Hi,
Currently, I receive files with header and trailer records.
How can I setup pipeline and have pipeline skip header and trailer and parse remaining records and load it into table?
If we have to use Stored Procedure, can you please provide a sample that can detect header and trailer. I might prefer header, trailer and file name inserted as a record in a separate table for auditing purposes and actual data into Target table.
You can get the filename in the process, which can be passed into the stored procedure via pipeline_batch_id() or pipeline_source_file() builtins. Here’s an example:
MySQL [db]> create or replace procedure proc(q query(a int, batch int)) as begin insert into t select batch from q; end //
Query OK, 1 row affected (0.015 sec)
MySQL [db]> create or replace pipeline p as load data fs "/tmp/*.csv" into procedure proc(a) set batch = pipeline_batch_id()//
Query OK, 0 rows affected (0.196 sec)
Note that you can always find the source for the pipeline batch by looking at the relevant tables in INFORMATION_SCHEMA, e.g. PIPELINES_BATCHES.
However, the transform may need to do something extra for headers and trailer, e.g. tag the line with a code so that you can properly route the row in a stored procedure later.