I have successfully created a pipeline that connects to S3, but am getting this bogus message:
ERROR 1262 ER_WARN_TOO_MANY_RECORDS: Leaf Error (node-71a961e2-6fe3-4556-baf6-977cea79e9dd-leaf-ag2-1.svc-71a961e2-6fe3-4556-baf6-977cea79e9dd:3306): Leaf Error (node-71a961e2-6fe3-4556-baf6-977cea79e9dd-leaf-ag1-0.svc-71a961e2-6fe3-4556-baf6-977cea79e9dd:3306): Row 1 was truncated; it contained more data than there were input columns
The reason I say “bogus” is because I have used these exact same buckets to ingest to Redshift, Aurora, and Athena. I have carefully counted the columns in the table, the columns in the text files, and the columns specified in the CREATE PIPELINE - they match. I have specified the correct delimiters (’\t’, and ‘\n’).
One thing that might help is disambiguating the error message. I can’t tell whether it thinks there are more columns in the source file or in the schema, or in the pipeline spec.
Also - a by-the-way: the JSON configuration for pipelines mentions an “extended_null” field, but I don’t see a way to include that in the CREATE PIPELINE command. I am going to need that.