It’s pipeline from AWS S3 received compressed csv files.
Leaf Error (127.0.0.1:3307): Cannot extract data for pipeline. AWS request failed:
NotFound: Not Found
status code: 404, request id: BWQ4W2WKA1DP490V, host id: taGqlq5FvnFlwOhu3vM0sMZtu/ZliPLWK+RlSBkTJtCHqsSPLu6lmRGRXs2YB7TYnXK03T2jsC8=
All files have the same structure.
What a best way to continue processing not processed files ?
On now helps only to drop pipeline and processed files and then to recreate pipeline.
What is reason of such performance (e.g., can pipeline try pull file during uploading to S3 ?)
Reason of issue: the source system sometimes sends to the same bucket the test files having a another structure.
In my case Singular sends such a files, fortunately with other extention, and it can be excluded by extension: ‘singular-s3-exports-files/events/*.csv.gz’
For the question on retry unprocessed files, currently we the pipeline has only 1 way to reprocess the file which is to drop Pipleine and recreate it. I have created an internal JIRA to handle such failure scenario so that we can improve the user experience.
Can you help me understand the send problem of performance?
Hi @mkumar,
About this issue - I found the reason and described it above to help the collegues.
About your comment - I’m afraid I don’t fully understood what mean you.
Sure. Small typo in the message there.
I am trying to understand the 2nd bullet in the original forum post. Is that a question with regard to the behavior of pipeline if it tries to read files while it is uploading to S3?