Pipeline support for ingestion ORC files stored in Hadop tables to Singlestoredb

I need to clarify that, whether singlestoredb supports data integration to ORCF files in HDFS environment. For plain csv files in HDS can be ingested to the singlestoredb but for ORC compressed data, we have error while running pipelines.

When we check the suuported formats documentation, ORC is not listed.

Can you clarify about the ORC support.

Supported File Formats

Pipeline support the following file formats:

  • JSON
  • Avro
  • Parquet
  • CSV

We don’t support pipelines from ORC. If you need to load those to SingleStore, consider converting them to another format, such as Parquet, then loading those using pipelines.

Dear hanson, thank you for your reply, we store those ORC files-data, in hadoop environment(ORC compressed tables) for big-data concept. I assume it is similar concept as files though. Does it make any sense ?

Best regards,

I think so. You have ORC files in HDFS and you want to load them. If you convert them to Parquet (also stored in HDFS) then you can load them with

CREATE PIPELINE … AS LOAD DATA HDFS … FORMAT PARQUET;

See here for details.

1 Like

Thanks for answering, you saved my day.

1 Like