I am trying to load a few 100 tables from mysql to singlestore db using the Load Data using CSV s option - LOAD DATA · SingleStore Documentation.
We first export these tables from mysql in to csv , these csv s are placed in an NFS shared space and then we use the Load Data using CSV s to import the same by using the load data option. However, this seems to be consuming too long a time for bulky files.
We are looking at the following option but, it says that its not recommended to use this in production - Load Data from MySQL · SingleStore Documentation
could you please share another alternative that is safe to use in production environments and provide good performance. mysqldump does not seem a fitment for us, so pls suggest.
For really huge files, consider extracting logical partitions of the file (say by date range) into separate CSV files, using concurrent external application processes or threads. Then use SingleStore PIPELINEs to load them to SingleStore.
We tried the chunking approach based on the primary key from mysql. However, my team has split the huge table in mysql in to 1 million records per CSV and exporting the mysql tablein to CSV itself is taking huge time.
since the shard key is a unique one, we cannot use that for logical partitioning. But date range is also a difficult choice because since there are 100 tables, each table would be split in to different size CSVs.
Would you suggest be the size of CSV? may be 50K per CSV ?
You don’t have to spilt up all the tables using the same logical partitioning scheme. The logical partitioning is only to give you a convenient way to divide up the data. SingleStore should be able to handle the 1M row CSV files fine, and 50K as well, with pipelines or LOAD DATA. If 50K works for your export, it sounds like a good choice.
Thank you sir for answering, you made my day.
1 Like