Pipeline creation in MemSQL (studio)

mangesh.j · February 11, 2021, 6:36am

Hi Team,

I am new to MemSQL . i have just created an account in MemSql and able to access only

MemSQL studio. I want to create a simple Pipeline in MemSQL to ingest data into MemSQL database . i have created user admin .
Question : Where do i need to write ‘create pipeline syntax’ in MemSQL , can i write in MemSQL studio-> SQL editor - Assuming i want to load a simple csv / json file from AWS s3 ( i already uploaded json file in aws s3 bucket)
Please suggest , where i can create a pipeline in MemSQL and how to i provide privileges to my admin user in singlestore studio.
Note : Apart from MemSQL studio , i wonot any MemSQL tool or any external database to access. and i am not permitted to installed any MemSQL tool on my machine . I can only access MemSQL studio.
I do have access to aws s3 env .
If anyone can guide me.

Regards,
Mangesh

hanson · February 12, 2021, 1:17am

Sure, you can create pipelines with SQL from Studio and start/stop them and otherwise control them from SQL as well. You will need permission to do that from your DBA. The permissions are given in the permissions matrix documentation:

If you have access to an S3 bucket, you can create a pipeline that reads from it.

mangesh.j · February 15, 2021, 10:20am

Thanks Hanson . I am able to create a pipeline in MemSQL editior . I had place just onc csv file in aws s3 bucket and once i ran the pipeline in MemSQL i am getting an error which i am not able to figure out whether it is roles / permission issue at aws s3 end . or any issue Of Leaf node in Memsql

error after executing pipeline

Leaf Error (node-d9da318f-a85d-4ac9-a654-453833506d70-leaf-ag1-0.svc-d9da318f-a85d-4ac9-a654-453833506d70:3306):
Cannot extract data for pipeline. AWS request failed: NotFound: Not Found status code: 404, request id: B41252724C41C2E3,
host id: 7Zs3TpbxnOgVaWAhqcACnSpIZX0KPC3mJ63SPjNoZbTT+p5qg80ICyi3EJnEFMiVeWogILlrpQg=Extract

and pipeline created in memsql

CREATE OR REPLACE PIPELINE ingest_stream.Test_Pipeline_library
AS LOAD DATA S3 ‘s3snowflakebucket/testfolder/books.txt’
CONFIG ‘{“region”: “us-east-1”}’
-----CREDENTIALS : aws access key and secret accedd key added .
INTO TABLE classic_books
FIELDS TERMINATED BY ‘,’
OPTIONALLY ENCLOSED BY ‘"’
NULL DEFINED BY ‘’ OPTIONALLY ENCLOSED

if you advice on this

Regards,

Mangesh

mangesh.j · February 15, 2021, 12:22pm

Hi Hanson ,

finally i am able to resolve this error : Leaf Error (node-d9da318f-a85d-4ac9-a654-453833506d70-leaf-ag1-0.svc-d9da318f-a85d-4ac9-a654-453833506d70:3306):
Cannot extract data for pipeline. AWS request failed: NotFound: Not Found status code: 404, request id: B41252724C41C2E3,
i had dropped the pipeline and re-created it again . now i am able to load the aws s3 data through memsql pipeline.
i guess , earlier pipeline was pointing to leaf node rather than aggregator node .
because without changing any code in pipeline or aws permission , i am able to load data after re-creating pipeline.
I hope , MemSQL pipeline also stores the Metadata of Recently loaded files into MemSQL so that same file cannot be loaded into MemSQL and only new s3 files will be loaded into MemSQL table. please share your thoughts .

Regards,

Mangesh