Hi,
What is the exact syntax needed to create a pipeline to GCS?
I understand it should be supported (for what exact version?)
I guess it is based on the following:
CREATE PIPELINE mypipeline AS
LOAD DATA S3 ‘my-bucket-name’
CONFIG ‘{“region”: “us-west-1”}’
CREDENTIALS ‘{“aws_access_key_id”: “your_access_key_id”, “aws_secret_access_key”: “your_secret_access_key”, [“role_arn”:“replace_with_your_role_arn”]}’
INTO TABLE my_table
Hi,
meaning is Google Cloud Storage.
Using a pipeline to upload files from bucket.
For backup memsql is using the S3 api.
Can it be done the same with pipelines?
if yes - in what version with what exact syntax?
if no - are you planing on providing it? about when?
This feature is going to be released in 7.1 (mid-Spring, I believe).
Is this still relevant to you? If yes, what version of MemSQL are you currently running and what flavor: cloud (Helios) or self-managed? If there still is an interest, I’ll ask around to see if we have spare cycles to backport it or maybe even do a custom test-build.
The docs are coming really soon – sorry for that. Meanwhile I’ll try to describe here.
Basically the syntax is similar to the S3 one:
CREATE PIPELINE library
AS LOAD DATA GCS 'my-bucket-name'
CREDENTIALS '{"access_id": "YOUR_ACCESS_KEY_ID", "secret_key": "YOUR_SECRET_ACCESS_KEY"}'
INTO TABLE `classic_books`
FIELDS TERMINATED BY ',';
The differences are GCS (stands for Google Cloud Storage) instead of S3, and that in CREDENTIALS you have to specify access_id and secret_key fields, while in S3 they were named aws_access_key_id and aws_secret_access_key respectively.
access_id and secret_key are Google’s HMAC Keys. You can follow this guide about creating them. But as a quick sanity check you can assume that access_id is usually a 24 or 60 character alphanumeric string, which is linked to the Google account, typically is all uppercase and starts with “GOOG” and secret_key is usually a 40 character Base-64 encoded string that is linked to a specific access_id.