New

Managing Stage files with Fusion SQL

Notebook


SingleStore Notebooks

Managing Stage files with Fusion SQL

Fusion SQL can be used to manage your workspace groups and workspaces, but it can also be used to upload, download, and manage files in your workspace group or starter workspace Stage. We'll show you how to work with files in Stage in this notebook.

Displaying the Stage Fusion SQL commands

The SHOW FUSION COMMANDS displays the commands that are handled by the Fusion engine. You can use the LIKE to filter the commands.

In [1]:

1

commands = %sql SHOW FUSION COMMANDS LIKE '%stage%'

2

for cmd in commands:

3

print(*cmd, '\n')

Creating a workspace group

We'll start by creating a workspace group. We can get a region in the US by using the SHOW REGIONS command and the random package.

In [2]:

1

import random

2

import secrets

3

4

us_regions = %sql SHOW REGIONS LIKE '%us%'

5

6

region_id = random.choice(us_regions).ID

7

region_id

In [3]:

1

wg_name = 'Fusion Notebook'

2

password = secrets.token_urlsafe(20) + '-x&'

In [4]:

1

%%sql

2

CREATE WORKSPACE GROUP '{{ wg_name }}'

3

IN REGION ID '{{ region_id }}' WITH PASSWORD '{{ password }}'

4

WITH FIREWALL RANGES '0.0.0.0/0'

Uploading and downloading Stage files

Uploading and downloading files to your Stage is easy with Fusion SQL. The commands are shown below.

DOWNLOAD STAGE FILE '<stage-path>' [ IN { ID '<deployment-id>' | '<deployment-name>' } ] [ TO '<local-path>' ]
    [ OVERWRITE ] [ ENCODING '<encoding>' ];

UPLOAD FILE TO STAGE '<stage-path>' [ IN { ID '<deployment-id>' | '<deployment-name>' } ] FROM '<local-path>' [ OVERWRITE ];

First we'll create a data file locally that we can work with.

In [5]:

1

%%writefile mydata.csv

2

name,age,height

3

Sue,27,65

4

Joe,32,70

5

Max,44,69

6

Ann,33,64

We can now upload our data file to our workspace group Stage.

In [6]:

1

%%sql

2

UPLOAD FILE TO STAGE 'stats.csv' IN '{{ wg_name }}' FROM 'mydata.csv'

We can list the files in a Stage with the SHOW STAGE FILES command.

In [7]:

1

%%sql

2

SHOW STAGE FILES IN '{{ wg_name }}'

Downloading the file is just as easy as uploading.

In [8]:

1

%%sql

2

DOWNLOAD STAGE FILE 'stats.csv' IN '{{ wg_name }}' TO 'stats.csv' OVERWRITE

In [9]:

1

!cat stats.csv

If you just want to display the contents of the Stage file without saving it to a local file, you simply leave the TO option off the DOWNLOAD STAGE FILE.

In [10]:

1

%%sql

2

DOWNLOAD STAGE FILE 'stats.csv' IN '{{ wg_name }}' ENCODING 'utf-8'

Creating folders

Up to this point we have just worked with files at the root of our Stage. We can use Fusion SQL to create folder structures as well. This is done with the CREATE STAGE FOLDER command.

CREATE STAGE FOLDER '<stage-path>' [ IN { ID '<deployment-id>' | '<deployment-name>' } ] [ OVERWRITE ];

The following code will create this folder structure:

project-1/
project-1/data/
project-2/
project-2/data/

In [11]:

1

for name in ['project-1', 'project-1/data', 'project-2', 'project-2/data']:

2

%sql CREATE STAGE FOLDER '{{ name }}' IN '{{ wg_name }}';

In [12]:

1

%%sql

2

SHOW STAGE FILES IN '{{ wg_name }}' RECURSIVE

Now that we have a folder structure we can put files into those folders.

In [13]:

1

%%sql

2

UPLOAD FILE TO STAGE 'project-1/data/stats.csv' IN '{{ wg_name }}' FROM 'mydata.csv';

3

UPLOAD FILE TO STAGE 'project-2/data/stats.csv' IN '{{ wg_name }}' FROM 'mydata.csv';

Now when we do a recursive listing of our Stage, we'll see the newly created files.

In [14]:

1

%%sql

2

SHOW STAGE FILES IN '{{ wg_name }}' RECURSIVE

We can list the files at a specific path as well.

In [15]:

1

%%sql

2

SHOW STAGE FILES IN '{{ wg_name }}' AT 'project-2/data'

Loading data from Stage

We are going to load data from a Stage into a database table. For this, we need to have a workspace and a database.

In [16]:

1

%%sql

2

CREATE WORKSPACE 'stage-loader' IN GROUP '{{ wg_name }}' WITH SIZE 'S-00' WAIT ON ACTIVE

In [17]:

1

%%sql

2

SHOW WORKSPACES IN GROUP 'Fusion Notebook'

Action Required

Make sure to select the stage-loader workspace from the drop-down menu at the top of this notebook.

In [18]:

1

%%sql

2

CREATE DATABASE IF NOT EXISTS stage_loader

Action Required

Make sure to select the stage_loader database from the drop-down menu at the top of this notebook. It updates the connection_url to connect to that database.

In [19]:

1

%%sql

2

DROP TABLE IF EXISTS stats;

3

CREATE TABLE stats (

4

name TEXT,

5

age INT,

6

height INT

7

);

Load the data from the Stage using a pipeline.

In [20]:

1

%%sql

2

CREATE PIPELINE IF NOT EXISTS stage_test

3

AS LOAD DATA STAGE 'project-2/data/stats.csv'

4

BATCH_INTERVAL 2500

5

SKIP DUPLICATE KEY ERRORS

6

INTO TABLE stats

7

IGNORE 1 LINES

8

FIELDS TERMINATED BY ','

9

LINES TERMINATED BY '\n'

10

FORMAT CSV;

11

START PIPELINE stage_test FOREGROUND;

12

DROP PIPELINE stage_test;

We can now query the table and select the output into a Stage. Note that the GROUP BY 1 is used here to combine the outputs from all of the database partitions into a single file. If you don't use that, you'll get multiple output files, each with a portion of the result set.

In [21]:

1

%%sql

2

SELECT * FROM stats GROUP BY 1 INTO STAGE 'project-3/data/stats.csv'

3

FIELDS TERMINATED BY ','

4

LINES TERMINATED BY '\n'

In [22]:

1

%%sql

2

SHOW STAGE FILES IN '{{ wg_name }}' AT 'project-3' RECURSIVE

In [23]:

1

%%sql

2

DOWNLOAD STAGE FILE 'project-3/data/stats.csv' ENCODING 'utf-8'

Deleting Stage files and folders

Files and folders can be deleted from a workspace Stage as well. This is done with the DROP STAGE FILE and DROP STAGE FOLDER commands.

DROP STAGE FILE '<stage-path>' [ IN { ID '<deployment-id>' | '<deployment-name>' } ];

DROP STAGE FOLDER '<stage-path>' [ IN { ID '<deployment-id>' | '<deployment-name>' } ] [ RECURSIVE ];

Let's delete the stats.csv file at the root of our Stage.

In [24]:

1

%%sql

2

DROP STAGE FILE 'stats.csv' IN '{{ wg_name }}'

In [25]:

1

%%sql

2

SHOW STAGE FILES IN '{{ wg_name }}'

Now let's delete the project-2 folder including all of the files in it.

In [26]:

1

%%sql

2

DROP STAGE FOLDER 'project-2' IN '{{ wg_name }}' RECURSIVE

In [27]:

1

%%sql

2

SHOW STAGE FILES IN '{{ wg_name }}' RECURSIVE

In [28]:

1

%%sql

2

DROP STAGE FOLDER 'project-1' IN '{{ wg_name }}' RECURSIVE;

3

DROP STAGE FOLDER 'project-3' IN '{{ wg_name }}' RECURSIVE;

Conclusion

We have demonstrated how to create and delete files and folders in a workspace group Stage using Fusion SQL. Note that it also supports managing Stage for starter workspaces. It is also possible to work with Stage files using the SingleStoreDB Python SDK, see the API documentation for more details.

Details


About this Template

Learn how to manage your Stage files in SingleStoreDB Cloud using Fusion SQL.

Notebook Icon

This Notebook can be run in Standard and Enterprise deployments.

Tags

starterfusionpython

License

This Notebook has been released under the Apache 2.0 open source license.

See Notebook in action

Launch this notebook in SingleStore and start executing queries instantly.