5.24.4. SnowFlake Bulk Upload through a S3 Bucket

<< Click to Display Table of Contents >>

Navigation:  5. Detailed description of the Actions > 5.24. TA - Data Engineering >

5.24.4. SnowFlake Bulk Upload through a S3 Bucket

 

Icon: clip1191

 

Function: snowflakeS3BulkUpload

 

Property window:

 

clip1194

 

Short description:

 

Upload a table to SnowFlake.

 

Long Description:

 

Inserting rows into Snowflake using the standard ODBC connector is horribly slow.

 

To get more speed, you can use the current   clip1192 Action. This action proceeds in 5 steps:

 

1.Using the user-provided .gel file (parameter P1), it creates on the local hard drive a temporary csv file with all the data to copy into SnowFlake.

2.It copies the temporary csv file from the local hard drive into a S3 bucket.

3.It copies the temporary csv file from the S3 bucket to a Staging Table (parameter P4) into SnowFlake.

4.It copies the content of the Staging Table to the Final Table (parameter P3) into SnowFlake. Before the copy starts, to avoid to get any Primary Keys “in double” inside the Final Table, it removes from the Final table all the rows that have the same Primary Keys as the Primary Keys inside the Staging table. The name(s) of the Primary key column(s) is(are) given in parameter P2.

5.Cleaning: It deletes the staging table, the csv file inside the S3 bucket and inside the local hard drive.

 

 

To be able to connect to your S3 bucket storage, you need to get from Amazon these 4 parameters:

 

Paramater P6: your “Bucket Name”

Paramater P7: your “Region”  

Paramater P8: your “Access Key ID”

Paramater P9: your “Secret Access Key”

 

Please refer to the section 5.23.6. for the procedure to get these 4 parameters.

 

 

To be able to connect to your SnowFlake database (to run the SQL command to import the data from the S3 bucket), you need to:

 

Download & install the ODBC drivers for RedShift. You can download these drivers from us here: http://download.timi.eu/ODBC/ODBC_drivers_SnowFlake/

 

Get the ODBC connection URL to connect to your Snowflake database: Enter this URL as the Anatella parameter P10. The instructions to get this ODBC connection URL is given here:
 

https://docs.snowflake.com/en/user-guide/odbc-parameters.html#configuration-parameters

 

Typically, this ODBC connection URL looks like this:

 
Driver={SnowflakeDSIIDriver}; Server=dy47396.eu-west-1.snowflakecomputing.com; Database=MYTESTDB; UID=<my_user>; PWD=<my_password>

 

 
To get the value of the “server” parameter inside the above ODBC connections string, you just need to look at the URL used to query your SnowFlake instance:

 

 

clip1193