Introduction
In today’s data-driven world, businesses constantly strive to make more informed decisions. A retail company, aiming to analyze its sales data for customer behavior insights, product performance, and sales trends, faces a common challenge: how to efficiently manage and analyze large volumes of data. This is where Snowflake, a cloud data platform, shines. In this blog post, we will explore how Snowflake can revolutionize data handling, using our retail company as a guiding example.
Getting Started with Snowflake
Before diving into the technicalities, let’s set up our environment. Snowflake offers a straightforward setup process, and you can start with their free trial.
Navigating the Snowflake UI
Once you’ve set up your account, you’ll enter the Snowflake UI. This web interface is where you’ll manage databases, execute queries, and monitor performance. Familiarize yourself with the main areas: Worksheets for SQL queries, Databases for managing schemas and tables, and Warehouses which are your computational resources.
Creating Database and Schema
Our retail company needs a structured way to store its data. We start by creating a dedicated database and schema.
-- Create a new database for our retail data
CREATE DATABASE IF NOT EXISTS retail_data;
-- Create a schema for sales information
CREATE SCHEMA IF NOT EXISTS retail_data.sales;
The retail_data
database will house all our data, while the sales
schema will specifically contain tables related to sales transactions.
Creating Tables
Next, we design tables to store essential data: sales transactions, customer information, and product details.
-- Creating a table for sales transactions
CREATE TABLE retail_data.sales.transactions (
transaction_id INT AUTOINCREMENT,
product_id INT,
customer_id INT,
quantity_sold INT,
sale_date DATE,
sale_amount FLOAT
);
-- Creating a table for customer information
CREATE TABLE retail_data.sales.customers (
customer_id INT AUTOINCREMENT,
first_name STRING,
last_name STRING,
email STRING,
join_date DATE
);
-- Creating a table for product details
CREATE TABLE retail_data.sales.products (
product_id INT AUTOINCREMENT,
product_name STRING,
product_category STRING,
price FLOAT
);
We use AUTOINCREMENT
for primary keys, ensuring unique identifiers for each record.
Role and User Management
In any organization, different departments need different levels of access. Snowflake’s role-based access control is perfect for this.
-- Create roles for different departments
CREATE ROLE sales_team;
CREATE ROLE marketing_team;
CREATE ROLE it_department;
-- Granting usage privileges on the database to roles
GRANT USAGE ON DATABASE retail_data TO ROLE sales_team;
GRANT USAGE ON DATABASE retail_data TO ROLE marketing_team;
GRANT USAGE ON DATABASE retail_data TO ROLE it_department;
-- Granting specific privileges to each role
GRANT SELECT ON retail_data.sales.transactions TO ROLE sales_team;
GRANT SELECT ON retail_data.sales.customers TO ROLE marketing_team;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA retail_data.sales TO ROLE it_department;
These roles reflect our organizational structure, ensuring that each team accesses only what they need.
Inserting Sample Data
Before we can run queries, we need to populate our tables with sample data. Let’s insert some records into our transactions
, customers
, and products
tables.
-- Inserting data into the products table
INSERT INTO retail_data.sales.products (product_name, product_category, price)
VALUES
('Winter Jacket', 'Apparel', 99.99),
('Running Shoes', 'Footwear', 79.99),
('Backpack', 'Accessories', 49.99);
-- Inserting data into the customers table
INSERT INTO retail_data.sales.customers (first_name, last_name, email, join_date)
VALUES
('John', 'Doe', 'john.doe@email.com', '2021-01-08'),
('Jane', 'Smith', 'jane.smith@email.com', '2021-02-15');
-- Inserting data into the transactions table
INSERT INTO retail_data.sales.transactions (product_id, customer_id, quantity_sold, sale_date, sale_amount)
VALUES
(1, 1, 2, '2021-03-10', 199.98),
(2, 1, 1, '2021-03-15', 79.99),
(3, 2, 1, '2021-03-20', 49.99);
Simple and Join Queries
Now that we have data, let’s run some queries.
Simple Query
A straightforward query might involve fetching all transactions:
SELECT * FROM retail_data.sales.transactions;
Join Query
To get more meaningful insights, let’s combine data from different tables. Suppose we want to see the details of each product sold in each transaction:
SELECT
t.transaction_id,
p.product_name,
t.quantity_sold,
t.sale_date,
t.sale_amount
FROM
retail_data.sales.transactions t
JOIN
retail_data.sales.products p ON t.product_id = p.product_id;
This query provides a more comprehensive view by combining the transactions with product information.
Writing Stored Procedures
Stored Procedures (SPs) are essential for encapsulating logic in the database. Let’s create an SP for a common retail task: calculating the total sales for a particular day.
Stored Procedure Example
CREATE OR REPLACE PROCEDURE total_sales_on_date(target_date DATE)
RETURNS FLOAT
LANGUAGE SQL
AS
$$
DECLARE
total_sales FLOAT;
BEGIN
SELECT SUM(sale_amount) INTO total_sales
FROM retail_data.sales.transactions
WHERE sale_date = target_date;
RETURN total_sales;
END;
$$;
-- To execute the stored procedure for a specific date
CALL total_sales_on_date('2021-03-10');
This SP calculates the total sales for a given date, an operation that could be part of a larger daily sales report.
Advanced Stored Procedures
Let’s create a more complex stored procedure that demonstrates conditional logic and error handling. We’ll write an SP that updates the price of a product and logs the change in an audit table.
First, we need an audit table:
CREATE TABLE retail_data.sales.price_audit (
audit_id INT AUTOINCREMENT,
product_id INT,
old_price FLOAT,
new_price FLOAT,
update_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Now, let’s create the stored procedure:
CREATE OR REPLACE PROCEDURE update_product_price(product_id INT, new_price FLOAT)
RETURNS OBJECT
LANGUAGE SQL
AS
$$
DECLARE
product_not_found EXCEPTION (-20001, 'Product not found.');
update_error EXCEPTION (-20002, 'Error updating product price.');
current_price FLOAT;
BEGIN
-- Check if the product exists
IF ((SELECT COUNT(*) FROM retail_data.sales.products WHERE product_id = :product_id) = 0) THEN
RAISE product_not_found;
END IF;
-- Get the current price
SELECT price INTO current_price FROM retail_data.sales.products WHERE product_id = :product_id;
-- Update the product's price
UPDATE retail_data.sales.products SET price = :new_price WHERE product_id = :product_id;
-- Insert a record into the price_audit table
INSERT INTO retail_data.sales.price_audit (product_id, old_price, new_price)
VALUES (:product_id, :current_price, :new_price);
RETURN OBJECT_CONSTRUCT('Status', 'Success', 'Message', 'Price updated successfully');
EXCEPTION
WHEN product_not_found THEN
RETURN OBJECT_CONSTRUCT('Error type', 'Product not found');
WHEN update_error THEN
RETURN OBJECT_CONSTRUCT('Error type', 'Update error', 'SQLCODE', SQLCODE, 'SQLERRM', SQLERRM);
WHEN OTHER THEN
RETURN OBJECT_CONSTRUCT('Error type', 'Other error', 'SQLCODE', SQLCODE, 'SQLERRM', SQLERRM);
END;
$$;
-- Test the stored procedure
CALL update_product_price(2, 200.50);
This SP includes error handling and transaction logging, making it a robust solution for real-world applications.
User-Defined Functions (UDFs)
UDFs allow you to extend Snowflake’s SQL capabilities. Let’s create a UDF that calculates the total revenue from a customer, including a discount based on their loyalty.
CREATE OR REPLACE FUNCTION calculate_revenue_with_discount(customer_id INT, discount_rate FLOAT)
RETURNS FLOAT
LANGUAGE SQL
AS
$$
SELECT SUM(sale_amount) * (1 - discount_rate) as discount_revenue
FROM retail_data.sales.transactions
WHERE customer_id = customer_id
$$;
-- Test the function
SELECT calculate_revenue_with_discount(1, 0.1) as disount_revenue;
This UDF demonstrates how custom functions can encapsulate complex logic for reuse in various queries.
Please find below complete Snowflake Sql code used in this tutorial
-- Create a new database for our retail data
CREATE DATABASE IF NOT EXISTS retail_data;
-- Create a schema for sales information
CREATE SCHEMA IF NOT EXISTS retail_data.sales;
-- Creating a table for sales transactions
CREATE TABLE retail_data.sales.transactions (
transaction_id INT AUTOINCREMENT,
product_id INT,
customer_id INT,
quantity_sold INT,
sale_date DATE,
sale_amount FLOAT
);
-- Creating a table for customer information
CREATE TABLE retail_data.sales.customers (
customer_id INT AUTOINCREMENT,
first_name STRING,
last_name STRING,
email STRING,
join_date DATE
);
-- Creating a table for product details
CREATE TABLE retail_data.sales.products (
product_id INT AUTOINCREMENT,
product_name STRING,
product_category STRING,
price FLOAT
);
CREATE TABLE retail_data.sales.price_audit (
audit_id INT AUTOINCREMENT,
product_id INT,
old_price FLOAT,
new_price FLOAT,
update_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Create roles for different departments
CREATE ROLE sales_team;
CREATE ROLE marketing_team;
CREATE ROLE it_department;
-- Granting usage privileges on the database to roles
GRANT USAGE ON DATABASE retail_data TO ROLE sales_team;
GRANT USAGE ON DATABASE retail_data TO ROLE marketing_team;
GRANT USAGE ON DATABASE retail_data TO ROLE it_department;
-- Granting specific privileges to each role
GRANT SELECT ON retail_data.sales.transactions TO ROLE sales_team;
GRANT SELECT ON retail_data.sales.customers TO ROLE marketing_team;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA retail_data.sales TO ROLE it_department;
-- Inserting data into the products table
INSERT INTO retail_data.sales.products (product_name, product_category, price)
VALUES
('Winter Jacket', 'Apparel', 99.99),
('Running Shoes', 'Footwear', 79.99),
('Backpack', 'Accessories', 49.99);
-- Inserting data into the customers table
INSERT INTO retail_data.sales.customers (first_name, last_name, email, join_date)
VALUES
('John', 'Doe', 'john.doe@email.com', '2021-01-08'),
('Jane', 'Smith', 'jane.smith@email.com', '2021-02-15');
-- Inserting data into the transactions table
INSERT INTO retail_data.sales.transactions (product_id, customer_id, quantity_sold, sale_date, sale_amount)
VALUES
(1, 1, 2, '2021-03-10', 199.98),
(2, 1, 1, '2021-03-15', 79.99),
(3, 2, 1, '2021-03-20', 49.99);
SELECT * FROM retail_data.sales.transactions;
SELECT
t.transaction_id,
p.product_name,
t.quantity_sold,
t.sale_date,
t.sale_amount
FROM
retail_data.sales.transactions t
JOIN
retail_data.sales.products p ON t.product_id = p.product_id;
CREATE OR REPLACE PROCEDURE total_sales_on_date(TARGET_DATE DATE)
RETURNS FLOAT
LANGUAGE SQL
AS
$$
DECLARE
total_sales FLOAT;
BEGIN
SELECT SUM(sale_amount) INTO total_sales
FROM retail_data.sales.transactions
WHERE sale_date = :TARGET_DATE;
RETURN total_sales;
END;
$$;
-- To execute the stored procedure for a specific date
CALL total_sales_on_date('2021-03-10');
CREATE OR REPLACE PROCEDURE update_product_price(product_id INT, new_price FLOAT)
RETURNS OBJECT
LANGUAGE SQL
AS
$$
DECLARE
product_not_found EXCEPTION (-20001, 'Product not found.');
update_error EXCEPTION (-20002, 'Error updating product price.');
current_price FLOAT;
BEGIN
-- Check if the product exists
IF ((SELECT COUNT(*) FROM retail_data.sales.products WHERE product_id = :product_id) = 0) THEN
RAISE product_not_found;
END IF;
-- Get the current price
SELECT price INTO current_price FROM retail_data.sales.products WHERE product_id = :product_id;
-- Update the product's price
UPDATE retail_data.sales.products SET price = :new_price WHERE product_id = :product_id;
-- Insert a record into the price_audit table
INSERT INTO retail_data.sales.price_audit (product_id, old_price, new_price)
VALUES (:product_id, :current_price, :new_price);
RETURN OBJECT_CONSTRUCT('Status', 'Success', 'Message', 'Price updated successfully');
EXCEPTION
WHEN product_not_found THEN
RETURN OBJECT_CONSTRUCT('Error type', 'Product not found');
WHEN update_error THEN
RETURN OBJECT_CONSTRUCT('Error type', 'Update error', 'SQLCODE', SQLCODE, 'SQLERRM', SQLERRM);
WHEN OTHER THEN
RETURN OBJECT_CONSTRUCT('Error type', 'Other error', 'SQLCODE', SQLCODE, 'SQLERRM', SQLERRM);
END;
$$;
-- Test the stored procedure
CALL update_product_price(2, 200.50);
select * from retail_data.sales.price_audit;
CREATE OR REPLACE FUNCTION calculate_revenue_with_discount(customer_id INT, discount_rate FLOAT)
RETURNS FLOAT
LANGUAGE SQL
AS
$$
SELECT SUM(sale_amount) * (1 - discount_rate) as discount_revenue
FROM retail_data.sales.transactions
WHERE customer_id = customer_id
$$;
-- Test the function
SELECT calculate_revenue_with_discount(1, 0.1) as disount_revenue;
Conclusion
Throughout this blog post, we’ve explored the foundational elements of working with Snowflake SQL, from setting up databases and tables to writing complex stored procedures and UDFs. Our journey through a real-world retail scenario has highlighted the power and flexibility of Snowflake in handling diverse data warehousing needs.
Snowflake’s robust features, such as stored procedures, user-defined functions, and comprehensive data handling capabilities, make it an ideal choice for businesses looking to leverage their data for strategic decision-making.
In conclusion, this blog post provides a thorough overview of Snowflake SQL, guided by a practical retail example. Each section builds upon the previous ones, creating a comprehensive guide that not only instructs but also demonstrates real-world applicability. To enhance your blog, consider adding visual aids, further examples, and possibly a Q&A section addressing common issues or concerns.