Unleashing the Power of Snowflake’s Advanced SQL Functions

Tahar Chanane
4 min readMay 21, 2023

Navigating the world of data analysis can sometimes feel like traversing a complex labyrinth. With the right tools, however, we can turn this labyrinth into a highway of insights. One such powerful tool at our disposal is SQL (Structured Query Language) functions.

In this article, we will take an in-depth look at Snowflake, a leading cloud data platform, and reveal how its advanced SQL functions can dramatically enhance your data manipulation skills.

Window Functions

In the realm of SQL, window functions are like magical windows that offer a glimpse into your data universe. Unlike aggregate functions that compress your data into a single row, window functions maintain the individual identity of each row while performing computations across a related set. Some of Snowflake’s most commonly used window functions include RANK(), ROW_NUMBER(), and LAG().

Imagine wanting to know the average sales for each product in your database without losing the granular sales data of each item. Snowflake’s window functions make this a breeze!

SELECT product, sales, AVG(sales) OVER (PARTITION BY product) as avg_sales
FROM sales_table;

This query returns the average sales for each product without discarding the granular sales data for individual items.

SELECT product, sales, RANK() OVER (ORDER BY sales DESC) as rank
FROM sales_table;

Here, we’re ranking products based on their sales. The product with the highest sales gets a rank of 1.

SELECT product, sales, LAG(sales) OVER (ORDER BY sales) as previous_sales
FROM sales_table;

This query uses LAG() to fetch the sales of the previous row. This allows us to compare a product’s sales with the sales of the product in the previous row.

Recursive CTEs (Common Table Expressions)

Recursive CTEs (Common Table Expressions) are like the secret passages within the castle of SQL. They can traverse hierarchical or recursive data models, revealing relationships hidden within your data, such as organizational hierarchies or product categories. Snowflake’s Recursive CTEs can transform a complex hierarchy into a simple, understandable data set. They allow you to unlock insights that would be hard to glean from a flat table.

WITH RECURSIVE organization AS (
SELECT employee_id, manager_id FROM employees WHERE manager_id IS NULL
UNION ALL
SELECT e.employee_id, e.manager_id FROM employees e, organization o WHERE e.manager_id = o.employee_id
)
SELECT * FROM organization;

This recursive CTE generates an organizational hierarchy. It starts from the employees who have no managers (likely the top-level executives) and then recursively includes employees who report to the managers already in the processed result set.

Table Functions

Snowflake’s table functions turn the output of a function into a table, enabling SQL queries to be run on function results. This feature is a game-changer for functions that return complex or variable data types, such as OBJECT_INSERT() or ARRAY_CAT().

Table functions can help you transform and manipulate complex data types with ease, offering the simplicity of handling data as if it were in a standard SQL table.

SELECT value:id::string AS user_id, value:email::string AS user_email
FROM TABLE(FLATTEN(input => PARSE_JSON('[{"id": 1, "email": "user1@example.com"}, {"id": 2, "email": "user2@example.com"}]')));

This script demonstrates the power of table functions. It parses a JSON string into a table format, allowing you to query it as you would with a regular SQL table.

SELECT OBJECT_INSERT('{"a":1,"b":2}', 'c', 3) AS new_json;

Here, we’re using the OBJECT_INSERT function to add a new key-value pair to an existing JSON object.

SELECT ARRAY_CAT(ARRAY_CONSTRUCT(1, 2, 3), ARRAY_CONSTRUCT(4, 5, 6)) AS new_array;

This code snippet concatenates two arrays using the ARRAY_CAT function.

Matching Regular Expressions with REGEXP

Snowflake offers a tool that can decipher patterns within your data — REGEXP (Regular Expressions). Whether you’re hunting for a specific text string or searching for a pattern, REGEXP functions offer a flexible way of finding your target.

The power of REGEXP functions in Snowflake lies in their flexibility. They allow you to extract insights from text data in ways that would be impossible with standard string functions.

SELECT email FROM user_table WHERE REGEXP_LIKE(email, '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$');

In this example, we’re using REGEXP_LIKE to filter out valid email addresses from a user table.

SELECT REGEXP_REPLACE('123-456-7890', '\\-', '.') AS new_format;

Here, REGEXP_REPLACE is used to replace all hyphens in a phone number with periods.

SELECT REGEXP_SUBSTR('The quick brown fox jumps over the lazy dog', 'q[^ ]*') AS word_starting_with_q;

This code uses REGEXP_SUBSTR to extract the first word that starts with ‘q’ from a given string.

In conclusion, mastering these advanced SQL functions in Snowflake can unlock new layers of data insights and amplify the efficiency of your data analysis process. These are just a glimpse of the many powerful SQL functions Snowflake offers. From manipulating strings, working with dates, to handling NULL values, Snowflake has a SQL function to cater to your every need.

--

--

Tahar Chanane

Data Consultant, Professional Services @Qrious. BInfSc Data Science. 3x Azure, 4x Databricks, 2x Snowflake, and Tableau Certified.