Geek Logbook

Tech sea log book

Understanding Window Functions in SQL: Beyond Simple Aggregations

When we think about SQL functions, we often start with scalar functions (UPPER(), ROUND(), NOW()) or aggregate functions (SUM(), AVG(), COUNT()). But there is a third type that is essential for advanced analytics: window functions.

The “Window”: The Metaphor Behind the Concept

A window function is evaluated for every row, but not in isolation — it does so by looking at a set of related rows, defined dynamically. This set of rows is what we call a window.

Unlike GROUP BY, which collapses rows into a single result, window functions preserve the row-by-row detail and allow us to calculate contextual metrics.


Three Key Components of a Window

1. PARTITION BY: Grouping Without Losing Granularity

The first step is to define partitions. This works like a logical GROUP BY, but without summarizing the rows.

SELECT 
  employee_id,
  department,
  salary,
  SUM(salary) OVER (PARTITION BY department) AS total_salary_department
FROM employees;

2. ORDER BY: Establishing a Sequence

For cumulative metrics, rankings, or functions such as LAG() and LEAD(), we need an order. This step defines the evaluation sequence within each partition.

SELECT 
  employee_id,
  salary,
  RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees;

3. ROWS BETWEEN ... AND ...: Defining the Frame

The frame specifies how many rows relative to the current row are included in the calculation. A classic example is a running total.

SELECT 
  date,
  sales,
  SUM(sales) OVER (
    ORDER BY date 
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  ) AS cumulative_sales
FROM daily_sales;

This gives us the total sales from the beginning up to the current row.


ROWS vs RANGE

  • ROWS: counts physical rows (exactly N before/after).
  • RANGE: groups by values in the ordering expression (e.g., all rows with the same date).

Common Window Functions

  • Aggregate functions: SUM(), AVG(), COUNT()
  • Ranking functions: ROW_NUMBER(), RANK(), DENSE_RANK()
  • Value access functions: LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE()

Why They Matter

Window functions are critical for advanced analytics such as:

  • Calculating running totals
  • Comparing with the previous row (day-over-day changes)
  • Ranking records dynamically
  • Moving averages and sliding windows

They let you perform these operations without complex subqueries or losing detail.


Conclusion

Window functions are one of the most powerful and flexible tools in modern SQL. By understanding PARTITION BY, ORDER BY, and ROWS BETWEEN, you can solve complex analytical problems elegantly and efficiently — all while staying within the expressive power of SQL.

Tags: