Thursday, January 16, 2014

Window function in SQL Server with an example

Windows functions in SQL Server's T-SQL is not related to the Windows operating system but the kind of more detailed ordering (and all the calculations you can do with the ordered set) than you can get by using the regular Group By clause. Windows functions are obviously much more powerful than what you get by mere grouping and subqueries.

The following windows functions were introduced in SQL Server 2005:
Row_Number
Rank
Dense_Rank
NTILE


These are enhanced in SQL Server 2012 by providing that provide better analytics,

  • Windows Order and Frame clauses
  • Windows Offset functions
  (Lag, Lead, First_Value, Last_value)
  • Windows distribution functions
  (Percent-Rank, Cume_Dist, Percentile_Disc and Percentile_Cont)

Conceptually it is as if you have created a window of rows to which you apply some processing and the result is returned in a single row as related to the windowed rows). You define the window in which the processing takes place by the clause OVER.

The following example is using the Orders and Order Details table in Northwind database. If you need a copy of Northwind follow this link:

The following query when run to get raw data,

Get raw filtered data

[Order Details].OrderID, Cast((Orders.OrderDate) as nvarchar(12)) as 'Order Date', [Order Details].Quantity
FROM            [Order Details] INNER JOIN
                         Orders ON [Order Details].OrderID = Orders.OrderID and [Order Details].OrderID <=10253

Returns the following:

OrderID     Order Date   Quantity
----------- ------------ --------
10248       Jul  4 1996  12
10248       Jul  4 1996  10
10248       Jul  4 1996  5
10249       Jul  5 1996  9
10249       Jul  5 1996  40
10250       Jul  8 1996  10
10250       Jul  8 1996  35
10250       Jul  8 1996  15
10251       Jul  8 1996  6
10251       Jul  8 1996  15
10251       Jul  8 1996  20
10252       Jul  9 1996  40
10252       Jul  9 1996  25
10252       Jul  9 1996  40
10253       Jul 10 1996  20
10253       Jul 10 1996  42
10253       Jul 10 1996  40

(17 row(s) affected)

Using Group BY clause:
There are groups (17 rows are in 6 groups) and for these groups we can find the sum of quantity orderd like in,

Select   [Order Details].OrderID, SUM([Order Details].Quantity) as 'QUAN'
FROM            [Order Details] INNER JOIN
                         Orders ON [Order Details].OrderID = Orders.OrderID and [Order Details].OrderID <=10253
                         group by  [Order Details].OrderID

the rows returned are,
OrderID     QUAN
----------- -----------
10248       27
10249       49
10250       60
10251       41
10252       105
10253       102

(6 row(s) affected)

Using the Rank() function

Now we can rank the value Quan and show they are ranked using the Rank() function as in the following:

Select   [Order Details].OrderID,
         SUM([Order Details].Quantity) as 'QUAN',
        RANK() OVER (ORDER BY SUM([Order Details].Quantity)) as RankQuan
FROM            [Order Details] INNER JOIN
                         Orders ON [Order Details].OrderID = Orders.OrderID and [Order Details].OrderID <=10253
                         group by  [Order Details].OrderID

We have a new column produced by,
 RANK() OVER (ORDER BY SUM([Order Details].Quantity)) as RankQuan
which really shows how the rank is evaluated providing a rank for each row.

The response to the above is,

OrderID     QUAN        RankQuan
----------- ----------- --------------------
10248       27          1
10251       41          2
10249       49          3
10250       60          4
10253       102         5
10252       105         6

(6 row(s) affected)

As you notice the Rank() function took care of ordering as well in addition to ranking them.



Use Rank() function one more time:

Let us modify the query further
(and answer the questions whether one can use another Rank() function.

Select   [Order Details].OrderID,
         SUM([Order Details].Quantity) as 'QUAN',
        RANK() OVER (ORDER BY SUM([Order Details].Quantity)) as RankQuan,
        Rank() OVER (ORDER BY [Order Details].OrderID) as 'OrdIDRank'
FROM       [Order Details]        INNER JOIN
                         Orders ON [Order Details].OrderID = Orders.OrderID and [Order Details].OrderID <=10253
                         group by  [Order Details].OrderID

The response is as follows:
OrderID     QUAN        RankQuan             OrdIDRank
----------- ----------- -------------------- --------------------
10248       27          1                    1
10249       49          3                    2
10250       60          4                    3
10251       41          2                    4
10252       105         6                    5
10253       102         5                    6

(6 row(s) affected)

You can see that this has not affected the RankQuan except that the ordering in the final rank gets ordered consequitively.