Common Table Expressions(CTE) in SQL SERVER 2008
Introduction
It’s a headache for developers to write or read a complex SQL query using a number of
Join
s. Complex SQLstatements can be made easier to understand and maintainable in the form of CTE or Common Table expressions. In this post, I will try to cover some of the benefits that we get when working with CTE.
When dealing with sub-queries, it is often required that you need to select a part of the data from a sub query or even join data from a query with some other tables. In that case, either you have an option to name your sub-queries with an alias or to use it directly. Gradually your requirement is getting more and more complex and your query would look unmaintainable at any time. CTE allows you to define the subquery at once, name it using an alias and later call the same data using the alias just like what you do with a normal table. CTE is standard ANSI SQL standard.
Using the Code
Say, for instance, you have a query like this:
SELECT * FROM (
SELECT A.Address, E.Name, E.Age From Address A
Inner join Employee E on E.EID = A.EID) T
WHERE T.Age > 50
ORDER BY T.NAME
The query looks really a mess. Even if I need to write something that wraps around the entire query, it would gradually become unreadable. CTE allows you to generate Tables beforehand and use it later when we actually bind the data into the output.
Rewriting the query using CTE expressions would look like:
With T(Address, Name, Age) --Column names for Temporary table
AS
(
SELECT A.Address, E.Name, E.Age from Address A
INNER JOIN EMP E ON E.EID = A.EID
)
SELECT * FROM T --SELECT or USE CTE temporary Table
WHERE T.Age > 50
ORDER BY T.NAME
Yes as you can see, the second query is much more readable using CTE. You can specify as many query expressions as you want and the final query which will output the data to the external environment will eventually get reference to all of them.
Multiple CTE in One SELECT Statement Query
With T1(Address, Name, Age) --Column names for Temporary table
AS
(
SELECT A.Address, E.Name, E.Age from Address A
INNER JOIN EMP E ON E.EID = A.EID
),
T2(Name, Desig)
AS
(
SELECT NAME, DESIG FROM Designation)
SELECT T1.*, T2.Desig FROM T1 --SELECT or USE CTE temporary Table
WHERE T1.Age > 50 AND T1.Name = T2.Name
ORDER BY T1.NAME
So the queries are separated using commas. So basically you can pass as many queries as you want and these queries will act as a subqueries, getting you the data and name it as a temporary table in the query.
According to the syntax, the CTE starts with a
With
clause. You can specify the column names in braces, but it is not mandatory.Common Table Expression Syntax
A Common Table Expression contains three core parts:
- The CTE name (this is what follows the
WITH
keyword) - The column list (optional)
- The query (appears within parentheses after the
AS
keyword)
The query using the CTE must be the first query appearing after the CTE.
When to Use Common Table Expressions
Common Table Expressions offer the same functionality as a view, but are ideal for one-off usages where you don't necessarily need a view defined for the system. Even when a CTE is not necessarily needed, it can improve readability. In Using Common Table Expressions, Microsoft offers the following four advantages of CTEs:
- Create a recursive query.
- Substitute for a view when the general use of a view is not required; that is, you do not have to store the definition in metadata.
- Enable grouping by a column that is derived from a scalar subselect, or a function that is either not deterministic or has external access.
- Reference the resulting table multiple times in the same statement.
Using a CTE offers the advantages of improved readability and ease in maintenance of complex queries. The query can be divided into separate, simple, logical building blocks. These simple blocks can then be used to build more complex, interim CTEs until the final result set is generated.
Using scalar subqueries (such as the (
SELECT COUNT(1) FROM ...
) examples we've looked at in this article) cannot be grouped or filtered directly in the containing query. Similarly, when using SQL Server 2005's ranking functions - ROW_NUMBER()
, RANK()
, DENSE_RANK()
, and so on - the containing query cannot include a filter or grouping expression to return only a subset of the ranked results. For both of these instances, CTEs are quite handy.
CTEs can also be used to recursively enumerate hierarchical data.
Points of Interest
See the awful series with CTE:
WITH ShowMessage(STATEMENT, LENGTH)
AS
(
SELECT STATEMENT = CAST('I Like ' AS VARCHAR(300)), LEN('I Like ')
UNION ALL
SELECT
CAST(STATEMENT + 'CodeProject! ' AS VARCHAR(300))
, LEN(STATEMENT) FROM ShowMessage
WHERE LENGTH < 300
)
SELECT STATEMENT, LENGTH FROM ShowMessage
A Common Table Expression (CTE) is a temporary, named result set. Once created, a CTE can then be used in subsequent statements in the same SQL batch. This provides two benefits. First, it allows queries to be written in a more clear and concise manner. Second, because CTEs can refer to themselves, they allow the creation of recursive queries. In previous versions of SQL Server, stored procedures could call themselves, but SQL statements could not. CTEs therefore represent a new and powerful addition to the T-SQL language.
ReplyDeleteAs CTEs can refer to themselves, thus allowing recursive SQL statements to be created. This capability is very useful when you store hierarchical data in an SQL table. Retrieving this data is most easily accomplished via a recursive SELECT statement. However, this was not possible in earlier versions of SQL Server. Developers were reduced to creating recursive stored procedures or more complex WHILE loops. Now with CTEs, we can easily use recursion to simply the process.
Suppose that we create a table to store employee information. Each row of the table contains an Employee ID, pertinent employee information, and a Manager ID. The Manager ID contains the Employee ID of the employee's manager. This hierarchy can be arbitrarily deep.
we see a recursive CTE that retrieves data from this table. The first thing we can think of is to have a CTE that contains two SELECT statements joined together with a UNION ALL clause. Although multiple statements can be used in a nonrecursive query, they are required in a recursive one. The first SELECT statement handles the "anchor" or terminating condition. This statement returns those employees at the top of the hierarchy and who therefore have no managers. The second statement handles the recurring condition, returning those employees that do have managers. Notice that this second statement contains an INNER JOIN between the Employee table and the CTE itself. This is what makes the CTE recursive.