Recursive query using Common Table Expression


It is very common to have applications where we are having data with hierarchy relationships i.e we have parent-child relationship between two entities. Database design for such cases involves storing the hierarchy relation between the components in the same table, i.e. the record Id and its ParentId, which is nothing but the Id of the parent record which is in the same table . So when we are required to fetch the actual data , let’s say, we need to get the data of employees with their names & manager names, then we use different ways to write the query. It could be using self join, it could be using sub-queries or may involve the use of temporary tables etc. Apart from these, there is one another easy way to achieve this, using the Common Table Expressions and we will be discussing this approach.

What is Common Table Expression ?

Common table expression is just another way of storing temporary data, like we have temporary table or the table variables, but the main difference is that their scope of existence is only the immediate query following the cte. We will not be going in to much details about cte and our focus will be on discussing the recursive query using cte.

So before we discuss this further, we will be discussing the structure of cte. Then we will create an example to demonstrate its use and discuss how it works, as per the structure defined for it.

Structure/Syntax of Common Table Expression :

WITH cte_name AS

(

query1 or anchor query

UNION ALL

query2 or recursive query

)

So, a CTE starts with the keyword WITH, followed by its name. Then we have two different queries which will be the core for getting the required data. They are divide into 2 parts.

Anchor query :  This query is the start point which acts as a base to enable the process of recursion in the cte. We have a kind of a condition in this query, which makes it our starting point for rest of the data fetching query. For example, in order order to fetch hierarchy based data, this query may have where condition to get the first record. So this query basically provides the base data for rest of the process for fetching the data.

Recursive query : After anchor query is executed, result set generated by the anchor query is used as an input for the recursive query and is JOIN-ed with the recursive query to generate new results. New result set created, is again JOIN-ed with the recursive query and further results are generated. So this addition to the result set and its JOIN with the recursive query continues until all the records are processed or you can say the further JOIN returns no data.

How Common table expression works ?

Now, we have had the overview of its structure, we will be discussing how actually the cte works. We will be discussing the process in 3 steps. When a SELECT statement is executed on a cte, the first query which gets executed is the anchor query. Since this is the start point for getting the required data, we need to take care of how the base condition is to be applied so that we get the starting point for the rest of the recursion process. We will be using an example to get all the managers & theemployees workers under them . Then after these steps, we will discuss the process in detail, using the same example.

Step 1 : We will write the anchor query as the first step. To get the list of all the employees under a manager, our anchor query should be starting with the record where the ManagerId for the employee is null. This means, we are starting from the top of the hierarchy or management level to get the further data.

Step 2 : Now we have the starting point, we will join it with the rest of the data set using the Union condition.

Step 3 : We will write the recursive query, which means, we use the input of step 1 for this step to apply JOIN between the result set of step 1 and rest of the data in the table. So for first time this recursive query is JOIN-ed with anchor query result set, we get the immediate workers under the manager on the top of the hierarchy, and this process further continues with the results being added to the result set and JOIN-ed with the main table.

Now, we will be discussing the process using an example. We will be using situations of hierarchy data of Employee-Manager relationship and try to get the names of the managers and employees who work under them, using the cte. For this we create a simple table with 3 columns, EmpId, EmpName and ManagerId. Here, ManagerId is nothing but the EmpId of the same table.

After this we will add some dummy data like the following. Based on these entries we will try to get the name of the managers and the workers under them.

Recursion_1

So as per the above data, we should have following data :

1. Mike is Self manager i.e. is having no Manager.

2. Mike is Manager of Nathan & Greg

3. Nathan is manager of Dan & Jack

4. Greg is manager of Kile.

So now we will try to write the cte based on the steps we discussed. So our first step will be to create the anchor query.

Step 1 : We will start with the Manager at the top of the hierarchy and get his immediate workers. Since top level employee will not be having any Manager, his ManagerId will be null. So this will be the having the base or anchor query with the where condition as following :

 Recursion_2

Now this is our base result set and will be the input for the recursive query.

Step 2 : We will be adding the UNION condition.

Step 3 : Run the anchor query and you will see that we have the name of the manager at the top most level of the hierarchy. Now we need to get the immediate workers for the same. To do so, we simply apply the JOIN with the rest of the data in the table with the condition :

EH1.ManagerId = HierarchyData.EmpId

which says, get the employees from the table, who are having the ManagerId equal to the EmployeeId of the record we have in the result set. This provides us the manager name for the employees having manager name in the cte result set. Our recursive query JOIN-ed with the anchor query will form the complete cte and will look like the following :

Recursion_3

So  when the recursive query executes, new results are generated and added to the previous result set. This result set is again JOIN-ed with the existing table with the same ON condition specified above and this time we get the immediate workers of the second level manager. In this way, this process keeps on repeating again and again until the complete result set is generated and we get the data, which we discussed in the start of the article.

So this was all about the recursion using Common Table Expression. Hope this helps.

Advertisements

About Jasminder

.Net developer and blogger by profession, keen to learn new technologies, love nature, music and cricket.
This entry was posted in SQL Server and tagged . Bookmark the permalink.

I have a suggestion..

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s