This is a classic SQL problem- probably more a textbook exercise than a real-life business problem, but a classic nonetheless. "Find all employees whose salary is higher than the average salary for their department."
In Access, the most straightforward approach is a multi-query solution. However it can be also be accomplished in a single query using subqueries. I'll discuss both approahces here
Assume our data in tblEmployeeInfo is like this:
EmployeeID | DepartmentID | Salary |
1 | 1 | $30,000.00 |
2 | 1 | $25,000.00 |
3 | 1 | $40,000.00 |
4 | 1 | $33,000.00 |
5 | 2 | $25,000.00 |
6 | 2 | $50,000.00 |
Two Query Solution
First create a query that finds average salary by department. I'll call this qryAvgSalaryByDepartment. This can be easily created in the query grid:
Or here's the SQL:
SELECT DepartmentID, Avg(tblEmployeeInfo.Salary) AS DeptAvgSalary FROM tblEmployeeInfo GROUP BY DepartmentID;
When we run the query we see the following results:
or
SELECT EmployeeID, tblEmployeeInfo.DepartmentID, Salary FROM tblEmployeeInfo INNER JOIN qryAvgSalaryByDepartment ON tblEmployeeInfo.DepartmentID = qryAvgSalaryByDepartment.DepartmentID WHERE Salary > DeptAvgSalary;and will yield the final results we were looking for
Subquery Solution
You can use a query with a subquery to do the two steps of the two query solution in one query. First he's the SQL:
SELECT EmployeeID, DepartmentID, Salary FROM tblEmployeeInfo AS T1 WHERE Salary > ( SELECT Avg(Salary) FROM tblEmployeeInfo as T2 WHERE T1.DepartmentID=T2.DepartmentID ) ;When we run this query we get the same results as the two query solution. Let's look at it in detail to understand how it works.
Inside the brackets you have a complete query- that's the subquery. It looks a lot like the first query we created. That query had a GROUP BY clause that told it to give us the average for each department. This query has a WHERE clause that tells it to give us the average for only one department at a time (more about how it knows which department to give us in a minute. The subquery's result is a single value- the average salary. That value can then be used in the main query. (Not all subqueries return a single value, some return a set of values.)
How does the subquery know which department to reurn an average salary for? It's WHERE clause tells it to select only records from the department of the employee being processed in the main query.WHERE T1.DepartmentID=T2.DepartmentID The main query will process each record in tblEmployeeInfo. And for each record it will run the subquery to get an average salary for their department. The WHERE clause tells the query to use a claue from the main query as the criteria in the subquery.
T1 and T2 are aliases we give the tables being referred to in the main query and the subquery. The reason we need aliases is that they're the same table- tblEmployeeInfo. By using the AS keyword we can specify a name we want to use to refer to the two different instances of the table in out query. I just used T1 and T2 for table 1 and table 2. You could be more imaginative. If the subquery doesn't use the same table as the main query you don't need an alias. In SQL texbooks these aliases are sometimes called carelation IDs when they're used to match (or corelate) records in the main and subquery.
And just because in SQL there's always another way to do it, here's another subquery approach which uses a JOIN instead of a filter based on a filed in the outer query. In my mind this is less "elegant". But I'm not sure I have a specific reason for this. I haven't researched it, but I have a hunch that the solution using JOIN is not supported in all SQL implementations. Here's the SQL:
SELECT T1.EmployeeID, T1.DepartmentID, T1.Salary FROM tblEmployeeInfo AS T1 INNER JOIN ( SELECT DepartmentID, Avg(Salary) AS AvgSalary FROM tblEmployeeInfo GROUP BY DepartmentID ) AS T2 ON T1.DepartmentID = T2.DepartmentID WHERE Salary > AvgSalary ;
See also: all my posts about subqueries.
No comments:
Post a Comment