Difference between Outer apply and outer join
The JOIN and APPLY operators are use for different purpose. The APPLY operator is used to apply the result of the table expression in the right to each row from the left. You can use any column from the left side as an input or parameter to the table expression in the right side.
The JOIN operator is exactly to join both sides, but you can not use columns from the left side as you do with the apply operator.
For example, assume that you have a inline table-value function that return the id of the customer with most orders for specific region, and you want to select rows from the customers table, pass the region to the inline function and reference the result of the function in the column list. You will do this using the APPLY operator:
select * from dbo.customers as c outer apply dbo.ufn_CustByRegion(c.region)
but you can not do this with a join.
select * from dbo.customers as c join dbo.ufn_CustByRegion(c.region) -- this yields an error
I am using OUTER APPLY to still be able to see customers with no orders.
You can check BOL for more info about these operators.
Thanks for reply.... the main difference i have seen in these are Left Join: i need to prepare the query as table (Temporary Table) and actual join will be provided after filtering of data using "ON" Cross Apply: i can prepare the Temporary Table in that i can make use of my parent table reference. Example: Left Join: Select r.Name, r.EmailID, ISNULL(j.Employer) as PresentEmployer From CandidateInfo r WITH (NOLOCK) LEFT JOIN (Select CandID, Employer From JOB_EMPLOYER WITH (NOLOCK)) as j ON j.CandID = r.ID OUTER APPLY: Select r.Name, r.EmailID, ISNULL(j.Employer) as PresentEmployer From CandidateInfo r WITH (NOLOCK) OUTER APPLY (Select Employer From JOB_EMPLOYER WITH (NOLOCK) Where CandID = r.ID) as j These two sample queries are examples in which i actually use Pivot output as table and that will be joining with parent table for getting details. As per my observation in SQL Server 2005, i was not seen any greater performance when i user CROSS APPLY but query output was faster and worker table load was also pretty low when i user CROSS APPLY in SQL Server 2008. But same query was not performing properly if use this logic for other combination. So i want to know that is there any dependency or specific formation where and when we can use CROSS APPLY.
Hiya! This article will explain OUTER and CROSS APPLY and show you how to use them by means of sample code. OUTER and CROSS APPLY are unique to SQL Server so this article is intended for anybody using SQL in a SQL Server environment. It will also cover many examples of where you can use OUTER and CROSS APPLY and their pro's and con's.
Use cases in this article include:
example
Instead of giving definitions I would like to explain by example. Think of
CROSS APPLY as a row-by-row INNER JOIN . If we have: SELECT *
FROM Vehicles V
INNER JOIN MileageLog ML ON V.ID = M.VehicleID
to join a vehicle and its mileage log we could do exactly the same thing using
CROSS APPLY : SELECT *
FROM Vehicles V
CROSS APPLY (SELECT * FROM MileageLog ML WHERE V.ID = ML.VehicleID) ML
These two queries will produce identical results. We could use
OUTER APPLY instead of CROSS APPLY to get the same effect as a LEFT JOIN . That is SELECT *
FROM Vehicles V
LEFT JOIN MileageLog ML ON V.ID = ML.VehicleID
will give the same results as:
SELECT *
FROM Vehicles V
OUTER APPLY (SELECT * FROM MileageLog ML WHERE V.ID = ML.VehicleID) ML
Notice how our
ON condition becomes a WHERE condition in the subquery. Also notice how we give an alias for theAPPLY just like we can alias tables in a JOIN statement - this is required for APPLY statements. TOP N Rows
These queries now do the same thing and the
JOIN is easier to write and remember, so why on earth would we useAPPLY instead?
Let's say that instead of all mileage log entries for every vehicle we now only want the last 5 entries for every vehicle. One way of doing this is with
ROW_NUMBER , PARTITION BY and a nested query: SELECT * FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY ML.VehicleID ORDER BY ML.EntryDate DESC) RN
FROM Vehicles V
INNER JOIN MileageLog ML ON V.ID = ML.VehicleID
) IQ
WHERE IQ.RN <= 5
Which would only return the first 5 entries for every vehicle. To do so using a
CROSS APPLY statement: SELECT *
FROM Vehicles V
CROSS APPLY (
SELECT TOP 5 *
FROM MileageLog ML
WHERE V.ID = ML.VehicleID
ORDER BY ML.EntryDate DESC) ML
The are a few important things to take note of here:
TOP 10 PERCENT rows without an APPLY statement it would have to be something like:
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY ML.VehicleID ORDER BY ML.EntryDate DESC) RN
FROM Vehicles V
INNER JOIN MileageLog ML ON V.ID = ML.VehicleID
) IQ
INNER JOIN
(
SELECT ML.VehicleID, COUNT(*) AS RowCount
FROM MileageLog ML
GROUP BY ML.VehicleID
) MLCount ON IQ.VehicleID = MLCount.VehicleID
WHERE RN / cast(MLCount.RowCount as float) <= 0.1
As you can see this becomes a more complex query since we now require aggregates and single-row expressions in order to calculate our own percentages. It also very quickly becomes unclear what we were trying to do.
If we use
CROSS APPLY doing this is simply: SELECT *
FROM Vehicles V
CROSS APPLY (
SELECT TOP 10 PERCENT *
FROM MileageLog ML
WHERE V.ID = ML.VehicleID
ORDER BY ML.EntryDate DESC) ML
Are you starting to see how
CROSS APPLY can make your life easier? UNPIVOTUNPIVOT unfolds a single row into multiple rows. The syntax for UNPIVOT works well if you're doing single tableUNPIVOT s and gets rather complicated when you're joining or doing multiple. I'm not going to cover UNPIVOT examples here for the sake of brevity - feel free to Google (or the search engine of your preference) a few examples before reading on.
If we have the following data (first row is column names) in the table tbl:
A B C D
E 1 2 3
F 4 5 6
We can unpivot it using a
CROSS APPLY as follows: SELECT A, Category, Value
FROM tbl
CROSS APPLY (
SELECT 'B' AS Category, B AS Value UNION ALL
SELECT 'C', C UNION ALL
SELECT 'D', D
) CA
Viola, that's it. It will unfold the data like such:
A Category Value
E B 1
E C 2
E D 3
F B 4
F C 5
F D 6
Which is the same results that
UNPIVOT would give.
A few important things to note:
Multi-field expressions
Lets say we want to know which day every vehicle travelled the furthest:
SELECT *, (
SELECT TOP 1 EventDate
FROM MileageLog ML
WHERE V.ID = ML.VehicleID ORDER BY DistanceTravelled DESC) AS DayMostTravelled
FROM Vehicles V
Simple enough, right? Doing this with
OUTER APPLY looks like such: SELECT *
FROM Vehicles V
OUTER APPLY (
SELECT TOP 1 EventDate AS DayMostTravelled
FROM MileageLog ML
WHERE V.ID = ML.VehicleID
ORDER BY DistanceTravelled DESC
) CA
Only a few small changes in the code is necessary:
So if we now want to know the date and the distance travelled on that day?
SELECT V.*, IQ.EventDate AS DayMostTravelled, IQ.DistanceTravelled
FROM Vehicles V
OUTER JOIN (
SELECT VehicleID, EventDate, DistanceTravelled,
ROW_NUMBER() OVER (PARTITION BY VehicleID ORDER BY DistanceTravelled DESC) RN
FROM MileageLog
) IQ ON IQ.VehicleID = V.ID AND IQ.RN = 1
Since this is no longer a single field we now have to use
JOIN and ROW_NUMBER to get our desired information. Doing this with OUTER APPLY on the other hand: SELECT * FROM Vehicles V
OUTER APPLY (
SELECT TOP 1 EventDate AS DayMostTravelled, DistanceTravelled
FROM MileageLog ML
WHERE V.ID = ML.VehicleID
ORDER BY DistanceTravelled DESC
) CA
This gives us an easy way to select multiple fields from a related row based on some condition.
Using expressions in other expressions
We can use
CROSS APPLY to give expressions names and use them in other expressions. SELECT V.*, CA1.AvgDistance, CA1.TotalDistance
FROM Vehicles V
OUTER APPLY (
SELECT Avg(DistanceTravelled) AS AvgDistance, Sum(DistanceTravelled) AS TotalDistance
FROM MileageLog ML
WHERE V.ID = ML.VehicleID
) CA1
The query above simply gets the average and total distance travelled for each vehicle.
SELECT V.*, CA1.AvgDistance, CA1.TotalDistance, CA2.ServicesLeft
FROM Vehicles V
OUTER APPLY (
SELECT Avg(DistanceTravelled) AS AvgDistance, Sum(DistanceTravelled) AS TotalDistance
FROM MileageLog ML
WHERE V.ID = ML.VehicleID
) CA1
OUTER APPLY (
SELECT COUNT(*) AS ServicesLeft
FROM VehicleServicePlans VSP
WHERE VSP.VehicleID = V.ID
AND VSP.ServicePlanDistance > CA1.TotalDistance
) CA2
As you see we can add a second
OUTER APPLY to now use the results of the first and do some additional calculations. Chaining APPLY s in this way makes it easy to seperate same-row logic into multiple sections. APPLY and TVFs
APPLY also works with TVFs.
Let's say we have a TVF to get the fields of a table:
CREATE FUNCTION FieldsForTable (@tablename nvarchar(1000))
RETURNS TABLE
AS
RETURN
select * from sys.columns where object_id = object_id(@tablename)
If we now want to get the fields for all tables starting with an A we can do it using
CROSS APPLY : SELECT * FROM sys.tables T CROSS APPLY dbo.FieldsForTable(T.name)
WHERE T.name LIKE 'a%'
Note: This could obviously be done using a single JOIN statement - the example is exactly that and just demonstrates how to use CROSS APPLY with TVFs.
Notes on the performance of APPLY
Since APPLY works on a row-by-row level:
|
No comments:
Post a Comment