|
|
|
Forum Newbie
      
Group: Forum Members
Last Login: 9/13/2007 6:12:00 PM
Posts: 5,
Visits: 1
|
|
Is EXISTS faster than IN? For example, table B has millions of records.
Select A.student_id from A where EXISTS (select student_id from B where A.student_id=B.student_id)
vs.
Select student_id from A where A.student_id IN (select B.student_id from B)
student_id is NOT NULL
IammaI
|
|
|
|
|
Forum Guru
      
Group: Forum Members
Last Login: 8/20/2008 2:31:00 AM
Posts: 53,
Visits: 10
|
|
Hi, u r query Select A.student_id from A where EXISTS (select student_id from B where A.student_id=B.student_id)
is equivalent to
Select student_id from A where A.student_id IN (select B.student_id from B where B.student_id is not null)
also the performance wise first query is better.
regds, sachin
|
|
|
|
|
Forum Newbie
      
Group: Forum Members
Last Login: 9/13/2007 6:12:00 PM
Posts: 5,
Visits: 1
|
|
|
|
|
|
Junior Member
      
Group: Forum Members
Last Login: 2/26/2008 11:42:13 AM
Posts: 16,
Visits: 32
|
|
Hi,
Can u pls explain why 'Exists' performance is better than 'in'?
Regards,
abc
|
|
|
|
|
Forum Guru
      
Group: Forum Members
Last Login: 10/15/2008 10:54:22 AM
Posts: 56,
Visits: 68
|
|
Hi,
The performance of ‘EXISTS’ and ‘IN’ will be same for queries having fewer members in their expression list.
But if the members in the expression list is more, then the performance of ‘EXISTS’ is better than ‘IN’.
This is because, while using ‘IN’ if the set of constants consists of 70 or fewer members, the system uses hashing to retrieve the rows.
If the set consists of more than 70 members, the system does a full-table scan.
Note that full-table scans are much more time consuming than row hashed accesses.
Regards,
Balamurugan
|
|
|
|
|
Forum Member
      
Group: Forum Members
Last Login: 5/20/2008 5:21:02 AM
Posts: 28,
Visits: 20
|
|
The SQL that uses EXISTS may be processed as a correlated subquery.
Experience shows that certain correlated subqueries take longer to process than similar SQL's which use a JOIN to achieve the same result.
I'd recommend the IN form of the SQL over the EXISTS form of the SQL.
--Foxbat
|
|
|
|
|
Supreme Being
      
Group: Forum Members
Last Login: Today @ 4:54:41 PM
Posts: 534,
Visits: 285
|
|
In almost any case there will be no difference between an IN-subquery and an EXISTS-correlated subquery, the optimizer will rewrite both to a join.
Compared to a JOIN there's also hardly any difference, if the subquery is the unique part of the 1:m relation.
If the subquery is the m-part, there's an automatic DISTINCT added to the subquery (in Explain that step will be executed before the join), but you have to add it manually to the JOIN (after the join within Explain).
If that subquery column is very non-unique that automatic DISTINCT might be overridden by a GROUP BY to enhance performance.
If it's NOT IN vs. NOT EXISTS there might be a huge difference, because NOT IN has to deal with NULLs via three-way-logic, whereas NOT EXISTS simply ignores NULL. If there's any NULLable column (inner or outer) used by the subquery, it is recommended to use NOT EXISTS instead of NOT IN. And NOT EXISTS is always more efficient than an Outer Join solution filtering for NULLs.
Dieter
|
|
|
|