EXISTS vs. IN
Teradata Teradata Discussion Forums Teradata.com Discussion Forum
Visit Teradata.com
Home       Guidelines    Member List
Welcome Guest ( Login | Register )
        


This online forum is for user-to-user discussions of Teradata products, and is not an official customer support channel for Teradata. If you require direct assistance, please contact Teradata support.


EXISTS vs. IN Expand / Collapse
Author
Message
Posted 8/31/2007 1:21:06 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: Forum Members
Last Login: 9/13/2007 6:12:00 PM
Posts: 5, Visits: 1
Is EXISTS faster than IN? For example, table B has millions of records.

Select A.student_id from A
where EXISTS
(select student_id from B where A.student_id=B.student_id)

vs.

Select student_id from A
where A.student_id IN (select B.student_id from B)

student_id is NOT NULL


IammaI

Post #8770
Posted 9/7/2007 7:12:46 AM
Forum Guru

Forum GuruForum GuruForum GuruForum GuruForum GuruForum GuruForum GuruForum Guru

Group: Forum Members
Last Login: 8/20/2008 2:31:00 AM
Posts: 53, Visits: 10
Hi,
u r query
Select A.student_id from A
where EXISTS
(select student_id from B where A.student_id=B.student_id)

is equivalent to


Select student_id from A
where A.student_id IN (select B.student_id from B where B.student_id is not null)

also the performance wise first query is better.

regds,
sachin

Post #8832
Posted 9/13/2007 6:12:55 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: Forum Members
Last Login: 9/13/2007 6:12:00 PM
Posts: 5, Visits: 1
Thanks Sachin.

IammaI

Post #8896
Posted 2/17/2008 11:52:17 PM
Junior Member

Junior MemberJunior MemberJunior MemberJunior MemberJunior MemberJunior MemberJunior MemberJunior Member

Group: Forum Members
Last Login: 2/26/2008 11:42:13 AM
Posts: 16, Visits: 32

Hi,

Can u pls explain why 'Exists' performance is better than 'in'?

Regards,
abc
Post #10609
Posted 2/18/2008 4:48:01 AM
Forum Guru

Forum GuruForum GuruForum GuruForum GuruForum GuruForum GuruForum GuruForum Guru

Group: Forum Members
Last Login: 10/15/2008 10:54:22 AM
Posts: 56, Visits: 68
Hi,

The performance of ‘EXISTS’ and ‘IN’ will be same for queries having fewer members in their expression list.
But if the members in the expression list is more, then the performance of ‘EXISTS’ is better than ‘IN’.
This is because, while using ‘IN’ if the set of constants consists of 70 or fewer members, the system uses hashing to retrieve the rows.
If the set consists of more than 70 members, the system does a full-table scan.
Note that full-table scans are much more time consuming than row hashed accesses.

Regards,
Balamurugan
Post #10614
Posted 2/22/2008 5:18:40 AM
Forum Member

Forum MemberForum MemberForum MemberForum MemberForum MemberForum MemberForum MemberForum Member

Group: Forum Members
Last Login: 5/20/2008 5:21:02 AM
Posts: 28, Visits: 20
The SQL that uses EXISTS may be processed as a correlated subquery.

Experience shows that certain correlated subqueries take longer to process than similar SQL's which use a JOIN to achieve the same result.

I'd recommend the IN form of the SQL over the EXISTS form of the SQL.


--Foxbat
Post #10667
Posted 2/22/2008 11:33:19 AM
Supreme Being

Supreme BeingSupreme BeingSupreme BeingSupreme BeingSupreme BeingSupreme BeingSupreme BeingSupreme Being

Group: Forum Members
Last Login: Today @ 4:54:41 PM
Posts: 534, Visits: 285
In almost any case there will be no difference between an IN-subquery and an EXISTS-correlated subquery, the optimizer will rewrite both to a join.

Compared to a JOIN there's also hardly any difference, if the subquery is the unique part of the 1:m relation.
If the subquery is the m-part, there's an automatic DISTINCT added to the subquery (in Explain that step will be executed before the join), but you have to add it manually to the JOIN (after the join within Explain).
If that subquery column is very non-unique that automatic DISTINCT might be overridden by a GROUP BY to enhance performance.

If it's NOT IN vs. NOT EXISTS there might be a huge difference, because NOT IN has to deal with NULLs via three-way-logic, whereas NOT EXISTS simply ignores NULL. If there's any NULLable column (inner or outer) used by the subquery, it is recommended to use NOT EXISTS instead of NOT IN. And NOT EXISTS is always more efficient than an Outer Join solution filtering for NULLs.

Dieter
Post #10671
« Prev Topic | Next Topic »


Reading This Topic Expand / Collapse
Active Users: 0 ( 0 guests, 0 members, 0 anonymous members )
No members currently viewing this topic.


All times are GMT -5:00, Time now is 5:14pm

Powered By InstantForum.NET v4.1.4 © 2008
Execution: 0.234. 9 queries. Compression Disabled.