• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

Reply to: SQL Left Join

Collapse

You are not logged in or you do not have permission to access this page. This could be due to one of several reasons:

  • You are not logged in. If you are already registered, fill in the form below to log in, or follow the "Sign Up" link to register a new account.
  • You may not have sufficient privileges to access this page. Are you trying to edit someone else's post, access administrative features or some other privileged system?
  • If you are trying to post, the administrator may have disabled your account, or it may be awaiting activation.

Previously on "SQL Left Join"

Collapse

  • wurzel
    replied
    Originally posted by TheFaQQer View Post
    It depends on whether the column is nullable or not, and (probably) which database you are using.

    In MySQL, for example, left join performs well where both columns are not-nullable. However, it performs badly where one or both of the columns is nullable, in which case an EXISTS / NOT EXISTS would perform better.

    You could also use a NOT IN instead of NOT EXISTS, which (I think!) would work best, although if the column is nullable then you would need to include an additional check for nulls.
    Using SQl Server 2008 here. Will run both variants through Query Analyzer & look at execution plans when I've got some time.

    Leave a comment:


  • TheFaQQer
    replied
    Originally posted by lilelvis2000 View Post
    Wouldn't using a not exists clause work better?

    eg: select a* from tableA a
    where not exists (select * from TableB b where b.id=a.id)
    One more thing....

    You'd need to include a check for both columns being null in the two tables, if the database doesn't use an equals operator on nulls (e.g. Oracle)

    Code:
    SELECT a.*
    FROM   tableA a
    WHERE  NOT EXISTS ( SELECT *
                        FROM   tableB b
                        WHERE  b.id = a.id
                        OR     ( b.id IS NULL and a.id IS NULL )
                      )

    Leave a comment:


  • TheFaQQer
    replied
    If the column in one of the tables is nullable, and in the other table is not nullable, then Oracle uses the same execution plan regardless of whether you use NOT EXISTS or a join:

    Code:
    SQL > create table cuk1(id number,
      2     constraint cuk1_pk primary key(id));
    
    Table created.
    
    SQL > create table cuk2(id number);
    
    Table created.
    
    SQL > begin
      2    for i in 1..100 loop
      3      insert into cuk1 values(i);
      4    end loop;
      5    commit;
      6  end;
      7  /
    
    PL/SQL procedure successfully completed.
    
    SQL > begin
      2    for i in 1..100000 loop
      3      insert into cuk2
      4        values(mod(i,97));
      5    end loop;
      6    commit;
      7  end;
      8  /
    
    PL/SQL procedure successfully completed.
    
    SQL > create index cuk2_idx on cuk2(id);
    
    Index created.
    
    SQL > exec dbms_stats.gather_table_stats(USER,'CUK1');
    
    PL/SQL procedure successfully completed.
    
    SQL > exec dbms_stats.gather_table_stats(USER,'CUK2');
    
    PL/SQL procedure successfully completed.
    
    SQL > -- Not exists check.....
    SQL > explain plan for
      2    select id from cuk1 a
      3     where not exists
      4      (select 1 from cuk2 b where b.id=a.id);
    
    Explained.
    
    SQL > select * from table(dbms_xplan.display);
    
    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Plan hash value: 1819530608
    
    ------------------------------------------------------------------------------
    | Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |         |     4 |    24 |    50   (4)| 00:00:01 |
    |*  1 |  HASH JOIN ANTI    |         |     4 |    24 |    50   (4)| 00:00:01 |
    |   2 |   INDEX FULL SCAN  | CUK1_PK |   100 |   300 |     1   (0)| 00:00:01 |
    |   3 |   TABLE ACCESS FULL| CUK2    |   100K|   292K|    48   (3)| 00:00:01 |
    ------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - access("B"."ID"="A"."ID")
    
    15 rows selected.
    
    SQL > -- Outer join check
    SQL > explain plan for
      2    select a.id
      3    from cuk1 a
      4    left outer join cuk2 b on (a.id = b.id)
      5    where b.id is null;
    
    Explained.
    
    SQL > select * from table(dbms_xplan.display);
    
    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Plan hash value: 1819530608
    
    ------------------------------------------------------------------------------
    | Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |         |     4 |    24 |    50   (4)| 00:00:01 |
    |*  1 |  HASH JOIN ANTI    |         |     4 |    24 |    50   (4)| 00:00:01 |
    |   2 |   INDEX FULL SCAN  | CUK1_PK |   100 |   300 |     1   (0)| 00:00:01 |
    |   3 |   TABLE ACCESS FULL| CUK2    |   100K|   292K|    48   (3)| 00:00:01 |
    ------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - access("A"."ID"="B"."ID")
    
    15 rows selected.
    
    SQL > drop table cuk1;
    
    Table dropped.
    
    SQL > drop table cuk2;
    
    Table dropped.

    Leave a comment:


  • TheFaQQer
    replied
    Originally posted by wurzel View Post
    Would one perform better than the other?
    It depends on whether the column is nullable or not, and (probably) which database you are using.

    In MySQL, for example, left join performs well where both columns are not-nullable. However, it performs badly where one or both of the columns is nullable, in which case an EXISTS / NOT EXISTS would perform better.

    You could also use a NOT IN instead of NOT EXISTS, which (I think!) would work best, although if the column is nullable then you would need to include an additional check for nulls.

    Leave a comment:


  • lilelvis2000
    replied
    Originally posted by wurzel View Post
    Perfect thanks.



    Would one perform better than the other?
    You would need to try it. go on.

    Leave a comment:


  • vetran
    replied
    Try this, makes sense to me

    Coding Horror: A Visual Explanation of SQL Joins

    Leave a comment:


  • wurzel
    replied
    Originally posted by TheFaQQer View Post
    It depends on what you want



    The difference is that the join condition in the second cursor may return no rows (there are no rows in b with a null ID). This means that (a.id=b.id AND b.id IS NULL) essentially means that you return everything in TableA......
    Perfect thanks.

    Originally posted by lilelvis2000 View Post
    Wouldn't using a not exists clause work better?

    eg: select a* from tableA a
    where not exists (select * from TableB b where b.id=a.id)

    though

    select a* from TableA a left join TableB b on a.id=b.id
    where b.id IS NULL

    is another way to go.
    Would one perform better than the other?

    Leave a comment:


  • lilelvis2000
    replied
    Wouldn't using a not exists clause work better?

    eg: select a* from tableA a
    where not exists (select * from TableB b where b.id=a.id)

    though

    select a* from TableA a left join TableB b on a.id=b.id
    where b.id IS NULL

    is another way to go.

    Leave a comment:


  • TheFaQQer
    replied
    Originally posted by Spacecadet View Post
    Ahem


    Didn't read that bit.

    Leave a comment:


  • Spacecadet
    replied
    Originally posted by TheFaQQer View Post
    It depends on what you want
    Originally posted by wurzel View Post
    I want to return everything form table A that doesn't have a corresponding record in B.
    Ahem

    Leave a comment:


  • TheFaQQer
    replied
    Originally posted by wurzel View Post
    What approach is correct?
    It depends on what you want

    Originally posted by wurzel View Post
    SELECT a.* FROM TableA a
    LEFT JOIN TableB b ON a.id = b.id
    WHERE b.id IS NULL

    SELECT a.* FROM TableA a
    LEFT JOIN TableB b ON (a.id = b.id AND b.id IS NULL)
    The difference is that the join condition in the second cursor may return no rows (there are no rows in b with a null ID). This means that (a.id=b.id AND b.id IS NULL) essentially means that you return everything in TableA.

    Code:
    SQL > CREATE table CUK1 (id number);
    
    Table created.
    
    Elapsed: 00:00:00.48
    SQL > begin
      2    for i in 1..4 loop
      3      insert into cuk1 values (i);
      4    end loop;
      5    commit;
      6  end;
      7  /
    
    PL/SQL procedure successfully completed.
    
    Elapsed: 00:00:00.46
    SQL >
    SQL > create table cuk2 (id number);
    
    Table created.
    
    Elapsed: 00:00:00.48
    SQL >
    SQL > begin
      2    for i in 3..4 loop
      3      insert into cuk2 values (i);
      4    end loop;
      5    commit;
      6  end;
      7  /
    
    PL/SQL procedure successfully completed.
    
    Elapsed: 00:00:00.48
    SQL > select * from cuk1;
    
            ID
    ----------
             1
             2
             3
             4
    
    Elapsed: 00:00:00.59
    SQL > select * from cuk2;
    
            ID
    ----------
             3
             4
    
    Elapsed: 00:00:00.59
    SQL > select cuk1.id, cuk2.id
      2  from   cuk1 LEFT OUTER JOIN cuk2 ON (cuk1.id = cuk2.id);
    
            ID         ID
    ---------- ----------
             3          3
             4          4
             1
             2
    
    Elapsed: 00:00:00.64
    SQL > SELECT a.* FROM cuk1 a
      2  LEFT JOIN cuk2 b ON a.id = b.id
      3  WHERE b.id IS NULL;
    
            ID
    ----------
             1
             2
    
    Elapsed: 00:00:00.61
    SQL > SELECT a.* FROM cuk1 a
      2  LEFT JOIN cuk2 b ON (a.id = b.id AND b.id IS NULL);
    
            ID
    ----------
             1
             2
             3
             4
    
    Elapsed: 00:00:00.60
    SQL > SELECT a.*
      2  FROM   cuk1 a
      3  JOIN   cuk2 b ON ( a.id = b.id );
    
            ID
    ----------
             3
             4
    
    Elapsed: 00:00:00.62
    SQL > SELECT a.* FROM cuk1 a
      2  LEFT JOIN cuk2 b ON a.id = b.id
      3  WHERE b.id IS NOT NULL;
    
            ID
    ----------
             3
             4
    
    Elapsed: 00:00:00.59
    SQL > drop table cuk1;
    
    Table dropped.
    
    Elapsed: 00:00:00.51
    SQL >
    SQL > drop table cuk2;
    
    Table dropped.
    
    Elapsed: 00:00:00.57
    SQL >

    Leave a comment:


  • Spacecadet
    replied
    Originally posted by wurzel View Post
    What approach is correct?
    The first one

    Leave a comment:


  • wurzel
    started a topic SQL Left Join

    SQL Left Join

    Apologies if this is is a very basic question but my SQL is a bit rusty these days.

    Simple scenario, I have 2 tables, A & B and I want to return everything form table A that doesn't have a corresponding record in B. So I've tried the followwing 2 statements:

    SELECT a.* FROM TableA a
    LEFT JOIN TableB b ON a.id = b.id
    WHERE b.id IS NULL

    SELECT a.* FROM TableA a
    LEFT JOIN TableB b ON (a.id = b.id AND b.id IS NULL)

    They both return different results. I would have expected the results to be same for both.

    What approach is correct?

Working...
X