an unsystematic journey through the source-code of MySQL and MariaDB: November 2014

Friday, 14 November 2014

JOIN (3)

this is the third post with my observations when I inspect a simple SQL-statement. You can find the other posts here: JOIN and JOIN (2).
And this is the SQL-statement I'm looking at:

MariaDB [TestOpt]> select SQL_NO_CACHE count(A.PZN) from TestSmall A straight_join TestBig B on (A.PZN = B.PZN) where A.Hersteller = '00020' and B.Hersteller = '36367';
+--------------+
| count(A.PZN) |
+--------------+
|           14 |
+--------------+
1 row in set (0.48 sec)

MariaDB [TestOpt]>

The first post described what is happening in the storage-engine (in my case this is of type MyISAM), the second post described what happened at the level(s) above. And in this post I will narrow my focus and look at a very tiny point: applying the WHERE-clause.

This is a specialized description, it's only valid for the SQL-statement given above. Don't generalize this description, but it should give you a point to start with your own examinations.

applying WHERE

As I've written in my last post you will find the code for handling this SQL-statement in the function sub_select() (in sql/sql_select.cc). The WHERE-clause is inspected in the function evaluate_join_record() (called from sub_select()). The whole WHERE-clause is handled by this single line in this function:


    select_cond_result= MY_TEST(select_cond->val_int());

As written before the execution of this query starts with the table TestSmall, reading record by record and applying the WHERE-clause for each record. When a match is found it switches over to the table TestBig and searches the corresponding records in this table. For each record found in TestBig the WHERE-clause is applied too.

So let's look at this code-line.

MY_TEST is simply a macro defined in my_global.h:


#define MY_TEST(a) ((a) ? 1 : 0)

it converts the result into a boolean-type.

select_cond is a pointer to an object of type COND, so let's look at this.

code

If we want to know what this COND contains I suggest to modify the code of MariaDB a bit (I did this in MariaDB but you can do this in MySQL too, it should be a similar task). You have to add these functions to the code in ha_myisam.cc:

display_functype()
ha_myisam::cond_push()
handle_Item()

These functions are described here: WHERE. Please do not forget to modify the header-file accordingly, and add the declaration of the 2 internal functions in the top of the cc-file.
Additionally I modified the function evaluate_join_record() (located in sql/sql_select.cc) and added the following 3 lines (marked in bold):

  DBUG_ENTER("evaluate_join_record");
  fprintf(stderr, "evaluate_join_record() for table: %s\n", join_tab->table->s->table_name.str);
  join_tab->table->file->cond_push( select_cond) ;
  fprintf(stderr, "end of COND.\n"); 
  DBUG_PRINT("enter", .............

The DBUG-statements should show you where to add these lines. The action happens in the call of the function cond_push() which writes the WHERE-clause to the file /var/log/mysql/error.log.
So please add these lines and functions, compile everything and start the server. We will now look what's happening when we execute the SQL-statement given above.

TestSmall

Let's inspect the first record from the table TestSmall. This is what this record looks like:

MariaDB [TestOpt]> select *  from TestSmall limit 1;
+---------+---------+------+------+--------------------------+---------------------+------------+
| Id      | PZN     | EVP  | HAP  | ArtikelBez               | ArtikelText         | Hersteller |
+---------+---------+------+------+--------------------------+---------------------+------------+
| 1000001 | 4972059 | 9.70 | 4.32 | PASCOSEDON TROPFEN 20 ML | PASCOSEDON TROPFEN  | 36350      |
+---------+---------+------+------+--------------------------+---------------------+------------+
1 row in set (0.00 sec)

MariaDB [TestOpt]>

When the SQL-statement from the entry of this post is executed the function evaluate_join_record() is called and in the case of the first record evaluated the code added produces this output::

table_name = <TestSmall>    <select SQL_NO_CACHE count(*) from TestSmall A straight_join TestBig B on (A.PZN = B.PZN) where A.Hersteller = '00020' and B.Hersteller = '36367'>
COND-ITEM   args: 0 type=[COND_AND_FUNC]    
FUNC-ITEM   [=] args: 2 type=[EQ_FUNC]  
FIELD-ITEM  [TestOpt] [A] [Hersteller]  str_value=<36350> name=<Hersteller>
STRING-ITEM     str_value=<00020>   name=<00020>
FUNC-ITEM   [isnotnull] args: 1 type=[ISNOTNULL_FUNC]   
FIELD-ITEM  [TestOpt] [A] [PZN] name=<PZN>
end of COND.

Maybe this form is more intuitv for you:

As you can see the WHERE-condition was modified by the server:

it contains only the part relevant for checking the table TestSmall
the check for PZN IS NOT NULL was added

Let's look into the function-hierarchy of the evaluation of this record:

TestSmall: check record with Id = 1000001 (compare: 36350 and  00020)
evaluate_join_record(): select_cond_result= MY_TEST(select_cond->val_int());
    Item_cond_and::val_int()
        Item::val_bool()
            Item_func_eq::val_int()
                Arg_comparator::compare()
                    Arg_comparator::compare_string()
                        Item_field::val_str()
                        Item_string::val_str()
                        sortcmp()
                            my_strnncollsp_simple()

The last function returns a non-zero-value which says that the two arguments compared are not equal so the whole expression is false (at the top-level there is an AND-operation. When one of the arguments is false then the whole AND results in a false and therefore the right branch of the tree is not evaluated).

With this result the record is thrown away and the next record from TestSmall is read. This process will be repeated until this record is read:

MariaDB [TestOpt]> select * from TestSmall where Hersteller = '00020' limit 1;
+---------+--------+------+------+--------------------------------+----------------------------+------------+
| Id      | PZN    | EVP  | HAP  | ArtikelBez                     | ArtikelText                | Hersteller |
+---------+--------+------+------+--------------------------------+----------------------------+------------+
| 1002100 | 178324 | 3.95 | 1.83 | TINY TOON ADVENTURES KIND 1 ST | TINY TOON ADVENTURES KIND  | 00020      |
+---------+--------+------+------+--------------------------------+----------------------------+------------+
1 row in set (0.01 sec)

MariaDB [TestOpt]>

When the WHERE-clause is applied to this record this is the function-hierarchy in this case:

TestSmall: check the record with Id = 1002100
evaluate_join_record(): select_cond_result= MY_TEST(select_cond->val_int());
    Item_cond_and::val_int()
        Item::val_bool()
            Item_func_eq::val_int()
                Arg_comparator::compare()
                    Arg_comparator::compare_string()
                        Item_field::val_str()
                        Item_string::val_str()
                        sortcmp()
                            my_strnncollsp_simple()
            Item::val_bool()
                Item_func_isnotnull::val_int()
                    Item_field::is_null()
                        Field::is_null()

As you can see the right branch of the tree is also checked for this record as the column Hersteller contains the correct value. So the server switches over to the table TestBig and searchess for records with the value '178324' in the column PZN (the value is taken from the current record from TestSmall).

TestBig

Again I will only look at the code that handles the WHERE-clause. So here is the first record from TestBig to evaluate:

MariaDB [TestOpt]> select * from TestBig where PZN = 178324 limit 1;
+----------+--------+--------+------+--------------------------------------+----------------------------+------------+
| Id       | PZN    | EVP    | HAP  | ArtikelBez                           | ArtikelText                | Hersteller |
+----------+--------+--------+------+--------------------------------------+----------------------------+------------+
| 01046189 | 178324 |  10.45 | 7.99 | MARLENE MS K1 AD K BLACK M      2 ST | BASILIKUM AETHERISCHES OEL | 20860      |
+----------+--------+--------+------+--------------------------------------+----------------------------+------------+
1 rows in set (0.00 sec)

MariaDB [TestOpt]>

This is the WHERE-statement as printed by the code shown above in the case of this record from the table TestBig:

table_name = <TestBig>  <select SQL_NO_CACHE count(A.PZN) from TestSmall A straight_join TestBig B on (A.PZN = B.PZN) where A.Hersteller = '00020' and B.Hersteller = '36367'>
FUNC-ITEM   [=] args: 2 type=[EQ_FUNC]  
FIELD-ITEM  [TestOpt] [B] [Hersteller]  str_value=<20860>   name=<Hersteller>
STRING-ITEM     str_value=<36367>   name=<36367>
end of COND.

And here it is in a different form:

This hierarchy shows how this WHERE-clause is applied:

TestBig: check record with Id = 01046189 (compare: 20860 and 36367)
evaluate_join_record()
    Item_func_eq::val_int()
        Arg_comparator::compare()
            Arg_comparator::compare_string()
                sortcmp()
                    my_strnncollsp_simple()

The last function returns a non-zero value so this record is thrown away and the next record is read from the table TestBig (the next record with the value '178324' in the column PZN).

COND

On entering the function evaluate_join_record() the (partial) WHERE-clause is extracted, a variable of type COND * points to this object.

COND is nothing more than an alias for the class Item which is defined in the file sql/item.h. Item is the base-class for a lot of classes used for bringing the WHERE-clause into an internal form. You can find a class-hierrachy here: Item Class Reference. Don't let this page confuse you. As you've seen in the hierarchies above the server used some of them to construct our WHERE-clause. And thanks to inheritance and object-orientation it was fairly easy to apply this WHERE-clause to a record.

correctness

This text is still a simplified presentation of what's going on in the software. In case something is wrong with my description please use the mail-function of this site and send me a mail describing my error. I will look at it. Thanks.

Wednesday, 5 November 2014

JOIN (2)

here I'm presentiung some more information about how MySQL/MariaDB handles a JOIN.

In my post JOIN I described what the storage-engine does in executing a simple (but non-trivial) SQL-statement. Here I take the same statement and describe what the database-server does. So in my last post my description started when the server calls a function in the engine, in this post my description will end when such a function is called. The last post described function-hierarchies in the MyISAM-engine, this post will describe hierarchies above the engine-level and therefore valid for all storage-engines.

environment

The environment is the same as in the post JOIN. The are two tables: TestSmall with approx. 300K records in it and TestBig with 10mio. records in it. The structure of the tables is the same as before. And on both tables there is an index on the column PZN.

statement

And here is the statement I want to inspect:

MariaDB [TestOpt]> select SQL_NO_CACHE count(A.PZN) from TestSmall A straight_join TestBig B on (A.PZN = B.PZN) where A.Hersteller = '00020' and B.Hersteller = '36367';
+--------------+
| count(A.PZN) |
+--------------+
|           14 |
+--------------+
1 row in set (0.48 sec)

MariaDB [TestOpt]> explain select SQL_NO_CACHE count(A.PZN) from TestSmall A straight_join TestBig B on (A.PZN = B.PZN) where A.Hersteller = '00020' and B.Hersteller = '36367'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: A
         type: ALL
possible_keys: PZN
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 301036
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: B
         type: ref
possible_keys: PZN
          key: PZN
      key_len: 8
          ref: TestOpt.A.PZN
         rows: 448
        Extra: Using where
2 rows in set (0.00 sec)

MariaDB [TestOpt]>

Also included is the EXPLAIN for this SQL-statement.

pseudocode

Here is what the server does (in pseudocode): the server scans the table TestSmall and fetches record after record (table-scan). For each record the WHERE-clause will be applied and if a match is found the server extracts from the current record the value of the column PZN, switches over to the table TestBig and searches all records with this value. For this operation it uses the index on the column PZN of TestBig. Again for each record found in TestBig the WHERE-clause is applied, in the case of a match this record is handled.
This process is continued in TestBig until all records with this PZN-value are examined. It then switches back to the table TestSmall and continues scanning through this table.
The whole process stops when all records in TestSmall are read.

hierarchy

Instead of long explanations here is the function-hierarchy for executing this statement:

 JOIN::exec()
     JOIN::exec_inner()
         do_select()
             sub_select()
                 // access the first record:
                 (*join_tab->read_first_record)() -> join_init_read_record()
                     init_read_record()
                         table->file->ha_rnd_init_with_error() -> handler::ha_rnd_init_with_error()
                             ha_rnd_init() -> handler::ha_rnd_init(
                                 rnd_init() -> ha_myisam::rnd_init
                         (*tab->read_record.read_record)() -> rr_sequential()
                              info->table->file->ha_rnd_next() -> handler::ha_rnd_next()
                                  rnd_next() -> ha_myisam::rnd_next()
                 evaluate_join_record()

                 // iterate: read all records until match
                info->read_record() -> rr_sequential()
                    info->table->file->ha_rnd_next() -> handler::ha_rnd_next()
                        rnd_next() -> ha_myisam::rnd_next()
                evaluate_join_record()
                // until END_OF_FILE

The whole work is done in the function sub_select(). The function is entered with 2 objects plus a boolean, let's look at the objects: an object of class JOIN and an object of struct JOIN_TAB. The class JOIN contains information about all tables to be accessed in executing this query, the struct JOIN_TAB contains information about one table, the table that is inspected by this call of sub_select().

If you want to look into the data of these parameters please add the following lines in the code of the function sub_select() (found in the file sql/sql_select.cc) after entrance into this function:

  TABLE ** ptrTables = join->table;
  char *ptrTable ;
  fprintf(stderr, "entering sub_select:\n\tstructure JOIN:\n");
  while ( *ptrTables != NULL )
  {
      ptrTable = (*ptrTables)->s->table_name.str;
      fprintf(stderr, "\t\ttable: <%s>\n", ptrTable);
      ++ptrTables;
  }
  char *ptr2ndTable = join_tab->table->s->table_name.str;
  fprintf(stderr, "\tstructure JOIN_TAB:\n\t\ttable: <%s>\n", ptr2ndTable);

And here is the output found in the log-file (on my machine it's in /var/log/mysql/error.log):

entering sub_select: 
    structure JOIN:
        table: <TestSmall>
        table: <TestBig>
    structure JOIN_TAB:
        table: <TestSmall>

As you can see the JOIN contains information about the tables TestSmall and TestBig (in this order), JOIN_TAB contains information about TestSmall and we will start with this table.

Coming back to the function-call-hierarchy given above let's break it into pieces and look at it. As reading the first record may need an extra step for initialization this access is handled separately:

in sub_select():
                 (*join_tab->read_first_record)() -> join_init_read_record()
                     init_read_record()
                         table->file->ha_rnd_init_with_error() -> handler::ha_rnd_init_with_error()
                             ha_rnd_init() -> handler::ha_rnd_init()
                                 rnd_init() -> ha_myisam::rnd_init()
                         (*tab->read_record.read_record)() -> rr_sequential()
                              info->table->file->ha_rnd_next() -> handler::ha_rnd_next()
                                  rnd_next() -> ha_myisam::rnd_next()
                 evaluate_join_record()

The last function in this hierarchy is evaluate_join_record(). In this function the record read is examined (= the WHERE-clause will be applied).

After the first record is read and processed all following records are read by this loop (again I show you the hierarchy of function-calls, not the full code executed):

in sub_select():
                // iterate: read all records until match
                info->read_record() -> rr_sequential()
                    info->table->file->ha_rnd_next() -> handler::ha_rnd_next()
                        rnd_next() -> ha_myisam::rnd_next()
                evaluate_join_record()
                // until END_OF_FILE

That's all.

But I've only shown how the table TestSmall is accessed. So where does the code jump over and accesses TestBig? This happens in the function evaluate_join_record(). Here the WHERE-clause will be applied to the current record (from TestSmall) and in case of a match the function sub_select() is called (a recursion). This is the line of code in evaluate_join_record() that calls sub_select():

      /* A match from join_tab is found for the current partial join. */
      rc= (*join_tab->next_select)(join, join_tab+1, 0);

On entering the function sub_select() again the parameters now look like:

entering sub_select: 
    structure JOIN:
        table: <TestSmall>
        table: <TestBig>
    structure JOIN_TAB:
        table: <TestBig>

You see that the data in JOIN is identical to the first entry but the value of the JOIN_TAB now points to data regarding TestBig.

As described above the code in sub_select() checks for the first record, now in the table TestBig:

in sub_select():
                        // search for the first record:
                        (*join_tab->read_first_record)() -> join_read_always_key()
                             table->file->ha_index_init() -> handler::ha_index_init()
                                 index_init() -> ha_myisam::index_init()
                             table->file->ha_index_read_map() -> handler::ha_index_read_map()
                                 index_read_map() -> ha_myisam::index_read_map()
                         evaluate_join_record()

Seems like an identical hierarchy but the functions called differ because of a different JOIN_TAB.

If there are more records in TestBig these are fetched by these function-calls (still in sub_select()):

in sub_select():
                         // iterate: search all records until a no-match
                         info->read_record() -> join_read_next_same()
                             table->file->ha_index_next_same() -> handler::ha_index_next_same()
                                 index_next_same() -> ha_myisam::index_next_same()
                         evaluate_join_record()
                         // until NESTED_LOOP_NO_MORE_ROWS

And this finishes our journey through the handling of the given SQL-statement.