Monday, 17 April 2017

subselect: execution (5)


I want to come back to my initial plan and look into the code, following the flow of C-statements while executing one given SQL-statement. For this text I will look in detail how the server executes the WHERE-clause. I will start my description with the the WHERE applied to the table TestBig (the subselect-part), and then look at the handling of the WHERE for the table TestSmall.


warning

Before I describe my observation let me issue a warning: the functions, hierarchies and data-examples shown here are valid for the environment on my PC. If you do your own tests this may look a bit different.


environment

Again I'm using my usual environment which I already described in one of my older posts. It's the same tables, the same data in the tables, it's even the same SQL-statement as in my last post. But for this post my focus changes. In this text I will look into the code that realizes the WHERE-clause.


the SQL-statement

For this text this is the statement I want to inspect:
MariaDB [TestOpt]> select SQL_NO_CACHE   B.PZN, B.ArtikelText  from TestSmall B where exists ( select 1 from TestBig A where A.Hersteller = '36367' and A.PZN = B.PZN) and B.Hersteller = '00020';
+--------+----------------------------+
| PZN    | ArtikelText                |
+--------+----------------------------+
| 178324 | TINY TOON ADVENTURES KIND  |
| 87886  | MONDOS KONDOME COLOR       |
| 87923  | MONDOS KONDOME NATURE      |
| 87952  | MONDOS KONDOME XXL         |
+--------+----------------------------+
4 rows in set (25 min 4.84 sec)

MariaDB [TestOpt]> 



WHERE

As I've written in my last text the WHERE-clause for the table TestBig looks like that:

For this text I will sharpen my look and will only look at this detail: the WHERE-clause. So where is the code that executes this WHERE?


the code

You will find the code for handling the WHERE in the file sql_select.cc:
static enum_nested_loop_state
evaluate_join_record(JOIN *join, JOIN_TAB *join_tab,
                     int error)
{
  ....
  COND *select_cond= join_tab->select_cond;
  ....
  if (select_cond)
  {
    select_cond_result= MY_TEST(select_cond->val_int());
  ....
}
I've marked the relevant part in bold.

MY_TEST is simply a macro that converts any output into 1 or 0. So the WHERE-clause is represented by the call of val_int() of the object pointed to by the variable select_cond. That's all.

Naturally that's not all. For this concrete example I will show you how the tree, given in graphical form above, is handled by this simple call. The tree is constructed somewhere else, but this is not part of this text. So let's simply start with an example to show the function-hierarchy that is equivalent to the tree.


applying WHERE to TestBig

And as I've already written I will start with the WHERE applied to records read from the table TestBig.

So here is the record that I will examine:
MariaDB [TestOpt]> select * from TestBig limit 1;
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
| Id       | PZN    | EVP    | HAP  | ArtikelBez                         | ArtikelText                | Hersteller |
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
| 01000000 | 999999 | 132.36 | 0.00 | UROSAFE BEINBEUTEL 7732C      1  L | SPRING DE LU AM K1 NO TOP5 | 25677      |
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
1 row in set (0.00 sec)

MariaDB [TestOpt]>

As you can see the value in the column Hersteller does not match the condition so the left part of the tree results in false. As the right and the left part of the tree are combined via an AND-statement the right part does not need to be examined.

first example: left part of the tree

For the record given here his is the hierarchy that detects and handles this:
Item_cond_and::val_int()        called via the variable select_cond
    Item::val_bool()
        Item_func_eq::val_int()
            Arg_comparator::compare()
                Arg_comparator::compare_string()
                    Item_field::val_str()
                    Item_string::val_str()
                    sortcmp()

The function sortcmp() returns with a value of -1 ( = not equal to 0) because the values differ (36367 ≠ 25677). So the code jumps back, the function item_func_eq::val_int() sets the return-value to 0 (= false) and at the AND-level will returns with false as a return-value.
This record from TestBig is thrown away. The server continues with reading the next record from TestBig.

second example: right part of the tree

The next record fetched from TestBig looks like this:
MariaDB [TestOpt]> select * from TestBig limit 1,1;
+----------+------+------+-------+-------------------------+---------------------+------------+
| Id       | PZN  | EVP  | HAP   | ArtikelBez              | ArtikelText         | Hersteller |
+----------+------+------+-------+-------------------------+---------------------+------------+
| 01000001 | 111  | 0.00 | 18.91 | PK FUER TIERE    250 ML | URETERSCHIENEAC4P66 | 36367      |
+----------+------+------+-------+-------------------------+---------------------+------------+
1 row in set (0.00 sec)

MariaDB [TestOpt]> 

As you can see for this record the column Hersteller contains the correct value so the left part of the tree returns true. Now the right part has to be examined. As I've already described the left part of the tree I will generously skip over this part. Here is how this record is checked:
Item_cond_and::val_int()        called via the variable select_cond
    // checking the column Hersteller, same as above, omitted here
    // as the return-value is true for this case the comparison continues:
    Item::val_bool():   
        Item_func_eq::val_int()
            Arg_comparator::compare()
                Arg_comparator::compare_string()
                    Item_field::val_str()
                    Item_field::val_str()
                    sortcmp()

The only difference to the hierarchy given above is inside the function compare_string(). Now two columns have to be compared instead of a column with a string-constant, so compare TestSmall.PZN and TestBig.PZN. As the values are 111 (for A aka TestBig) and 12 (for B = TestSmall, shown later in this text) so the comparisons results in false (aka not equal) and this record is thrown away. The server continues with reading the next record from TestBig and so on.

As you can see from the box in the beginning of this text there are 4 records that will be returned by this statement so there must be 4 records in the table TestBig where both parts of the tree result in true. These records are handled further by the server but this is not part of this text.

coding the AND

Here is the code that realizes the AND:
longlong Item_cond_and::val_int()
{
  DBUG_ASSERT(fixed == 1);
  List_iterator_fast li(list);
  Item *item;
  null_value= 0;
  while ((item=li++))
  {
    if (!item->val_bool())
    {
      if (abort_on_null || !(null_value= item->null_value))
    return 0;               // return FALSE
    }
  }
  return null_value ? 0 : 1;
}

The while-loop realizes the AND, that's all. If one of the computations inside the brackets returns a false the whole statement is false (it's an AND !) so the function is left with a return-value FALSE.


applying WHERE to TestSmall

So I've shown how the WHERE in the subselect-part of the statement is handled. Now I want to show how the WHERE is applied to the table TestSmall. And this is, in graphical form, the condition to be applied:

third example: checking TestSmall

Here is the first record from TestSmall that that matches the WHERE-condition:
MariaDB [TestOpt]> select * from TestSmall where Hersteller = '00020' limit 1;
+---------+------+------+------+--------------------------------+----------------------------+------------+
| Id      | PZN  | EVP  | HAP  | ArtikelBez                     | ArtikelText                | Hersteller |
+---------+------+------+------+--------------------------------+----------------------------+------------+
| 1002100 | 12   | 3.95 | 1.83 | TINY TOON ADVENTURES KIND 1 ST | TINY TOON ADVENTURES KIND  | 00020      |
+---------+------+------+------+--------------------------------+----------------------------+------------+
1 row in set (0.04 sec)

MariaDB [TestOpt]> 

Again I omit the left part of the tree as it's already shown above. So let's look at what happens on the right side of this tree. Here is the function-hierarchy:
Item_cond_and::val_int()        called via the variable select_cond
    // checking the column Hersteller, same as above, omitted here
    // as the return-value is true for this case the comparison continues:
    Item_cache_wrapper::val_bool()
        Item_cache_wrapper::cache()
            Item_cache_int::cache_value()
                Item::val_int_result() 
                    Item_exists_subselect::val_int()
                        Item_subselect::exec()
                             subselect_single_select_engine::exec()
                                JOIN::exec()
                                    JOIN::exec_inner()
                                        do_select()
                                            sub_select() 
                                                // read record from table
                                                // evaluate this record
                                                // until EOF

In the graphical representation the right part of the tree simply contains the box SUBSELCT. This is a statement of its own which is handled as (partially) described above. The result is put into a cache. The hierarchy shows in the lower parts how the table TestBig is scanned and in the upper parts how the cache is checked for results. But that's another story.



correctness

This text is a simplified presentation of what's going on in the software. As I'm still learning the inner-workings of this software errors on my side can happen. In case something is wrong with my description please use the mail-function of this site and send me a mail describing my error. I will look at it. Thanks.