Friday, 10 July 2015

0xCA

In the last post I described a numerical value: 0xA8, which was used in the this post but I didn't explain it - the last post cleared this (I hope). Now if you read the post you will see another value used: 0xCA, again not explained so I want to use this post to explain the meaning of this value and where it comes from.

some hints

As in the last post I did all my tests here with MariaDB, the code-examples presented here are taken from the source-code of MariaDB 10.0.10, but the explanations should also be valid for MySQL.

Please do not generalize the results presented here. A lot of the data on this post depends on the data in the tables, and this is my data. If you do your own tests don't expect to see the same numbers (like the values 0xA8 in JOIN_TAB->const_keys).

usage

Where is the value 0xCA used? Let's simply look into the code:
/**
  Find shortest key suitable for full table scan.

  @param table                 Table to scan
  @param usable_keys           Allowed keys

  @return
    MAX_KEY     no suitable key found
    key index   otherwise
*/

uint find_shortest_key(TABLE *table, const key_map *usable_keys)
{
  double min_cost= DBL_MAX;
  uint best= MAX_KEY;
  if (!usable_keys->is_clear_all())
  {
    for (uint nr=0; nr < table->s->keys ; nr++)
    {
      if (usable_keys->is_set(nr))
      {
        double cost= table->file->keyread_time(nr, 1, table->file->records());
        if (cost < min_cost)
        {
          min_cost= cost;
          best=nr;
        }
      }
    }
  }
  return best;
}
As you can see it looks for the (estimated) fastest way to do a full table-scan using an index. For finding the best way it iterates on all indexes defined for a table and the parameter usable_keys with the value 0xCA tells the code which index contains usable information and therefore costs should be calculated.

And this is the function hierarchy:
make_join_statistics()
    get_quick_record_count()
        SQL_SELECT::test_quick_select()        parameter: &head->covering_keys
            find_shortest_key()                parameter: usable_keys
As shown above the parameter hast the name usable_keys in the function find_shortest_key(). One level up in the hierarchy it comes from head->covering_keys:
SQL_SELECT::test_quick_select(
    ........
    /* Calculate cost of full index read for the shortest covering index */
    if (!head->covering_keys.is_clear_all())
    {
      int key_for_use= find_shortest_key(head, &head->covering_keys);
      ....
    }
    .........
}
And head is a variable of type TABLE *, here is the definition:
class SQL_SELECT :public Sql_alloc {
 public:
  ....
  TABLE *head;
  ....
};
So let's look into the definition of a TABLE:
struct TABLE
{
  ....
  /* 
    Map of keys that can be used to retrieve all data from this table 
    needed by the query without reading the row.
  */
  key_map covering_keys;
  ....
};

where does it come from

Where does covering_keys gets it's value? Here this happens:
static void update_field_dependencies(THD *thd,
                                      Field *field,
                                      TABLE *table)
{
    ....
    table->covering_keys.intersect(field->part_of_key);   
    ....
  DBUG_VOID_RETURN;
}
This function is called with the parameters table TestBig and field Hersteller. In this call the operation is 0xFF AND 0xEA, resulting in a value of 0xEA. In a second call the field is PZN and its computing 0xEA AND 0xDA, resulting in a value of 0xCA, our value.

Some lines above I've shown where covering_keys belongs to, and I will soon explain where part_of_key belongs to.

Here is the function hierarchy:
handle_select()
    mysql_select()
        JOIN::prepare()
            setup_without_group()
                setup_conds()
                    Item_cond::fix_fields()
                        Item_func::fix_fields()
                            Item_field::fix_fields()
                                find_field_in_tables()
                                    find_field_in_table_ref()
                                        find_field_in_table()
                                            update_field_dependencies()
Warning:: the function-hierarchy shown here depends on the SQL-statement used. So in your tests this hierarchy may look different.

Now we have this status: I wanted to explain where the value 0xCA comes from and I said: take the values 0xFF, 0xEA and 0xDA and AND them together, this will produce the value 0xCA. So now I have to explain 3 values not one, a remarkable progress.

part_of_key

The member-var. part_of_key is defined in the class Field:
 class Field
{
  ......
  /* Field is part of the following keys */
  key_map key_start, part_of_key, part_of_key_not_clustered;
  ....
};
and it contains as a bitmap the information in which index this field is used. And the var. key_start was explained in the last post.

So here is a list of all indexes defined in my case on the table TestBig:
MariaDB [TestOpt]> show index from TestBig;  
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table   | Non_unique | Key_name   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| TestBig |          1 | EVP        |            1 | EVP         | A         |       11350 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Ind1       |            1 | PZN         | A         |       22321 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Ind1       |            2 | ArtikelText | A         |     1666666 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Ind1       |            3 | Hersteller  | A         |    10000000 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | HAP        |            1 | HAP         | A         |       10162 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Ind2       |            1 | Hersteller  | A         |        2581 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Ind2       |            2 | ArtikelText | A         |    10000000 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Ind2       |            3 | PZN         | A         |    10000000 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | PZN        |            1 | PZN         | A         |       22321 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Hersteller |            1 | Hersteller  | A         |        2581 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | PznHerst   |            1 | PZN         | A         |       22321 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | PznHerst   |            2 | Hersteller  | A         |     1111111 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | HerstPzn   |            1 | Hersteller  | A         |        2581 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | HerstPzn   |            2 | PZN         | A         |     1111111 |     NULL | NULL   | YES  | BTREE      |         |               |
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
14 rows in set (0.00 sec)

MariaDB [TestOpt]> 
The index with the name EVP is the first one in the list and in the bitmaps it is represented by the 0th bit. Similar the index with the name Ind1 is the second one defined and therefore represented by bit 1 in the bitmaps and so on. So I can explain the values used:
  • 0xFF = 011111111b: as we have 8 indexes this value represents an inital value: use all indexes
  • 0xEA = 011101010b: one may use indexes Ind1, Ind12, Hersteller, PznHerst and HerstPzn: all indexes which use the field Hersteller
  • 0xDA = 011011010b: one may use indexes Ind1, Ind2, PZN, PznHerst and HerstPzn: all indexes which use the field PZN

looking for part_of_key: You will find this code-excerpt in the function TABLE_SHARE::init_from_binary_frm_image():
    for (uint key=0 ; key < keys ; key++,keyinfo++)                     // iterate over the indexes defined
    {
      .....
      for (i=0; i < key_parts; key_part++, i++)                         // iterate over the parts of this index
      {
        Field *field;
        ....
        field= key_part->field= share->field[key_part->fieldnr-1];      // get the Field-object for this part of this index
        ....
        if (i == 0)                                                     // if this is the first part of the index
          field->key_start.set_bit(key);                                // set the corresponding bit in key_start (see last post)
        if (field->key_length() == key_part->length &&
            !(field->flags & BLOB_FLAG))
        {
          if (handler_file->index_flags(key, i, 0) & HA_KEYREAD_ONLY)
          {
            share->keys_for_keyread.set_bit(key);                       // a little of carelessness: it's called 14 times to set 8 bits
            field->part_of_key.set_bit(key);                         // here these bits are set
            if (i < keyinfo->user_defined_key_parts)
              field->part_of_key_not_clustered.set_bit(key);
          }
          if (handler_file->index_flags(key, i, 1) & HA_READ_ORDER)
            field->part_of_sortkey.set_bit(key);
        }
        ....
      }
      ....
    }

The value in covering all indexes (aka 0xFF) is copied here:
inline void setup_table_map(TABLE *table, TABLE_LIST *table_list, uint tablenr)
{
  ....
  table->covering_keys= table->s->keys_for_keyread;
  ....
}

With these lines I want to stop. I hope I could explain where the value 0xCA comes from and what the meaning of it is.


correctness

This text is still a simplified presentation of what's going on in the software. In case something is wrong with my description please use the mail-function of this site and send me a mail describing my error. I will look at it. Thanks.