Friday, 14 August 2015

Impressum

to be compliant with German law I have to add these lines:

Angaben gemäß § 5 TMG:

August Quint
Gronaustrasse 7
D-65205 Wiesbaden

Kontakt:

Mail: August.Quint@googlemail.com

Verantwortlich für den Inhalt nach § 55 Abs. 2 RStV:

August Quint

Haftungsausschluss (Disclaimer)

Haftung für Inhalte
Als Diensteanbieter sind wir gemäß § 7 Abs.1 TMG für eigene Inhalte auf diesen Seiten nach den allgemeinen Gesetzen verantwortlich. Nach §§ 8 bis 10 TMG sind wir als Diensteanbieter jedoch nicht verpflichtet, übermittelte oder gespeicherte fremde Informationen zu überwachen oder nach Umständen zu forschen, die auf eine rechtswidrige Tätigkeit hinweisen. Verpflichtungen zur Entfernung oder Sperrung der Nutzung von Informationen nach den allgemeinen Gesetzen bleiben hiervon unberührt. Eine diesbezügliche Haftung ist jedoch erst ab dem Zeitpunkt der Kenntnis einer konkreten Rechtsverletzung möglich. Bei Bekanntwerden von entsprechenden Rechtsverletzungen werden wir diese Inhalte umgehend entfernen.
Haftung für Links
Unser Angebot enthält Links zu externen Webseiten Dritter, auf deren Inhalte wir keinen Einfluss haben. Deshalb können wir für diese fremden Inhalte auch keine Gewähr übernehmen. Für die Inhalte der verlinkten Seiten ist stets der jeweilige Anbieter oder Betreiber der Seiten verantwortlich. Die verlinkten Seiten wurden zum Zeitpunkt der Verlinkung auf mögliche Rechtsverstöße überprüft. Rechtswidrige Inhalte waren zum Zeitpunkt der Verlinkung nicht erkennbar. Eine permanente inhaltliche Kontrolle der verlinkten Seiten ist jedoch ohne konkrete Anhaltspunkte einer Rechtsverletzung nicht zumutbar. Bei Bekanntwerden von Rechtsverletzungen werden wir derartige Links umgehend entfernen.
Urheberrecht
Die durch die Seitenbetreiber erstellten Inhalte und Werke auf diesen Seiten unterliegen dem deutschen Urheberrecht. Die Vervielfältigung, Bearbeitung, Verbreitung und jede Art der Verwertung außerhalb der Grenzen des Urheberrechtes bedürfen der schriftlichen Zustimmung des jeweiligen Autors bzw. Erstellers. Downloads und Kopien dieser Seite sind nur für den privaten, nicht kommerziellen Gebrauch gestattet. Soweit die Inhalte auf dieser Seite nicht vom Betreiber erstellt wurden, werden die Urheberrechte Dritter beachtet. Insbesondere werden Inhalte Dritter als solche gekennzeichnet. Sollten Sie trotzdem auf eine Urheberrechtsverletzung aufmerksam werden, bitten wir um einen entsprechenden Hinweis. Bei Bekanntwerden von Rechtsverletzungen werden wir derartige Inhalte umgehend entfernen.

Datenschutzerklärung:

Datenschutz
Die Nutzung unserer Webseite ist in der Regel ohne Angabe personenbezogener Daten möglich. Soweit auf unseren Seiten personenbezogene Daten (beispielsweise Name, Anschrift oder eMail-Adressen) erhoben werden, erfolgt dies, soweit möglich, stets auf freiwilliger Basis. Diese Daten werden ohne Ihre ausdrückliche Zustimmung nicht an Dritte weitergegeben.
Wir weisen darauf hin, dass die Datenübertragung im Internet (z.B. bei der Kommunikation per E-Mail) Sicherheitslücken aufweisen kann. Ein lückenloser Schutz der Daten vor dem Zugriff durch Dritte ist nicht möglich.
Der Nutzung von im Rahmen der Impressumspflicht veröffentlichten Kontaktdaten durch Dritte zur Übersendung von nicht ausdrücklich angeforderter Werbung und Informationsmaterialien wird hiermit ausdrücklich widersprochen. Die Betreiber der Seiten behalten sich ausdrücklich rechtliche Schritte im Falle der unverlangten Zusendung von Werbeinformationen, etwa durch Spam-Mails, vor.

Datenschutzerklärung für die Nutzung von Google Analytics
Diese Website benutzt Google Analytics, einen Webanalysedienst der Google Inc. ("Google"). Google Analytics verwendet sog. "Cookies", Textdateien, die auf Ihrem Computer gespeichert werden und die eine Analyse der Benutzung der Website durch Sie ermöglichen. Die durch den Cookie erzeugten Informationen über Ihre Benutzung dieser Website werden in der Regel an einen Server von Google in den USA übertragen und dort gespeichert. Im Falle der Aktivierung der IP-Anonymisierung auf dieser Webseite wird Ihre IP-Adresse von Google jedoch innerhalb von Mitgliedstaaten der Europäischen Union oder in anderen Vertragsstaaten des Abkommens über den Europäischen Wirtschaftsraum zuvor gekürzt.
Nur in Ausnahmefällen wird die volle IP-Adresse an einen Server von Google in den USA übertragen und dort gekürzt. Im Auftrag des Betreibers dieser Website wird Google diese Informationen benutzen, um Ihre Nutzung der Website auszuwerten, um Reports über die Websiteaktivitäten zusammenzustellen und um weitere mit der Websitenutzung und der Internetnutzung verbundene Dienstleistungen gegenüber dem Websitebetreiber zu erbringen. Die im Rahmen von Google Analytics von Ihrem Browser übermittelte IP-Adresse wird nicht mit anderen Daten von Google zusammengeführt.
Sie können die Speicherung der Cookies durch eine entsprechende Einstellung Ihrer Browser-Software verhindern; wir weisen Sie jedoch darauf hin, dass Sie in diesem Fall gegebenenfalls nicht sämtliche Funktionen dieser Website vollumfänglich werden nutzen können. Sie können darüber hinaus die Erfassung der durch das Cookie erzeugten und auf Ihre Nutzung der Website bezogenen Daten (inkl. Ihrer IP-Adresse) an Google sowie die Verarbeitung dieser Daten durch Google verhindern, indem sie das unter dem folgenden Link verfügbare Browser-Plugin herunterladen und installieren: http://tools.google.com/dlpage/gaoptout?hl=de.

Datenschutzerklärung für die Nutzung von Google Adsense
Diese Website benutzt Google AdSense, einen Dienst zum Einbinden von Werbeanzeigen der Google Inc. ("Google"). Google AdSense verwendet sog. "Cookies", Textdateien, die auf Ihrem Computer gespeichert werden und die eine Analyse der Benutzung der Website ermöglicht. Google AdSense verwendet auch so genannte Web Beacons (unsichtbare Grafiken). Durch diese Web Beacons können Informationen wie der Besucherverkehr auf diesen Seiten ausgewertet werden.
Die durch Cookies und Web Beacons erzeugten Informationen über die Benutzung dieser Website (einschließlich Ihrer IP-Adresse) und Auslieferung von Werbeformaten werden an einen Server von Google in den USA übertragen und dort gespeichert. Diese Informationen können von Google an Vertragspartner von Google weiter gegeben werden. Google wird Ihre IP-Adresse jedoch nicht mit anderen von Ihnen gespeicherten Daten zusammenführen.
Sie können die Installation der Cookies durch eine entsprechende Einstellung Ihrer Browser Software verhindern; wir weisen Sie jedoch darauf hin, dass Sie in diesem Fall gegebenenfalls nicht sämtliche Funktionen dieser Website voll umfänglich nutzen können. Durch die Nutzung dieser Website erklären Sie sich mit der Bearbeitung der über Sie erhobenen Daten durch Google in der zuvor beschriebenen Art und Weise und zu dem zuvor benannten Zweck einverstanden.

Datenschutzerklärung für die Nutzung von Google +1
Erfassung und Weitergabe von Informationen:
Mithilfe der Google +1-Schaltfläche können Sie Informationen weltweit veröffentlichen. über die Google +1-Schaltfläche erhalten Sie und andere Nutzer personalisierte Inhalte von Google und unseren Partnern. Google speichert sowohl die Information, dass Sie für einen Inhalt +1 gegeben haben, als auch Informationen über die Seite, die Sie beim Klicken auf +1 angesehen haben. Ihre +1 können als Hinweise zusammen mit Ihrem Profilnamen und Ihrem Foto in Google-Diensten, wie etwa in Suchergebnissen oder in Ihrem Google-Profil, oder an anderen Stellen auf Websites und Anzeigen im Internet eingeblendet werden.
Google zeichnet Informationen über Ihre +1-Aktivitäten auf, um die Google-Dienste für Sie und andere zu verbessern. Um die Google +1-Schaltfläche verwenden zu können, benötigen Sie ein weltweit sichtbares, öffentliches Google-Profil, das zumindest den für das Profil gewählten Namen enthalten muss. Dieser Name wird in allen Google-Diensten verwendet. In manchen Fällen kann dieser Name auch einen anderen Namen ersetzen, den Sie beim Teilen von Inhalten über Ihr Google-Konto verwendet haben. Die Identität Ihres Google-Profils kann Nutzern angezeigt werden, die Ihre E-Mail-Adresse kennen oder über andere identifizierende Informationen von Ihnen verfügen.

Monday, 3 August 2015

optimizer: indexes (2)

some time ago I started to describe what's happening in the code of (mostly) MariaDB when it handles a SQL-statement. The SQL-statement in this series of posts is always the same, the structure of the tables involved and the data stored in the tables are identical, I've changed only a few parts around it and looked how the code reactss on these modifications. With this post I would like to continue this.

This text here belongs to the series that describes the handling of multiple indexes on one table. These are the older posts:

this series started here: indexes
one minor piece omitted in the above post was described here: 0xA8
another tiny piece omitted but described in detail here: 0xCA

In this post I would like to continue this description.

the environment

I want to describe the environment I'm using (again).

For this series of posts I'm looking at how the optimizer in the code of MariaDB¹⁾ inspects a SQL-statement and finds a good plan for collecting the data requested in the SQL-statement. In this post I look at how the optimizer treated indexes on tables so I created these indexes:

MariaDB [TestOpt]> show index from TestSmall;
Empty set (0.00 sec)

MariaDB [TestOpt]> show index from TestBig;  
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table   | Non_unique | Key_name   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| TestBig |          1 | EVP        |            1 | EVP         | A         |       11350 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Ind1       |            1 | PZN         | A         |       22321 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Ind1       |            2 | ArtikelText | A         |     1666666 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Ind1       |            3 | Hersteller  | A         |    10000000 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | HAP        |            1 | HAP         | A         |       10162 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Ind2       |            1 | Hersteller  | A         |        2581 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Ind2       |            2 | ArtikelText | A         |    10000000 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Ind2       |            3 | PZN         | A         |    10000000 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | PZN        |            1 | PZN         | A         |       22321 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | Hersteller |            1 | Hersteller  | A         |        2581 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | PznHerst   |            1 | PZN         | A         |       22321 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | PznHerst   |            2 | Hersteller  | A         |     1111111 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | HerstPzn   |            1 | Hersteller  | A         |        2581 |     NULL | NULL   | YES  | BTREE      |         |               |
| TestBig |          1 | HerstPzn   |            2 | PZN         | A         |     1111111 |     NULL | NULL   | YES  | BTREE      |         |               |
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
14 rows in set (0.00 sec)

MariaDB [TestOpt]>

As you can see, the table TestSmall contains no indexes at all, the table TestBig contains 8 indexes. Please do not try to see any meaning in these indexes, they are intended for the demonstration here.

the statement

This is the statement I want to follow it's way through the code of MariaDB:

MariaDB [(none)]> use TestOpt;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [TestOpt]> select SQL_NO_CACHE  count(A.PZN) from TestSmall A join TestBig B on (A.PZN = B.PZN) where A.Hersteller = '00020' and B.Hersteller = '36367';
+--------------+
| count(A.PZN) |
+--------------+
|           14 |
+--------------+
1 row in set (0.45 sec)

MariaDB [TestOpt]> explain select SQL_NO_CACHE  count(A.PZN) from TestSmall A join TestBig B on (A.PZN = B.PZN) where A.Hersteller = '00020' and B.Hersteller = '36367'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: A
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 301036
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: B
         type: ref
possible_keys: Ind1,Ind2,PZN,Hersteller,PznHerst,HerstPzn
          key: PznHerst
      key_len: 14
          ref: TestOpt.A.PZN,const
         rows: 9
        Extra: Using where; Using index
2 rows in set (0.00 sec)

MariaDB [TestOpt]>

I've included the explain of the statement.

You can see the optimizer chooses to start with the table TestSmall doing a full table-scan (no index available) and reading 300K records. After reading one record from TestSmall it applies the WHERE-clause to this record and when a match is found it switches over to the table TestBig and searches for records matching the value for Hersteller (from the WHERE-clause) and PZN (from the record just read from TestSmall) using the index PznHerst.

warning

As in the last posts I did all my tests with MariaDB, the code-examples presented here are taken from the source-code of MariaDB 10.0.10, but the explanation should also be valid for other versions and for MySQL²⁾.

The example uses MyISAM as the storage engine for the tables and indexes, some (but not all) of the statements here are only valid for this engine.

Please do not generalize the results presented here. A lot of the data presented here depends on the data in the tables, and this is my data. If you do your own tests don't expect to see the same numbers or plans.

prerequisites

In this post I ended with the cost-values for accessing a table alone. Let me repeat the results:

TestSmall:
found_records  =    301,036
read_time      =      6,127
s->worst_seeks =     18,382
TestBig:
found_records  = 10,000,000
read_time      =    211,613
s->worst_seeks =    634,840

As there are indexes defined on the table TestBig the number of records to inspect when this table is accessed via an index are computed:

table TestBig:
index 3 = Ind2       = 204,922 records
index 5 = Hersteller = 455,132 records
index 7 = HerstPzn   = 380,971 records

Please keep in mind that these are estimated values.

choosing a plan

A lot of words until here, so I want to come to the topic of this post: how does the code make the decision for the best plan, the plan described in the explain above? Which alternatives are inspected? What are the arguments for this plan? Why are alternatives rejected?

Here is the function hierarchy of the handling of this statement during inspection and decision-making:

make_join_statistics()
    choose_plan()
        greedy_search()
            best_extension_by_limited_search()
                best_access_path()        --> (1)     will explain this some lines later
                --> (2)
                best_extension_by_limited_search()
                    best_access_path()    --> (3)
                --> (4)
                best_access_path()        --> (5)
                --> (6)
                best_extension_by_limited_search()
                    best_access_path()     --> (7)
                --> (8)

warning: this hierarchy is valid for the SQL-statement given above and for the environment I am working in. If you do your own tests this hierarchy may look differently. And in the hierarchy I've added some hints to steps which I would like to explain in some details later.

Let's start with an overview of the steps involved in the inspection of this statement:
The strategy is to start with a table and calculate the cost of accessing this table alone. This plan is extended by adding the second table to it and calculating the cost of accessing the records of the second table, finally the cost of accessing these 2 tables is calculated. And as no more table exist³⁾ this calculation is finished.
Now the code tries the opposite way: the inspection begins with the second table and calculates the costs of accessing this table alone. Then this plan is extended by adding the first table and calculate the cost of accessing this table is calculated, finally the costs are combined.

As we now have two values a comparison is possible and the plan with the lower cost-value is chosen, the other plan and its values are forgotten. As no other possibilities exists the computation stops and we have a plan to execute. So it starts executing the plan computed.

Let's look a bit deeper into the code and at first I want to show the places where I will show results of computations soon.
The hints above marked as (1), (3), (5) and (7) refer to these lines of code in the function best_access_path():

  /* Update the cost information for the current partial plan */
  pos->records_read= records;
  pos->read_time=    best;
  pos->key=          best_key;
  pos->table=        s;
  pos->ref_depend_map= best_ref_depends_map;
  pos->loosescan_picker.loosescan_key= MAX_KEY;
  pos->use_join_buffer= best_uses_jbuf;

the other hints refer to the function best_extension_by_limited_search() and especially these lines of code:

      /* Find the best access method from 's' to the current partial plan */
      POSITION loose_scan_pos;
      best_access_path(join, s, remaining_tables, idx, disable_jbuf,
                       record_count, join->positions + idx, &loose_scan_pos);

      /* Compute the cost of extending the plan with 's' */

      current_record_count= record_count * position->records_read;
      current_read_time=read_time + position->read_time +
                        current_record_count / (double) TIME_FOR_COMPARE;

So let's dive into the code and look what the results of the computations are.

Everything starts from scratch with an empty plan, so the first table inspected is the table TestBig.
Stop! Why do we start with the table TestBig? If you look at the SQL-statement this table is found in the second place and its the table with the most records in it. So why?

If you look into the code of choose_plan() you see this call near the beginning:

  my_qsort2(join->best_ref + join->const_tables,
            join->table_count - join->const_tables, sizeof(JOIN_TAB*),
            jtab_sort_func, (void*)join->emb_sjm_nest);

In this function the entries are sorted using the comparison-function join_tab_cmp() (via the var. jtab_sort_func). And in this function the number of records are compared:

  if (jt1->found_records > jt2->found_records)
    return 1;

and as the values are 301,036 (for jt1 aka. TestSmall) and 204,922 (for jt2 aka. TestBig) it returns a value of 1 which will result in swapping the entries, so the JOIN_TAB representing the table TestBig is in the first position of the list and therefore the code starts the inspection with this table.

And for the table TestBig it calculates the costs, so here is the result of the step marked with (1):

(1) TestBig: 
records_read: 204,922
read_time:    19,858
key:          3 = Ind2

The whole cost for accessing this table alone is computed as:

(2)
current_record_count= 204,922
current_read_time   = 60,843

Let's add the next table to this partial plan. So we inspect the table TestSmall and found this path:

(3) TestSmall:
records_read: 301,036
read_time:    128,679
key;          NULL

Now we put all the numbers together and calculate the cost of accessing both tables by the ways found so far:

(4) TestBig + TestSmall:
current_record_count= 61,688,899,192 = 204,922 * 301,036
current_read_time   = 12,337,969,360 =  60,843 + 128,679 + 61,688,899,192/5

If you look at the numbers you will see that the values involved in the computation are a combination of the numbers taken from step (2) and (3).

Here the inspection of this plan ends and we store it:

        if (current_read_time < join->best_read)  
        {
          memcpy((uchar*) join->best_positions, (uchar*) join->positions, 
                 sizeof(POSITION) * (idx + 1));
          join->record_count= partial_join_cardinality;
          join->best_read= current_read_time - 0.001;
        }

The best plan so far has this (initial) value: join->best_read = DBL_MAX = 1.7976931348623157e+308
so our plan just computed has a better value and is therefore copied to join->best_positions.

So we go back to the beginning but in this case we start with the next table in the list: TestSmall. Here are the results:

(5) TestSmall
records_read: 301036
read_time:    6127
key:          NULL

The whole cost for accessing this table alone is computed as:

(6)
current_record_count= 301,036
current_read_time   = 66,334

We extend this plan by adding the table TestBig. For accessing this table these costs are computed:

(7) TestBig:
records_read: 9
read_time:    391,573
key:          6  = PznHerst

From these numbers we calculate the cost of the total plan:

(8)
current_record_count= 2,709,324 = 301,036 * 9
current_read_time   = 999,772  = 66,334 + 391,573 + 2,709,324/5

As no more table is in our list the inspection stops here and we can compare the cost of this plan with the best plan so far:

        if (current_read_time < join->best_read)  
        {
          memcpy((uchar*) join->best_positions, (uchar*) join->positions,   
                 sizeof(POSITION) * (idx + 1));
          join->record_count= partial_join_cardinality;
          join->best_read= current_read_time - 0.001;
        }

and as 999,772 is less than 12,337,969,360 the code copies the plan to the final position.

We would go back to the beginning but as there are no more tables on the list the computation stops here and we finally have a plan to execute.

That's all. The best plan is found and in the next step it will be executed.

correctness

This text is a simplified presentation of what's going on in the software. In case something is wrong with my description please use the mail-function of this site and send me a mail describing my error. I will look at it. Thanks.

some notes:

1) in this post I'm using version 10.0.10

2) I didn't check this, sorry

3) valid for this test-case = for this statement