an unsystematic journey through the source-code of MySQL and MariaDB: od

I would like to take you with me for a diving-tour through the data-file of the MyISAM-engine, the MYD-file. You will find this engine in MySQL and MariaDB.

You will find a description of the internals here. I want to add some information to this description and give you some examples.

od is a GNU-tool to dump the contents of a file in octal form (od = octal dump), but I would like to present the data in hexadecimal form.

So let's start.
There are three possible formats for the MYD-file: fixed, dynamic, and packed. As I've never used the packed format I can't say anything about this so I omit it. Let me start with the fixed format.

fixed format

I created a table in the fixed format, inserted 2 records into the table and verified it:

MariaDB [Test]> use Test;
Database changed
MariaDB [Test]> create table TestOD (                                                                                                                                                                             
    ->                 Id   char(8) not null,
    ->                 PZN  char(7) not null,
    ->                 EVP  char(8) not null,
    ->                 HAP  char(8) not null,
    ->                 ArtikelBez char(40) not null,
    ->                 ArtikelText char(26) not null,
    ->                 Hersteller  char(5) not null)
    ->                 engine = MyISAM
    -> ;
Query OK, 0 rows affected (0.04 sec)

MariaDB [Test]> insert into TestOD values( 12345678, 1234567, 1234.67, 123456.78, '12345678901234567890', 
    ->     '123456789012345678901234567890123456', '12345')
    -> ;
Query OK, 1 row affected, 2 warnings (0.00 sec)

MariaDB [Test]> insert into TestOD values( 223456789, 2345678, 2345.78, 234567.89, '234567890123456789012345678901234567890', 
    ->     '234567890123456789012345678901234567', '23456')
    -> ;
Query OK, 1 row affected, 3 warnings (0.00 sec)

MariaDB [Test]> select * from TestOD;
+----------+---------+---------+----------+-----------------------------------------+----------------------------+------------+
| Id       | PZN     | EVP     | HAP      | ArtikelBez                              | ArtikelText                | Hersteller |
+----------+---------+---------+----------+-----------------------------------------+----------------------------+------------+
| 12345678 | 1234567 | 1234.67 | 123456.7 | 12345678901234567890                    | 12345678901234567890123456 | 12345      |
| 22345678 | 2345678 | 2345.78 | 234567.8 | 234567890123456789012345678901234567890 | 23456789012345678901234567 | 23456      |
+----------+---------+---------+----------+-----------------------------------------+----------------------------+------------+
2 rows in set (0.01 sec)

MariaDB [Test]>

You can see that the table contains no columns of variable length. Now I want to look at the MYD-file:

august@AMD4:~/MariaDB/data/Test$ od -t x1 -Ax TestOD.MYD
000000  ff                                                  // flag
        31 32 33 34 35 36 37 38                             // column Id  
        31 32 33 34 35 36 37                                // column PZN
        31 32 33 34 2e 36 37                                // column EVP   
        20 31 32 33 34 35 36 2e 37                          // column HAP
        31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36     // column ArtikelBez
        37 38 39 30 20 20 20 20 20 20 20 20 20 20 20 20
        20 20 20 20 20 20 20 20 
        31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36     // column ArtikelText 
        37 38 39 30 31 32 33 34 35 36   
        31 32 33 34 35                                      // column Herssteller  
        
000067  ff                                                  // flag
        32 32 33 34 35 36 37 38                             // column Id
        32 33 34 35 36 37 38                                // column PZN
        32 33 34 35 2e 37 38                                // column EVP   
        20 32 33 34 35 36 37 2e 38                          // column HAP  
        32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37     // column ArtikelBez
        38 39 30 31 32 33 34 35 36 37 38 39 30 31 32 33
        34 35 36 37 38 39 30 20  
        32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37     // column ArtikelText  
        38 39 30 31 32 33 34 35 36 37                         
        32 33 34 35 36                                      // column Hersteller

I rearranged the output of the od-command and added some comments to show how to interpret the numbers. And all numbers are in hex.
The data begins with a flag which I would like to skip. After this byte the data is stored in consecutive form, one column after the other. As you can see the value of the column ArtikelBez is not filled up to it's full length witch characters in the INSERT-statement. In the first record it's only 20 bytes long and the server added 20 bytes of space (=0x20), in the second record one byte is missing and the server added one byte of space at the end of this column.

Next I delete the first record in this table:

MariaDB [Test]> delete from TestOD where Id = '12345678';
Query OK, 1 row affected (0.00 sec)

MariaDB [Test]> select * from TestOD;
+----------+---------+---------+----------+-----------------------------------------+----------------------------+------------+
| Id       | PZN     | EVP     | HAP      | ArtikelBez                              | ArtikelText                | Hersteller |
+----------+---------+---------+----------+-----------------------------------------+----------------------------+------------+
| 22345678 | 2345678 | 2345.78 | 234567.8 | 234567890123456789012345678901234567890 | 23456789012345678901234567 | 23456      |
+----------+---------+---------+----------+-----------------------------------------+----------------------------+------------+
1 row in set (0.00 sec)

MariaDB [Test]>

Here is how this looks like in the MYD-file:

august@AMD4:~/MariaDB/data/Test$ od -t x1 -Ax TestOD.MYD
000000  00                                                  // record is deleted
        ff ff ff ff ff ff                                   // deleted-flags
        37 38 31 32 33 34 35 36 37 31 32 33 34 2e 36 37     // garbage
        20 31 32 33 34 35 36 2e 37 31 32 33 34 35 36 37 
        38 39 30 31 32 33 34 35 36 37 38 39 30 20 20 20 
        20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 
        20 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 
        36 37 38 39 30 31 32 33 34 35 36 31 32 33 34 35 
         
000067  ff                                                  // flag 
        32 32 33 34 35 36 37 38                             // column Id
        32 33 34 35 36 37 38                                // column PZN
        32 33 34 35 2e 37 38                                // column EVP   
        20 32 33 34 35 36 37 2e 38                          // column HAP  
        32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37     // column ArtikelBez
        38 39 30 31 32 33 34 35 36 37 38 39 30 31 32 33
        34 35 36 37 38 39 30 31   
        32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37     // column ArtikelText  
        38 39 30 31 32 33 34 35 36 37                         
        32 33 34 35 36                                      // column Hersteller

enough written about this type of format. I want to switch to the other format.

moving to dynamic

Next I want to change one of the columns to one of variable length:

MariaDB [Test]> drop table TestOD;
Query OK, 0 rows affected (0.00 sec)

MariaDB [Test]> create table TestOD (                                                                                                                                                                             
    ->                 Id   char(8) not null,
    ->                 PZN  char(7) not null,
    ->                 EVP  char(8) not null,
    ->                 HAP  char(8) not null,
    ->                 ArtikelBez varchar(40) not null,
    ->                 ArtikelText char(26) not null,
    ->                 Hersteller  char(5) not null)
    ->                 engine = MyISAM
    -> ;
Query OK, 0 rows affected (0.03 sec)

MariaDB [Test]> insert into TestOD values( 12345678, 123, 12.34, 123456.78, '123456789012345', 
    ->     '12345678901234567890', '12')
    -> ;
Query OK, 1 row affected, 1 warning (0.01 sec)

MariaDB [Test]> insert into TestOD values( 23456789, 234, 234.56, 234567.89, '2345678901234567890', 
    ->     '23456789012345678901', '23456')
    -> ;
Query OK, 1 row affected, 1 warning (0.00 sec)

MariaDB [Test]> select * from TestOD;
+----------+-----+--------+----------+---------------------+----------------------+------------+
| Id       | PZN | EVP    | HAP      | ArtikelBez          | ArtikelText          | Hersteller |
+----------+-----+--------+----------+---------------------+----------------------+------------+
| 12345678 | 123 | 12.34  | 123456.7 | 123456789012345     | 12345678901234567890 | 12         |
| 23456789 | 234 | 234.56 | 234567.8 | 2345678901234567890 | 23456789012345678901 | 23456      |
+----------+-----+--------+----------+---------------------+----------------------+------------+
2 rows in set (0.00 sec)

MariaDB [Test]> desc TestOD;
+-------------+-------------+------+-----+---------+-------+
| Field       | Type        | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+-------+
| Id          | char(8)     | NO   |     | NULL    |       |
| PZN         | char(7)     | NO   |     | NULL    |       |
| EVP         | char(8)     | NO   |     | NULL    |       |
| HAP         | char(8)     | NO   |     | NULL    |       |
| ArtikelBez  | varchar(40) | NO   |     | NULL    |       |
| ArtikelText | char(26)    | NO   |     | NULL    |       |
| Hersteller  | char(5)     | NO   |     | NULL    |       |
+-------------+-------------+------+-----+---------+-------+
7 rows in set (0.01 sec)

MariaDB [Test]>

as you can see the table-definition is identical to the first example except that the column ArtikelBez iss now of type VARCHAR. So this must be a table of type dynamic. What does the data-file look like in hex:

august@AMD4:~/MariaDB/data/Test$ od -t x1 -a -Ax TestOD.MYD
// block #1:
0000000 03                                                  // block-type
        00 43                                               // actual record-length
        01                                                  // no. of filler-bytes at the end of this block
        36                                                  // flags
        31 32 33 34 35 36 37 38                             // column Id as CHAR
        03 31 32 33                                         // column PZN as VARCHAR
        05 31 32 2e 33 34                                   // column EVP as VARCHAR
        31 32 33 34 35 36 2e 37                             // column HAP as CHAR
        0f 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35     // column ArtikelBez as VARCHAR
        14 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36  // column ArtikelText as VARCHAR
           37 38 39 30  
        02 31 32                                            // column Hersteller as VARCHAR
        00                                                  // filler-bytes
// block #2:
000048 03                                                   // block-type
       00 4a                                                // actual record-length
       02                                                   // no. of filler-bytes at the end of this block
       16                                                   // flags
       32 33 34 35 36 37 38 39                              // column Id as CHAR
       03 32 33 34                                          // column PZN as VARCHAR
       06 32 33 34 2e 35 36                                 // column EVP as VARCHAR
       32 33 34 35 36 37 2e 38                              // column HAP as CHAR
       13 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36      // column ArtikelBez as VARCHAR
          37 38 39 30
       14 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36      // column ArtikelText as VARCHAR 
          37 38 39 30 
       31 32 33 34 35 36                                    // column Hersteller as CHAR
       00 00                                                // filler-bytes

So you see something we didn't intend: some of the CHAR-columns are treated as VARCHAR, e.g. PZN or ArtikelText. In the case of at least 1 column of a type with varying length MySQL/MariaDB will treat everything as a type with varying length in the data-file. And the VARCHAR-data begins with a length-byte followed by the data.

So let's do an UPDATE on the first record:

MariaDB [Test]> update TestOD set ArtikelBez = 'ABCDEFGHIJKLMOPQRSTUVWXYZ' where Id = '12345678';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

MariaDB [Test]> select * from TestOD;
+----------+-----+--------+----------+---------------------------+----------------------+------------+
| Id       | PZN | EVP    | HAP      | ArtikelBez                | ArtikelText          | Hersteller |
+----------+-----+--------+----------+---------------------------+----------------------+------------+
| 12345678 | 123 | 12.34  | 123456.7 | ABCDEFGHIJKLMOPQRSTUVWXYZ | 12345678901234567890 | 12         |
| 23456789 | 234 | 234.56 | 234567.8 | 2345678901234567890       | 23456789012345678901 | 23456      |
+----------+-----+--------+----------+---------------------------+----------------------+------------+
2 rows in set (0.00 sec)

MariaDB [Test]>

Everything looks fine, o let's look into the data-file:

august@AMD4:~/MariaDB/data/Test$ od -t x1 -Ax TestOD.MYD
// block #1: the first record, but only partial
000000 05                                                   // block-type
       00 4d                                                // actual record-length
       00 3b                                                // length of this block 
       00 00 00 00 00 00 00 98                              // position of next block
       36                                                   // flags
       31 32 33 34 35 36 37 38                              // column Id as CHAR 
       03 31 32 33                                          // column PZN as VARCHAR
       05 31 32 2e 33 34                                    // column EVP as VARCHAR
       31 32 33 34 35 36 2e 37                              // column HAP as CHAR
       19 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4f 50      // column ArtikelBez as VARCHAR
          51 52 53 54 55 56 57 58 59 5a
       14 31 32 33 34 35                                    // column ArtikelText as VARCHAR (partial)
       
// block #2: the second record
000048 03                                                   // block-type
       00 4a                                                // actual record-length
       02                                                   // no. of filler-bytes at the end of this block
       16                                                   // flags
       32 33 34 35 36 37 38 39                              // column Id as CHAR
       03 32 33 34                                          // column PZN as VARCHAR
       06 32 33 34 2e 35 36                                 // column EVP as VARCHAR
       32 33 34 35 36 37 2e 38                              // column HAP as CHAR 
       13 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36      // column ArtikelBez as VARCHAR
          37 38 39 30 
       14 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36      // column ArtikelText as VARCHAR
          37 38 39 30 31 
       32 33 34 35 36                                       // column Hersteller as CHAR
       00 00                                                // filler-bytes
       
// block #3: second block of the first record
000098 09                                                   // block-type
       00 12                                                // len of data in this block
       02                                                   // no. of filler-bytes at the end of this block
       36 37 38 39 30 31 32 33 34 35 36 37 38 39 30         // the missing part of column ArtikelText
       02 31 32                                             // column Hersteller as VARCHAR
       00 00                                                // filler-bytes

As you have already guessed it the first record has not enough room to hold the full record now: in the old record it contained 20 bytes in the column ArtikelBez, now it must hold 26 bytes in this column. So the server splits the data, writes the first part of the data in the space of the first record and searches for free space for the rest of the data. It finds free space only at the end of the file so it writes the rest of the data here.

blocks

When you look at the hex-values you see that the data-file consists of blocks and every block starts with a number that describes this block. As you can see in the last example this is not always 0x03. In the first example of a dynamic table we found 2 blocks of type 0x03, in the second example we had three blocks with types 0x005, ox03 and 0x09.
Following the block-type you see some values which I want to explain now. Here is a list of all possible blocks:

block-types:
 0             block contains deleted data (=free block)
     3 bytes   len of this block
     8 bytes   ptr. to next free block
     8 bytes   ptr. to previous free block

 1             block contains complete data
     2 bytes   len of this block (=len of data, =len of record)

 2             block contains complete data
     3 bytes   len of this block (=len of data, =len of record)

13             first block of this record
     4 bytes   len of record
     3 bytes   len of this block
     8 bytes   ptr. to next block of this record

 3             block contains complete data
     2 bytes   len of record
     1 byte    no. of filler bytes at end of record
               (len of block = len of record + this byte)

 4             block contains complete data
     3 bytes   len of record
     1 byte    no. of filler bytes at the end of record                
               (len of block = len of record + this byte)

 5             first block of this record
     2 bytes   len of record
     2 bytes   len of block
     8 bytes   ptr. to next block of this record
     
 6             first block of record
     3 bytes   len of record
     3 bytes   len of block
     8 bytes   ptr. to next block of this record
     
 7             last block of record
     2 bytes   len of this block (=len of data, =len of record)
     
 8             last block of this record
     3 bytes   len of this block (=len of data, =len of record)
     
 9             last block of this record
     2 bytes   len of data
     1 byte    no. of filler bytes at the end of record                
               (len of block = len of record + this byte)
               
10             last block of this record
     3 bytes   len of data
     1 byte    no. of filler bytes at the end of record                
               (len of block = len of record + this byte)
                    
11             another block of data
    2 bytes    len of data (=len of block)
    8 bytes    ptr. to next block of this record
    
12             another block of data
    3 bytes    len of data (=len of block)
    8 bytes    ptr. to next block of this record

These lines are taken from the file mi_dynrec.cc, look at the function _mi_get_block_info(). I simplified it a bit.

In the last example the data-file starts with a block of type 0x05 (=first block of this record). After this byte follows the length of this record (2 bytes), then the length of this block (2 bytes), and the ptr. to the next block of this record (ptr. = offset in file). And then the data of this record follows, as long as the block can hold.
Then comes the next block (= next record, unchanged from the previous example). And finally the next block of type 0x09 (=last block of a record) contains the rest of the data of the first record.

As you cans see every block of a data-file of a dynamic table begins with a type followed by some additional information. This header of a block can contain between 3 and 20 bytes.

modifying again

Now we have 2 records which created 3 blocks in the data-file. Let's add another record to this table:

MariaDB [Test]> insert into TestOD values( 34567890, 345, 345.67, 345678.90, '3456789012345678901234567890', 
    ->          '34567890123456789012', '3456')
    -> ;
Query OK, 1 row affected, 1 warning (0.00 sec)

MariaDB [Test]> select * from TestOD;
+----------+-----+--------+----------+------------------------------+----------------------+------------+
| Id       | PZN | EVP    | HAP      | ArtikelBez                   | ArtikelText          | Hersteller |
+----------+-----+--------+----------+------------------------------+----------------------+------------+
| 12345678 | 123 | 12.34  | 123456.7 | ABCDEFGHIJKLMOPQRSTUVWXYZ    | 12345678901234567890 | 12         |
| 23456789 | 234 | 234.56 | 234567.8 | 2345678901234567890          | 23456789012345678901 | 23456      |
| 34567890 | 345 | 345.67 | 345678.9 | 3456789012345678901234567890 | 34567890123456789012 | 3456       |
+----------+-----+--------+----------+------------------------------+----------------------+------------+
3 rows in set (0.00 sec)

MariaDB [Test]>

and here it is in hex:

august@AMD4:~/MariaDB/data/Test$ od -t x1 -Ax TestOD.MYD
// block #1: the firt record, but only partial
000000 05                                                   // block-type
       00 4d                                                // actual record-length
       00 3b                                                // len of blck
       00 00 00 00 00 00 00 98                              // ptr. to next block to record
       36                                                   // flags
       31 32 33 34 35 36 37 38                              // column Id as CHAR
       03 31 32 33                                          // column PZN as VARCHAR
       05 31 32 2e 33 34                                    // column EVP as VARCHAR
       31 32 33 34 35 36 2e 37                              // column HAP as CHAR
       19 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4f 50      // column ArtikelBez as VARCHAR
          51 52 53 54 55 56 57 58 59 5a 
       14 31 32 33 34 35                                    // column ArtikelText as VARCHAR (partial)
       
// block #2: the second record
000048 03                                                   // block-type
       00 4a                                                // actual record-length
       02                                                   // no. of filler-bytes at the end of this block
       16                                                   // flags
       32 33 34 35 36 37 38 39                              // column Id as CHAR
       03 32 33 34                                          // column PZN as VARCHAR
       06 32 33 34 2e 35 36                                 // column EVP as VARCHAR
       32 33 34 35 36 37 2e 38                              // column HAP as CHAR
       13 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36      // column ArtikelBez as VARCHAR
          37 38 39 30 
       14 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36      // column ArtikelText as VARCHAR
          37 38 39 30
       31 32 33 34 35 36                                    // column Hersteller as CHAR
       00 00 

// block #3: first record, second part
000098 09                                                   // block-type
       00 12                                                // actual record-length
       02                                                   // no. of filler-bytes at the end of this block
       36 37 38 39 30 31 32 33 34 35 36 37 38 39 30         // rest of coulmn ArtikelText
       02 31 32                                             // column Hersteller as VARCHAR
       00 00                                                // filler-bytes
       
// block #4: third record
0000b0 03                                                   // block-type
       00 53                                                // actual record-length
       01                                                   // no. of filler-bytes at the end of this block
       16                                                   // flags
       33 34 35 36 37 38 39 30                              // column Id as CHAR
       03 33 34 35                                          // column PZN as VARCHAR
       06 33 34 35 2e 36 37                                 // column EVP as VARCHAR
       33 34 35 36 37 38 2e 39                              // column HAP as CHAR
       1c 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37      // column ArtikelBez as VARCHAR
          38 39 30 31 32 33 34 35 36 37 38 39 30 
       14 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37      // column ArtikelText as VARCHAR
          38 39 30 31 32 
       33 34 35 36 20                                       // column Hersteller as CHAR
       00                                                   // filler-byte

I don't want to go into the details here because a lot of things repeat but I want to show you: our data-file now consists of 4 blocks:

block #1 (type = 0x05) contains the first part of the data of the first record
block #2 (type = 0x03) contains the data of the second record
block #3 (type = 0x09) contains the rest of the data of the first record
block #4 (type = 0x03) contains the data of the third record

complications

I want to complicate this situation again: let's do an UPDATE that will make the column ArtikelBez of the first record even longer:

MariaDB [Test]> update TestOD set ArtikelBez = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOOP' where Id = '12345678' ;
Query OK, 1 row affected, 1 warning (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 1

MariaDB [Test]> select * from TestOD;
+----------+-----+--------+----------+------------------------------------------+----------------------+------------+
| Id       | PZN | EVP    | HAP      | ArtikelBez                               | ArtikelText          | Hersteller |
+----------+-----+--------+----------+------------------------------------------+----------------------+------------+
| 12345678 | 123 | 12.34  | 123456.7 | abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMN | 12345678901234567890 | 12         |
| 23456789 | 234 | 234.56 | 234567.8 | 2345678901234567890                      | 23456789012345678901 | 23456      |
| 34567890 | 345 | 345.67 | 345678.9 | 3456789012345678901234567890             | 34567890123456789012 | 3456       |
+----------+-----+--------+----------+------------------------------------------+----------------------+------------+
3 rows in set (0.00 sec)

MariaDB [Test]>

Here is the result in the datafile:

august@AMD4:~/MariaDB/data/Test$ od -t x1 -Ax TestOD.MYD
// block #1: the first record, but only partial
000000 05                                                   // block-type
       00 5c                                                // actual record-length
       00 3b                                                // len of thi block
       00 00 00 00 00 00 00 98                              // ptr. to next block of this record
       36                                                   // flags
       31 32 33 34 35 36 37 38                              // column Id as CHAR
       03 31 32 33                                          // column PZN as VARCHAR
       05 31 32 2e 33 34                                    // column EVP as VARCHAR
       31 32 33 34 35 36 2e 37                              // column HAP as CHAR
       28 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f      // column ArtikelBez as VARCHAR (partial)
          70 71 72 73 74 75 76 77 78 79 7a 41 42 43 44 
          45

// block #2: the second record           
000048 03                                                   // block-type
       00 4a                                                // actual record-length
       02                                                   // no. of filler-bytes at the end of this block
       16                                                   // flags
       32 33 34 35 36 37 38 39                              // column Id as CHAR
       03 32 33 34                                          // column PZN as VARCHAR
       06 32 33 34 2e 35 36                                 // column EVP as VARCHAR
       32 33 34 35 36 37 2e 38                              // column HAP as CHAR
       13 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36      // column ArtikelBez as VARCHAR
          37 38 39 30 
       14 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36      // column ArtikelText as VARCHAR
          37 38 39 30 31 
       32 33 34 35 36                                       // column Hersteller as CHAR
       00 00                                                // filler-bytes
       
// block #3: first record, second part
000098 0b                                                   // block-type
       00 0d                                                // len of data in this block
       00 00 00 00 00 00 01 08                              // ptr. to next block of this record
       46 47 48 49 4a 4b 4c 4d 4e                           // rest of column ArtikelBez
       14 31 32 33                                          // column ArtikelText as VARCHAR (partial)
       
// block #4: third record
0000B0 03                                                   // block-type
       00 53                                                // actual record-length
       01                                                   // no. of filler-bytes at the end of this block
       16                                                   // flags
       33 34 35 36 37 38 39 30                              // column Id as CHAR
       03 33 34 35                                          // column PZN as VARCHAR
       06 33 34 35 2e 36 37                                 // column EVP as VARCHAR
       33 34 35 36 37 38 2e 39                              // column HAP as CHAR
       1c 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37      // column ArtikelBez as VARCHAR
          38 39 30 31 32 33 34 35 36 37 38 39 30 
       14 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37      // column ArtikelText as VARCHAR
          38 39 30 31 32 
       33 34 35 36 20                                       // column Hersteller as CHAR
       00                                                   // filler-byte
        
// block #5: first record, third part
000108 09                                                   // block-type
       00 14                                                // löen of data in this block
       00                                                   // no. of filler-bytes at the end
       34 35 36 37 38 39 30 31 32 33 34 35 36 37 38 39      // rest of column ArtikelText
          30 
       02 31 32                                             // column Hersteller as VARCHAR

Now the data-file consists of 5 blocks:

block #1 (type = 0x05) contains the first part of the data of the first record
block #2 (type = 0x03) contains the data of the second record
block #3 (type = 0x0B) contains the second part of the data of the first record
block #4 (type = 0x03) contains the data of the third record
block #5 (type = 0x09) contains the last part of the data of the first record

As you can see the data of the first record is broken into the 3 parts and spread around the data-file.

There are some more cases still possible that I didn't show until now, so let's go to another case: DELETE the first record:

MariaDB [Test]> delete from TestOD where Id = '12345678';
Query OK, 1 row affected (0.00 sec)

MariaDB [Test]> select * from TestOD;
+----------+-----+--------+----------+------------------------------+----------------------+------------+
| Id       | PZN | EVP    | HAP      | ArtikelBez                   | ArtikelText          | Hersteller |
+----------+-----+--------+----------+------------------------------+----------------------+------------+
| 23456789 | 234 | 234.56 | 234567.8 | 2345678901234567890          | 23456789012345678901 | 23456      |
| 34567890 | 345 | 345.67 | 345678.9 | 3456789012345678901234567890 | 34567890123456789012 | 3456       |
+----------+-----+--------+----------+------------------------------+----------------------+------------+
2 rows in set (0.01 sec)

MariaDB [Test]>

and here is everything in hex:

august@AMD4:~/MariaDB/data/Test$ od -t x1 -Ax TestOD.MYD
// block #1: first record, deleted block
000000 00                                                   // block-type
       00 00 48                                             // len of this block
       ff ff ff ff ff ff ff ff                              // ptr. to previous free block (no prev. block here)
       00 00 00 00 00 00 00 98                              // ptr. to next free block
       37 38 03 31 32 33 05 31 32 2e 33 34 31 32 33 34      // garbage (=old contents of thi block)
       35 36 2e 37 28 61 62 63 64 65 66 67 68 69 6a 6b 
       6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a 41 
       42 43 44 45 

// block #2: second record 
000048 03                                                   // block-type
       00 4a                                                // actual record-length
       02                                                   // no. of filler-bytes at the end of this block
       16                                                   // flags
       32 33 34 35 36 37 38 39                              // column Id as CHAR 
       03 32 33 34                                          // column PZN as VARCHAR
       06 32 33 34 2e 35 36                                 // column EVP as VARCHAR
       32 33 34 35 36 37 2e 38                              // column HAP as CHAR
       13 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36      // column ArtikelBez as VARCHAR 
          37 38 39 30 
       14 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36      // column ArtikelText as VARCHAR
          37 38 39 30 31 
       32 33 34 35 36                                       // column Hersteller as CHAR 
       00 00                                                // filler-bytes
       
// block #3: next deleted block (= former second block of first record)
000098 00                                                   // block-type
       00 00 18                                             // len of this block
       00 00 00 00 00 00 00 00                              // ptr. to previou fre block 
       00 00 00 00 00 00 01 08                              // ptr. to next free block
       14 31 32 33                                          // garbage
       
// block #4: the third record (same as above)
0000b0 03                                                   // block-type
       00 53                                                // actual record-length
       01                                                   // no. of filler-bytes at the end of this block
       16                                                   // flags
       33 34 35 36 37 38 39 30                              // column Id as CHAR
       03 33 34 35                                          // column PZN as VARCHAR
       06 33 34 35 2e 36 37                                 // column EVP as VARCHAR
       33 34 35 36 37 38 2e 39                              // column HAP as CHAR
       1c 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37      // column ArtikelBez as VARCHAR
          38 39 30 31 32 33 34 35 36 37 38 39 30 
       14 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37      // column ArtikelText as VARCHAR
          38 39 30 31 32 
       33 34 35 36 20                                       // column Hersteller as CHAR 
       00                                                   // filler-byte
        
// block #5: next deleted block (= former third block of first record)
000108 00                                                   // block-type
       00 00 18                                             // len of this block
       00 00 00 00 00 00 00 98                              // ptr. to previou free block
       ff ff ff ff ff ff ff ff                              // ptr. to next free block (no next free block here)
       30 02 31 32                                          // garbage

We still have 5 blocks but the types and the contents of block #1, #3 and #5 changed, the blocks #2 and #4 are unchanged as they contained the second and the third record.
Let's look at the changed blocks: they start with a block-type of 0x00, which marks them as deleted blocks. And as we now have 3 deleted blocks these free block are linked together via pointers following the length-info of this block. These free blocks have forward and backward pointers so they build a double-linked list.

something different

There are a lot of topics I didn't mention here like the interpretation of the filler-bytes or packing adjacent blocks or handling of NULL-values or ....
but I want to switch to another theme: we only had columns of type CHAR or VARCHAR in our test-table. What about the other data-types?

more data-types

Let's create a different table that contains more types than CHAR and VARCHAR:

MariaDB [Test]> drop table TestOD;
Query OK, 0 rows affected (0.00 sec)

MariaDB [Test]> create table TestOD (                                                                                                                                                                             
    ->                 Id   numeric(8,0) not null,
    ->                 PZN  char(7) not null,
    ->                 EVP  decimal(8,2) not null,
    ->                 HAP  decimal(9,2) not null,
    ->                 ArtikelBez varchar(40) not null,
    ->                 Datum date not null,
    ->                 Uhrzeit time not null,
    ->                 DatumZeit  datetime not null,
    ->                 Tstamp timestamp not null,
    ->                 PreciseTime datetime(6) not null)
    ->                 engine=MyISAM
    -> ;
Query OK, 0 rows affected (0.04 sec)

MariaDB [Test]> insert into TestOD values( 12345678, 1234567, 123456.78, 1234567.89, '1234567890123456789012345678901234567890', 
    ->     '2014-02-04', '14:59', '2014-02-04 14:59:00', current_timestamp, current_timestamp)
    -> ;
Query OK, 1 row affected (0.00 sec)

MariaDB [Test]> insert into TestOD values( -12345678, -1234567, -123456.78, -1234567.89, '1234567890123456789012345678901234567890', 
    ->     '2014-02-05', '15:01', '2014-02-05 15:01:00', current_timestamp, '2014-02-05 19:51:17.123456')
    -> ;
Query OK, 1 row affected, 1 warning (0.02 sec)

MariaDB [Test]> select * from TestOD;
+-----------+---------+------------+-------------+------------------------------------------+------------+----------+---------------------+---------------------+----------------------------+
| Id        | PZN     | EVP        | HAP         | ArtikelBez                               | Datum      | Uhrzeit  | DatumZeit           | Tstamp              | PreciseTime                |
+-----------+---------+------------+-------------+------------------------------------------+------------+----------+---------------------+---------------------+----------------------------+
|  12345678 | 1234567 |  123456.78 |  1234567.89 | 1234567890123456789012345678901234567890 | 2014-02-04 | 14:59:00 | 2014-02-04 14:59:00 | 2014-02-26 17:23:10 | 2014-02-26 17:23:10.000000 |
| -12345678 | -123456 | -123456.78 | -1234567.89 | 1234567890123456789012345678901234567890 | 2014-02-05 | 15:01:00 | 2014-02-05 15:01:00 | 2014-02-26 17:23:20 | 2014-02-05 19:51:17.123456 |
+-----------+---------+------------+-------------+------------------------------------------+------------+----------+---------------------+---------------------+----------------------------+
2 rows in set (0.00 sec)

MariaDB [Test]> desc TestOD;
+-------------+--------------+------+-----+-------------------+-----------------------------+
| Field       | Type         | Null | Key | Default           | Extra                       |
+-------------+--------------+------+-----+-------------------+-----------------------------+
| Id          | decimal(8,0) | NO   |     | NULL              |                             |
| PZN         | char(7)      | NO   |     | NULL              |                             |
| EVP         | decimal(8,2) | NO   |     | NULL              |                             |
| HAP         | decimal(9,2) | NO   |     | NULL              |                             |
| ArtikelBez  | varchar(40)  | NO   |     | NULL              |                             |
| Datum       | date         | NO   |     | NULL              |                             |
| Uhrzeit     | time         | NO   |     | NULL              |                             |
| DatumZeit   | datetime     | NO   |     | NULL              |                             |
| Tstamp      | timestamp    | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| PreciseTime | datetime(6)  | NO   |     | NULL              |                             |
+-------------+--------------+------+-----+-------------------+-----------------------------+
10 rows in set (0.00 sec)

MariaDB [Test]>

As you can see this table consists of many more data-types. So let's see how this is organized in the data-file:

august@AMD4:~/MariaDB/data/Test$ od -t x1 -Ax TestOD.MYD
000000 03                                                   // block-type
       00 58                                                // actual record-length
       00                                                   // no. of filler-bytes at the end of this block 
       00                                                   // flags 
       80 bc 61 4e                                          // column ID: bit-flag + 0x00bc614e
       31 32 33 34 35 36 37                                 // column PZN as CHAR
       81 e2 40 4e                                          // column EVP: flag + 0x01e240 + 0x4e = '+' + 123456 + 78
       80 12 d6 87 59                                       // column HAP: flag + 0x0012d687 + 0x59 = '+' + 1234567 + 89
       28 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35      // column ArtikelBez as VARCHAR
          36 37 38 39 30 31 32 33 34 35 36 37 38 39 30 
          31 32 33 34 35 36 37 38 39 30 
       44 bc 0f                                             // column Datum: 0x0fbc44 = 1031236d = DD + 32*MM + 16*32*YYYY = 04 + 02*32 + 2014*16*32  
       ec 39 02                                             // column Uhrzeit: 0x0239ec = 145900d = HHMMSS
       ec bc b6 41 51 12 00 00                              // column DatumZeit: 0x125141b6bcec = 20140204145900d = YYYYMMDDHHMMSS
       ee 14 0e 53                                          // column Tstamp = 0x530E14EE = no. of seconds since 1970 (Unix-like)
       01 01 33 a0 34 e7 5b 80                              // column PreciseTime: see separate descriiption

00005C 03                                                   // block-type
       00 58                                                // actual record-length
       00                                                   // no. of filler-bytes at the end of the block
000060 00                                                   // flags
       7f 43 9e b1                                          // column ID: bit-flag + NOT 0x00bc614e
       2d 31 32 33 34 35 36                                 // column PZN as CHAR
       7e 1d bf b1                                          // column EVP: flag + NOT 0x7e1dbf + NOT b1 = '-' + 123456 + 78
000070 7f ed 29 78 a6                                       // column HAP: flag + NOT 0x7fed2978 + NOT a6 = '-' + 1234567 + 89 
       28 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35      // column ArtikelBez as VARCHAR
          36 37 38 39 30 31 32 33 34 35 36 37 38 39 30 
          31 32 33 34 35 36 37 38 39 30 
       45 bc 0f                                             // column Datum: 0x0fbc45 = 1031237d =  DD + 32*MM + 16*32*YYYY = 05 + 02*32 + 2014*16*32
       54 4a 02                                             // column Uhrzeit: 0x024a54 = 150100d = HHMMSS
       94 0f c6 41 51 12 00 00                              // column DatumZeit: 0x125141cd0f94 = 20140205150100d = YYYYMMDDHHMMSS 
       f8 14 0e 53                                          // column Tstamp = 0x530e14f8 = no. of seconds since 1970 (Unix-like)
0000b0 01 01 31 fb d3 f3 39 80                              // column PreciseTime: see separate descriiption

As you can see in the comments I added to every column that the internal representation is a bit tricky sometimes, but I hope the comments explain everything.
Sometimes the information is stored in the form Hi-Lo (e.g. in the Id-column), sometimes the information is stored in the form Lo-Hi (e.g. in the Tstamp-column); even trickier is the information stored in the HAP-column: numbers are grouped by values up to 1 billion (= 9 digits) and stored separately. The column Tstamp contains the value in the form of the Unix-function time(), which returns the number of seconds since 1970/01/01. And the trickiest part is the column PreciseTime. Instead of an explanation I took a famous program and added some code so that it look like this now:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

static void unpack( unsigned long packed)
{
   unsigned long second_part = packed % 1000000ULL;
   packed /= 1000000ULL;
   int second = packed % 60ULL;
   packed /= 60ULL;
   int minute = packed %60ULL;
   packed /= 60ULL;
   int hour = packed % 24UL;
   packed /= 24UL;
   int day = packed % 32UL;
   packed /= 32UL;
   int month = packed % 13UL;
   packed /= 13UL;
   int year = packed;
   printf( "%4.4d-%02.2d-%02.2d %02.2d:%02.2d:%02.2d.%06.6ld\n", year, month, day, hour, minute, second, second_part);
}
int main(void) {
    puts("!!!Hello World!!!"); /* prints !!!Hello World!!! */

//  outputs: 2014-02-26 17:23:10.000000
    unpack( 0x010133a034e75b80 );
//  outputs: 2014-02-05 19:51:17.123456
    unpack( 0x010131fbd3f33980 );;

    return EXIT_SUCCESS;
}

I hope these lines of code show how the information is grouped together in a column of this type.

finally

This is not complete, it's only an overview. But I hope I gave you a good entry for further tests.

Sunday, 2 March 2014

od