And then: I tested my approach and compared it to the results of the original code. Well, what do I say, hmmmm, ..... (please look at the little drawing at the bottom of this page)
OK, I say it: the original code was faster than my code.
There could be only one reason for this: the server caches a lot of the data (and my code didn't).
So how is this caching done by MariaDB/MySQL?
what is caching
You can find a good description of caching on the Wikipedia.In a few words: if you connect a fast device with a slow one a cache can help speedup some operations. And in our case the fast device is the CPU and the slow device is the harddisk.
OS
Caching is used quite often. The operating system does a lot of caching by itself. I don't want to describe this and I don't want to play with this, I only want to show the effect of caching done by the MySQL/MariaDB-server.cash
If you look through the code of MySQL and MariaDB you will see the words cache and cash appear, they are used interchangeably. Don't let this confuse you, it's a programmers fun. Newer versions of the software replace the word cash with the word cache (the correct word).environment
I did all my tests with the MyISAM-engine. In caching data each engine can do whatever it wants so the results may differ between the engines.My machine here is equipped with 32 GB of RAM. If youre machine is equipped with a different amount of RAM then your results may differ a bit.
types
The MyISAM-engine supports 3 types of tabls: static (fixed-length records), dynamic (variable length records) and packed. I tested only static and dynamic.standard
Let's start with the standard-approach:MariaDB [TestOK]> create database TestCache;
Query OK, 1 row affected (0.00 sec)
MariaDB [TestOK]> use TestCache;
Database changed
MariaDB [TestCache]> create table TestStat (
-> Id char(8) not null,
-> PZN char(7) not null,
-> EVP char(8) not null,
-> HAP char(8) not null,
-> ArtikelBez char(40) not null,
-> ArtikelText char(26) not null,
-> Hersteller char(5) not null)
-> engine = MyIsam
-> ;
Query OK, 0 rows affected (0.05 sec)
MariaDB [TestCache]> create table TestDyn (
-> Id varchar(8) not null,
-> PZN varchar(7) not null,
-> EVP varchar(8) not null,
-> HAP varchar(8) not null,
-> ArtikelBez varchar(40) not null,
-> ArtikelText varchar(26) not null,
-> Hersteller varchar(5) not null)
-> engine = MyIsam
-> ;
Query OK, 0 rows affected (0.04 sec)
MariaDB [TestCache]> insert into TestStat select * from TestOK.ABDAOK
-> ;
Query OK, 10000000 rows affected (34.84 sec)
Records: 10000000 Duplicates: 0 Warnings: 0
MariaDB [TestCache]> insert into TestDyn select * from TestOK.ABDAOK
-> ;
Query OK, 10000000 rows affected (46.90 sec)
Records: 10000000 Duplicates: 0 Warnings: 0
MariaDB [TestCache]>
As you can see I created a database and in this database I created 2 tables. The tables are identical, except that the table TestStat is of type static (=fixed length records) whereas the table TestDyn is of type dynamic (=variable length records). Both tables contain 10 Mio. records of identical data. The table TestDyn does not contain any updated data, and each record consist of only one block, so no data is spread around the datafile.This is how the tables look on the OS-side:
august@AMD4:~/MariaDB/data/TestCache$ ls -l
insgesamt 1880292
-rw-rw---- 1 august august 65 Mär 18 15:47 db.opt
-rw-rw---- 1 august august 685 Mär 18 15:50 TestDyn.frm
-rw-rw---- 1 august august 895389376 Mär 18 15:53 TestDyn.MYD
-rw-rw---- 1 august august 1024 Mär 18 16:15 TestDyn.MYI
-rw-rw---- 1 august august 679 Mär 18 15:50 TestStat.frm
-rw-rw---- 1 august august 1030000000 Mär 18 15:51 TestStat.MYD
-rw-rw---- 1 august august 1024 Mär 18 16:15 TestStat.MYI
august@AMD4:~/MariaDB/data/TestCache$
If you add the numbers of the column-widths given in the CREATE-statement you will get 102 as the width of the record. Add to this one byte for a flag, multiply this value with 10 Mio. and ŷou will get exactly the size of the file TestStat.MYD. The file TestDyn.MYD is a bit smaller because the data is not filled up to the full width of each column (justification of using the VARCHAR column-type).Let's do our first tests. Nothing is changed, these tests are all done with the server out of the box. And everything is a table-scan:
MariaDB [TestCache]> select count(*) from TestStat where Hersteller <> 'ABCDE';
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (10.63 sec)
MariaDB [TestCache]> select count(*) from TestStat where Hersteller <> 'ABCDF';
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (10.58 sec)
MariaDB [TestCache]> select count(*) from TestStat where Hersteller <> 'ABCDG';
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (10.58 sec)
MariaDB [TestCache]> select count(*) from TestDyn where Hersteller <> 'ABCDE';
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (14.16 sec)
MariaDB [TestCache]> select count(*) from TestDyn where Hersteller <> 'ABCDF';
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (14.14 sec)
MariaDB [TestCache]> select count(*) from TestDyn where Hersteller <> 'ABCDG';
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (14.13 sec)
MariaDB [TestCache]>
All time-values are taken from the frontend mysql as you can see. All tests are done 3 times to compensate for optimizations done by the OS (=caching on the OS-level).
Accessing table TestStat is about 30% faster than accessing the table TestDyn. The reason for this is the more complex structure of the later table. You will find a description of the structure of the MyISAM-tables here. You may also look at this post.
In all future examples of this post I will do the tests 3 times, but I will only present the average value for each test.
into the code
Let's step into the code and look how the server does the read-operation.All operations that I used in my tests are sequential reads (=table scan) because no other access is possible (no index was created). These operations are executed by the following functions:
- xxxx::rnd_init() is called at the beginning of the process. it initializes everything needed
- xxxx::rnd_next() returns one row of the table. this function is called repeatedly until it returns the constant HA_ERR_END_OF_FILE
- xxxx::rnd_end() is called at the end of the of the read-operation. here you can clean things up
This is the function hierarchy using MyISAM in the case of fixed-sized records:
ha_myisam.cc : ha_myisam::rnd_next()
mi_scan.c : mi_scan()
mi_stat_rec.c : mi_read_rnd_static_record()
mf_iocache.c : _my_b_read()
mysql_file.h : inline_mysql_file_read
my_read.c : my_read() reads 128K from the datafile into the buffer
And in the case of variable length records:
ha_myisam.cc : ha_myisam::rnd_next()
mi_scan.c : mi_scan()
mi_dynrec.c : _mi_read_rnd_dynamic_record()
mi_cache.c : _mi_read_cache()
mf_iocache.c : _my_b_read()
mysql_file.h : inline_mysql_file_read
my_read.c : my_read() reads 128K from the datafile into the buffer
The last function fills the cache and this buffer is used internally by the MyISAM-engine in the next read-operations.modifications
I want to know how this buffer is used and especially what happens when I change the size of the buffer. So let's modify the code a bit. I will present the places where I added or changed code and mark added/changed lines in bold.At first: when the cache is unable to satisfy the request the data must be read from the datafile, so let's count how many times the function read() of the C-library is called:
modify the file storage/myisam/ha_mysiam.cc near the top of this file:
extern int counterReadStaticRecord;
extern int counterReadDynamicReacord;
somewhere in the middle of this file:
int ha_myisam::rnd_init(bool scan)
{
counterReadStaticRecord = counterReadDynamicReacord = 0;
if (scan)
return mi_scan_init(file);
return mi_reset(file); // Free buffers
}
int ha_myisam::rnd_end()
{
DBUG_ENTER("ha_myisam::rnd_end");
fprintf(stderr, "no. of reads:\tstatic=%d\tdynamic=%d\n", counterReadStaticRecord, counterReadDynamicReacord);
ds_mrr.dsmrr_close();
#if !defined(DBUG_OFF) && defined(SQL_SELECT_FIXED_FOR_UPDATE)
file->update&= ~HA_STATE_AKTIV; // Forget active row
#endif
DBUG_RETURN(0);
}
modify the file storage/myisam/mi_statrec.c before the function _mi_read_rnd_static_record():
//##########################################################
// the original code:
//#define my_b_read(info,Buffer,Count) \
// ((info)->read_pos + (Count) <= (info)->read_end ?\
// (memcpy(Buffer,(info)->read_pos,(size_t) (Count)), \
// ((info)->read_pos+=(Count)),0) :\
// (*(info)->read_function)((info),Buffer,Count))
int counterReadStaticRecord;
static inline int AQ_my_b_read(IO_CACHE *info, uchar *Buffer, ulong Count)
{
if ( (info)->read_pos + (Count) <= (info)->read_end )
{
memcpy(Buffer,(info)->read_pos,(size_t) (Count));
((info)->read_pos+=(Count));
return 0;
}
counterReadStaticRecord++;
return (*(info)->read_function)((info),Buffer,Count);
}
in the function _mi_read_rnd_static_record():
// error=my_b_read(&info->rec_cache,(uchar*) buf,share->base.reclength);
/////
error=AQ_my_b_read(&info->rec_cache,(uchar*) buf,share->base.reclength);
/////
I have to explain a bit what I did here and why:
There is a makro my_b_read, you will find it in include/my_sys.h.
I included it as a comment in the file mi_statrec.c and also duplicated it as a function with a new name: AQ_my_b_read. I hope this function better shows what's going on in this code and it also allows for better debugging. And I added the counter-variable to this inline-function.
Then I replaced the call of the makro my_b_read with the call of my own function AQ_my_b_read.
Let's continue with our modifications, now for the dynamic table.
in storage/myisam/mi_cache.c somewhere near the start of the file add:
int counterReadDynamicReacord;
somewhere in the function _mi_read_cache():
else
info->read_pos=info->read_end; /* All block used */
counterReadDynamicReacord++; // <---- add this line
if (!(*info->read_function)(info,buff,length))
DBUG_RETURN(0);
why these modifications
A table-scan is done by calling at first the function ha_myisam::rnd_init(). In this function I initialize the counters to 0.Then the function ha_myisam::rnd_next() ist called until we reached the end of the datafile. Nothing new here.
And finally the function ha_myisam::rnd_end() is called which will report the values of our counter-variables.
In the case of a static table the read-operation is handled in the function AQ_my_b_read in the file mi_statrec.c so here I'm counting.
And for a dynamic table the read-operation is done in the function _mi_read_cache() so I'm counting here.
more modifications
Counting the number of disk-accesses is only a minor point. With thi information we can compare the effect of modifying the size of the buffer. But we didn't modify this size, so let's do it now: in the file sql/sysvars.cc:static Sys_var_ulong Sys_read_buff_size(
"read_buffer_size",
"Each thread that does a sequential scan allocates a buffer of "
"this size for each table it scans. If you do many sequential scans, "
"you may want to increase this value",
SESSION_VAR(read_buff_size), CMD_LINE(REQUIRED_ARG),
VALID_RANGE(IO_SIZE*2, INT_MAX32), DEFAULT(128*1024),
BLOCK_SIZE(IO_SIZE));
as you can see the default value is 128*1024 = 128K (I marked the entry in bold). I can easily change this, recompile and do the tests, but there are restrictions built into the server-code which can lead to the following error-message appearing on start-up:
Sysvar 'read_buffer_size' failed 'min_val <= def_val'
I could try to find the places where these checks happen or look for something else. I decided to modify this entry:
in include/my_global.h:
// #define IO_SIZE 4096
#define IO_SIZE 64
now recompile everything (the last modification leads to a recompilation of almost everything) and do the test. The error-message did not appear anymore.
results
Here are the results of my tests. I changed the DEFAULT-value from 128 bytes to 256 bytes to 512 bytes and so on up to 1 GB and did my tests, with both types of tables, and did every test 3times. Here are the results of the tests in graphical form:If you are interested in the details of my tests please look at this text-file. It contains the raw data of my tests.
conclusion
The default value of these buffers is 128K bytes. If you look at the picture above you will see that this is a good compromise. If you make the buffer smaller everything will need more time. If you increase the size you will not get much improvement (and maybe the request will need more time). And keep in mind that RAM is a finite resource, and you're not alone on this server.It's a good compromise.
some more information
I collected a lot more of information then the time elapsed for the SELECT. So let me give you some more data.Let's first check the number of reads.
The values are:
number of reads with different sizes of buffer:
size of number of reads
buffer static dynamic
128b 4023438 3497615
256b 4023438 3497615
512b 2011719 1748808
1K 1005860 874404
2K 502930 437202
4K 251465 218601
8K 125733 109301
16K 62867 54651
32K 31434 27326
64K 15717 13663
128K 7859 6832
256K 3930 3416
512K 1965 1708
1M 983 854
2M 492 427
4M 246 214
8M 123 107
16M 62 54
32M 31 27
64M 16 14
128M 8 7
256M 4 4
512M 2 2
1G 1 1
In the last line you see why my testing stopped here: with a buffer-ize of 1GB the whole file is accessed in one read()-call.
In the first lines you will alo see identical numbers for the cache-sizes of 128 and 256 bytes. The reason for this is that with these values a minimal buffer of 256 is used even when it is set to a smaller value.
If you multiply the number of reads with the size of the buffer you will always get a value that is bigger then the size of the corresponding file, so the whole file is read.
And you will also see that doubling the size of the cache halves the number of reads.
support
The support for caching data is already there. In your own storage engine you can use it (or write your own routine which are hopefully better/faster).Please look at the code in mysys/mf_iocache.c. The buffers are created in the function init_io_cache(). This function is called after a call of ha_myisam::rnd_init(). Here is the hierarchy:
ha_myiam.cc : ha_myisam::extra_opt()
mi_extra.c : mi_extra()
mf_iocache.c : init_io_cache()
and the buffers are freed after a call of ha_myisam::rnd_end() in the function end_io_cache(). This is the hierarchy:
ha_myisam.cc : ha_myisam::extra()
mi_extra.c : mi_extra()
mf_iocache.c : end_io_cache()
functions
Above I've written about functions of the storage-engine that are called in the execution of the SELECT-stmt. Here is a listing of these functions as they are called called by the server:<TestMax> ha_maxtempl::extra
<TestMax> ha_maxtempl::column_bitmaps_signal
<TestMax> ha_maxtempl::extra
<TestMax> ha_maxtempl::extra
<TestMax> ha_maxtempl::store_lock
<TestMax> ha_maxtempl::external_lock
<TestMax> ha_maxtempl::table_cache_type
<TestMax> ha_maxtempl::register_query_cache_table
<TestMax> ha_maxtempl::table_cache_type
<TestMax> ha_maxtempl::info
<TestMax> ha_maxtempl::info
<TestMax> ha_maxtempl::scan_time
<TestMax> ha_maxtempl::scan_time
<TestMax> ha_maxtempl::rnd_init
<TestMax> ha_maxtempl::extra_opt
<TestMax> ha_maxtempl::rnd_next
<TestMax> ha_maxtempl::rnd_next
<TestMax> ha_maxtempl::rnd_next
<TestMax> ha_maxtempl::rnd_next
<TestMax> ha_maxtempl::rnd_next
<TestMax> ha_maxtempl::rnd_next
<TestMax> ha_maxtempl::rnd_next
<TestMax> ha_maxtempl::rnd_end
<TestMax> ha_maxtempl::extra
<TestMax> ha_maxtempl::external_lock
<TestMax> ha_maxtempl::extra
<TestMax> ha_maxtempl::extra
<TestMax> ha_maxtempl::reset
For this test I created a table TestMax in the database TestCache and inserted 6 records into this table. Then I started the >SELECT-stmt given above. You can see which functions are called, how often they are called and in which order they are called. An example: rnd_next() is called 7 times, once for each of the 6 records in the table and a 7th time, in which the EOF is detected and returned.The table TestMax uses the storage engine MAXTEMPL.