Some weeks ago I was asked by a company to help in a project. The company that asked is working in the business for arranging contracts between companies and software developers, so getting the right information about the job to do wasn't easy in the first discussion. A week later the job description was clear: implementing an encryption-function in MySQL/MariaDB. By asking Mr. Google I found this site: MariaDB Database Encryption. This describes in detail one approach for use in the engines InnoDB and XtraDB.
There was more information available: Mr. Google also told me this about encryption in MariaDB. I continued searching and started thinking about possible solutions.
I like to present some of my thoughts and experiments. As I said in other posts before: the code I will present here is only a proof of concept, it's not ready for commercial use.
As my focus is on the code of MariaDB/MySQL and not in the algorithm or code of any encryption-procedures I've chosen a simple function for simulating encryption: XOR. It's easy to implement, fast, encrypting and decrypting is identical (so I have only one routine to code) and the encrypted text is broken in some seconds (or less). Let's simply use this algorithm (you shouldn't use it for encrypting your data).
functions
MariaDB/MySQL contains functions for encrypting and decrypting your data. You will find a list of these functions here: 12.13 Encryption and Compression Functions. This page also contains examples on how to use these.As my "encryption" is XOR I would like to demonstrate this using this approach. So I have to write an UDF which implements this. You can find a description of UDFs here:CREATE FUNCTION UDF and an introduction into the rules for writing your own UDF here: 24.3 Adding New Functions to MySQL. And finally there is an example which you will find in the directory sql of the source-code: udf_example.c.
You will find the source-code of my little UDF in the archive in the file: string_xor.c1). Compile it and copy it to the directory where the database-server expects it:
august@AMD4:~/workspace/MariaDB/mariadb-10.0.10/sql$ gcc -shared -o string_xor.so string_xor.c -I ../include
august@AMD4:~/workspace/MariaDB/mariadb-10.0.10/sql$ cp string_xor.so ~/MariaDB/pgm/lib/plugin/
august@AMD4:~/workspace/MariaDB/mariadb-10.0.10/sql$
So let's test this little routine:
MariaDB [(none)]> CREATE FUNCTION string_xor RETURNS STRING SONAME 'string_xor.so';
Query OK, 0 rows affected (0.01 sec)
MariaDB [(none)]> select hex(string_xor('ABCDE'));
+--------------------------+
| hex(string_xor('ABCDE')) |
+--------------------------+
| 1417161110 |
+--------------------------+
1 row in set (0.03 sec)
MariaDB [(none)]> select string_xor(unhex(1417161110));
+-------------------------------+
| string_xor(unhex(1417161110)) |
+-------------------------------+
| ABCDE |
+-------------------------------+
1 row in set (0.00 sec)
MariaDB [(none)]>
Using the hex()- und unhex()-functions was necessary because the box above does not show correctly the output of the function string_xor(). If I omit the hex()-function it looks like this:
Thius output looks like encrypted. So let's continue our tests:
MariaDB [(none)]> create database TestEnc;
Query OK, 1 row affected (0.00 sec)
MariaDB [(none)]> use TestEnc;
Database changed
MariaDB [TestEnc]> create table TestNormal (
-> Id char(8),
-> PZN char(7),
-> EVP char(8),
-> HAP char(8),
-> ArtikelBez char(40),
-> ArtikelText char(26),
-> Hersteller char(5) )
-> engine=MYISAM
-> ;
Query OK, 0 rows affected (0.05 sec)
MariaDB [TestEnc]> create table TestEnc (
-> Id char(8),
-> PZN char(7),
-> EVP char(8),
-> HAP char(8),
-> ArtikelBez char(40),
-> ArtikelText char(26),
-> Hersteller char(5) )
-> engine=MYISAM
-> ;
Query OK, 0 rows affected (0.18 sec)
MariaDB [TestEnc]> insert into TestNormal values ( '01000000', '999999', '132.36', '0.00', 'UROSAFE BEINBEUTEL 7732C 1 L', 'SPRING DE LU AM K1 NO TOP5', '25677');
Query OK, 1 row affected (0.00 sec)
MariaDB [TestEnc]> insert into TestEnc values ( '01000000', '999999', '132.36', '0.00', 'UROSAFE BEINBEUTEL 7732C 1 L', 'SPRING DE LU AM K1 NO TOP5', string_xor('25677'));
Query OK, 1 row affected (0.00 sec)
MariaDB [TestEnc]>
As you can see in the last lines I've inserted a row into each table. For showing the result of these INSERT-statements I did some SELECTs. In the result I've marked the values for the column Hersteller in bold. So let's look what is stored in the table:
MariaDB [(none)]>MariaDB [TestEnc]> select * from TestNormal;
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
| Id | PZN | EVP | HAP | ArtikelBez | ArtikelText | Hersteller |
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
| 01000000 | 999999 | 132.36 | 0.00 | UROSAFE BEINBEUTEL 7732C 1 L | SPRING DE LU AM K1 NO TOP5 | 25677 |
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
1 row in set (0.00 sec)
MariaDB [TestEnc]> select * from TestEnc;
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
| Id | PZN | EVP | HAP | ArtikelBez | ArtikelText | Hersteller |
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
| 01000000 | 999999 | 132.36 | 0.00 | UROSAFE BEINBEUTEL 7732C 1 L | SPRING DE LU AM K1 NO TOP5 | g`cbb |
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
1 row in set (0.00 sec)
MariaDB [TestEnc]> select Id, PZN, EVP, HAP, ArtikelBez, ArtikelText, string_xor(Hersteller) as Hersteller from TestEnc;
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
| Id | PZN | EVP | HAP | ArtikelBez | ArtikelText | Hersteller |
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
| 01000000 | 999999 | 132.36 | 0.00 | UROSAFE BEINBEUTEL 7732C 1 L | SPRING DE LU AM K1 NO TOP5 | 25677 |
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
1 row in set (0.01 sec)
MariaDB [TestEnc]>
As you can see accessing the column Hersteller of our table TestEnc shows an encrypted value, you must use the decryption-routine to get the real value of this column.Let's look into the MYD-files of the two tables:
august@AMD4:~/MariaDB/data/TestEnc$ od -t x1 -Ax TestNormal.MYD
000000 01 flag
30 31 30 30 30 30 30 30 Id
39 39 39 39 39 39 20 PZN
000010 31 33 32 2e 33 36 20 20 EVP
30 2e 30 30 20 20 20 20 HAP
000020 55 52 4f 53 41 46 45 20 ArtikelBez
42 45 49 4e 42 45 55 54
000030 45 4c 20 37 37 33 32 43
20 20 20 20 20 20 31 20
000040 20 4c 20 20 20 20 20 20
53 50 52 49 4e 47 20 44 ArtikelText
000050 45 20 4c 55 20 41 4d 20
4b 31 20 4e 4f 20 54 4f
000060 50 35
32 35 36 37 37 Hersteller
000067
august@AMD4:~/MariaDB/data/TestEnc$ od -t x1 -Ax TestEnc.MYD
000000 01 flag
30 31 30 30 30 30 30 30 Id
39 39 39 39 39 39 20 PZN
000010 31 33 32 2e 33 36 20 20 EVP
30 2e 30 30 20 20 20 20 HAP
000020 55 52 4f 53 41 46 45 20 ArtikelBez
42 45 49 4e 42 45 55 54
000030 45 4c 20 37 37 33 32 43
20 20 20 20 20 20 31 20
000040 20 4c 20 20 20 20 20 20
53 50 52 49 4e 47 20 44 ArtikelText
000050 45 20 4c 55 20 41 4d 20
4b 31 20 4e 4f 20 54 4f
000060 50 35
67 60 63 62 62 Hersteller (encrypted)
000067
august@AMD4:~/MariaDB/data/TestEnc$
At the end of the tables you see the value of the column Hersteller. For table TestNormal you see it in plain text, for table TestEnc you see the encrypted values. To verify our little UDF let's do some calculations:
value decrypt encrypt
67 67 XOR 55 = 32 = '2' 32 XOR 55 = 67
60 60 XOR 55 = 35 = '5' 35 XOR 55 = 60
63 63 XOR 55 = 36 = '6' 36 XOR 55 = 63
62 62 XOR 55 = 37 = '7' 37 XOR 55 = 62
final words: can we use this approach? Maybe. Please keep in mind that every SQL-statement must contain the call of the encryption- or decryption-function for every column that contains encrypted values. In most of the cases this has to be done in the client-software. Using a hex-editor or decompiler (e.g. for Java) or some other tools this can easily be detected. So this approach is no real protection.
trigger
Let's switch over to the server-side and implement encryption via a trigger. You will find a description here: CREATE TRIGGER Statement.Here is the first step of the example using a trigger:
MariaDB [TestEnc]> create table TestTrigger (
-> Id char(8),
-> PZN char(7),
-> EVP char(8),
-> HAP char(8),
-> ArtikelBez char(40),
-> ArtikelText char(26),
-> Hersteller char(5) )
-> engine=MYISAM
-> ;
Query OK, 0 rows affected (0.04 sec)
MariaDB [TestEnc]> delimiter //
MariaDB [TestEnc]> CREATE TRIGGER test_xor BEFORE INSERT ON TestTrigger
-> FOR EACH ROW
-> BEGIN
-> SET NEW.Hersteller = string_xor( NEW.Hersteller);
-> END;//
Query OK, 0 rows affected (0.08 sec)
MariaDB [TestEnc]> delimiter ;
MariaDB [TestEnc]> insert into TestTrigger values ( '01000000', '999999', '132.36', '0.00', 'UROSAFE BEINBEUTEL 7732C 1 L', 'SPRING DE LU AM K1 NO TOP5', '25677');
Query OK, 1 row affected (0.00 sec)
MariaDB [TestEnc]> select * from TestTrigger;
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
| Id | PZN | EVP | HAP | ArtikelBez | ArtikelText | Hersteller |
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
| 01000000 | 999999 | 132.36 | 0.00 | UROSAFE BEINBEUTEL 7732C 1 L | SPRING DE LU AM K1 NO TOP5 | g`cbb |
+----------+--------+--------+------+------------------------------------+----------------------------+------------+
1 row in set (0.00 sec)
MariaDB [TestEnc]>
If you look at the result of the SELECT you see that only the column Hersteller is encrypted. And if you look at the definition of the trigger you see that it uses the UDF string_xor() defined before and it calls this function only for the column Hersteller.
Looks fine but let's look into the MYD-file:
august@AMD4:~/MariaDB/data/TestEnc$ l -l TestTrigger*
-rw-rw---- 1 august august 679 Aug 19 13:57 TestTrigger.frm
-rw-rw---- 1 august august 103 Aug 19 13:58 TestTrigger.MYD
-rw-rw---- 1 august august 1024 Aug 19 13:58 TestTrigger.MYI
-rw-rw---- 1 august august 337 Aug 19 13:57 TestTrigger.TRG
august@AMD4:~/MariaDB/data/TestEnc$ od -t x1 -Ax TestTrigger.MYD
000000 01 30 31 30 30 30 30 30 30 39 39 39 39 39 39 20
000010 31 33 32 2e 33 36 20 20 30 2e 30 30 20 20 20 20
000020 55 52 4f 53 41 46 45 20 42 45 49 4e 42 45 55 54
000030 45 4c 20 37 37 33 32 43 20 20 20 20 20 20 31 20
000040 20 4c 20 20 20 20 20 20 53 50 52 49 4e 47 20 44
000050 45 20 4c 55 20 41 4d 20 4b 31 20 4e 4f 20 54 4f
000060 50 35 67 60 63 62 62
000067
august@AMD4:~/MariaDB/data/TestEnc$
If you look at the hex-dump you will see exactly the same hex-values as in the example above (table TestEnc.MYD) so this approach worked.
How can we access this column with the encrypted value? Unfortunately there is no trigger for SELECT in MariaDB/MySQL. Other database-software also does not have such a trigger as you can see here for Oracle and Microsoft. So we have to create a stored procedure that contains this SELECT-statement and does the processing on the encrypted columns, returning the row after processing it.2). But I don't want to do this here but instead switch over to a different approach.
continued
Next I like to describe an approach that uses a storage engine so this post will be continued.some notes:
1) I will give you the link to the archive in the coming post.
2)
for protecting the data in your database this approach has some advantages: you can put all SQL-statements in stored-procedures and the clients are only allowed to call these stored-procedures. You can even let these stored procedures call other stored procedures which belong to a different user but have the right to access tables and manipulate data. So from the client side you will only see a stored procedure which contains a one-liner (the call of another stored procedure, which is invisible to the client).