Optimize SQL databases – Data types

2011-03-21

Now we will finish the series on database optimization with something more easy-going, data types. If you have worked with databases before, much of this might seem obvious, but this article may still offer something new. If nothing else, it can perhaps provide an explanation for why you do things in a certain way.

The purpose here is to create as tight tables as possible. This means that the columns must be of the correct data type as its purpose while they are as small as possible to avoid unnecessary overhead. This is done so that the database manager will get as little data as possible to search through, and for applications of a far bigger scale, it can also make a noticeable difference on the storage space. In this article, I will use the data types of MySQL, but most of it should be possible to use in other implementations as well.

We begin with the text handling. Should you store a very short text string or if it always has a fixed length, the data type CHAR() should be used. It stores text strings with a fixed length. The datatype VARCHAR() can be used to save some the storage space if the length of the text string varies. For both CHAR() and VARCHAR(), there is a maximum limit of 255 characters. If you set a higher value, it will be automatically converted to the data type TEXT. TEXT can store up to 64 kB of text. If more space is needed, then MEDIUMTEXT and LARGETEXT can be used which can store up to 16 MB and 4 GB of text. The binary counterpart is BLOB. For most data types, you can set the maximum number of characters allowed in the parenthesis after the data type.

When dealing with numbers, you should use the data types INT(), FLOAT() and DOUBLE(). If a column should be storing only whole numbers, you should always use INT. If the number has decimals, you should use FLOAT for smaller numbers or DOUBLE for larger numbers. You may set the maximum number of digits and decimals places in the parentheses for both FLOAT and DOUBLE like this: DOUBLE(2, 4). This will store up to two digits followed by a maximum of four decimals. If you want to save a value that only has two alternatives, true or false, the data type BIT() can be used.

Finally, I would like to tell you that there are different data types for storing date and time. DATE stores dates in the format YYYY-MM-DD, TIME stores time in the format HH:MM:SS and DATETIME stores both date and time in the format YYYY-MM-DD HH:MM:SS. By using the built-in datatypes rather than a text string, you get both some control of the input and you can use built-in functions like DATEDIFF(). You can read more about the different data types available in MySQL Reference Manual.

Other articles in this series: