How important a primary key can be for MySQL performance?

How important a primary key design can be for MySQL performance? The answer is: Extremely! If tables use InnoDB storage engine, that is.

It all begins with the specific way InnoDB organizes data internally. There are two major pieces of information that anyone should know:

  1. It physically stores rows together with and in the order of primary key values. It means that a primary key does not only uniquely identify a row, it is also part of it. Or perhaps rather, a physical row is part of table’s primary key.
  2. A secondary index entry does not point to the actual row position, which is how it works in MyISAM. Instead, every single index entry is concatenated with a value of the corresponding primary key. When a query reads a row through a secondary index, this added value is used in additional implicit lookup by the primary key, to locate the actual row.

What could be a “rule of the thumb” for InnoDB primary key design is that it should be as short as possible. Needlessly long values use a lot of space inside the secondary indexes and may degrade performance.

It is also good when primary key is also incremental, i.e. each new row has a value there that is larger than in any other existing row. This does not necessarily imply AUTO_INCREMENT column, which is just one of the possibilities. Because rows are kept in the order of the key, using values that constantly grow means InnoDB may simply “append” rows to a table. It benefits insert performance and prevents fragmentation as data does not have to be moved around to make room for an insert that needs to add a row in the middle of table.

However in some cases it turns out that choosing a simple, short and incremental column for primary key is not the best option when thinking about database scalability. With certain access patterns, especially when reads by range are prevalent, it turns out that longer and more complex design can produce better results in the longer term.

When AUTO_INCREMENT column becomes primary key, rows are placed one after another in the order of insertion. If two rows were added at the interval of 1 hour to a table receiving a constant stream of inserts, there would likely be some physical distance between both of them. I.e. they would be in two different places on disk as many other rows would be added in between these two.

For the sake of example let’s assume two rows were added for a user whose user_id is 50 and they have been assigned with id=10000000 and id=10001000, where id is the primary key in this table. What sort of work the following query has to perform in order to return results?

SELECT * FROM messages WHERE user_id = 50

It finds index entires where user_id is 50. With the primary key values from the secondary index, it can then locate the physical rows. So InnoDB first seeks a row where id is 10000000. But since it does not operate on individual rows at this point, but rather on the blocks of records called pages, it locates the proper page on disk and caches it in the buffer pool. Only then it can extract the actual row, return it, and move on to the next one where id is 10001000. Since the distance between the two records is significant, it turns out that it is not on the same page that was just pulled into the cache and which holds the first row. InnoDB has to go back to disk to fetch another page, cache it, extract the row and return it.

In the above example at least two I/Os were issued. In a pessimistic case for a few hundred matching rows, when a user has that many messages, that could result in a few hundred I/O operations. Random I/O. Assuming as little as a few milliseconds for each operation, the query may need as much as one second to execute. It would use index properly, it would only pull a few hundred rows, but would still perform poorly. With the exception for flash technology any typical storage is rather slow, and also very difficult to scale, so such scenario should be avoided.

Another negative effect, which could be further affecting MySQL performance, is buffer pool pollution. If application asks for a hundred rows and each of them resides on a different page, InnoDB will need to load one hundred pages that may be irrelevant to all other queries. When the buffer is full, loading anything new into it means dropping some of the existing information. That way InnoDB cache may fill with a lot of data that isn’t actually needed.

What if the primary key was different?

For example a composite key on pair of columns (user_id, id). In such case every row sharing common value in user_id would be stored next to each other. In other words, all messages belonging to a single user would be kept together on one or maybe just a few pages. How would the example query execute now?

SELECT * FROM messages WHERE user_id = 50

Execution begins the same way, however after finding the first matching row, it turns out the other row that the query is looking for is actually on the same page that was loaded into the buffer pool just a moment earlier. So instead of two I/O operations, one was enough. What if there were a few hundred rows to read? One or maybe a few reads instead of hundreds. That is a significant gain.

Additional gain is that the query no longer needs to do the extra lookup through the secondary index and with the lower overhead it can execute even faster. Querying InnoDB tables by primary key values is faster, which is different from MyISAM where it does not matter for performance whether access is by primary or secondary keys.

More? Improved data locality not only improves this query performance. It also allows more efficient use of memory, which results in caching more relevant information and thus improves overall MySQL performance.

Why (user_id, id)? Using user_id alone would not be possible, because primary key enforces uniqueness, while each user may have many rows. With id being a unique incremental column also some inserts may still see improved performance. For example if application inserted ten rows all for the same user, they would possibly be stored on a single page.

When designing an InnoDB table you have to think ahead. None of this matters when database is small, but it can make a tremendous difference once data grows sufficiently. Unfortunately many solutions, which developers use to build applications, completely miss this. Out of many frameworks I know, very few even support composite primary keys through their CRUD modules. On the other hand, thanks to this performance consultants have so much work ;-)

[MySQL Health Check]
About Maciej Dobrzanski

A MySQL consultant with the primary focus on systems, databases and application stacks performance and scalability. Expert on open source technologies such as Linux, BSD, Apache, nginx, MySQL, and many more. @linkedin

Comments

  1. You say:
    “For example a composite key on pair of columns (user_id, id). In such case every row sharing common value in user_id would be stored next to each other. In other words, all messages belonging to a single user would be kept together on one or maybe just a few pages.”

    But I don’t get it how this happens in reality. If I insert a (user_id = 50, id = 10000000) record, then insert 999 other records with “user_id 50″, and then a (user_id = 50, id = 10001000) record, how would InnoDB store the two “user_id = 50″ records next to each other on the same disk page? Where would the other 999 records have been stored?

    • Ivan,

      If these other 999 rows will carry the same user_id value, they would of course be stored “between” the first one (10000000) and the last one (10001000). My example was rather pointing to a more common situation where those other 999 rows would belong to other users.

      • That was exactly my question but my “not equal” was considered an HTML tag and obviously stripped. Please edit my question, it must read:
        …then insert 999 other records with “user_id (NOT equal to) 50″…

        The original was (I’ll use HTML quotes now):
        …then insert 999 other records with “user_id <=> 50″…

        • Ivan,

          These other 999 rows would be placed in other pages, appropriate based on the primary key values. For example:

          [page N]
          ..
          (49, 9389274)
          (50, 27234)
          ..
          (50, 9999921)
                 (50, 10000000)
                 (50, 10001000)
          (51, 987342,  ...)
          ..
          (51, 7834221)
          ... free space ...
          
          [page N+1]
          (51, 8212112)
          ..
          (51, 9529792)
                 (51, 10000001)
          (52, 8934)
          ... free space ...
          
          [page M]
          (212323, 7872342)
          (212324, 5928343)
          ..
          (212324, 9327222)
                 (212324, 10000002)
          (212325, 7889812)
          ..
          ... free space ...
          

          Please keep in mind that tables are not like flat files. InnoDB intentionally leaves some free space in each page, so that it could easily add new rows “in the middle” if necessary. If a page becomes nearly full (the factor is 15/16), it is immediately split into two half-empty pages.

Speak Your Mind

*