How InnoDB works with transactions and auto recovery

How InnoDB work with transactions:

When any transaction will be completed with COMMIT,  InnoDB will write those changes in InnoDB Buffer Pool. After that InnoDB will run some background operations like checkpoint.  Checkpoint is the most important operation which will writes the changes on disk.   Lets see how it will work.

During the checkpoint phase, InnoDB writes dirty pages to the double write buffer, and then writes pages from the doublewrite buffer to the actual tablespace. During checkpointing, as pages are flushed to the actual tablespace making the data changes persistent on disk, log_sequence_numbers (LSN) are also updated on the pages. The LSN info written to the page is what identifies whether a data page has current data or not, during the crash recovery phase.

How InnoDB does crash / auto recovery:

In the starting of crash recovery, InnoDB will compare the LSN of every log record to the LSN in the data pages to check if the changes in log record should be applied to the data pages or not. If the LSN in the data page is more current than the LSN of the log record then it means log records are not applied  to the data pages. In this case, crash recovery will be started. InnoDB will read the log files, find the missing log records and apply them to data pages and after that InnoDB will be started.

The purpose of checkpoint is to free up space in the InnoDB log files, because log files are fixed in their size, so before InnoDB can  overwrite log records with new ones, those log records must be flushed to the data pages.

How innodb_flush_log_at_trx_commit is important here:

The setting innodb_flush_log_at_trx_commit is the important one as it will guarantee the consistency of data by making sure that the changes made by a committed transaction have been logged to the log file on disk. So the setting above should make the backup consistent, provided that there are no write caches on disk that cache writes. Because if you have a write cache on disk enabled, then InnoDB would be fooled by the disk into believing that the transaction modification log has been successfully written to disk, while it would only have made it to the disk’s write cache. So this is something to be kept in mind. Otherwise, setting innodb_flush_log_at_trx_commit to 1 should make the backup you take consistent from the point of view of InnoDB.

One thought on “How InnoDB works with transactions and auto recovery

  1. If MySQL crashes, there may be data that is changed that has not been written to disk. Those data pages were marked as “dirty” in the innodb_buffer_pool, but after a MySQL crash the innodb_buffer_pool no longer exists. However, they were written to the redo log. On crash recovery, MySQL can read the redo log (InnoDB log files) and apply any changes that were not written to disk.

Leave a Reply