Sunday, June 8, 2008

Cause of Replication Delay? I/O or SQL Thread?

I was thinking of this and could not figure it out or maybe I am just confused... Hehehe.

If MySQL is lagging behind the master (7,000 secs.), would it be caused by the I/O thread being slow or by the SQL thread (of course, this is assuming that I only have these two options to choose from)? If it is the I/O thread, what should I be looking for?

I've always assumed that the I/O thread would not cause a huge replication delay as it is just reading the binary logs from the master. So, if the Master_Log_File and Relay_Master_Log_File are the same and the Exec_Master_Log_Pos and Read_Master_Log_Pos are the same too, then I can assume that the replication delay is not based on the I/O thread, right?

And does anyone know how to compute the Seconds_Behind_Master???


Volcane said...

Replication often lags - on a busy site its the natural state of things unfortunately.

Your master server no doubt has many parallel threads writing to it, many webservers and each webserver with many threads each working in paralls and all your cores are being used by all the threads, as you'd expect.

Contrast this with the slave, the slave has a single thread reading from the logs and replaying them, in a serial manner. It is only using one core of your server and the next query in the log will wait for the previous one to finish.

The end result is that with a busy master, your slave will lag a lot.

Andrew Johnstone said...


Take a look at googles mysql tools, it contains a number of extra stats for MySQL in particular for replication.

Jay said...

The lag is on the SQL thread which is replaying transactions. The relay thread (IO thread) receives the transaction from the master within nanoseconds of the master writing it out, and the IO thread has written that transaction for the SQL thread to replay shortly thereafter.

Baron said...

Compute the Seconds_Behind_Master from what? From the other stats in the output of SHOW SLAVE STATUS? You can't do it.

You should ignore it anyway, it's unreliable. A heartbeat record is based on actual facts, and is reliable. See mk-heartbeat in Maatkit.

Anonymous said...

Hi Baron,

I have just installed the mk-heartbeat as I noticed something peculiar with the Seconds_Behind_Master.

I was monitoring it by doing a show slave status per second and I noticed that the value was fluctuating. It was going from 4200 seconds to 0 second.

I guess the Seconds_Behind_Master is really not reliable. I wonder if this is something that will be fixed or removed in the future version of MySQL.

Greg Haase said...

I've recently been experiencing some issues with the I/O thread being behind, so let me take a crack at this one.

On the master, log into the database as a privileged use and and issue "show master logs". Note the last row - both the Log_name and the File_size.

Then on the slave, do a "show slave status" and note Master_Log_File and Read_Master_Log_Position. Hopefully, the numbers will be about the same.

In my case, the slave was 3 bin-logs behind.

We were seeing strange symptoms where the slave would report itself 12000 seconds behind, and then 0 seconds behind. It was flip flopping back and forth.

I finally figured out that when the SQL thread has finished all it's available statements from the relay log, it says it's 0 seconds behind, since it can't know when the next query is going to come. When the relay log fills in and the next query runs, it picks up the timestamp of that query and says "
Oh crap, I'm 3 hours behind". It processes the query, doesn't find any more commands to process and says - Oh good, I'm done - 0 seconds behind. Then the next command comes in and the slave looks at it and says "Oh crap, I'm 3 hours behind".

Hope that helps.

Anonymous said...

Hi Greg,

That's a possible explanation and it could also mean that the Second_Behind_Master is computed based on the recent query that the slave will try to execute.

I'm still trying to figure out how the slave is computing the Seconds_Behind_Master... I may have to resort in looking at the code.

Baron said...

Take it from someone who's looked at the source: the source is tough to grok :-) The value is computed by looking at the server's current timestamp and the timestamp of the executing query. The binlog events (and hence the relay log events) have a timestamp in them. Unfortunately, through a variety of means they can get really snarled up, and that's why you'll see the weird values. Don't waste brain time on it, just use a heartbeat record. There's no way to fix Seconds_behind_master the way it's done now.

Anonymous said...

Thanks for saving me time, Baron.

Sheeri K. Cabral said...

My understanding is that seconds_behind_master is calculated by taking the current master's time, as in the relay log. So if the I/O thread is lagging, the seconds_behind_master is incorrect.

check out

However, it's fairly easy to see which the cause of replication delay is -- you can see if the Read_Master_Log_Pos (from show slave status on the slave) matches the actual master position (from show master status on the master). If it doesn't match, it's lagging behind because of the I/O thread.

Anonymous said...

Hi Sheeri,

Thanks for your comment. But maybe I'm just confused but does it jive with what Baron mentioned on how the Seconds_Behind_Master is computed?

He mentioned that the value is computed by looking at the server's current timestamp and the timestamp of the executing query.

Baron said...

But the timestamp of the executing query, and the assumed timestamp of the master relative to that, come from the relay log. And this is written by the I/O thread, and if the I/O thread lags, as Sheeri mentioned, it gets all borked up in the relay log. (Try it: intentionally delay the I/O thread and then compare the relay log to the binlog on the master). It will make your head spin :-) Oh, it's not just the timestamp of the query. That's not altered in the relay log -- the CURRENT_TIMESTAMP at which the query executed on the master is preserved to the slave. What's not preserved is the exec_time in the log event, and this is part of the calculation of how far behind the query is.

Anonymous said...

Thanks for the clarification.