Monday, December 31, 2007

What is a Production DBA?

I often hear recruiters say that they are having a hard time to find a MySQL production DBA. Well, since MySQL is not yet that mature as compared to Oracle, most DBAs are therefore relatively unseasoned.

And most DBA are not into production work yet. So, what are the experience / characteristics of a production DBA? Here are my two cents.

1. Manages 20-30 databases which are 25-30 GB in size.
2. Knows what to do when he is on-call.
3. Understands the inner workings of MySQL. For example, does he know what mysqldump actually do aside from backing up the data? How will it perform with a 25 GB data?
4. Based on the 1st criteria, the DBA will be careful in writing his queries. For example, he knows that select * from a table is query that should be banned from MySQL.
5. Very handy with scripting languages which could be PERL, Shell, etc. If the DBA has to update the configuration of 30 databases, he cannot do this manually and therefore he has to rely on his scripting skills to be able to whip a quick script to do the work.

Wednesday, December 26, 2007

JasperServer Evaluation

This package is one of the many parts of the JasperSoft BI solution. I'm currently evaluating the JasperServer and below are my comments.

1. The documentation for JasperServer is terrible. Unlike other documentation which assumes that you don't know anything, JasperSoft assumes you know their stuff already. Because of this, it is quite hard to understand.

2. iReport which is the GUI interface to create JasperReport looks cool but I can't seem to get it working with JasperServer. Again, because of poor documentation or none at all. Oh wait! There is documentation which I think can be found in the 'JasperServer Ultimate Guide' which you have to buy for $50. Hmm, maybe a good way to make money for open source company. Create a nice looking open source software, but sell the documentation. Sweet!

3. I almost forgot this... when I was installing JasperServer, typically you would expect that if you have an error, the installation will stop or just let you know that there was a problem. With JasperServer, you have to look at the logs and make sure that you don't get any Java related errors. Sweet!

Well, I'm still evaluating this software but my other co-worker is evaluating InfoBright which I heard is a MySQL storage engine that compresses and makes your data ready for data warehouse processing.

Wednesday, December 19, 2007

How to Analyze Slow Query Logs in a Production Database

I have 25 GB of data and I keep the parsed slow query logs in my database. How am I going to analyze these slow query logs? Sure these logs can tell you the top 10 slow queries, but is this the only use of these logs?

What if I migrate my database to a new hardware? How do I compare the slow query logs of the two hardware? The problem is that this is a live system and therefore the amount of query are not the same, so it is like comparing apples and oranges.

After a cup of coffee, I realized that I would just do the following:

1. Get the high level average count of slow queries and average query time for 1 week before and after the migration.
2. Drill down the average count of slow queries and average query time on a server by server basis.

Tuesday, December 18, 2007

Essential MySQL DBA Skills

What are the skills of a MySQL DBA? Here are my 2 cents.

1. Solid understanding of database system including design and how data is processed.
2. Handy scripting language that you can use like PERL, Python, or even Shell.
3. MySQL Backup, High Availability and Performance Tuning.
4. SQL language.

Monday, December 17, 2007

How to Use Delete SQL in a Production Environment

I made a mistake in using the delete command in our production environment. I had 17 million rows and I only needed to keep 3 million rows, so I issued a delete sql to erase the 15 million rows. Bad move! Because I immediately saw how the I/O of the machine went up very high and the delete sql caused a huge load spike. On top of that, I had to run an optimize table command because MySQL did not free the deleted space and this caused another load spike.

In retrospect, I should have done the following items:

1. Instead of delete sql, I should have issued a create new table and select the 3 million rows from the old table. This would have been faster and it would not be necessary to run optimize table.

2. If I have to use delete and and optimize table, I could have copied the table to a non-production database and run the commands there. After it has finished executing all the processes, I can copy the new tables back to the production database. I have this option since there is no transaction happening on those tables on an hourly basis.

Sunday, December 16, 2007

MyISAM 4GB Default Limit

I was coincidentally doing a show table status on couple of tables when I saw that one of the MyISAM tables has reach the 4 GB default limit. The table stores all my show processlist parsed data and apparently my parsing and importing script did not notify me of any errors like the table was full or something. Anyway, I solved the problem by running the command below.

ALTER TABLE MAX_ROWS = 1000000000;

A good discussion on MyISAM 4 GB default limit can be found on this link:

Saturday, December 15, 2007

How to Test a MySQL Backup Recovery

Okay, so you have used some mysqldump to backup some of your data. But have you wondered how to test your backup? What if mysqldump does not really 'backup' your data? It is actually quite simple to test your backup and that is to verify the restore of your backup.

Here are a couple of ways to test your backup recovery (assuming that you have restored your backup already).

1. Do a select count(*) on all your tables before and after you restore your data.
2. Do a md5sum on your MySQL files before and after you restore your data.

Obviously when you do test your recovery process, you need to make sure that no data is being changed while you are doing it.

Friday, December 14, 2007

How to Issue an SQL to all MySQL Database without Replication

Here's the replication structure: 1 master, 5 slaves under it, each slaves have another 5 slaves under it.

Here's the problem: What if you have to update a couple of big tables and you know that if you apply it to the master that it will take a long time to replicate? What can you do?

Here's the solution: Run the update queries on each database by turning off replication. This can be done by running the command below.

set SQL_LOG_BIN = 0;

Thursday, December 13, 2007

First Day of the Month

I was looking for a function in the MySQL documentation on how to get the first day of the month and I couldn't find one. Although I found that there was a LAST_DAY function. So, here's my function to get the first day of the month.

select DATE_ADD(LAST_DAY(DATE_SUB(CURDATE(), interval 30 day), interval 1 day);

This is how I computed the first day of the month. I'm not sure if there is a simpler way of doing this.

Wednesday, December 12, 2007

MySQL Volume Partition

When I say volume partition, I'm referring to how to separate the different structure of your MySQL components to different volumes. For a newbie, it is so easy to just install MySQL using its default configuration. This means that your binary logs (assuming you are using it) will be in the same volume as your data files.

There is nothing wrong with this if you have a small database, but once your database grows, you will see an impact of what you did. What you sow, is what you reap. Let me explain this further, remember that binary logs needs to be written to the hard disk and the same goes for the data files. Imagine if there are two processes writing to your hard disk at the same time, what would happen? Setting aside the technological jargon, it will take longer to perform the necessary action.

This is the reason why you have to be careful in partitioning your MySQL into different volumes. Make sure that the binary logs and the data files are in a different partitions.

Tuesday, December 11, 2007

Book: DBA for Dummies

I just checked now and they do not have a book on DBA for Dummies. This was not surprising to me as I know that there are only a handful of DBAs out there specifically for MySQL.

And what topics would be covered in this book? Replication, backups, security, etc... But all of these topics can be found in other MySQL books. So, why would someone write a book for DBA for Dummies? Because it would contain advance topics or a in-depth details on the topics just mentioned. For example, how do you do circular replication? Or how do you manage 25 databases?

Monday, December 10, 2007

It's Monday... What do I do as a DBA?

1. Check if all the weekend database related jobs ran, this could be backup, or some data-related processing work.

2. Check your replication and make sure it was running smoothly (you should actually check this everyday).

3. Check your logs for any related errors.

4. Check the scripts that you are working on.

Sunday, December 9, 2007

Oracle vs. MySQL

I heard someone said that if you do not need to know the intrinsic details of the database, then go for Oracle. I love Oracle because of its robustness. No database can compare against it. It has been there for a long time... Some other database that rose together with Oracle have already died down, but this database still lives.

Am I going to compare MySQL and Oracle here? Nope. For all you know, you really did not have a choice as to which database you have to choose. Your boss chose it for you. So, what is the point of having a title called 'Oracle vs. MySQL' if I'm not going to compare them? Nothing really. It's just a point where there are certain things in life that you really don't have control about.

Saturday, December 8, 2007

MySQL Database Hardware Migration

What do you do when your database is falling behind?

1. Optimize the queries that are running on your database.
2. Try to partition your data so extensive I/O database will be distributed evenly.

If both approach don't work, you have one option left and that is to migrate your database to a new hardware. Here are pointers on what you have to do.

1. Backup your previous data.
2. Make sure that the new hardware configuration is identical with the old hardware.
3. Prepare for a downtime meaning you have to shutdown your database.
4. Copy your data to the new hardware.
5. Test your data in the new hardware.

Friday, December 7, 2007

Help! My Database Stopped Working

What do you do when MySQL stopped working? Here are my 2 cents on what you should do.

  1. Check if the database is running. Check the MySQL process.

    ps -aux | grep mysql

  2. Check the error logs. Where is the location of the error logs again? The default location would be at /usr/local/mysql/data/ and look for a file with .err at the end of it. Sample errors would be:

    • Database got restarted - check with your sys admin why it restarted.
    • Replication failed - do a show slave status and check the error then restart the replication.

  3. Check the MySQL process list and see if there are any read / write locking happening.

    show processlist

Thursday, December 6, 2007

MySQL Eggs

When I was playing some Windows games before, there used to be eggs like a combination of certain keys would produce messages like the program pays tribute to blah, blah, blah. I wish MySQL has something like that... that if I press certain combination of keys that instead of the language in English, it would become a combination of English and unknown language. Just a thought.

When you look at the new feature set of MySQL, they are just so serious in trying to become the no. 1 database out there. Where is the fun??

Wednesday, December 5, 2007

My.cnf File

This is the configuration file of MySQL. For new DBAs, depending on your needs (eg. if the database is small, medium, large), just copy the appropriate my.cnf in /usr/local/mysql/support-files/ . Then just change couple of parameters and you are all set. There is no need to make a huge fuss about it. Many experts will tell you to change this file, I recommend not to listen to them. MySQL was created to be simple... If you really need to change it, then make sure you know what you are doing.

Tuesday, December 4, 2007

MySQL Conference 2008

Yep. Guess you have heard that there is such a thing called "MySQL Conference 2008". In fact, this has been a yearly event. This year it would be held in Santa Clara, California on April 14-17, 2008.

But the question is should you attend at all? Well, if your company is going to pay for it, why not?! But if you are going to pay for it, think twice. Sure, you will learn a lot, but do you really need them especially since you are just starting as a MySQL DBA? Even if you are an intermediate DBA, it would just be for fun if you are going there... I'm not sure if the learning value outweighs the cost.

Anyway, check this out for details...

Monday, December 3, 2007

How to Install a MySQL Database

Just search the MySQL documentation and it will clearly tell you on how to install MySQL. Just for clarification, when you install MySQL. There's the installation for the server part and there's the installation for the client part.

When your boss or peers asked you to install MySQL, they are referring to the server part. The server is what you use to create your database and store your data. The client is what you use to 'query' your database server.

For simplicity sake, use MySQL RPMs. This is used in a Linux system. I assume that if you are using Windows that you should be able to guess what you need to do. There's a reason why you should use RPMs and the reason is that it is simple, efficient and easy to manage.

As a newbie, do not be tempted to do binary installation eventhough it looks easy and believe me it is easy too. But RPMs are way easier... here's how to install a MySQL database in a Linux system.

1. Download the appropriate MySQL RPM for your Linux system.

1.1 32-bit or 64 machine??? Oh, you need the assistance of your system administrator for this... specifically ask them if you have a 32-bit or 64-bit. For 32-bit machines, you need the RPM with 386 appended to it and the 64-bit has the 686 appended to it.

1.2 What kind of operating system? Again, ask your system administrator. It is easier to ask than to figure it on your own. So, if they tell you they are using a Fedora Core 6 (note on the version of the operating system, it is important), then just look for the corresponding RPM in the MySQL website.

2. Install the MySQL RPM.

Ahem, ahem... use this command....

rpm -ivh .rpm

How hard is that? Oh, in case you made a mistake. There are two things that will most likely happen, you will get an error saying that you downloaded the wrong RPM (you can blame your system administrator or you can blame your boss for giving you a small monitor -- can't see that right rpm in a 19" monitor) or you installed the wrong RPM.

If you installed the wrong RPM, use this command to erase it.

rpm -qa | grep -i mysql ((a) this will tell you what MySQL version was installed.)

rpm -e (result from the above) (This will remove the RPM.)

So, that's it. Your next step as outlined in the MySQL documentation is to start the server, issue some grant SQL, etc...

Sunday, December 2, 2007

Pareto Principle in MySQL

Pareto Principle, also known as the 80/20 rule, states that 80% of the output comes from 20% of the input. How are we going to apply this to MySQL?

If the MySQL documentation is 1,000 pages long (I'm giving an arbitrary number here), then according to the 80/20 rule, in order to learn 80% of MySQL, you just have to study 20% of the documentation which is 200 pages.

But! 200 pages is still a lot. Non-fiction books are almost 200 pages long. What if we apply 80/20 rule to this situation again? So, to learn 80% of the 200 pages, you just need to learn 20% of it which would be 40 pages.

Think about it, in a daily DBA work, do you really need to know 40 pages worth of MySQL documentation? Most likely not, you may have intuitively applied the 80/20 rule again and again until you are left with probably 3 pages worth of MySQL documentation which are the bare essentials to your work.

Saturday, December 1, 2007

MySQL Logs

My previous boss told us that in order to become a good system admnistrator, one must know how to parse the 'logs'.

I hate the 'logs'. As you are well aware of, there are three major logs in MySQL and those are the error logs, slow query logs, and the binary logs.

1. Error Logs

This is the simplest of all the logs and the most important for all the newbies. In case you forget anything, remember to know where the error logs are located because it will save you a lot of headaches in the feature.

2. Slow Query Logs

The infamous utility called the mysqldumpslow will help you parse your slow query logs. If you have a small database, then this can help you. But what if you have a 25 GB database with a rate of 1,000 queries per seconds? Mysqldumpslow will still be able to help you, but you may have to parse the slow query logs on your own.

You can use your favorite language, PERL, Python or even AWK to help you parse the slow query logs. Come to think about it, they are just text files. Is it easy to write a parsing script? Hmmm, it depends on your skill but it is worthwhile to learn regular expression and ways to parse these files on your own.

3. Binary Logs

When I first heard about this type of log, I hate it! This log contains all the changes done to your database. And there is a parameter in the my.cnf that controls the growth of the binary logs and it would be painful if you don't set it right.

Imagine the binary log grew to 2 GB, when you use the mysqlbinlog to convert the binary logs to text files, imagine how long it would take? Or what if you know a position in the binary log that you want to extract, how long would it take the mysqlbinlog to get it from a 2 GB file?