A r t i c l e s
Navigation

Note: This Wiki is
outdated, personal views
may have changed.
L505 A.I. bot is dead
long live THX 1138

M a i n P a g e

D i r e c t o r y

MySQL Optimization


On medium to large databases (more than 3GB of data in one table) It's very important to have a sufficiently optimized database - the standard programmer uploads data and focuses his time and effort optimizing the code instead of the database. In really large databases a large key_buffer_size can help. However first before playing with database settings it is essential to study and experiment what the indexes are doing. In many cases a simple change to the query can take a 30 second response time down to a 0.030 response time or a 0.30 response time.

Don't believe that indexes will just magically help your performance as soon as you create them. There are issues which cause indexes to be useless. Even tables with proper indexes can have queries that scan through almost the entire table even though the index is generated optimally. For example calling the UPPER() function, or performing functions on the left side of your equal equations can cause your database to be almost scanned through entirely - because MySQL isn't as smart as one would assume. Don't assume anything.

Everyone says indexes help performance - but it isn't as simple as creating the index.

One thing to try, before you get carried away with increasing key_buffer_size or creating indexes, is to delete all your indexes and get a low down or a bottom line figure for how your database is performing without them. Mark it down. Save an SQL script that regenerates the indexes. If after deleting your indexes (except the primary key) the database is much faster or there is no difference, then you've probably got to change something in your SQL statement.

If you know about indexes, you might be thinking.. why not just run the query without the index, specifying that no index be used in the SQL query? Well - deleting the indexes is still better - it gets rid of them so that you don't have any room for mistakes (such as forgetting to specify no index). Make sure however that you know how to regenerate your indexes if you are going to delete them. Deleting your indexes could cause your development team to come and kill you - don't do this unless you know what you are doing or you have basic knowledge about how to recreate your indexes.

If you are fairly fast, but not fast enough

If you find that indexes help and you have no problems in your sql statements (such as functions on the left hand side of equations) and you still want to tune the DB to be faster - try the key_buffer_size tuning (this is not tuned on shared servers by you, you will have to contact your web host).

Before you increase the key_buffer_size you must realize a few things.. on shared web hosts this is hard to do. You must have root access to do this. Most shared hosts setup the typical 8MB or 6MB or 64MB key_buffer_size. This is usually way to small of a key buffer size for large databases. However 64MB usually does fine for most smaller websites without GB's of data. If you need to increase the key_buffer_size it is usually only when you start dealing with GB's of data.

Do not go crazy and increase the key buffer size too much. Try 10 percent of your memory, then try 15 percent, or 25 percent if you are on a dedicated server. But don't do this until you really know that your indexes are large in size. You can view the index size in the mysql administrator program. So if you have a machine with 1GB of memory and it is your own server, try 100MB or even 150MB for the key_buffer_size but ONLY if you need it. If your indexes are smaller than 100MB and you have a dedicated server then it isn't going to help much at all.

In order for the key buffer size tuning to actually take affect, you must already have created indexes. I won't discuss how to create proper indexes in this article as that is another story that you probably already know. Don't do any tuning until first you have proper indexes.

Increasing the key buffer size is not the one and only thing that makes large websites fast. But it is one thing that folks overlook - they install the typical MySQL database and leave the default settings on.

Since on many shared web hosts we cannot change the MySQL settings like key_buffer_size, Z505 software has seen drastic increases in website speed just by tweaking the SQL statement. One little change to an SQL statement can make a 5 or 30 second response time become a 0.03 second response time.

EXPLAIN SELECT command is over-rated

Using the EXPLAIN command in front of your SELECT statements can sometimes help you solve index issues - however many times database tuning is much more about knowing how the database actually works. For example why is the EXPLAIN statement showing that 1 million rows are being scanned, if the EXPLAIN statement says your indexes are perfectly fine? You must know why 1 million rows are being scanned when you only want 10 thousand or 1 thousand to be scanned. The EXPLAIN command is not magically going to tell you whether the key buffer size is large enough. It can give you hints.

Put yourself in the customer's shoes

Pretend you are the customer and you are visiting a random page in your database.. a random page at the very end or the very middle of your database. Test from the command line and not just from a GUI client.

Test from CGI, php, asp, or your website

Sometimes, connecting through the command line or a MySQL GUI still isn't the same as a web program. From our experience some databases have quick response times when command line and GUI clients are used, but when the cgi program or web program connects things are much different (even after benchmarking only the SELECT and including no web browser output time in the benchmark. - i.e. setting a clock timer before and after the query and printing the response time out, completely separating the query time from the browser output time - even with the exact same connection parameters).

This is not an exact science and requires real world experimentation.


Real world study

Z505 Software uploaded a 5GB database with optimal indexes on a shared server, where default key buffer size was about 67MB.

Performance was extremely slow when doing queries in the middle or at the end of the database table (30 second response). The absolute simplest queries were used and proper indexes had been created. The server was a dual Xeon with 4GB of ram.

At the very beginning of the table (first 10,000 rows out of millions) the database responded quickly (0.3s-2s). When accessing about the 500,000th row in the database (simple SELECT) the database sometimes took 30-40 seconds to respond. Testing was done over a period of 5 days in different server conditions.

The website was only getting 1 unique visitor per day (the tester's IP), and yet it still took 30-40 seconds to respond. One minor setting such as key buffer size can severely affect the performance of a large database.

After tweaking the databases using experimental testing, keeping exact track of SELECT query timing, paying very close attention to the amount of rows being scanned to obtain results, and tuning different settings and indexes, the response time was down to 0.3 - 0.4 seconds and down to 0 seconds using caching - even at the end or in the middle of the database. Then after we modified the the database table to contain a special extra column which kept a sorting intelligence about the database rows, the response time moved down to 0.02-0.03 seconds (30 milliseconds).

Many times storing some intelligence into a separate column is all it takes to make a 3 second query become a 0.03 second query - but knowing how and why to implement this intelligence can be tricky. An index can be shrunken significantly when the intelligence column contains much less data than the actual column being selected - because the intelligence column contains shortform, known, presorted information about the SELECT column(s).

Database tuning can drastically change a website's response time - so don't buy new hardware just because the site starts to slow down or the database gets large. Consider first studying about index buffering and even deleting your indexes. Be experimental, don't always pay attention to what the book says.

Another problem with the servers these days is that they have such powerful processors and so much memory - that even somewhat ineffecient poorly tuned databases and queries can appear to run fast during limited testing at the command line or in a gui client. But once the site becomes larger and has more traffic - all of a sudden everyone thinks it must be time to purchase new hardware, because the script worked fine when they tested it months ago. Little do they know that a 1 second response time on a powerful server may actually signal a serious flaw in the database query or database settings.


Database Consulting

Z505 Software offers free tips for optimizing databases, however we also do database consulting and custom database work. dbconsult(@)z505(.)com

About
Note: This Wiki is outdated, personal views may have changed.

This wiki contains info on life, health, humans, nature, programming, database, fads, paradigms, poems, principles, theories.

Articles may contain statements which some may find helpful and encouraging, or even discouraging.

Beware, I believe in the Grand Justice system.
_ _ _