MySQL Query Performance Tips
Almost all of the modern web applications involve the use of relational database. So the performance of your applications are very much affected by the quality of the SQL queries you run. Here are some simple performance tips from MYSQL forge that you should consider when writing queries.
- Use EXPLAIN to profile the query execution plan
- Use Slow Query Log (always have it on!)
- Don’t use DISTINCT when you have or could use GROUP BY
- Insert performance
- Batch INSERT and REPLACE
- Use LOAD DATA instead of INSERT
- LIMIT m,n may not be as fast as it sounds
- Don’t use ORDER BY RAND() if you have > ~2K records
- Use SQL_NO_CACHE when you are SELECTing frequently updated data or large sets of data
- Avoid wildcards at the start of LIKE queries
- Avoid correlated subqueries and in select and where clause (try to avoid in)
- ORDER BY and LIMIT work best with equalities and covered indexes
- Separate text/blobs from metadata, don’t put text/blobs in results if you don’t need them
- Delete small amounts at a time if you can
- Make similar queries consistent so cache is used
- Don’t use deprecated features
- Turning OR on multiple index fields (<5.0) into UNION may speed things up (with LIMIT), after 5.0 the index_merge should pick stuff up.
- Use INSERT … ON DUPLICATE KEY update (INSERT IGNORE) to avoid having to SELECT
Now if you know all the above tips, then may be you want to start looking at database scaling, design and more from MYSQL forge.
Here are some more detailed articles about optimizing MySQL for those of you want to get your hands dirty:
- MySQL Queries Optimization by Peter Zaitsev
- Table Design and MySQL Index Details
- MySQL & PHP Performance Optimization Tips
- MySQL Presentations: Optimizing MySQL
Comments so far
excellent tips, very useful for newbie like me, thks for sharing ^_^
I’d also like to add, in extreme niche scenarios, using a Unix epoch style timestamp (with an INT datatype) is more efficient than the DateTime field. I can’t vouch for the performance difference, it’s probably pretty small, but back in 2001 with MySQL 4, I saw performance gains in the 8 fold difference (that’s 800%).
But I have faith in today’s hardware and MySQL 5 builds and I haven’t attempted to recreate my old benchmarks. Despite, the fact is MySQL has to store date/time internally, the format you see is massaged for humor consumptions, so conversions are being executed here, the less conversions the better. And on that note, I recently ran across a system where the coder irresponsibly retrieved all data for a certain date by doing
SELECT * FROM table WHERE date LIKE ’2008-11-01%’
Instead of simply using BETWEEN, obviously this incurred massive slowdowns.
The last one “INSERT … ON DUPLICATE KEY update (INSERT IGNORE) to avoid having to SELECT” in some cases can be really bad for your slaves, as INSERT IGNORE will propagate to all slaves, causing slave lag. If you expect lots of INSERTs to dupe, it’s better to SELECT and INSERT selectively.
Artem
Thanks very much for taking your time to create this very useful infos
Post a comment
Trackbacks/Pingbacks