Speed up your app with MySQL VIEWs and query_cache
One of the issues with creating highly normalized database designs is that you invariably end up with a lot of inefficient joins in your queries. This can really start to slow down an application. At RentPost, we were recently faced with this issue when building out a new flexible address system for all the entities within the database. The db schema is awesome, super flexible, but it made the select statements an absolute nightmare.
Enter MySQL VIEWS with query_cache. Because these entities don’t change a whole lot and they’re primarily used for select statements, we can fully take advantage of the speed that query_cache can offer in this scenario. The trick though (unless you’re running a website with a large number of selects vs insert/update/delete queries) is to set query_cache_type = 2. This enables your query_cache in an ondemand fashion, so it’s not interfering with the rest of your queries.
So, now, you can create your super fancy VIEW with your crazy joins and then apply sql_cache like the following to enable caching on your VIEW.
CREATE VIEW `foo` AS SELECT sql_cache a.id, a.name...
This is simply going to tell mysql to cache this select statement and now your view is going to be super duper snappy! So, when you have a particular dataset that’s commonly used across your application, where you have a lot of join issues, the combination of VIEWs and query_cache can offer some serious speed benefits.
I didn’t run any actual speed tests on this, but I observed a localhost query in a loop go from roughly 30 seconds to just under 2 seconds. Your mileage will vary and this example is a very drastic one, but the benefits are clear when used in the right situation.
Here are the mysql config settings you’ll need. I’ve added these to my my.cnf
query_cache_size = 64M #query cache of 64M
query_cache_type = 2 #ondemand use sql_cache after SELECT
UPDATE:
I forgot to mention this when this article was initially written. But, seeing as this is getting a fair bit of attention, I figured I needed to update and share more information on the topic.
Firstly, I should note that we’ve moved away from a view and instead opted for triggers to denormalized tables. There were a number of reasons for this, but, in general, the maintenance of views became a headache for us. New development setups, database testing (sqlite build process for instance) made views a general frustration. Not to mention, by having a denormalized table, we could treat it just like any other table. And, while a view is very similar to a “normal” table, it’s certainly not the same.
This all said, I don’t think views are a bad solution and can certainly provide some great performance benefits. If you’re still interested in using views and going this route, there is a big gotcha that you must be aware of.
Understanding how query_cache works will make all the difference in the world in terms of performance. If you go with the above described setup and create the view, then just write some queries against it, per normal, you’re likely not going to be seeing any performance gains. In fact, you’re likely going to see some pretty large performance hits!
The reason for this is quite simple actually. Because of the way mysql’s query optimizer functions and the way views are built and stored in cache, you have to be aware of how you’re writing your query and what’s happening behind the scenes. Without going into all the detail myself, and because someone else has already done an excellent job, you can read more about this here and here.
The secret is to use a derived table. A derived table query should grab a copy (it’s important that the SELECT query is always the same) of the view from cache and assign it for use within the current query. If you don’t do this, it will force a new version of the view results to be cached. And, when the next query comes along to perform a query against the view, it will need to completely re-create it, again. This is counterproductive and defeats the entire purpose for the query_cache in the first place. So instead, you should use a derived table, like so…
SELECT * FROM (SELECT * FROM view) as view WHERE view.name = "Jacob";
So, we just assign the full result of the view to an aliased derived table, then perform any additional filtering and modifications on the data within this query. This way, the original cached dataset of the view is kept in tact in cache as a full copy, not needing to be trashed and re-created by MySQL every time it’s used.
I hope this helps.