Monday, October 22, 2012


An Introduction to MySQL 5: Views

The MySQL 5 series introduced quite a few changes. Triggers and stored procedures were two of the big ticket items. One of the lesser known additions, at least from the amount of writing on the subject, is the introduction of views. While after a quick look at MySQL views, you might not see the obvious advantages, they’re there if you dig into them just a bit.


Introduction: What is a View

“A View is nothing more than a pseudo table doing the work of a defined query.”
In short, the easiest way to look at a view is that it is just a stored query which mimics a table. It’s nothing more than a pseudo table doing the work of a defined query. It doesn’t add any functionality, such as changing variables, nor does it fire other queries when something else happens. Views just sit there, fat dumb and happy doing what you defined them to do.
On first blush, this doesn’t sound like that big of a deal, but if you dig past the surface, you can start seeing some of the power of the lowly view. I am not going to say that views will change your life, but I will tell you that they will make the job of database interaction a little easier to work with. They will make your job a little easier when you make major version changes in your interaction layer. They will also make some difficult tasks such as dynamic reporting a little more efficient and a little easier to work with. I’ll take a little easier any day of the week.
With anything there are tradeoffs.
“Views may and probably will decrease your performance.”
As I have written in the past, I am a believer in making tradeoffs, as long as you understand what is on the table. More than likely someone will skim past this paragraph and make a comment that you should never use a view because of the performance hit. I disagree. You should use every tool in your toolbox, but when they make sense. You don’t use a hammer to put a screw into a wall, just as you wouldn’t use a view when you really need a heap / memory table. The flip side is the development usability for your applications. When it makes sense to save time, effort, and energy over the performance hit that you might take, take that choice. Development isn’t all about the performance of your applications, as there are other considerations such as support, time to market, and overall value of your time.
The tools that I am working with in this tutorial are pretty standard. I am using phpMyAdmin for the database interaction for explanation purposes. I will also be using very rudimentary table structures, strictly for ease of explanation. I don’t expect that these table structures would ever be used in a production environment, as I am merely using them for illustration.
One further note. There is no right or wrong way to name a view. However, I do name my views with the syntax of view_*primary_table*_*what_the_view_is_used_for* unless I am working for backwards compatibility changes. So, if I was creating a view for statistical reporting purposes on my salesforce table, my view name would be: view_salesforce_statistical_report. That can be rather long and you only have 64 characters to work with, so keep that in mind. It works for me, and my style, it might not work for you.
“I am not going to say that views will change your life, but I will tell you that they will make the job of database interaction a little easier to work with. They will make your job a little easier when you make major version changes in your interaction layer. They will also make some difficult tasks such as dynamic reporting a little more efficient and a little easier to work with.”

Definitions: How to Define a View

As I stated, a view is just a query. There are some slight configurations that we need to make when creating a view, so let’s discuss that first. In phpMyAdmin, in the “browse” table page, you will see a link called “Create View”.
Create View Link
If you click on that link you should see something that looks like this:
Create View Page
This, my friends, is where the magic happens. There isn’t much to explain on the page, but let’s take a quick look at the definitions that we need to understand to create a view.
First, there is “Create View” followed by “OR REPLACE”. If you click the OR REPLACE it will do exactly as you think, which is overwrite an existing view. Column names is the name of the columns in your table. Each is seperated by a comma, so it might look like first_name, second_name, etc. AS is the query.
There are two more items to explain, but the concepts are not hard. ALGORITHM has the selections of undefined, merge, and temp table. If you select “Merge” when there is a one to one relationship it will combine the incoming query with the view. Temp table is less efficient, but when you are not using a one to one relationship, such as a aggregation function like SUM() or you are using certain keywords like GROUP BY or HAVING or UNION, you have to use the temp table algorithm. That said, you can do like I do, and leave the algorithm as “undefined” and MySQL will select the best algorithm to use.
Finally, we have CASCADED CHECK OPTION and LOCAL CHECK options. The check options tell MySQL to validate the view definition if there is a condition on the query, like WHERE. If the WHERE clause excludes data, it will prevent updates or insertion of those rows where it should be excluded. The LOCAL CHECK deals with just the view you are defining, where CASCADED CHECK is for views that you have defined from other views. It will cascade the check for those as well.
That’s a view in a nutshell. Let’s take a look at some use cases to see them in action and where they may help your development efforts.

Backward Compatibility: For the Procrastinator

I have had it happen more times than I would care to mention when I design a single use table which I never think will need to be normalized further inevitably does. Let’s take the example that showed before with a few more records.
Not Normalized
Obviously, my normalization skills leave something to be desired in this example. What I probably should have done when I first created this table, was have a seperate table for addresses, and then just call an address_id in my sales force table. The problem is, once I move to a database change, I have to run back through my logical interaction layer and make numerous query changes. Instead of doing that much work, why not let a view come to the rescue.
First, let’s make the change to my table structure. I copy my table structure and data to my new table, addresses and make the table sane such as adding address_id and removing the unneeded structure:
Create Address Table
Address Table
Then I just need to delete the offending columns and add my address_id back to my sales table.
Sales Table
This is a pretty common change that you make on a semi-regular basis, although rather simplistic in nature. You figure out that you can normalize something because of a new feature. In this case, we can reuse our addresses for customers, or for other employees, or for whatever we might store addresses. This work isn’t that difficult, but depending on your query reuse, finding all of the places that you call our old sales_force table might be a much larger change in scope. In comes a view.
Instead of going back through our code right now, and instead wait for a normal release cycle, we can create a view to keep our old functionality intact. I changed the name of our sales_force table to sales_force_normalized:
Change My Table Name
Now we can create our view to maintain backwards compatibility:
Create My View
And we have our backwards compatibility with just the extra work of creating one query that sits in MySQL:
Backwards
Even when I enter a new sales person, my view will reflect the change:
Add Jin
And, presto:
Works!
About two minutes of work to maintain our backwards compatibility to our previous data structure. There are drawbacks to this method, in that you can not define an index against your view which is important when you are cascading views. In addition, you will still need to change your queries for INSERT, DELETE and UPDATE, but this will save you some work. Your performance could drop a bit, but as a stop gap, there is no easier way to make a change to your data structure to ease your code base into that change. Your queries in your logic layer will be untouched because as far as they know, they are looking at the original table.

Complex Queries: Making the Hard Bearable

Now that we have our proof of concept under our belts, let’s look at another use. I created another table to capture the sales data from my salesforce table and filled it with some random information. It looks like this:
It’s an extremely simplified table to capture the sales of the salesforce for illustration. There are always things that we want to extract for measurement on a table like this. I probably want to know the total sales. I probably would want to know the total sales by person. I also might want to know the rank of the sales performance. I could write queries in my database logic to perform each of these when I need them, or I could simply write a view to grab the data as needed. Since this is a tutorial about views, I guess the choice is pretty simple at this point which tactic to take.
Let’s start by evaluating the total sales, along with some other pertinent information:
Which gives us a view of:
I also included the query time on this one, as looking at 200 records, this is lightening fast, but performance will vary. Notice that I am not using the CHECK functions because I am not discriminating the information in a WHERE clause. Now that we have this information neatly packaged, it’s just a matter of building our reporting mechanism in our logic.
Getting this information isn’t that hard. Let’s take this just a step further and use a GROUP BY function and a join function against the salesforce. Again, I am using simplified queries to illustrate. In this case, we want to get the same information that we had from total sales, but this time broken down by our individual sales person.
Which gives us a view of:
Again, very simple in the end to get these values out of your database. Let’s take a look at one more example, which will combine the two views. I want to compare the totals against the individual, and so we will create a view of two views:
Which gives us a view of:

Conclusion

One other benefit of views is they do provide a further level of security to your applications. You are not exposing your table structure to your application. Instead, you’re exposing something that doesn’t really exist, except as a pseudo table. I wouldn’t call a view a best practice and use them primarily for writing secure applications, but I would look at it as an added benefit.
I use views in a limited fashion. Where I use views are as demonstrated above, particularly in the reporting mechanisms in my applications. Writing a query to perform the heavy lifting for me is much easier than writing the logic around more difficult queries. I do take a bit of a hit on my performance from time to time, which I tend to overcome by optimizing the original data structure. I’ve yet to have one be a show stopper in terms of performance, but the day is young. Thanks so much for reading.

No comments:

Post a Comment