Database lookup tables
February 27, 2009 · by Josh Swan
This past year I have been doing a lot of database design work. One of the things we had to decide was how to handle look-up codes for things like states. I was told that we would be breaking out each type of code into its own table. Each look-up table has approximately these columns: code, name, description, enabled, created_date, and updated_date. At first I was indifferent to the design since it seemed fairly clean. I noticed it was using the code as the primary key instead of a surrogate key but I let it go.
We have been using these tables for months now and we have grown to have around 100 look-up tables. I’ve decided that maybe this design wasn’t the best way to go. Here is how I came to this conclusion:
1) When querying our main table we often have to retrieve the associated look-up tables as well. This has resulted in queries that have 20 – 50 table joins in them causing a noticeable decrease in database performance.
2) Creating an application to edit the data for a code was more complicated since we had to search across 100 tables instead of only a couple tables or even just one.
3) Given that we used the code as the primary key we had several situations where the code changed and we had to update the references across the database. I was talked into this one with “that code should never change” from the business. If a surrogate key had been used these code changes would not have been as painful as they were.
4) Since we are using hibernate for all our database tables we also had to create an additional 100 hibernate models for these tables. This added some significant development time to the project.
5) Another thing that bothered me is that we kept creating these tables and they all looked exactly the same. Kind of felt like copy and pasting the same bit of code over and over again.
For these reasons, I have decided to use a single table in the future for look-up codes similar to the structure above except adding a discriminator column called code_type or type. This type column can be used for grouping related codes together in the table and returning a specific code set. Couple of notes about this new design:
- If a system so many codes that the table actually gets so large that it starts to perform poorly it may be necessary to create a couple of tables. For most systems, however, this is an unlikely problem since most databases can handle millions of records in a table effectively and codes tables don’t tend to grow much.
- Queries may get a little interesting when you have to join multiple code groups in the same query. Just remember to use table and/or column aliases to help keep the joins clear.
Filed in: Team Member Blog Comments (0)
Comments
There are no comments for this entry.
Commenting is not available in this section entry.