Awkward JDBC API and MySQL User Group MeetingLife has been super busy, but I have plenty of posting to catch up on. Fear not, there will be more content soon (after Monday, hopefully); I am working on editing a very relevant book, and I hope to be able to share my excitement after I finish. Also on Monday is the next Boston MySQL User Group, which will go through how to do capacity planning in MySQL with very little pain and effort. In the beginning we will have 10 minutes for user questions, so we can all benefit from each other's knowledge. I already have a user with a great question! We have rebuilding our product all summer, with a deadline of releasing the product in the next 2 months. Our lead developer had put a surrogate key in our new schema about a month ago. He said he needed the surrogate key because "the ORM requires it." I just accepted it. My mistake was that I made an assumption. The table already had a primary key, but it was a composite key. I assumed that the ORM required a unique key to be one field, and thus I assumed he combined the 2 values in the primary key to get the unique key. I was wrong. In adding the final subsystems into the schema this week, I noticed that the surrogate key was an auto-increment field. I also noticed he put it in every single table he dealt with. We had hours of meetings about schema, and this was NEVER put in as a requirement. Thus, today we had over three hours of agonizing discussions, including a back-and-forth of "If it's going into the database I need to understand why," and the lead developer saying "No you don't." I severely wanted to say "If I don't understand it, it does not go in the database," but resisted. I asked him to send me the ORM so I could understand why it required the field. At first he said he would, and then kept talking to me about why I don't need to understand the field; it didn't add more overhead, it didn't change the way the data relate to each other, etc. I need to understand this because there may be other, similar requirements in the future. Or perhaps I'll find a better way to do it (maybe a VIEW would work). Perhaps I'll find other places where other fields need to be added. He finally explained that the API JDBC was using was awkward -- it needs to retrieve basically the row number of any row it's looking at, and if it deletes or changes the row number it uses the row number as the key to find the row. Aha! That makes sense. However, why do the row numbers need to be in the database? Can't it just retrieve the rows and put a row number in its own copy? Apparently, not. I cannot imagine that a mature technology would require something like that. It's not that difficult to do. I said this, and the lead developer was insanely frustrated by it. So I said, "Are you using Connector/J?" He was confused, but asked, "Is that JDBC?" "Yes," I replied. "Oh, then yes, we're using it." "I don't think so. If the interface is awkward, you're not using Connector/J." He left my office. So I type in "Connector/J" into my MySQL Community Toolbar (I love it!) and find the following on http://www.mysql.com/products/connector/j/ New features from the JDBC-3.0 API in the latest production version of MySQL Connector/J include getGeneratedKeys which allows users to retrieve auto-increment fields in a non-database-specific way. Auto-increment fields now work with object-relational mapping tools, as well as Enterprise Java Beans (EJB) servers with Container Managed Persistence (CMP) that support JDBC-3.0. Hrm....retrieve auto-increment fields in a non-database-specific way? I think that solves our problem!!! [EDIT: I am, apparently wrong....but I cannot imagine that anyone using JDBC specifies an auto-increment field for EVERY SINGLE TABLE their application will touch. Do people actually do this?!?!?] |
Follow me on:SearchNavigation |
I think there are two
I think there are two different schools of thought on whether primary keys should have business value or not - we seem to be on opposite sides of the fence. Nothing wrong with that - I've just been burnt so many times by changing business data requirements causing changes that have to be cascaded down the data model and the application that depends on it that I don't go there any more. I think http://www.bcarter.com/intsurr1.htm does a good job of representing the pros and cons of either surrogate or intelligent/business primary keys.
I'm sure we can agree that
I'm sure we can agree that composite keys are a major handicap in dealing with ORM tools. The Hibernate business team is absolutely wrong, by the way -- it's not that the data model is broken, it's that Hibernate is broken and cannot deal with a perfectly reasonable data model.
The question is why does a "good" ORM require a "bad" data model? The ORM tools should be mature enough to handle good database design.
Sure, I've put surrogate keys on databases that have 5 columns as a composite primary key, because good database design in that case leads to an unacceptably slow application. However, your #2 "The primary key must be a surrogate key with no business value," makes me want to run screaming, given that the primary key is SUPPOSED to have business value.
As well, I was led to believe that a unique number wouldn't work, it *had* to be auto-increment. Our primary key is 2 numbers, which should be simple to concatenate together, either making a decimal by having a decimal point as a separator, or just smushing them together with appropriate leading 0's, as there are upper bounds on both numbers.
I would be fine with a surrogate key that actually means something. On principle, I find that a system that requires what amounts to "row numbers' seems broken to me.
Yes - I use a surrogate
Yes - I use a surrogate auto-increment (with MySQL - sequences with Oracle) primary key value on almost every database table. It's critical that ORM tools be able to define row identity. Sometimes, that is a function of unique constraints of the data, and other times, it is not. It is true that Hibernate and other tools provide support for composite primary keys for establishing identity, but the support is extremely unwieldy. My experience has been that it is not hard to get Hibernate to recognize composite keys for identity purposes, but it's downright painful to try to define object relationships using these composite keys.
My rules for defining tables in my application are:
1. Always define a primary key.
2. The primary key must be a surrogate key with no business value.
3. Never use a composite primary key.
I also try to always use numeric data types for primary key columns, where practical. There may be instances where I don't do that (say in a list of US State Codes), but it's rare.
The above tends to lead me to use auto-increment columns more often than not. They also correspond to ORM-specific issues that have hurt me in the past. If you look at the Hibernate forums, for example, you'll find all sorts of problems related to composite keys - most answered by the Hibernate development team with comments like, "fix your data model - it's broken." Whether we agree with that or not, composite keys are a major handicap in dealing with ORM tools (in my experience).
My $0.02.
Todd
Sheeri, Just curious what
Sheeri,
Just curious what ORM you're using, since the ones I'm familiar with allow composite primary keys (Hibernate, EJB, JDO). There are quite a few "patterns" out there that do suggest that sometimes surrogate keys are better, but that's usually because someone's picked a poor primary key.
-Mark