Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
User Journal

Journal einhverfr's Journal: Why don't Relational and Object Oriented methods get along? 6

As many programmers are aware, there is a conceptual mismatch between relational and object oriented data modeling approaches. This mismatch causes a great number of headaches for both application developers and database administrators. Unlike Chris Date (whose Third Manifesto suggests a unification of db and OO schemes in a tightly bound and relationally oriented fashion), I am going to suggest that the problem is one which cannot be solved simply by adopting a fully relational approach, nor do object-relation mappers solve the problem (they force the adoption of an object-oriented approach).

The basic issue is that, although relational and object oriented provide methods of partitioning data variables, the goals and methods of determining how that data is partitioned is fundamentally different. Relational models focus on breaking down data so that it can be easily managed with a minimum of duplication and error correction. Object oriented models focus on breaking down data into portions which are organized by their relevance to the flow of data processing (hence the emphasis on encapsulation). Consequently relational and object oriented designs often cannot be mapped to eachother in a straight-forward way because the models are not mathematically equivalent (i.e. one relational model does not necessarily imply a single given object model, nor does a single given object model necessarily imply a single given relational model). Furthermore, a relation as a unit of semantic data storage or semantic data description is not the same thing as an object which is a unit of data flow. This contextual difference (partitioning the data attributes using data dependency vs flow dependency) more or less dooms attempts to treat these systems as equivalent.

Since these concepts cannot be mapped to eachother in an optimal and straight-forward manner, so we should probably forget about this sort of effort. ORM's and the like are probably quite useful for sets of applications where good relational design doesn't matter (one-off small web apps which don't need to connect to other applications in any way) but they fail for any application where the re-use and central management of data is important. The only trap here is that one-off applications which are expected to remain islands forever sometimes develop into something that has complex integration needs. In this case, one is unable to integrate on the data level.

In general I see three options for developers to use in addressing this problem. The inherent tradeoff is between reporting and centralized database on one hand and rapid application development on the other.

The first option is to use a simple object store for the application. XML files, BDB, serialized objects, OODB's, LDAP, etc. are all valid approaches here. In this approach, reporting is a lot of work (you have to do the retrieval and parsing yourself), and the information is not as manageable as it could be in an RDBMS, but the data models map exactly 1:1, and programming is easy.

The second option is to do a straight-forward mapping of relations to properties, and essentially use an RDBMS as a set-based serialization system. In this case, the data is not normalized, reporting is a challenge but doable, and the data is not really particularly reusable.

The final option is to provide separate physical and logical data stores. The physical data store is a highly normalized relational system, while the logical store consists of either views or stored procedures designed to present the data in a way consistent with flow-based dependencies. In essence these divisions avoid the mapping issues by making the model differences explicit. This means more time in database development, and marginally more time in initial application development, but a lot less time down the road making extensions. It also means that central management and reporting of data is strongest in this model.

This discussion has been archived. No new comments can be posted.

Why don't Relational and Object Oriented methods get along?

Comments Filter:
  • The final option is to provide separate physical and logical data stores. The physical data store is a highly normalized relational system, while the logical store consists of either views or stored procedures designed to present the data in a way consistent with flow-based dependencies.

    Don't available persistence layers (e.g. Hibernate or Spring) already do this? Its true that both the aforementioned products are Java only, but I'd imagine that there are similar products for other languages.

    • If I design a database in 5NF without prior though to Hibernate, and say it has 200+ tables, of which maybe 6-7 tables may need to be joined across to create a Java object, how well does Hibernate work?

      For example, suppose i have an address object (used to track customer addresses). The address object is created via aggregates and joins from the following tables (constraints not listed):
      create table country(
      id int not null unique,
      name text primary key,
      iso_code char(2)
      );

      CREATE TABLE state_province(
      id int no
      • You do raise some good points - its very difficult to simply "drop in" a persistence layer like Hibernate or Spring. The database has to be configured around the persistence layer. The point of persistence layers is to make things easier for the application developer. Unfortunately, this sometimes carries the cost of making things harder for the DBA.

        • Also note that the problem with designing your database around the persistence layer is that you are forced to generally design the database based primarily on the data model of the application (i.e. object-oriented or flow-dependent data) rather than data dependencies. Consequently the database takes on a form where the structure has a data-flow context to it which makes it harder to re-use that database from other applications, especially where the object models may be quite different. This is especiall
  • I think you missed an option: just let the database drive. Design the database schema first, write the SQL you expect to use to access it, and then write routines around the SQL -- which can be OOP methods if you're so inclined, but don't need to be particularly.

    The question that I would ask: is there any evidence that OOP-centric thinking has ever delivered on it's promises (e.g. lower maintenance costs)? And what's so hard about just writing your own SQL?

    • In that case, you are either putting those routines in the db (and using the logical/physical division) or you are incorporating the logical division into your application. So I suppose I did miss the option of providing the logical layer mappings in the application itself.

      In general, I avoid this as much as I can just because I think that a one-language-per-file approach is most easily maintainable (hence I would put the procedures in the db as SQL scripts at the cost of portability).

"Ninety percent of baseball is half mental." -- Yogi Berra

Working...