Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×

Why Aren't You Using An OODMS? 213

Dare Obasanjo contributed this piece about a subject that probably only a very few people have ever taken the time to consider, or had to. Below he asks the musical question "Why aren't you using an Object Oriented Database Management System?"Update: 05/04 02:11 PM by H :This is also running on K5 - yes, that's on purpose, and yes, Dare, myself and Rusty all know. *grin*

Why Aren't You Using An Object Oriented Database Management System?

In today's world, Client-Server applications that rely on a database on the server as a data store while servicing requests from multiple clients are quite commonplace. Most of these applications use a Relational Database Management System (RDBMS) as their data store while using an object oriented programming language for development. This causes a certain inefficency as objects must be mapped to tuples in the database and vice versa instead of the data being stored in a way that is consistent with the programming model. The "impedance mismatch" caused by having to map objects to tables and vice versa has long been accepted as a necessary performance penalty. This paper is aimed at seeking out an alternative that avoids this penalty.

What follows is a condensed version of the following paper; An Exploration of Object Oriented Database Management Systems, which I wrote as part of my independent study project under Dr. Sham Navathe.

Introduction

The purpose of this paper is to provide answers to the following questions

  • What is an Object Oriented Database Management System (OODBMS)?
  • Is an OODBMS a viable alternative to an RDBMS?
  • What are the tradeoffs and benefits of using an OODBMS over an RDBMS?
  • What does code that interacts with an OODBMS look like?
Overview of Object Oriented Database Management Systems

An OODBMS is the result of combining object oriented programming principles with database management principles. Object oriented programming concepts such as encapsulation, polymorphism and inheritance are enforced as well as database management concepts such as the ACID properties (Atomicity, Consistency, Isolation and Durability) which lead to system integrity, support for an ad hoc query language and secondary storage management systems which allow for managing very large amounts of data. The Object Oriented Database Manifesto [Atk 89] specifically lists the following features as mandatory for a system to support before it can be called an OODBMS; Complex objects, Object identity, Encapsulation , Types and Classes ,Class or Type Hierarchies, Overriding,overloading and late binding, Computational completeness , Extensibility, Persistence , Secondary storage management, Concurrency, Recovery and an Ad Hoc Query Facility.

>From the aforementioned description, an OODBMS should be able to store objects that are nearly indistinguishable from the kind of objects supported by the target programming language with as little limitation as possible. Persistent objects should belong to a class and can have one or more atomic types or other objects as attributes. The normal rules of inheritance should apply with all their benefits including polymorphism, overridding inherited methods and dynamic binding. Each object has an object identifier (OID) which used as a way of uniquely identifying a particuler object. OIDs are permanent, system generated and not based on any of the member data within the object. OIDs make storing references to other objects in the database simpler but may cause referential intergrity problems if an object is deleted while other objects still have references to its OID. An OODBMS is thus a full scale object oriented development environment as well as a database management system. Features that are common in the RDBMS world such as transactions, the ability to handle large amounts of data, indexes, deadlock detection, backup and restoration features and data recovery mechanisms also exist in the OODBMS world.

A primary feature of an OODBMS is that accessing objects in the database is done in a transparent manner such that interaction with persistent objects is no different from interacting with in-memory objects. This is very different from using an RDBMSs in that there is no need to interact via a query sub-language like SQL nor is there a reason to use a Call Level Interface such as ODBC, ADO or JDBC. Database operations typically involve obtaining a database root from the the OODBMS which is usually a data structure like a graph, vector, hash table, or set and traversing it to obtain objects to create, update or delete from the database. When a client requests an object from the database, the object is transferred from the database into the application's cache where it can be used either as a transient value that is disconnected from its representation in the database (updates to the cached object do not affect the object in the database) or it can be used as a mirror of the version in the database in that updates to the object are reflected in the database and changes to object in the database require that the object is refetched from the OODBMS.

Comparisons of OODBMSs to RDBMSs

There are concepts in the relational database model that are similar to those in the object database model. A relation or table in a relational database can be considered to be analogous to a class in an object database. A tuple is similar to an instance of a class but is different in that it has attributes but no behaviors. A column in a tuple is similar to a class attribute except that a column can hold only primitive data types while a class attribute can hold data of any type. Finally classes have methods which are computationally complete (meaning that general purpose control and computational structures are provided [McF 99]) while relational databases typically do not have computationally complete programming capabilities although some stored procedure languages come close.

Below is a list of advantages and disadvantages of using an OODBMS over an RDBMS with an object oriented programming language.

Advantages
  1. Composite Objects and Relationships: Objects in an OODBMS can store an arbitrary number of atomic types as well as other objects. It is thus possible to have a large class which holds many medium sized classes which themselves hold many smaller classes, ad infinitum. In a relational database this has to be done either by having one huge table with lots of null fields or via a number of smaller, normalized tables which are linked via foreign keys. Having lots of smaller tables is still a problem since a join has to be performed every time one wants to query data based on the "Has-a" relationship between the entities. Also an object is a better model of the real world entity than the relational tuples with regards to complex objects. The fact that an OODBMS is better suited to handling complex,interrelated data than an RDBMS means that an OODBMS can outperform an RDBMS by ten to a thousand times depending on the complexity of the data being handled.

  2. Class Hierarchy: Data in the real world is usually has hierarchical characteristics. The ever popular Employee example used in most RDBMS texts is easier to describe in an OODBMS than in an RDBMS. An Employee can be a Manager or not, this is usually done in an RDBMS by having a type identifier field or creating another table which uses foreign keys to indicate the relationship between Managers and Employees. In an OODBMS, the Employee class is simply a parent class of the Manager class.

  3. Circumventing the Need for a Query Language: A query language is not necessary for accessing data from an OODBMS unlike an RDBMS since interaction with the database is done by transparently accessing objects. It is still possible to use queries in an OODBMS however.

  4. No Impedence Mismatch: In a typical application that uses an object oriented programming language and an RDBMS, a signifcant amount of time is usually spent mapping tables to objects and back. There are also various problems that can occur when the atomic types in the database do not map cleanly to the atomic types in the programming language and vice versa. This "impedance mismatch" is completely avoided when using an OODBMS.

  5. No Primary Keys: The user of an RDBMS has to worry about uniquely identifying tuples by their values and making sure that no two tuples have the same primary key values to avoid error conditions. In an OODBMS, the unique identification of objects is done behind the scenes via OIDs and is completely invisible to the user. Thus there is no limitation on the values that can be stored in an object.

  6. One Data Model: A data model typically should model entities and their relationships, constraints and operations that change the states of the data in the system. With an RDBMS it is not possible to model the dynamic operations or rules that change the state of the data in the system because this is beyond the scope of the database. Thus applications that use RDBMS systems usually have an Entity Relationship diagram to model the static parts of the system and a seperate model for the operations and behaviors of entities in the application. With an OODBMS there is no disconnect between the database model and the application model because the entities are just other objects in the system. An entire application can thus be comprehensively modelled in one UML diagram.

Disadvantages
  1. Schema Changes: In an RDBMS modifying the database schema either by creating, updating or deleting tables is typically independent of the actual application. In an OODBMS based application modifying the schema by creating, updating or modifying a persistent class typically means that changes have to be made to the other classes in the application that interact with instances of that class. This typically means that all schema changes in an OODBMS will involve a system wide recompile. Also updating all the instance objects within the database can take an extended period of time depending on the size of the database.

Who is currently using an OODBMS to handle mission critical data

The following information was gleaned from the ODBMS Facts website.

  • The Chicago Stock Exchange manages stock trades via a Versant ODBMS.

  • Radio Computing Services is the world's largest radio software company. Its product, Selector, automates the needs of the entire radio station -- from the music library, to the newsroom, to the sales department. RCS uses the POET ODBMS because it enabled RCS to integrate and organize various elements, regardless of data types, in a single program environment.

  • The Objectivity/DB ODBMS is used as a data repository for system component naming, satellite mission planning data, and orbital management data deployed by Motorola in The Iridium System.

  • The ObjectStore ODBMS is used in SouthWest Airline's Home Gate to provide self-service to travelers through the Internet.

  • Ajou University Medical Center in South Korea uses InterSystems' Cachè ODBMS to support all hospital functions including mission-critical departments such as pathology, laboratory, blood bank, pharmacy, and X-ray.

  • The Large Hadron Collider at CERN in Switzerland uses an Objectivity DB. The database is currently being tested in the hundreds of terabytes at data rates up to 35 MB/second.

  • As of November, 2000, the Stanford Linear Accelerator Center (SLAC) stored 169 terabytes of production data using Objectivity/DB. The production data is distributed across several hundred processing nodes and over 30 on-line servers.
Interacting With An OODBMS

Below are Java code samples for accessing a relational database and accessing an object database. Compare the size of the code in both examples. The examples are for an instant messaging application.

  1. Validating a user.

    Java code accessing an ObjectStore(TM) database

    import COM.odi.*;
    import COM.odi.util.query.*;
    import COM.odi.util.*;
    import java.util.*;

    try {

    //start database session
    Session session = Session.create(null, null);
    session.join();

    //open database and start transaction
    Database db = Database.open("IMdatabase", ObjectStore.UPDATE);
    Transaction tr = Transaction.begin(ObjectStore.READONLY);

    //get hashtable of user objects from DB
    OSHashMap users = (OSHashMap) db.getRoot("IMusers");

    //get password and username from user
    String username = getUserNameFromUser();
    String passwd = getPasswordFromUser();


    //get user object from database and see if it exists and whether password is correct
    UserObject user = (UserObject) users.get(username);

    if(user == null)
    System.out.println("Non-existent user");
    else
    if(user.getPassword().equals(passwd))
    System.out.println("Successful login");
    else
    System.out.println("Invalid Password");

    //end transaction, close database and retain terminate session
    tr.commit();
    db.close();
    session.termnate();
    }
    //exception handling would go here ...


    Java JDBC code accessing an IBM's DB2 Database(TM)

    import java.sql.*;
    import sun.jdbc.odbc.JdbcOdbcDriver;
    import java.util.*;


    try {

    //Launch instance of database driver.
    Class.forName("COM.ibm.db2.jdbc.app.DB2Driver").newInstance();

    //create database connection
    Connection conn = DriverManager.getConnection("jdbc:db2:IMdatabase");

    //get password and username from user
    String username = getUserNameFromUser();
    String passwd = getPasswordFromUser();

    //perform SQL query
    Statement sqlQry = conn.createStatement();
    ResultSet rset = sqlQry.executeQuery("SELECT password from user_table WHERE username='" + username +"'");


    if(rset.next()){
    if(rset.getString(1).equals(passwd))
    System.out.println("Successful login");
    else
    System.out.println("Invalid Password");
    }else{
    System.out.println("Non-existent user");
    }

    //close database connection
    sqlQry.close();
    conn.close();

    }
    //exception handling would go here ...

    There isn't much difference in the above examples although it does seem a lot clearer to perform operations on a UserObject instead of a ResultSet when validating the user.

  2. Getting the user's contact list.

    Java code accessing an ObjectStore(TM) database

    import COM.odi.*;
    import COM.odi.util.query.*;
    import COM.odi.util.*;
    import java.util.*;


    try {

    /* start session and open DB, same as in section 1a */

    //get hashmap of users from the DB
    OSHashMap users = (OSHashMap) db.getRoot("IMusers");

    //get user object from database
    UserObject c4l = (UserObject) users.get("Carnage4Life");
    UserObject[] contactList = c4l.getContactList();

    System.out.println("This are the people on Carnage4Life's contact list");

    for(int i=0; i <contactList.length; i++)
    System.out.println(contactList[i].toString()); //toString() prints fullname, username, online status and webpage URL

    /* close session and close DB, same as in section 1a */
    }//exception handling code


    Java JDBC code accessing an IBM's DB2 Database(TM)

    import java.sql.*;
    import sun.jdbc.odbc.JdbcOdbcDriver;
    import java.util.*;


    try {

    /* open DB connection, same as in section 1b */
    //perform SQL query
    Statement sqlQry = conn.createStatement();
    ResultSet rset = sqlQry.executeQuery("SELECT fname, lname, user_name, online_status, webpage FROM contact_list, user_table" + "WHERE contact_list.owner_name='Carnage4Life' and contact_list.buddy_name=user_table.user_name");

    System.out.println("This are the people on Carnage4Life's contact list");


    while(rset.next())
    System.out.println("Full Name:" + rset.getString(1) + " " + rset.getString(2) + " User Name:" + rset.getString(3) + " OnlineStatus:" + rset.getString(4) + " HomePage URL:" + rset.getString(5));

    /* close DB connection, same as in section 1b*/
    }//exception handling code


    The benefits of using an OODBMS over an RDBMS in Java slowly becomes obvious. Consider also that if the data from the select needs to be returned to another method then all the data from the result set has to be mapped to another object (UserObject).

  3. Get all the users that are online.

    Java code accessing an ObjectStore(TM) database

    import COM.odi.*;
    import COM.odi.util.query.*;
    import COM.odi.util.*;
    import java.util.*;

    try{
    /* same as above */

    //use a OODBMS query to locate all the users whose status is 'online'
    Query q = new Query (UserObject.class, "onlineStatus.equals(\"online\"");
    Collection users = db.getRoot("IMusers");
    Set onlineUsers = q.select(users);

    Iterator iter = onlineUsers.iterator();

    // iterate over the results
    while ( iter.hasNext() )
    {
    UserObject user = (UserObject) iter.next();

    // send each person some announcement
    sendAnnouncement(user);

    }

    /* same as above */

    }//exception handling goes here


    Java JDBC code accessing an IBM's DB2 Database(TM)
    import java.sql.*;
    import sun.jdbc.odbc.JdbcOdbcDriver;
    import java.util.*;

    try{
    /* same as above */

    //perform SQL query
    Statement sqlQry = conn.createStatement
    ();
    ResultSet rset = sqlQry.executeQuery
    ("SELECT fname, lname, user_name, online_status,
    webpage FROM user_table WHERE
    online_status='online'");

    while(rset.next()){

    UserObject user = new UserObject
    (rset.getString(1),rset.getString
    (2),rset.getString(3),rset.getString
    (4),rset.getString(5));
    sendAnnouncement(user);

    }


    /* same as above */
    }//exception handling goes here

List of Object Oriented Database Management Systems
Proprietary Conclusion

The gains from using an OODBMS while developing an application using an OO programming language are many. The savings in development time by not having to worry about separate data models as well as the fact that there is less code to write due to the lack of impedance mismatch is very attractive. In my opinion, there is little reason to pick an RDBMS over an OODBMS system for newapplication development unless there are legacy issues that have to be dealt with.

This discussion has been archived. No new comments can be posted.

Why Aren't You Using An OODMS?

Comments Filter:
  • by Anonymous Coward
    OOP and other language paradigms are in my mind useful tools, however, all tools have limitations due to the level of specialization that a good tool inherently has in its design. OOP is a very useful tool for a wide class of problems, but not every problem will get a win from OOP. However, many of OOP's fans are very zealous, and tend to present OOP as a sort of cure-all. An interesting view point is presented on the pages Object Oriented Programming is Oversold [geocities.com] and Critique of Bertrand Meyer's Object Oriented Software Construction, 2nd Edition [geocities.com]

    My impression is that the relational model of databases is more natural to most DBAs than the object oriented model. Object oriented software tends to have a large nunber of derived types, and furthermore operator overloading (or function overloading) makes it impossible to read a snippet of code and really know for certain what it does. For certain applications (e.g. management of large data sets and low level systems programming) these features do not provide a sufficient "win" to offset the additional complexity and overhead. The competing paradigms in programming language design that I think are most compelling are:

    1. Procedural languages (and assembly language) for precise control of the machine and mapping onto hardware.
    2. Functional Programming (e.g. LISP) for rewriting and filtering of inputs. Many scripting languages do some of this.
    3. OOP for providing transparent abstraction, encapsulation of data and programs, information hiding, and code reuse.
    4. Declaritive Languages (e.g. Prolog) for goal seeking programs using logical deduction
    5. Relational databases for large data set management and manipulation.
    Remember that the art of picking the right tool is critical to doing a good job.
  • by Anonymous Coward
    You had the slickest OODBMS. Well, that's one adjective. You also had (probably) the least reliable and worst supported OODBMS around.

    I already posted in response to an earlier thread cataloguing my team's woes with Objectstore.

    OK, you faulted in objects rather than overloading -> but you have to get that right if it's not to cause major havoc. Our platform was HP-UX 10.20 and we had one helluva time with Objectstore and our C++ compiler. Coredumps on demand. An endless stream of patches and Objectstore consultants.

    Your product had real difficulties with exception handling and that HP compiler and we couldn't stop using exceptions because we also used Orbix and it needed to use them.

    I am glad you acknowledge your scalability and concurrency problems - we ran into them with 1GB stores.

    Had Objectstore been robust (let's forget about scalability for the moment) it would have had a good chance of being used throughout our company's telecoms apps dept. (This was one of the world's largest software companies). As it was, whenever anyone called us asking our advice about Objectstore they ended up using Oracle.

    Are you sure that your lack of market penetration was not fundamentally because none of your customers had a good word to say about your product?
  • by Anonymous Coward
    Okay .. let's get disclosures out of the way: I work for a RDBMS vendor. I'm a developer in the IBM DB2 engine. I was also a big contributor of our Object-Relational project, and was the team lead of that group before I moved to another project.

    I used to be an OODBMS bigot because of academic influences from grad school. Then I defected to the dark side :-) Actually I was doing compiler work, got laid off and these guys hired me to figure out how to bind their database objects to programming language objects. Finally I saw the light (in the dark side) and am now a complete Relational DB person.

    The question to be asked (as an astute poster remarked) is why not RDBMS or ORDBMS ? I invite you to read Mike Carey and David DeWitt's 1996 VLDB paper "Object Databases - A Decade of Turmoil". You can probably get it from Citeseer.

    Modern ORDBMSs are coming along very fast - they still do have some impedence mismatches, but they're going away really fast. As far as I see, "ease in programming" is the best reason for an OODBMS. DB2 UDB lets you create structured types with inheritance in the engine. You can use these types to create typed tables as well as typed columns. You can write your own methods - dynamic dispatch coming soon. You can extend our index manager with user-defined schemes and write your own predicates. So your queries don't look ugly any more.

    The idea behind an RDBMS is really to separate the "how" of getting data from the "what" the application needs. This difference is crucial. It gives the engine all the room to find the most efficient way to access data. The relational model gives sound theory and lets us reason in really complicated ways. The ODBMS paradigm is to force application developers to traverse the data and the relationships involved. This locks you into a physical database design, and again affects performance.

    Transaction management with different isolation levels are a good thing ! Relational databases are great at recovery of your data using log information. Well the good ones are :-)

    It's a red-herring to wave a flag and say "OODBMS support indexes". Really without a proper query language what they support are hash indexes. Try to do a range query on those things ? BTW, OODBMS also support an Object Query Language .. and unfortunately you can't do as much as you can with plain ol' regular SQL ..

    The big place where RDBMS wins over ODBMS is set-theoretic operations. Not everything is a simple traversal ! Consider an operation where you want to find the average salary of a bunch of employees that satisfy a particular criterion (in a salary range). Traversing a collection using pointers is terribly painful. Instead when you let our relational engine do the thing, bang it goes sucking page after page from the disk and really giving great performance. In addition you could have things like materialized views to precompute intersting results (and letting the compiler compensate over them appropriately).

    Finally a lot of people take for granted that an ODBMS is "superior" and it's stupid business types who prefer an RDBMS. Actually, academia has well and truly come to the conclusion that RDBMS technology is the way to go.

    Advice to the pure ODBMS camp: Get with the program. There are many great ideas in ODBMS technology, but the ability to support a declarative query language and separate access patterns is really crucial. Instead of beating the hoary old chestnut your best bet is to invest in an XML query engine. XML queries are coming a long way and you can do traversal operations pretty well with that.

    Okay .. the debugger calls. Later.

  • by Anonymous Coward
    Funnily enough my name isn't Anonymous Coward but given that my former employer still develops that product (although it uses Objectstore no longer) I don't want provide any way of someone figuring out who that company was. Telcos paid millions of dollars for that product and I wouldn't want it getting a bad reputation.

    I find it hard to believe that databases over 1GB were possible on HP-UX 10.20 (32 bit address space) given the way Objectstore handled memory and also how much memory a user process had access to (only 1GB in most cases due to the four quadrant architecture of HP-UX - although I accept under certain circumstances you could get two quadrants (2GB) but those circumstances didn't apply to Objectstore).

    I remember that when we got bigger than 600MB our problems began.

    It's all coming back now. Objecstore grew from the top of the address space down didn't it? So when your store got too big there was a collision between your store and your application code. Or something like that.
  • Whenever I've researched the topic, the canonical example returns to the infamous "bill of material" problem. While that is a good OODBMS application that RDBMS have problems with (although Oracle has extensions to help here...) I've never programmed one.

    One way to look at OODBMS is the second coming of IMS, the old IBM hierarchial DBMS.

    I do Java programming (I hear boos and hisses from the peanut gallery, but I persist...) and could use a seamless way to store state of my object hierarchies, but OODBMS haven't been it. (The wag will say at this point that I should be using Smalltalk, which has this seamless storage, but I duck this brick and go on my way).

  • I used to work for a very large software company and we used Objectstore on one of our apps (telco related).

    Every single person on the team agreed that it was the biggest mistake we ever made. Had we been a small company (not funded by a massive corporation) we would have gone under very quickly. And this was a team of top notch developers.

    Objectstore is (or was at the time, around 1998) a truly buggy, awful product. Because we were backed by such a large organisation we were able to fund the massive delays Objectstore introduced (just before release, guaranteed, there would be a series of showstopper bugs in Objectstore that the support guys at Objectstore (whom we were paying a small fortune for a support contract) would take forever trying to fix (and inevitably get it wrong - we had patched patched patches). The platform was HP-UX BTW.

    In our tests, Objectstore was fine with small stores but when we hit production (with heaps around 1GB) it couldn't handle it.

    If you make an error in your code your store is more often than not corrupted. Data integrity is not Objectstore's strong point. And IIRC you cannot restore your store online. This resulted in huge downtimes for an application that was supposed to be 24x7. I think many people would be prepared to take some performance hit in order to have data integrity.

    There was no great story on schema migration either.

    Objectstore forces the developer to understand intimately how it is going to lay out objects within the store in order for you to get even decent performance. IIRC every time it touched a segment it loaded everything from that segment (it is several years since I've looked at Objectstore so I might be forgetting the salient details).

    We dumped Objectstore after a couple of releases and moved to Persistence with Oracle. I also left the company so I don't know in detail how that went but I think it was much more robust.

    I would like to add that I do think that *orthogonal* persistence is something that is very interesting indeed (see the Java implementation with P-Jama for example).

    However, Objectstore is by no stretch of the imagination orthogonal and the quality of its implementation does no service to OODBMSs in general.

    P.S I do not and have never worked for any database vendor or company in competition with Object Design.
  • by Anonymous Coward on Friday May 04, 2001 @06:15AM (#245632)
    Ok only one disadvantage with this OODBMS, a system wide recompile EVERY time you make a schema change. Umm, that's a pretty big disadvantage in my book.
  • There are many reasons to not use an OODBMS. Some reasons are: what is "the" object model? Ans: there is no single object model, there are many. But there is only *one* relational model.

    Database Debunking [firstsql.com] has some great arguments against OODBMSs. Basically, it comes down to: data independence, no single rational object model, and the strong mathematical (set theory) foundation of the RDBMS.

    Object theory is still too... undefined.

  • by pb ( 1020 ) on Friday May 04, 2001 @07:02AM (#245634)
    Great job, Carnage4Life!

    I didn't think I'd see the day when someone got actual content posted on Slashdot.

    Or, for that matter, that you'd post a Java article that I thought was somewhat interesting and useful... :)

    Anyhow, wouldn't it be easier to integrate all this with C? Especially considering the huge body of existing code, and the well-known primitives involved.

    And are there any less proprietary OODBMSes out there that anyone would recommend?
    ---
    pb Reply or e-mail; don't vaguely moderate [ncsu.edu].
  • Most of my applications are insulated from the actual RDBMS. I usually define all of the data-primitives I'd have in a given project, and only make RDBMS calls from this layer. The application itself only makes calls against this layer instead of being burdened with directly operating on the RDBMS.

    Besides being better for portability, it also means that the application isn't tied to the actual RDBMS. It can easily be made to speak to something that may naturally expresses it's dataset, if necessary. This is no small feat. It means that IT DOESN'T MATTER WHAT THE UNDERLYING DATABASE IS. It could be a freaking plaintext file or it could be something ludicrously complicated.

    I'm totally not amused by the Objectify-Everything mindset. This is a very difficult perspective to have, FWIW. All academics teach OOP as the "one, true, way." and in practice, people believe them. In practice, they also find that it's not the silver bullet that it's trumped up to be. Regardless, saying that you think it's bullshit leads most people to the conclusion that you're an unwashed uneducated fool.

    Oh well. I never accused programmers of being open-minded.

  • I'm not so sure what makes you feel so superior, doesn't everyone do this?

    Sadly, no. I've seen countless lines of code where RDBMS calls are sprinkled liberally throughout all levels. Perhaps I just think I'm superior since I'm the only one I know personally who does it. You see it in plenty of open source projects, but those never reflect the quality of code written in commercial environments.

    I'm not sure what to make of the example you cited. That sounds great, but it's something I say that I can probably develop with my neanderthal RDBMS if I needed to... :)

    My impression of OODBMS is that you're just integrating the storage of the data with the semantics of the language you happen to be using. Obviously, this will have benefits, but it will certainly have it's own issues. For one, I bet it's a royal asspain to access this data through environments that aren't based on the primary language. RDBMS can be queried with vanilla SQL. OODBMS (probably) cannot. A good deal of my projects involve the suits being able to probe the dataset through their own comfortable tools (Access, *shudder*). It would waste my time needlessly to have to instead have to write the reports they think they'll need.

    I've also hated what most people's ideas of OO systems turn out to be, perhaps with the exception of Python. If Python did it well (which I'm going to look into), I think I could be a believer. But when the author mentions heaping tablespoons of Java, I quickly lose my appetite.

    Still, my policy towards unfamilar technology is that it's bullshit until proven otherwise. I believe that I am a fairly competent person, and if I'm only hearing about it now, then I tend to disbelieve that my world has been wrong all this time. That's certainly not a bad one considering all of the buzzwords being thrown around lately. The paper didn't change that for me, but the comments here (including yours) have prompted me at least give it a chance.

  • by On Lawn ( 1073 ) on Friday May 04, 2001 @09:10AM (#245637) Journal
    As I understand, EROS is adding a rdbms but by its nature OODBMS would be a more logical use of its properties.

    I actualy can't wait for an EROS OODBMS Network Data Storage system. I think they were meant for each other but it will take 10 years for people to comprehend it. I wonder if in 10 years when this idea is finnaly reaching Linux like momentum, if someone will think back and say "We could have had this 10 years ago".


    ~^~~^~^^~~^
  • Man, did you just use speed and ObjectStore in the same sentence? ObjectStore is not even in the same class as several other OODB solutions out there.
  • Reality is object-oriented.
    Really? When the Sun rises, does it send messages to the birds to tell them to sing?

    This is to cite an oft-used example of how absurd it is to think reality conforms to a message-passing model. It doesn't. That's partially why "design patterns" exist in the first place.. they're patches on things that the language model doesn't solve yet.

    As for OODBMS', I'm an OODBMS advocate in some ways, I think they're great when in the hands of experts. But otherwise one tends to be able to get simple things done quicker with SQL, until a fully expressive ad hoc query capability is on these object systems.. (which is almost never the case).

    Ditto for getting performance out of things... does your ODBMS have associative B-Tree indices built in? that work when your query is traversing over collections? etc. I've had to write these things myself in the past, and while pleasant, I don't think other people would share that opinion.
  • Why do you think 19 out of the 20 biggest Telco companies use ObjectStore?

    They don't use just Objectstore. Lots of them use Informix [informix.com] too - like 8/10 of the world's traffic if you believe the adverts.

  • First, I'd like to say that I work for Object Design (ObjectStore), so I am probably a little biased...
    Although I had worked before on Oracle systems.

    People have more confidence that they can find developers for Oracle than ObjectStore. But the same is true about COBOL programmers vs Java or C++ ones.
    But this is why we have got consultants to help start the projects and get the design right. Once started on the right track, development usually goes a lot faster than in a traditional RDBMS environment.

    I think the main reason people stick with Oracle and co. is that they prefer a known and tested solution like an Oracle database to store their business data.
    This is the old "no one got fired for choosing IBM" argument.

    By the way, there is no dba with ObjectStore (at least not in the way people think of them).
    The optimizations are done by the programmers because the db layout is really dependant on your object model.

    But the speed factor is something real which is why ObjectStore is doing so well with telcos (where speed is of the essence) with C++ applications.
    On the EJB side, we really blow RDBMS out of the water. Oracle is nowhere near Javlin (EJB containers for ObjectStore) in term of performance...

  • by Juju ( 1688 ) on Friday May 04, 2001 @06:51AM (#245642)
    Why do you think 19 out of the 20 biggest Telco companies use ObjectStore? The answer is speed! I have never seen an app running faster on Oracle than on ObjectStore.

    I agree about the complexity and skill availability arguments, it is still easier (and cheaper) to get several COBOL and VB programmers than Java or C++ ones.
    But then you can always get a consultant to help with the design. And as a matter of fact, it will be faster to develop that way than having a bunch of COBOL developers put together some kind of server side app while some VB coders put the client interface together...
    Having done both, I can tell you what kind of system scales and which one does not.

    Have you done some EJB programming? You would be surprised how much faster and easier it is to go the OODB route.

    My opinion on what the biggest problem really is, is mainstream recognition. OODB vendors are vulnerable to FUD from RDBMS vendors as much as Linux was suffering from Microsoft FUD two years ago. Note that for OODB systems (as for Linux 2 years ago) there are some good reasons to stick with the mainstream solution. Going the OODB route is far more risky (from a business decision making point of view).

  • Not only is there a standard but the ODMG standard is on version 3, JDO is merely a Java standard. Please know the facts before flaming.

    Yeah, there's a standard ... but what good is the standard if none of the vendors do more than implement subsets of the standard, and none of the vendors implement the same subset?


    Are you moderating this down because you disagree with it,
  • the kdb [kx.com] faq, from kx Systems [kx.com].
    1. What is Kdb ?
      Kdb is an
      extremely fast RDBMS extended for time-series analysis.
    2. Does Kdb support SQL92, ODBC and JDBC ?
      Yes.
    3. Is Kdb a read-only RDBMS ?
      No. Kdb is
      very fast for OLTP (online transaction processing). For example, it runs over 50,000 ATM-style transactions per second logged to disk with full recovery on a single cpu. This was against a database of over 100,000,000 accounts, tellers and branches. Kdb can do batch updates at several hundred thousand records per second per cpu.
    4. Is Kdb a memory resident RDBMS ?
      No. Kdb has
      minimal memory requirements and is very fast from disk. For example, it ran the gigabyte TPC-D (an industry standard decision support benchmark) queries and updates on a 200MHZ PC with 64 megabytes of memory, an ultrawide SCSI controller and four disk drives many times faster than the best published results at a fraction the cost.
    5. What about time series ?
      Kdb handles much more than just SQL92 tables. Online analytical processing (OLAP) on multi-dimensional arrays is done with our extended SQL language, KSQL. For example, on the 35 megabyte
      OLAP APB-1 benchmark queries , Kdb ran 12,000 queries per minute with no precalculation.
    6. 6. Since Kdb is so fast, does it require more storage ?
      No. Kdb is simple and will
      often store just the raw data . For example, in TPC-D, the published results required storage between 3 and 10 times the raw data. The Kdb factor is a little over one. Some OLAP tools require (for fast queries) massive precalculations. For example, in APB-1 some expanded the 35 megabytes of input data to many gigabytes. Kdb aggregates relations (extended with time series fields) so fast that precalculation is often obviated. Certainly when the raw data is less than a few gigabytes.
    7. Is there a parallel version ?
      Yes. Although Kdb can handle much larger databases than other database products without requiring parallel processing, there is a parallel version for the largest applications.
      Kdb scales.

    --
    http://kx.com
    taylor:{+/y**\1.0,x%1+!-1+#y}
  • by freeBill ( 3843 ) on Friday May 04, 2001 @10:05AM (#245645) Homepage
    A lot depends here on what we mean by the OO in OODBMS. Even in programming, the meaning of object-orientation has changed through the years. And the problem of ambiguity is even greater with database management tools.

    I am glad there are good and smart people working on a standard for what constitutes an OODBMS. I suspect it will be a few years before a definitive standard is completely figured out.

    Consider, for example, some of the very different things people mean by an "Object-Oriented Database Management System":

    Some people use it to mean "something which will give me persistence in the OO app I'm currently working on." For them a relational database management product with a few OO tools may be fine (assuming their objects are sufficiently simple).

    Some people use it to mean "something that will give me the ability to tie behavior to persistent objects." For them good stored procedures (like Oracle with a third-party product for debugging stored procedures) may be exactly what they want.

    Some people use it to mean "a DBMS which implements all the major features of current OO theory." A OODBMS which truly implements standards (as linked to in the original article) is what's needed for these people.

    Some people use it to mean "something which will enable me to implement all ideas currently associated with advanced OO theory (including aspect-oriented programming) and anything which may be included in that theory in the future." A DBMS with a dynamic model of object-oriented-ness (along the lines of Perl's dynamic model of what OO is) would be required. I don't know if anyone's actually accomplished this, but I would be both impressed and interested if it's been done (especially if it's language-independent, assuming that's possible).

    And some people use it to mean "a DBMS which is fundamentally object-oriented in its underlying structure enabling a variety of powerful table-creation tools." This can be accomplished with some of the better OODBMSs (depending, once again, on just what you mean by "fundamentally object-oriented").

    Given all this, I suspect it will be a while before a clear definition is agreed upon. It may even come out of theoretical work in academia. Until that time, the practical reasons listed here will continue to be why many don't use OODBMSs. And the attractive features they offer will continue to be why some people will ignore those practical problems.

    Oh, no! It looks like we're back to "it depends on the problem you're working on" just like so many of these debates.
  • At first glance, I thought it said CONDOMS. I thought it was kind of a weird topic for Slashdot.
  • Actually, data loading/backend-type stuff is easier with an RDBMS, because data-entry is almost alway tabular anyway. However, once entered, it is usually easier to process it through an OO layer. The best way to accomplish this is to have a CORBA layer that the applications always use for talking to the database that actually incorporates all the business logic, but have the CORBA layer talk to an RDBMS.

    I don't have a lot of experience with OODBMSs - I'd be curious exactly how they work. The closest I've worked with is PostgreSQL which is Object-relational. Are there any intro guides, especially to schema definition and stuff like that.

    Is there a free software OODBMS?
  • The article says that 'impedance mismatch' causes performance loss. I don't see this as necessrily true. If anything, I'd expect the additional features (listed at length in the article) to cost performance...are any numbers available?
  • Whether an OODBMS or RDBMS is more appropriate depends on the situation, (and how good the particular database implementations are - you can read lots of stuff by Codd on how what most people think of as RDBMS's has more to do with implementations that what the relational model actually allows (for example, the One Data Model fits there - operations and behaviour should be part of the single model, they just aren't usually part of the database schema). As a general rule if your data structures are stringly tied to a particular application, you want an OODMBS, and if you want free form ad-hoc queries and flexibily changing applications, you want an RDMS.
    But anyway, I just wanted to pick up on one point.
    When you say "No Primary Keys: The user of an RDBMS has to worry about uniquely identifying tuples by their values and making sure that no two tuples have the same primary key values to avoid error conditions. In an OODBMS, the unique identification of objects is done behind the scenes via OIDs and is completely invisible to the user." that's just flat wrong (and dangerously so).
    Object identity is important. If you rely on invisible Object IDs to wave a magic wand and handle it for you, you will almost certainly end up with a single real life object having multiple inconsistent representations in the database, so to avoid screwing it up you will need to do explicit defining of keys and checking for duplicates.
    On the other hand, if you have a well defined keys in your Relational Database, there isn't a problem - the database won't _allow_ duplicates. In a worst case, you have to define artificial IDs in your RDBMS and you are back to the OID case except that you actually have control over them.
    Of course lots of people do make a mess of choosing keys - but OIDs _don't_ solve the problem.

    --
  • the entire point of object oriented programming is creating generic reusable components

    The point of OOD/OOP is creating software that more directly reflects the problem domain of the system being constructed. Its for creating maintainable software fast (by directly implementing the model instead of having to twist the model to fit restrictions in the language paradigm). Encapsulation helps maintainability; inheritance and polymorphism help the reflection of the model.

    This code is not necessarilly reusable. And any true Object Model is going to be very application specific and have very few reusable parts.

    Reusability comes from a Component-design approach. Components are not (necessarilly) Objects. They can be used to implement Objects, but they can be constructed using non-OO techniques. Objects belong to a problem domain; reusable components might not.
    --
    You know, you gotta get up real early if you want to get outta bed... (Groucho Marx)

  • How do object databases perform as data warehouses? Intuitively, it seems like OO would not natively be efficient at aggregating data, and would be forced to rely on application-level code for any analytical processing. Or am I missing something?


    Why'd you say 'burma'?
  • No one ever got fired for buying oracle/db2. At the moment why would you bet the farm on a wonderful "new" (yes I know it's not really new) technology like this, especially when the previous incarnation of oodbms failed in the early 90s (I was involved with one of these, a complete disaster) when you can get nice object relational mapping tools and use a good and proven solution?
  • You wrote:
    Circumventing the Need for a Query Language: A query language is not necessary for accessing data from an OODBMS unlike an RDBMS since interaction with the database is done by transparently accessing objects. It is still possible to use queries in an OODBMS however.

    If I simply want a persistence mechanism for objects (and will access these objects via a pre-designed application), then sure, an OODBMS makes sense. However, if I understand a priori that all kind of queries will be executed against the data, I need to design accordingly, and here, the relational model is superior to the OO model: fast access of the data depends on exposing the possible queried fields.

    Without this exposure, if I store the data as objects (that have references to other objects), then I might have to traverse ALL the objects to get the result set of some query, and this highly inefficient. So the response to that might be, "then expose all the fields that might be queried," and, thus, you are reverting back to a relational structure...
  • 3 Things:

    If people could learn SQL which is completely unrelated to any other aspect of their programming experience then adding OODBMS techniques as a skillset would be trivial.
    Speaking out of years of DB experience, learning OO techniques are not trivial. SQL is fairly easy to learn - even for a procedural programmer. However, inheritance, polymorphism, just to name a few are not simple concepts to grasp.

    Of course, if people don't realize that alternatives to RDBMSs exist then they won't learn these techniques.
    Ugh. Databases should be designed for 2 main reasons really.
    1. Speed. Get in and get out fast!
    2. Redundancy or data protection.
    Everything else is just nice little utilities bolted on (cept maybe indexes).

    I been programming in Java for almost every work day for the past 4 or 5 years and while I've seen Java performance increase, to me, writing a DB in it is pretty ridiculous. Granted, an OO db could be just a easily be (or maybe not) written in C++. However after much review and time and just learning how to write good software over the past years, I'm not quite convinced that OO is 100% the way to go. However, it has a place.

    That is more important because management can't find people who have these skills if developers don't go out and learn these techniques.
    With all do respect, I have worked with very few managers that were worth much. I've seen so many mangers jump at anything OO without knowing what they're talking about. So I trust me over them to make informative decisions about software.

    Imho, make good software not hype.
  • I was a database administrator for 3 years, and I think I changed the schema of an existing table exactly once. Now, I'm not totally sold on the idea of OODBMS, but if it has genuine benefits, then making schema changes harder would not be that bad considering how often they tend to occur.

    Ben
  • Aren't you supposed to design an application before implemnting it in any way including putting data in a DB? I've worked at two companies and had a ton of projects in school and none involved implemnting the database before the application was designed.

    So I take it you've only worked on pristine new projects coded in a vacuum, then? While I've never known anyone (well, anyone who knew what they were doing) to start putting data into a new database before designing their app, I've encountered many cases of new apps being written to use existing databases, generally either because the new version needs to be backwards compatible with the old one or because the need has arisen to look at old data in a new way.

  • ... need to redo this "study".

    Object databases are cool, specially from a programing standpoint, but there are a lot many disadvantages than the single one you listed, including : cost, familiarity (most people are not), less vendors (and open source alternatives), legacy databases (migrate or build bridges ?), etc.

    Whenever you compare technologies, if you only find 1 disadvantage, you have probably not looked hard enough.
  • was the ability to use already-existent tools to do data mining, reporting, & other similar (not necessarily insignificant) activities...

    in all cases that we had gone through rigourous prototypes of products and used ODBMS', it always seemed to come down to the same few things:

    1) critical mass (everyone already knew the relational databases very well)

    2) tool robustness (there are a wide variety of good tools (most 3rd party supplied) to MANAGE relational instances. i'm referring to more subtle circumstances than managing users & schema here)

    3) reporting and data-mining was ALWAYS more difficult (usually by an order of magnitude or more).

    now, my last involvement in a prototype is YEARS ago, so i'm absolutely positive things have changed...

    the reality remains that people haven't yet gotten by what they learned in their first few experiences and simply haven't re-examined the landscape, just like myself...

    a weak excuse, but i'm certain this is a more common answer than we'd all like to admit.

    just my 0.02.

    Peter
  • I'm fired... arn't I.


    You should be, if you're a software engineer. If you're just a code monkey then it doesn't really matter as you don't create the design, just bang out code from a specification.

  • Guess my ranting overrided (OO pun! HOORAY!) my Simpsons attention.
  • The schema change is the biggest drawback. Adding to columns to a database is "simple". Adding fields to an object is not so simple. (OK it is simple, its the recompiling and ensuring all objects are correctly serialized etc is the pain).

    Adding columns and design changes are fact of life. Until it is as easy and safe as relational - OODBMS apps will definitely be widespread.

    When OODBMS store their data in an XML document style format is when OOBMS will take off since the dtd can be written with versions in mind and older objects can still be understood with relative ease.

  • Great article. At this point, it's the only article rated higher than 2 and that's fully deserved.

    But I have some additional points:

    6. The world is not object oriented. Even if oo is a usefull tool, it is no silver bullet.
    7. RDBMS are proven technology and rather well standardised, OODBMS aren't. Currently there is a proposal for a standard (java data objects), but even that only addresses one plattform.

  • Everyone knows SQL; nobody knows OO.

    Well, I don't know about anyone else around here, but I knew OO before I'd even heard of SQL.

    I know that C, Perl, etc are still very popular languages, and deservedly so, but C++, Java, etc are also extremely popular. I think OO has been around long enough now for there to be little excuse for people not to know anything about it. They're even teaching it to the Physics students at my old university, fer chris' sake! :-)

    (Although not until after I'd been forced to learn Fortran, mind you...)

    Cheers,

    Tim
  • Ok, I have to admit I'm not a relational scholar. I've read Chris Date's An Introduction to Database Systems and Foundation for Object/Relational Databases : The Third Manifesto. Both contain excellent reasons NOT to using OODBMSs for most database applications, at least in the form you refer to. Here are some of my observations on OODBMSs,

    Composite Objects and Relationships:

    OODBMSs may be able to outperform an RDBMS for a specific data model, but ONLY using queries and transactions that are defined up front in your modeling/design process. Relational systems will often perform better when all of the possible queries/transaction are not known up front. (as is often the case)

    OODBMSs often fall down when their datasets exceed the amount of RAM in your system. Because the relational model is mathematically well defined, RDBMS implementers can build generic query analyzers that can often find an optimal data retrieval path for a large set of queries that aren't known up front. Also, you can tweak a query's performance using indexes and storage specifications. This is much harder to do in OODBMSes - which don't have a well defined mathematical model.

    Date's Third Manifesto shows the proper future for DBMSs, IMHO - add OO to RDBMSs.

    Class Hierarchy: Data in the real world is usually has hierarchical characteristics.

    In the real world, data can be modeled hierarchically, but often shouldn't be. Imposing a hierarchy limits you to viewing the world through that hierarchy.

    Improper OO modeling can cause a ton of headaches. In your Employee/Manager hierarchy, what happens when an Employee becomes a Manager? Does the object change its type on the fly? Say you have a MangerList that contains a list of Manager objects (and only Manager objects). What happens when a Manager is demoted to Employee? Is the OODBMS smart enough to remove them from the ManagerList dynamically? Or do you have to write code to auto-magically remove them? Or do you destroy the Manager object and recreate them as an Employee?

    Circumventing the Need for a Query Language:

    Without a generic query language, any OODBMS "queries" you perform must involve massive amounts of recursive pointer chasing. It is possible to use query languages in an OODBMS but they have a much harder time optimizing the queries due to the lack of a formal mathematical model for OO and the lack of emphasis on normalization.

    No Impedence Mismatch

    This is not a problem with the relational model, it's a problem with the interface layer you are using (ODBC, ADO,..). These interface layers are designed to be generic to all RDBMSs, this causes the impedance.

    No Primary Keys

    My primary beef with OIDs is an OID defines a value that has no meaning to the object it represents. It is extra information that takes up space. Often OODBMSs have to resort to 128-bit or larger values to be able to generically identify every possible instance of every class in a system. These large values can greatly increase the storage space needed for objects. Say I have a class that just has one 32-bit integer member. In an OODBMS, the space required for an OID is many times larger than an instance of the class itself!! (Very inefficient!) Also, every lookup of an OID potentially requires a search through every object in the system, not just the primary key of one table. For large datasets this is a killer.

    Also it's incorrect to say a user of an RDBMS has to worry about primary keys. It's the designer of a relational data model has to worry about keys. A user just uses the data model.

    Defining primary keys is one of the easier tasks in relational modeling, IMHO.

    One Data Model

    I agree, current RDBM Systems don't model behavior very well but that's not a fault of the relational model, it's the fault of RDBMS vendors. Transact-SQL sucks ass, PL/SQL isn't much better.
  • Are you sure that your lack of market penetration was not fundamentally because none of your customers had a good word to say about your product?

    Well, Anonymous Coward, if that is your name, "none" is provably false. All I can say is that your experience was atypical based on the information I saw. Databases over 1gb were common, and most customers I met were very pleased with tech support.

  • There is no limit on the database size. The limit is on the amount of mapped memory space in a single transaction, 1gb in your case. You sometimes have to plan things carefully, (e.g. judicious use of ObjectStore references), to avoid running out of address space.
  • I am one of the founders of Object Design (now Excelon Corp.) We had the slickest OODBMS -- persistence was implemented by taking over memory mapping, (no "overloading the arrow operator"). It was the least obtrusive OODBMS. Other systems of the day required you to use different string libraries or forego C/C++ standard arrays (for example). Other systems arguably had better scalability or concurrency models.

    As someone else has pointed out, OODBMSs require a very different skill set. The problem isn't that your typical SQL developer didn't have these skills. The problem is that the things were ever referred to as database systems.

    If you walk into a potential customer selling a "database system", then the database guys come and hear what you have to say. They ask about SQL support and point-and-click development tools. They are going to be looking for very high levels of concurrency, at isolation levels below serializable.

    Selling a "database system" meant that once we got past the early adopters, we were selling against Oracle and we hit a wall. What we should have done from day one was to sell persistence for C++. We did start out like this, e.g. trying to convince ECAD vendors to build their products on top of ObjectStore. That had some limited success because the customers knew that they needed persistence, but they were C/C++ hackers at heart, and an RDBMS was a poor compromise. A "database for C++ with no impedance mismatch" sounds great to someone writing a 3d modeler. We then went on to apply the same logic selling to satisfied RDBMS users without changing our strategy, and that's when things stalled.

    That strategy was necessary in some ways, because we were venture-funded, and the VCs weren't going to be happy with a small niche. They wanted something that would get into every insurance company and bank. However, by aiming high and failing (by VC standards), we abandoned our natural market too soon and avoided becoming a small success in that market.

  • > If you store your data by maintenance method

    I don't really know what you mean.


    --
    Leandro Guimarães Faria Corsetti Dutra
    DBA, SysAdmin
  • > It is interesting to see that a posting with so many wrong statements receives such a high rating here.

    It seems that each of us have different ideas about what's right and what's wrong, so let's get over such self-serving statements and go over the issues at hand.


    > Reality is object-oriented. We use objects in our modern programming languages.

    As far as I know reality may be represented by objects. It makes no sense saying that reality is this or that oriented, since apart from God no one can ever know what He was oriented to when he did create.

    Seriously, reality can be represented by objects. It can be represented by relations also. Even if objects are convenient for some programming domains, it isn't for data storage and retrieve, except if your program will never change *and* it is object-oriented. It's surprising to learn how many programming gurus, and specially database gurus, steer clear of object-oriented programming, rather keeping with other models of programming like the functional or the structured ones.


    > Why should we flatten these out to tables with unnecessary keys...

    Why should we complicate relations with objects, if we can store data-independently?

    Please, you are repeating OO jargon without explaining nothing to me. As I never saw much sense in OOness, I will need better teaching than this.


    > Relations between objects can be maintained transparently within the database.

    This isn't the issue. The issue is that these relationships are maintained physically in the database, thus getting against the Information Principle. In contrast, relational databases have no physical links. All relationships are a result of data kept in common by different relations. I won't explain here what is a relation; you should read Chris J Date's [dbdebunk.com.] An Introduction to Database Systems [dbdebunk.com.] for that.


    > This is common-sense and there is no need to write scientific papers about it.

    It is common sense for uncommon OO programmers, not for DAs, DBAs, functional or other non-OO programmers and common people like me.

    The fundamental issue of OO is that is an over-extension of some simple programming rules-of-thumb. When they got the full extension they have today they got much more complicated than the surprisingly simple scientific papers that were published about relational database model theory, like E F Codd's paper [acm.org].


    > You can design data by writing classes in your respective programming language.

    Not so fast. I want my data in my database, in a well defined and theoretically sound *data* sublanguage, not only in a programming language.


    > [independence of the logical and physical layers of your database]

    > Where do you need tables here?
    > Class members define properties.
    > Methods define behaviour.

    If you want to compare relational databases to anything else, it is better to think relations, not tables nor entities or relationships.

    Seriously again, relations are a logical representation. With this logical representation of data at users' (and programmers') hands, you can lay your data physically whatever way you want, even by using pointers if you like. This way you can optimise the physical layer for performance or availability or thoroughput or whatever balance of whatever goals, while keeping data logically organized, easily accessible and readily available to access plans created by a good query optimizer.


    > [shifting the performance optimization issues to the DBMS' optimizer]

    > Object databases also use optimizers to analyze queries.

    But your access paths are predefined. The user has little freedom to discover relationships between different data, ad hoc queries will have weird access paths and little possibility of optimization. And if you ever need a schema change, you will have to rewrite lots of queries.


    > [any schema change do not only need an application recompilation]

    > There are object databases that manage schema versioning automatically.
    > Our product simply stores a superset of all used schemas.

    This won't fly. This is schema accumulation, not change. You will have data stored in many different ways, and users will find ways of wanting schema changes that can't be efficiently stored anyway.


    > Wrong. Our object database is not multi-user as of today

    Then you haven't even faced the very issues relational theory was created to solve. First read and understand Codd's paper above, then we can talk. Write me privately by email, or better yet read also The Third Manifesto [dbdebunk.com.], then Database Debunkings [dbdebunk.com.], and then we can talk.

    > By the way: Do you typically have different tables for different users?

    Even in quasi-relational SQL each user has its own schema. And this schema can contain base relations, derived relations including named ones (views), and synonyms.


    > [rethink the data access path]

    > Reengineering is a terrible problem with relational databases.

    I've been working for years with weak, quasi-relational SQL and even with this poor tool I've not faced this problem of "reengineering". I now you have a product to sell, so it may be hard to forget marketspeak, even more actually acknowledging some fault, but what do you mean really by reengineering? AFAIK this is a management nineties' regurgitation of Operations and Methods ill digested with some Data Processing thrown in, not a CS word.


    > Strings are not typesafe, so you have to parse the entire application

    Stop! Stop! You're killing me!

    Seriously, strings are strings. If SQL doesn't do all type checking it could, it is no fault of strings per se, much less the relational model. And it has no bearing in relational data independence. When you change a data type, you have a different relation, there's no way of shielding a typesafe language from that. Some data sublanguage may provide some shortcuts to such modifications, and even do some automatic type casting, but this is not a database model issue.


    > [which should give us practically all of the advantages of OODBMSs without their cons]

    > We are going the other way. We want to provide all the
    > relational functionality that you wish with
    > our object database. We might finally end up
    > with very similar engines.

    Not at all, because you haven't yet understood the fundamentals of the field.


    --
    Leandro Guimarães Faria Corsetti Dutra
    DBA, SysAdmin
  • The issue is that OODBMSs do not conform neither to current database best practices, nor to theory.

    Relating to best practices, you should know already from other, better-rated comments in this thread: you should design your data before your application, OODBMSs make it hard; you should strive for independence of the logical and physical layers of your database, keeping data independence and shifting the performance optimization issues to the DBMS' optimizer; OIDs hinder the designation of candidate keys, of which the primary key is a special case, and thus hinder a lot of data integrity checkings that should be done by referential integrity. And we could go on and on.

    As for the practical implications of not conforming to these best practices, any schema change do not only need an application recompilation, but also that you rethink the data access path (also known as a query's access plan); you won't be able to keep several logical schemas to different users, and the identity of the user's view with the physical layout will force you to optimize only for the most common case, instead of leaving it up to the DBMS to create the best access plans.

    All this is much better explained in Database Debunkings [dbdebunk.com.], a site co-maintained by Chris J Date [dbdebunk.com.], author of the best database books I've ever read; you can find a list of his available books also there [dbdebunk.com.].

    As for theory, there is no real substitute for the relational database model theory. As Linus Torvalds thinks that microkernels were a good idea but misguided, wielding no practical nor theoretical improvements, so OODBMSs sounds nice but offer no real improvements over RDBMSs. This is not to say that everything you will ever need will be handled properly by your SQL DBMS. The point is exactly that people have went for OODBMSs because they thought that SQL was relational, and found it wanting. The problem is that SQL never was truly relational, just an approximation of it. Date has a whole book on it, called The Third Manifesto [dbdebunk.com.].

    Summing up, what I am really trying to find is some proper implementation of the relational database model ideals, which should give us practically all of the advantages of OODBMSs without their cons. I have just been informed of Suneido [suneido.com.], but have not investigated it fully... it's a pity it is Win32, not POSIX.


    --
    Leandro Guimarães Faria Corsetti Dutra
    DBA, SysAdmin
  • Developers aren't the only ones who have to query the database.
    Hear, hear! My first database-related job involved writing Crystal Reports against an Informix databse. I knew zilch about CR, SQL, and databases in general when I started, but I was up to speed in about a month (enough to do the job, anyway). I was working in suitland (the 'real' developers were elsewhere in the building); there was one former coder who was my manager, and one other report writer. EVERYONE else was nontechnical - MBAs or MBAs to be. ALl of them had a decent understanding of the database structure.
  • In fact, zope's OODBMS, the ZODB is usable without zope, see
    http://sourceforge.net/projects/zodb/
  • We mostly use a custome OO Java layer over our database to drive our primary web application. But, occasionally, someone wants to crank out a reporting app with Perl. No problem, just load up DBI and whatever CPAN libraries you want. We have another app written in Python that hooks into the DB, equally easy. And, yes, we have our resident PHP fans who insist on using that for quick apps, so they use the mysql layer for their language and have no problem.
    And, oh, did I mention that MySQL, which we're using, is free, fast, cross platform, and well tested by thousands of users over the course of many years? It also has tons of freely-available tools (GUIs, web apps, etc.). None of the OODBMSes can touch that (yes, I actually evaluated Zope and Ozone, the two biggest open source alternatives, and they don't come close to what we're looking for).
    But, that said, I WISH we could use an OODBMS that was free or inexpensive (this is a nonprofit institution), cross-language (including scripting!), and standards compliant so that we could move to a competitor if we needed to.
    In the future, we'll probably move to Java Data Objects (JDO), which provide an object-relational mapping layer over a traditional RDBMS, but without the complexity of full EJB. See Exlab's Castor project [exolab.org] for more info.
    --JRZ
  • Firstly, they only really make sense if your applications are OO. Mine are now, I use Zope and its OODBM is amazing (supports transactions, versions, undo etc).

    It keeps things elegant, tidy and dev time is slashed considerably (perhaps 40% of similar things in PHP/RDBMS from my experience)

    If you don't try them, you'll never know.

    http://www.zope.org
  • Compare the TPC-C (OLTP) or TPC-W (web e-commerce) benchmark numbers for the OODB's to those of the RDB's. Well... you can't. None of the OODB vendors have done those benchmarks. They know they would suck. Or maybe they just know that the RDB's have this market sown up, and can't justify the investment.

    Whatever reason, if TP is at the heart of what you do (as it is for MANY systems), you don't want to use a DB whose vendor can't or won't do the industry standard benchmark.
  • Actually, you can use objects with an RDBMS if you are using PHP4. The Serialize command will encode an object for storing in the database as a text string. The only problem is that you can't do any queries on the data in that form. Combines well with sessions though, I believe.

    Personally, I find that there are very few things that actually work better as objects than as straight procedural code, but that's probably just a matter of coding style and language preference.

  • While OODBMS were an obvious choice to me for performance and ease of programming, my consultants told me that finding Oracle talent was so much easier than finding Versant talent (for example) that I would be wasting time and money using OODBMS. This is especially true of DBAs.

    And that's precisely the same reason it took Linux so long to catch on in the enterprise, and why it still hasn't invaded small to medium businesses with only 1-2 network-savvy people. I'd love to switch to Linux fileservers instead of upgrading our NT boxes to 2k, but since we can't find anybody with the appropriate experience to manage them when I'm not around, we stick with the point-and-click OS's. Don't flame me for the decision, I'm just stating why we don't always switch to things we all know are best. (Reminds me of OS/2 for some reason.)
  • I would have liked to see some example code for storing data into the OODMS, anyone have any?
  • by hey! ( 33014 ) on Friday May 04, 2001 @07:13AM (#245681) Homepage Journal
    This reminds of the famous GOTO considered harmful issue.

    Relational systems are useful for a wide variety of tasks specifically because they are limited in their expressive power. This limitation in their expressive power means that certain desirable properties are maintained.

    The objects that are recognized in the relational programming model are scalars, tuples and tables. Most operations are closed on the set of all tables -- that is to say the take tables and produce tables. This means that you can compose operations in various kinds of ways and still have more raw material for further operations.

    To take a more modern view of this: relational databases are about the reuse of facts. The process of designing a database is one of analyzing factual relationships so that eventually each fact is stored in one and only one place. This, along with the closed nature of relational operations, facilitates recombining these facts in various novel ways. I believe this is the source of the relational model's sustained popularity.

    The cost is that the resultant model is not ideal for any single application. I believe this is the nature of the "impedence mismatch" -- you are dealing with an off-the-rack, one-size-fits-most-applications representation of data. Naturally, for complex applications with severe performance constraints, a more tailored representation is required.

    I've never had the cash to hack around with OO databases, so I'd like to learn more. Do they support the kind of composition of operations that you get with relational systems? Presumably objects can be re-used in different applications, but how well does this work in practice?

  • I asked the same question on K5, but got no takers: has anybody played with ZODB outside of Zope? insights/impressions?

  • Aren't you supposed to design an application before implemnting
    it in any way including putting data in a DB? I've worked at two companies and
    had a ton of projects in school and none involved implemnting the database
    before the application was designed.


    I'd like to know what world you are living in. In the real world, most databases
    are legacy databases and FULL of data. I've had to design applications around
    databases for years now. In my field (Programming for Engineers) the data is king
    and people need to access it in multiple ways. True, if you are designing a
    system from the ground up, then you will be able to design the DB and
    make it nice and pretty. This is seldom the case in any case but web development.

    This is simply hogwash. RDBMSs are by their nature
    non-generic espoecially when one adds foreign keys and constaints to a system
    which are necessary for any decent sized application. On the other hand the
    entire point of object oriented programming
    is creating generic reusable
    components. With the ability to use inheritance and polymorphism in an ODBMS I
    see no reason why you believe an RDBMS is more generic.


    'Generic' may be the wrong word here. A better one would be 'simpler'. A lot of applications
    just don't need all the OO stuff. The reason that RDBMSs are so pervasive is
    because most data can be represented well and in an easy to understand way with
    just tables and keys.

    Learning
    OODBMS techniques is mainly learning how to use another API in your bject
    Oriented programming language of choice (well C++, Java or Smalltalk) versus
    learning SQL and relational database theory. If people could learn SQL which is
    completely unrelated to any other aspect of their programming experience then
    adding OODBMS techniques as a skillset would be trivial. Of course, if people
    don't realize that alternatives to RDBMSs exist then they won't learn these
    techniques. That is more important because management can't find people who have
    these skills if developers don't go out and learn these techniques.


    Developers aren't the only ones who have to query the database. In my shop,
    we have 10-20 people querying the same database. Many of whom have spent a lot
    of time learning SQL. Most of the people who need to look at the data are
    not able to pick up a new query language quickly enough. SQL is simple
    enough to learn. RDBMSs are simple and easy to understand. With an OODBMS,
    these people have to be trained on what the heck OO is. This is not an easy
    concept for a non-progammer. On the other hand, tell someone that the database
    is a collection of tables, and they can easily understand.

    Now I just realized you didn't read the
    article. People have measured gains in the range of ten to a thousandfold
    increase in performance, these are not incremental. Secondly the primary benefit
    is that it means you have to write less code and don't have to worry
    about multiple paradigms at once when implementing an application.


    Sure. I'll believe it when I see it. This sounds like marketing hype to me.
    Sounds like someone who didn't know how to program for an RDBMS wrote some
    crappy code. Correctly written code for an RDBMS would not experience these
    kinds of gains when converted to an OODBMS. The overhead for the conversion
    process could be this large, but only if the original code is crap.
  • I think OO has been around long enough now for there to be little excuse for people not to know anything about it.

    On the occasions I've tried picking it up, I've usually ended up with headaches. Either in terms of how people think or how computers work, it makes no sense that I've ever been able to figure.

    My most recent attempt was to pick up Visual C++ for an image-processing class I'm taking this semester (the instructor recommended it in order to access whatever bells and whistles Win32 offers). It seemed to me that you spent more time moving widgets around on the screen than you spent actually writing program code. After a few weeks, I said "screw this" and went back to gcc under Linux. While the rest of the class was running into trouble getting its software working (I'll admit that I don't know if they were struggling to get VC++ to do what they wanted or if they were running into more fundamental problems with the algorithms to be coded), I was producing working code for histogram equalization, 2-D Fourier transforms, and DCT-based lossy image compression (among other things) long before anyone else I spoke with on the subject.

  • One interesting compromise is to use O/R mapping layers; you put all your data in a traditional SQL database and describe a mapping to objects.

    A couple of interesting open-source ones are Castor [exolab.org] and Osage [sourceforge.net]. I haven't had the chance to use either one in a serious project yet, but as a NeXT refugee I'm looking forward to using a good O/R mapping layer again. Do people have any recommendations?

    For those interested in the topic, there is useful information at Scott Ambler's site [ambysoft.com], including his white paper The Design of a Robust Persistence Layer for Relational Databases [ambysoft.com].
  • It has really spiffy Java Object Projection, and is a lot faster than Oracle.

    http://www.e-dbms.com/


    - - - - -
  • by 1010011010 ( 53039 ) on Friday May 04, 2001 @06:21AM (#245697) Homepage
    Cache [e-dbms.com] solves some of the problems you point out. It's accessible relationally or via objects. New object interfaces can be added to existing ones, or to relational and non-relational data stores. So, Cache is generic. Complexity -- because Cache can also be accessed as a relational database, you can write a new Java OO app using its object interface and let older apps continue to use its SQL interface. Skills availability -- start relational, have the choice of trying OO.

    - - - - -
  • by 1010011010 ( 53039 ) on Friday May 04, 2001 @06:24AM (#245698) Homepage
    http://www.e-dbms.com/cache/components/cacheobject s/index.html

    - - - - -
  • I must admit, I'm biased by a very bad experience with ObjectStore. Here's the story.

    I used to work for Excite@Home, in their E-Business Services unit (now defunct; those left are just an engineering adjunct to Excite@Home). We created a web-based store hosting product based entirely upon ObjectStore as the back-end using Java for dynamic page generation getting results from C++ query servers.

    Unfortunately, the site became very popular, and with all the orders, order information, store products, etc. stored in the database, had hundreds of millions of objects (in some cases, very large objects) in the data store.

    We began running up against the 32-bit barrier for address space within ObjectStore. At the time, there was no 64-bit version of ObjectStore (and I don't know if there is now). We would watch performance steadily degrade on our C++ queries over the course of 2 or 3 months, until finally it would nearly grind to a halt because of lack of address space and we would be forced into a 12-14 hour defragmentation routine. Each time we went through this cycle, it would start again, but performance would erode even faster.

    Admittedly, we were doing some pretty bizarre stuff. ObjectStore didn't support on-the-fly schema changes, so we hacked some utilities which allowed us to do that (and which ate address space). We also stored all the product orders in the database, and we never fully deleted orders until we defragmented. But fundamentally, ObjectStore had a problem with scalability for extremely large databases (billions of objects).

    We went to Oracle, and the problems disappeared. Hello, 64-bit world, hello nearly unlimited address space, bye-bye constant database defragmentation. I'm not saying Oracle is a panacea -- it's not, and is quirky as hell -- but it blew the crap out of ObjectStore in this case.

    My two cents.

    Matt Barnson

  • Hmm... Some probjects may be like that, but there are others that aren't.

    A very common requirement is to add a boolean or a timestamp field that is used by a maintenence batch process to determine if it needs to do something to a record. In the RDBMS world, you don't need to interupt your interactive application to replace a batch process that is likely running on a seperate machine and only at night.

    In fact, with Oracle, you can, on the fly, add a boolean flag, plus an trigger to be run on update to set the flag to 1. Then your modified batch process can set it to 0 when it does it's thing.

    Now, I would like to solve the "impedence mismatch" between 3GL code and RDBMSs, but I don't know if it is possible w/o sacrificing the generic nature of RDBMS. I would rather switch languages to something better suited to database work (PERLs DBI is pretty nice, compared to C/C++/Java interfaces, at least for what I used it for)
  • by NothingCleverToSay ( 76997 ) on Friday May 04, 2001 @08:46AM (#245706)
    I have used Versant on a medium to large scale development project. For 80-90% of the code, using and ODBMS was a dream. A simple persistant base class for objects which need to go in/out from the DB was an elegant solution that was a joy to use. I was all ready to become the new ODBMS advocate/zelot for all my future development projects.

    Then I ran into a wall. The wall was Ad Hoc query. For most of our system, traversing an object model was a very elegant way of accessing data. But for that last 10%, we really needed a fast, efficient Ad Hoc query. Here is where the ODBMS fell flat on its face. The querys were slow, and doing something akin to a "join" was mighty painful. And of course, it turned out that these operations were the most used and the slowest part of our system. Everything came crashing down arround me. What had been a joy to develop, was a nightmare to use.

    Our application was a series of seperate distributed apps, all reading and writing to a shared datastore. Although walking between related objects was a dream, finding the "head" of the tree would always be a PITA. Our data had a good parent-child-grandchild-etc has-a set of relationships. But finding the "interesting" parent objects was very, very slow. Once the parents were found, traversing thru the related data was fast and easy, but the startup of each operation was a huge bottleneck.

    This may just be poor design experience on my and the other developer's parts. Just like a set of C developers can create a truly horrid C++ design, we RDBMS developers may have just abused the OODBMS. But the fact is we had a group of half a dozen experienced OO developers, and we all thought we had a good approach. If a group of developers with good OO programming experience, and good RDBMS experience can't figure out how to correctly use an ODBMS, then I don't have much hope for the technology. Either the technology has some serious limitations, or the learning curve is very, very steep. Either way, I've been sticking to my tried-and-true RDMBS every since.
  • This is a good article, but only presents 2 extremes in the examples (in Java): either use a native OODBMS, or use raw JDBC. The latter is not how anyone doing any medium-to-large scale projects. Instead of using "raw" JDBC to talk to an RDMS, object-relational middleware (like "TopLink", etc.) are used to map the objects to the tables transparently to the programmer. So, you can have code which looks like, and is as easy to understand/use, as the OODMS code in the example, but the back-end is a RDMS.

    Just check out what google(tm) gives you for the search terms: object relational mapping [google.com].

    yabba

  • by Carnage4Life ( 106069 ) on Friday May 04, 2001 @07:05AM (#245717) Homepage Journal
    6. The world is not object oriented. Even if oo is a usefull tool, it is no silver bullet.

    Which makes more sense when writing an application using an object oriented programming language to develop an application? Using a database that is consistent with the programming paradigm and performs database operations transparently or one that requires the developer to go through additional hoops to get data, is generally slower, and involves writing more code?

    7. RDBMS are proven technology and rather well standardised, OODBMS aren't. Currently there is a proposal for a standard (java data objects), but even that only addresses one plattform.

    Not only is there a standard but the ODMG standard is on version 3 [odmg.org], JDO is merely a Java standard. Please know the facts before flaming.

    --
  • Hi, thanks for the responses, I didn't think anyone would be done reading it so quickly. :)

    Complexity. These systems are much more difficult to design than RDBMS. The application must be designed first, then the data structures must accomodate that. This kind of design is very expensive.

    Aren't you supposed to design an application before implemnting it in any way including putting data in a DB? I've worked at two companies and had a ton of projects in school and none involved implemnting the database before the application was designed.

    RDBMSs are generic. Since an OO system is designed for a specific application, it's difficult to use that system for anything else. A well-designed, properly normalized RDBMS can be used for many different applications. When a DB is going to fill many terabytes, you don't want to have multiple copies of it for each distinct reporting application.

    This is simply hogwash. RDBMSs are by their nature non-generic espoecially when one adds foreign keys and constaints to a system which are necessary for any decent sized application. On the other hand the entire point of object oriented programming is creating generic reusable components. With the ability to use inheritance and polymorphism in an ODBMS I see no reason why you believe an RDBMS is more generic.

    Schema changes. As mentioned in the article, schema changes are a nightmare with an OO system. In a relational system, some changes can be made with no impact on existing applications. Others are relatively uncomplicated compared to similar OO changes.

    ...and in an OO system some changes can be made without no effects on the existing applications. In the general case though an RDBMS is more flexible than an ODBMS.

    Skills availability. Yes, the old management problem. Everyone knows SQL; nobody knows OO.

    Learning OODBMS techniques is mainly learning how to use another API in your bject Oriented programming language of choice (well C++, Java or Smalltalk) versus learning SQL and relational database theory. If people could learn SQL which is completely unrelated to any other aspect of their programming experience then adding OODBMS techniques as a skillset would be trivial. Of course, if people don't realize that alternatives to RDBMSs exist then they won't learn these techniques. That is more important because management can't find people who have these skills if developers don't go out and learn these techniques.

    It's just not worth it. Given the dramatically higher costs associated with designing and maintaining an OO system, most applications just don't need the incremental performance gains associated with it. Very specialized, very high performance systems would benefit, but smaller or more general systems would not.

    Now I just realized you didn't read the article. People have measured gains in the range of ten to a thousandfold increase in performance, these are not incremental. Secondly the primary benefit is that it means you have to write less code and don't have to worry about multiple paradigms at once when implementing an application.

    Finally, where the heck are you getting this BS that designing an application with a single data model (i.e. one set of UML diagrams) is more expensive than designing one with 2 data models (i.e. an ER model for the DB, UML for the application).

    --
  • Once upon a time, before relational databases, there was something called the Conference on Data Systems Languages [umn.edu], an organization which developed standards for COBOL. They defined the "CODASYL DBMS", which was basically a way to put persistence into COBOL. COBOL records could be stored, indexed, and explicitly linked.

    This approach had the same advantages and disadvantages of an "object oriented database". The data was too closely coupled to the applications. Adding a new field or index required modifying and recompiling all the applications and rebuilding the database.

    The great advantage of a relational DBMS in a business environment is that it isn't closely coupled to the applications. For long-lived data, this is essential. That's why relational DBMS systems won out over explicitly-linked databases decades ago. They have the flexibility needed for long-term data storage.

    Persistent data storage of language objects is an idea that keeps recurring in academia. It can certainly be done, but the long-term operational headaches aren't worth the short-term gain.

    A related problem is the storage of data trees in databases. The current buzzword for this is XML databases [rpbourret.com], but systems for this go back a long way; check out MUMPS [mumps.org]. You can store a tree in a relational DBMS by breaking up all the nodes into rows and using serial numbers to tie them together, but retrieval takes a huge number of lookups. You can also store a tree as a BLOB (a binary object that the database system doesn't parse), but then you can't search it. There's no general agreement on how to approach this problem yet, but this, not persistent object storage, is probably the way to go.

    The database community learned painfully to separate indexing from structure. In SQL, you can do any search regardless of whether indices exist to make it fast. Indexing is a performance enhancement, and indices can be created later as needed to improve performance, without impacting programs. Any new database system should have that property.

  • GemStone/S [gemstone.com] is still a commercial app, but pretty much all of the Smalltalk source is included. That's something particular to Smalltalk culture, not so much the product specifically. But, if you really wanted, you could make fundamental changes to any part of the system.
  • Just to append it to the list in the article. I interned at an insurance company which is using IBM's VisualAge for Smalltalk and GemStone/S to run all of their insurance production definition. Really interesting stuff, actually. Define it in a GUI, and gen code for the mainframe.
  • That's not the case with every OODBMS. With GemStone/S, for example, you don't ever do anything remotely like that.
  • by bburcham ( 131217 ) on Friday May 04, 2001 @09:05AM (#245731) Homepage

    I've got a better question: why aren't you using the RDBMS?

    Many of us who crow about the wonders of OO programming environments, don't have a firm grasp of the alternatives, nor do we fully appreciate the problems that those OO environments solve versus the good things they traded away. For building significant, long-lived, scalable, evolveable, administerable, restartable information systems the RDBMS has not been beat.

    If we start from the opposite side, i.e. we start with the RDBMS and ask: what is it that is distasteful about programming in this environment, we might actually get somewhere. If I take Oracle as an example and compare it to e.g. Java the only shortcoming I see with Oracle's PL/SQL is that it doesn't (to my knowledge) support polymorphism. It does support encapsulation and abstraction (functions, procedures, packages with data hiding), and the biggie: declarative, optimizable association specification. It certainly supports "structured programming". Are you willing to trade away all that RDBMS goodness just to get polymorphism. Seems like a poor tradeoff.

    I'll go even further. It is not at all obvious that the OO "model" is superior to the relational one. These observations from this paper [msu.edu] by McCarthy apply just as well now to OO models, as they did to non-relational (accounting) models back in 1982 (pp 554-555):

    (2) Its classification schemes are not always appropriate. The chart of accounts for a particular enterprise represents all of the categories into which information concerning economic affairs may be placed. This will often lead to data being left out or classified in a manner that hides its nature from non-accountants.

    (3) Its aggregation level for stored information is too high. Accounting data is used by a wide variety of decision makers, each needing differing amounts of quantity, aggregation, and focus depending upon their personalities, decision styles, and conceptual structures. Therefore information concerning economic events and objects should be kept in as elementary a form as possible to be aggregated by the eventual user.

    What McCarthy is arguing for is dis-encapsulation! Anti-OO. I think there's an important lesson there.

    So the question is: can we have that flexibility along with maintainability?

    Also, be careful to avoid reasoning from an outdated view of the data type expressiveness offered by the modern RDBMS. All the major vendors are now offering so-called OO/Relational features such as object identifiers, large objects, arrays, structures, sub-tables.

  • I'm sorry, I don't understand many of your arguments. (Disclaimer: I never really used either OODBMS or RDBMS.)

    These systems are much more difficult to design than RDBMS. The application must be designed first, then the data structures must accomodate that. This kind of design is very expensive.

    Don't system designs using UML or any modelling technique used today translate quite simply to an OODB, since they are OO to start with?

    The application must be designed first...of course, on successful projects people don't immediately start coding without knowing what they're coding. I don't see how that differs depending on an RDB or OODB world...

    Since an OO system is designed for a specific application, it's difficult to use that system for anything else.

    Isn't that like saying you can't use the RDB you designed for project A on project B? It seems to me if you can move tables representing objects from project A to project B, you should be able to move the objects from the OODB used in project A to project B. The transportability of the objects OR tables depends on the relationship/similarity between the two projects.

    I don't understand what you mean by multiple copies of an multi terabyte OODB....

    In a relational system, some changes can be made with no impact on existing applications.

    If changes can be made to the RDB tables without impacting the system, can't correspondingly similar changes be made to the OODB object models? If code has to change in one, it would seem to me that code would have to change in the other.

    Everyone knows SQL; nobody knows OO.

    Can't argue with that. Reminds me of a quote from a book that went something like this: "Like it or not, SQL is intergalactic interspeak."

    As I said, I'm not very familiar with either, and any clarification of your points would be appreciated.

  • Of course I was joking. The folks above didn't seem to catch on ; )

  • by _|()|\| ( 159991 ) on Friday May 04, 2001 @06:22AM (#245747)
    Dare has completely avoided the most important issue: compatibility. I work for a company that produces a DBMS that is marketed as "post-relational." It is, in fact, a hierarchical DBMS with a relational layer and a separate object data management layer. (The relational and object layer are linked by the conventional class-table, object-row, attribute-column mapping.) The object database is the strategic offering, with a proprietary scripting language, proprietary COM and Java bindings, and a proprietary web "server pages" technology. (If you think I'm using "proprietary" as a dirty word, you're right.) The relational database is still the money maker. We get "relational refugees" from Oracle and Sybase. We have a LAMP (Linux, Apache, MySQL, PHP) customer replacing MySQL with our ODBC interface.

    As patchy as the SQL, ODBC, and JDBC standards may be, they have commoditized the DBMS market. Until object databases can do the same (the ODMG standards [odmg.org] don't even come close), they lock you into a proprietary solution. Ultimately, if your database doesn't scale as well as you'd like, that will hurt performance.

  • It's a somewhat important point that SQL was originally designed for ad hoc reporting tasks by semi-technical users. It's got a low bar of entry by design.
  • From my point of view, there is an additional reason.
    Theoretical rudiments for relational databases do exist and are well understood by some. I am not referring to SQL here, but rather to math.
    Term 'relation' is well defined. Math behind it is pretty and simple. Simple is good.
    So, on the one side we have well defined mathematical concept that can be worked on. On the other hand - elusive 'art of objects'. Although I personally prefer the latter (makes me feel good about myself) I always appreciate ability to define software in more solid terms. A relational database engine (what it does, what it needs to do) can be defined in terms of quite comprehensible algebra.
    There are similar efforts for OODBMS (evolving algebras, for instance) but these are relatively recent - and most people doesn't care about them.
    Conclusion: relational database engines are simple in construction. OO databases are not. Simple is good. Complex is bad. Long live RDBMS.

    -m-
  • With increasing levels of complexity you get increasing levels of functionality. You also get more things that break. You have to make sure that the underpinningsa are set correctly. When they are not, then watch out.

    [Insert snide comment here] Take a look at that pinnacle of Object Oriented Programming, Microsoft Office

    That cheap shot aside, The ramp up to a level of truly competent understanding is much longer than anticipated. The problem is that often OOP can give the appearance of competancy to those not in the know, but you still have the same problems that you had before, that can be much more difficult to find, if you are not expert

    Check out the Vinny the Vampire [eplugz.com] comic strip

  • ...for proving my point. ;)


    --
    Scott Robert Ladd
    Master of Complexity
    Destroyer of Order and Chaos

  • Now then, if you managed to find a language whose native types were *all* expressed as objects, where my data were all most naturally expressed as an object

    Eiffel? C#? How about the originial -- Smalltalk.

    (god bless Objective C and Scheme)

    I don't know much about Objective C, but Scheme? In no sense are, for example, numbers treated like OOP objects in Scheme. I mean, you can't subclass them. They are objects in a non-OOP sense, I guess -- you can query their type and all. But this sense is irrelevant to the current thread.
  • Which makes more sense when writing an application using an object oriented programming language to develop an application? Using a database that is consistent with the programming paradigm and performs database operations transparently or one that requires the developer to go through additional hoops to get data, is generally slower, and involves writing more code?

    How about using the appropriate paradigm for the application at hand (which is not always OO), the right paradigm for the data in the database (which may be relational, OO, etc.), and establishing a sensible protocol between the two?

    The point of your target article is well-taken, but don't get too religious about OO. uniformity for uniformity's sake is seldom convincing.
  • by LionKimbro ( 200000 ) on Friday May 04, 2001 @10:36AM (#245772) Homepage

    If I understand correctly, the idea is that the RDBMS is turned into an object persistance store. You pull the object's data from the data store, manipulate the object (which may or may not update the database), and then you can store the object back away.

    The idea seems to be that we should not abstract ("essentialize") database transactions. We shouldn't have to think about transactions with the data store, and by golly, we won't. We'll make the database a 1:1 mapping with the objects, and that'll be that.

    It's a terrible idea. You want as tight control over your trips to the database as possible. Sure, if you're running a small app on one machine, you're fine. But if you've got hoards of transactions coming through, it is really important to watch those trips.

    When I worked with WebObjects/EOF, us developers were constantly doing a tug and pull with the system, trying to get just the data we wanted. Different object sets and different APIs would have different ways of presenting the very simple information we needed.

    For example, say I have 500K entries in the People table. We want to view their name and email, and a couple other things, 50 at a time.

    With an object store, even though I only want their name and email, I've got to pull out everything else in there. What an incredible waste! There are so 10-20 fields in there!

    In proper OO fashion, these folks are in a list. I can't possibly pull out 500K entries, so the API goes through twists and contortions to let me select out just the first 50, and then page through in 50.

    Do you see what's happened? OO programming has "degraded" itself into what should be SQL land, though it's doing a damn poor job of it. Sure, it sounds nice to say, "Oh yeah, we'll just make an object in the OOP program an object in the database", but what happens when that object is a linked list of 500K items? Suddenly you have these lazy bindings from your linked lists, and each time you traverse another item in the list, it's making a query!

    We had to have 5-7 test database servers so that we could make sure our performance was okay, every time we made a change!

    Ug! What a terrible idea..!

    We would have the friggen SQL written down, EXACTLY like we wanted, and it was terribly frustrating to have to wrestle with this system, trying to get our fucking data out, and nothing BUT our data out..! A lot of programmers just didn't bother. "We'll just pack in more RAM." Eeeee! The database guys hated us.

  • by micromoog ( 206608 ) on Friday May 04, 2001 @06:16AM (#245776)
    From a purely academic standpoint, OO database systems seem like a better solution. They are elegant, and designed to integrate perfectly with the application. However, most of the reasons for using them are also the reasons for not using them:
    1. Complexity. These systems are much more difficult to design than RDBMS. The application must be designed first, then the data structures must accomodate that. This kind of design is very expensive.
    2. RDBMSs are generic. Since an OO system is designed for a specific application, it's difficult to use that system for anything else. A well-designed, properly normalized RDBMS can be used for many different applications. When a DB is going to fill many terabytes, you don't want to have multiple copies of it for each distinct reporting application.
    3. Schema changes. As mentioned in the article, schema changes are a nightmare with an OO system. In a relational system, some changes can be made with no impact on existing applications. Others are relatively uncomplicated compared to similar OO changes.
    4. Skills availability. Yes, the old management problem. Everyone knows SQL; nobody knows OO.
    5. It's just not worth it. Given the dramatically higher costs associated with designing and maintaining an OO system, most applications just don't need the incremental performance gains associated with it. Very specialized, very high performance systems would benefit, but smaller or more general systems would not.
  • by micromoog ( 206608 ) on Friday May 04, 2001 @08:22AM (#245777)
    Why do you think 19 out of the 20 biggest Telco companies use ObjectStore?

    This is marketing hype of the worst kind. If you look, I would be willing to bet you would also discover:

    • 19 out of the 20 biggest Telco companies use Linux
    • 19 out of the 20 biggest Telco companies use Windows NT
    • 19 out of the 20 biggest Telco companies use HP printers
    • 19 out of the 20 biggest Telco companies use Dell computers
    • 19 out of the 20 biggest Telco companies use Gateway computers
    • 19 out of the 20 biggest Telco companies use IBM computers
    • 19 out of the 20 biggest Telco companies have some employees named Dave
    . . . you get the idea.
  • by stille ( 213453 ) on Friday May 04, 2001 @06:41AM (#245781)
    Because I'm too lazy to read 21846 bytes of text that explains why I should.
  • Here's another example for your case against OO: We implemented Poet, (at no small cost, I might add) only to find that within 6 months of beginning to use Poet as our major DB, a major 'Organizational Re-Adjustment' and a 'Data Center Consolidation' project wiped the whole dang thing in favor of DB2/OS390.

    Now, our DB2 performance is actually much better than Poet ever was. Perhaps this is due to having highly skilled DB2 DBAs, or something. All the same, it is not worth going to a very expensive, single application db. Always use something standard, flexible, and easy to find admins for. It's worth it in the long run.

    -WS

  • Is there a reason that Oracle wasn't in the list of OODBMS? It's only the most used database in the world...

    And the reason I'm not using it yet is because it simply hasn't been around long enough. Oracle's implementation from what I understand is still a bit buggy. The RSBMS version has been around for muuch, much longer, and when you're dealing with enterprise-class applications, you sure as hell don't want to use anything that's even close to bleeding-edge.

  • Actually, it is. Check the Oracle Docs [oracle.com].

  • I appreciate the merit and advantages to object databases over relationals, and the code example was an eye-opener. Still, how do I transfer over simply? Every coder and his dog is familiar with relationship databases, but how prevalent is object....anybody?

    1. what the? [nowjones.com]
  • The reason you failed is because your product sucked! We used ObjectStore 8-10 years ago, and it was a nightmare: persistence was implemented by taking over memory mapping causing page faults is not the most efficient way to access data.

    In a RDMS if you have a problem with data, you just go look at the table that you have a problem with and fix the problem. In an OODMS if you have a problem, you need code to navigate to the offending data (via those ever efficient page faults). So every fix requires software to be written.

    Another problem was the arogance of the people from ObjectDesign, and from reading the preceding post, nothing has changed (not our fault, the world didn't understand us).

    Other than transactions, the persistence that both Borland an M$ built into their C++ compilers was superior to the snakeoil that ObjectDesign was peddling

  • Um gee isn't that what:

    ( Read More... | 21846 bytes in body | 106 of 163 comments | Features )

    Is for? I'm sure you're joking but I had to reply to make it 164 :)

    --

  • Hey Rob, could you not post such code heavy topics on a Friday? I'm still caffeine deficient, burnt from the week, and I'd like a lighter, funnier /. to get me in the weekend mood. Save this stuff for Monday, and put up some Katz or Star Wars stuff.
  • But it's a big one. A company I worked for recently employed a RogueWave product to emulate an OODBMS on top of an relational DB, and the result was utter horror. Having to recompile everything as the result of a schema change is a major pain, especially if you have to deal with multiple versions of the codebase. Of course, the situation was worsened in this case because the RW software had to generate all of the mapping code.

    I also find that building entity-relation models for relational DBs -- that is, thinking of objects in the form of tables, rows, and columns -- is a very clear way to figure out the problem domain and evaluate different solutions. A successful development process might well include a preliminary stage working with an RDBMS, even if only in working out the conceptual kinks, and then move on to an OODMS.

    Finally, a crucial criterion I would employ in which system to go with is the complexity of the data to be stored. The application I worked with was a horrible candidate for an OODBMS because the information itself was simple: names, contact info, and the like, which fits the relational model quite naturally. On the other hand, I'm about to start on a project of my own utilizing highly complex objects, capable of much greater sophistication than in my other example. I will most likely use an OODBMS, for instance db40 [db40.com].

  • Aren't you supposed to design an application before implemnting it in any way including putting data in a DB? I've worked at two companies and had a ton of projects in school and none involved implemnting the database before the application was designed.

    Uh, you're both right and wrong here. Yes, you are right that one should design the application before implementing the database. But you are mistaken about one point: The first step to designing the application is to understand the problem, and that includes understanding the data. Understanding the data means modeling -- or if you will, designing -- the data (which is not the same as implementing the database). If you design the objects first, then try to figure out which data belongs to what object, you may overlook some important data. Despite the department name (INFORMATION Technology) half my job seems to be convincing people that the data is what really matters, not the color of the web page background.

    It doesn't matter whether you are using OO techniques. I once took a class in OO design where the instructor pointed out that you can implement an OO design in just about any non-OO language. The point is how you look at the problem. If you are writing a program to track a fleet of taxis and you choose OO techniques where you model the taxis as individual objects with a set of attributes, so what? You can still implement the design in C or Cobol or assembly or whatever non-OO language you choose, and you can store the data in an RDBMS or a big flat file -- it's still an OO design. Perhaps implementation will be facilitated by using C++ or Smalltalk or whatever OO language you choose, and perhaps implementation would be easier with an OO database. Or perhaps not; as others have pointed out there are other factors to consider, such as existing skill sets.

    Also, don't OO databases store their data in RDBMS systems? Or do they store their data in a big flat file? Or do they create a small, flat file for each object? Does it matter? I submit that it doesn't matter -- the OO design is a layer of abstraction on top of the implementation method. For all you care, Oracle can store their relational data in one big fat flat file; the fact that they store it in tables is an abstration on top of the actual storage mechanism. OO databases are, or should be, no different.

    The article says OO databases are better because they involve less code. Less code for you, perhaps, but way more code for the people who wrote the OO database. An OO database gives the application designer a layer of abstraction. That layer of abstraction makes life easier for you, and you may believe it even makes your applications run faster, but it isn't necessary to implement an OO application.

    When that OO layer of data abstraction becomes as common (both in terms of standardization and availability) as SQL, then we will see more OO designs implemented on OO databases. Until then, some of us are perfectly capable of implementing our OO designs with RDBMS's. Or, as someone once said (attributions welcome), "A good programmer can code Fortran in any language."

  • Isn't this tying too much of the DB design into the application? And it pretty much precludes reusing a DB unless you have access to the source code, doesn't it?
  • I think the main reasons people aren't even considering OODBMSs is the price, limited language bindings, and the lack of standardization. RDBMSs, in contrast, are standardized and accessible from a large number of languages with a single API, and there are numerous excellent free implementations.

    In fact, none of the open source implementations mentioned even comes close. Ozone, XL2, and Zope are quite language specific and don't even have C++ bindings. (There is a free OODBMs for C++, the Texas Persistent Store, but I think it also has many limitations.) FramerD isn't even an OODBMS because it doesn't attempt to bind language objects to database objects but introduces its own data model.

    More generally, I think the traditional OODBMS approach turns out to be too rigid and too low level for many applications. If someone going to keep data around for a long time, they don't want it abstracted and encapsulated, they want concrete, exposed, well-defined representations with powerful operations for manipulating it. RDMBSs provide that. I think among the non-RDBMs systems, FramerD comes closest to that.

    So, there are practical reasons why people don't use OODBMSs. But I think there are also some fundamental theoretical issues having to do with how data is manipulated and data models evolve. Still, OODBMSs have many attractive features, so one can hope that they will evolve to meet people's needs more. For some applications, they are already preferable.

  • by janpod66 ( 323734 ) on Friday May 04, 2001 @11:46AM (#245812)
    The problem is that the things were ever referred to as database systems.

    Well, given that it was priced and marketed like a high-end, enterprise-grade database system, that kind of seems reasonable.

    What we should have done from day one was to sell persistence for C++.

    Indeed. But a persistence library for C++ might cost a few hundred dollars per developer and have no or minimal per copy runtime costs. ObjectStore was priced out of that market by orders of magnitude.

    In addition to the license costs itself, there are training costs, retooling, and the cost associated with the risk of picking a single vendor solution. Even if you had given ObjectStore away for free it would have been difficult to displace RDBMSs.

    The best chance for success I see these days would be to have a simple, reasonably good open source OODBMS and make money on management tools and high performance versions. Still not the stuff of billion dollar companies, but a decent living.

    That strategy was necessary in some ways, because we were venture-funded, and the VCs weren't going to be happy with a small niche. They wanted something that would get into every insurance company and bank. However, by aiming high and failing (by VC standards), we abandoned our natural market too soon and avoided becoming a small success in that market.

    It's unfortunate that good technology like ObjectStore failed, but ultimately the choice was yours when you accepted the money and the business model.

  • This is a little piece I wrote recently as an internal memo. We use Versant on an internal project, and wanted to justify the use of OODBMS. We want to begin supporting an open-source one, which is where this came in...

    ---

    Premise:

    When Java first started appearing in enterprise-wide systems, there were large existing systems containing the enterprise's data. Early on, for Java to have acceptance as a solution to problems in this domain, there had to be a way to access this data... data was not going to be 'recreated' for a new, unproven language.

    By far, this data was stored in relational databases, like Oracle and Microsoft's Sql Server. The shortest path to accessing this data in a way that made sense to Java's cross-platform nature was to take an existing specification, ODBC, and create a java-specification based on it. Thus, JDBC was born.

    In the years since then, JDBC has matured into a very usable API for accessing relational databases, and a lot of Java developers have had to learn how to use it. Many developers don't even realize it, but there is a mismatch between storing data as an object graph and storing it in a relational database. As a developer, you have to write a lot of code to map between the object world and the relational world. There is a better way.

    JDBC is great for accessing existing relational data... It is and should be considered a bridge to legacy systems. If you are starting a new project using Java, there should be a better way. There should be a way to store your data without thinking about it. You should be able to hand your objects to a service that will store them. You should be able to maintain complex relationships between those objects, and you should be able to query that service for objects that match certain criteria. You should not have to muddy up your domain objects with code for storing themselves, and you shouldn't have to extend any objects that a framework provides. You shouldn't have to complicate your build process with 'enhancement' steps, and your objects should not be bytecode modified, so they are still usable in debugging environments.

    Does such an object-oriented database exist? In some respects, yes. There are object-oriented databases such as Object Store and Versant. They operate very closely to the ideals listed above. But for most developers, especially those that already know JDBC, their price tags are a large barrier to entry. Until you know the technology, you won't recommend it. You won't recommend the technology until you use it. You are not going to spend $10,000 of your own money to get to know and use a product like this... there are too many other things happening in the Java community anyway.

    Conclusion:

    These are the reasons object databases have not become popular: There is a high barrier to entry for their use. The databases themselves are expensive. And Java Developers doing database access know JDBC, which is a 'good enough' solution.

    In order to overcome this and make object databases take their position in the java community as a preferred way to store data, we need a good, Free, Open Source implementation with all the benefits of transparency of use. Once the database is Free, it must be evangelized by developers. Other developers need to know of it and learn it.
  • by MatthewNYC ( 413607 ) on Friday May 04, 2001 @06:04AM (#245814)
    I just went through this decision-making process with the consultants who are going to build my company's OSS. While OODBMS were an obvious choice to me for performance and ease of programming, my consultants told me that finding Oracle talent was so much easier than finding Versant talent (for example) that I would be wasting time and money using OODBMS. This is especially true of DBAs.
  • Of course that's a personal problem

    But there's a good reason why it stops my *good* use of an OODB.

    If you look well and hard at C++, it's mildly object oriented, you can at least *create* objects, right?

    If you look well and hard at Java (god bless it), it's *mostly* object oriented, at least there's a *root* object in the hierarchy, right?... Hmm, what are all those basic components though, is that an integer over there? What kind of object is that again?

    Now then, if you managed to find a language whose native types were *all* expressed as objects, where my data were all most naturally expressed as an object (god bless Objective C and Scheme)... well, I'd be much more likely to store my data objectively (come to think of it, I do hate most of my data more than a little).

    At any rate, I think the answer is that Object technology is only just coming into its own, and the rigor required of us the programmers to *USE* object orientation to its fullest extent is something that we don't enjoy doing for something as crunchy as DBMS access. Of course, just trying to get the data-creating departments to specify your data in an object oriented fashion might very well bite arse also...
    Nietzsche on Diku:
    sn; at god ba g
    :Backstab >KILLS< god.

For God's sake, stop researching for a while and begin to think!

Working...