OLTP vs Data Warehouse, Why not or just have both?

Journal Chacham's Journal: OLTP vs Data Warehouse, Why not or just have both? 11

Journal by Chacham on Wednesday March 19, 2003 @12:25AM

This is something that I've wondered about before. I had a small interview with some people regarding some work, and the idea came up again. So, I've been thinking about it.

The issue is having two databases. One as an OLTP, and the other as a Data Warehouse. Being I worked with Oracle, and I like a good deal of the Oracle way of doing things, I am going along with Oracle terms. Basically an OLTP is an On-Line Transaction Processing. It is a database that is designed more for input than for output, and most queries are small. A good example would be an ATM machine. In such a case, the database should be normalized, for authoratative answers, and less work on input.

On the hand, a Data Warehourse is made more for output than for input, and most queries are large. A good example would be a data mining firm. In such a case the database should not be normalized, for quick answers, and less work on output.

The problem is, many companies need both. There is always information going in, and every day a number of people want some serious reports. That's when the normalization fights come in, and the line in between OLTP and Data Warehouse blurs beyond recognition.

What I've been wondering, is why not have both? The OLTP could be normalized, and be the authoritative database, and the Data Warehouse could be updated nightly (or weekly). The only loss really is the reports are woking on data from yesterday. But that shouldn't be much of a problem.

OLTP vs Data Warehouse, Why not or just have both?

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 11 Comments Log In/Create an Account

Comments Filter:

Data sync can be ugly... (Score:2)

by Xerithane ( 13482 ) writes:

Just as an FYI, and it's not definite but I may be able to release something under the nerdfarm name that can help you with the data synchronization.

It's a perl system, basically a glorified job scheduler. It needs some more work but it does cascading job schedules with failover. One major patch that I want to do is embed the jobs in a framework that allows shared database connectivity. I'll find out hopefully this week or next about the status of releasing it GPL'd.
- Re:Data sync can be ugly... (Score:2)
  
  by Chacham ( 981 ) writes:
  
  Sounds interesting, but why would I need it? (Enlighten me please. :-)
  
  The way I see it, at midnight a job could fire off a procedure that just runs a bunch of queries to throw report worthy data into the Warehouse. Either INSERT INTO DW.OLTP_Customer SELECT * FROM Customer; Or even, CREATE TABLE DW.OLTP_Customer AS SELECT * FROM Customer; (So the tables don't take up any space that they don't need.) Maybe even exp/imp? The data would be loaded into temp tables, just to get the data over. Then, the Warehous
  - Re:Data sync can be ugly... (Score:2)
    
    by Xerithane ( 13482 ) writes:
    
    Sounds interesting, but why would I need it? (Enlighten me please. :-)
    
    To control the link, it's just a high-capability cron/job scheduling system.
    
    The way I see it, at midnight a job could fire off a procedure that just runs a bunch of queries to throw report worthy data into the Warehouse.
    
    They way I would do it is do a dump of the database from a specific time range, reducing overhead on the transaction system. I'm also seeing this on two different systems, which may not be the case I see now.
    
    Job s
    - Re:Data sync can be ugly... (Score:2)
      
      by Chacham ( 981 ) writes:
      
      reducing overhead on the transaction system.
      
      Good point. But, being there's no real UPDATEs, other than just dumping data, is that really going to be a burden? Nothing in the original DB will be changed, so there's no need for rollback images, and the warehouse would be closed to other transactions. I'd even remove all INDEXes first, so all it does is DELETE the old data.
      
      But an export/import might very well be quicker. Now that I think about it, the real concern would be the space the dump would take. If
      - Re:Data sync can be ugly... (Score:2)
        
        by Xerithane ( 13482 ) writes:
        
        But why can't DBMS_JOB to that? At least in Oracle it can run a procedure that has failover and monitoring. It could easily update a table mentioning its progress, and being inside the DB, I'd assume that it would be at least a little bit faster.
        
        I'm assuming a scenario of a light-weight insert style database (like berkeleyDB or something.) So you should be fine w/ Oracle.
        
        I'm not trying to put you're idea down. I'm just trying to understand it. I appreciate you helping me here.
        
        I know. It's not my ide
        
        Re:Data sync can be ugly... (Score:2)
        
        by Chacham ( 981 ) writes:
        
        Thanx for the info.
        
        Had the interview. Looks like I won't be doing anything like that. At least not right now. Although, down the road it is a possibility. I'll keep you in mind.
        
        Re:Data sync can be ugly... (Score:2)
        
        by Xerithane ( 13482 ) writes:
        
        Had the interview. Looks like I won't be doing anything like that. At least not right now. Although, down the road it is a possibility. I'll keep you in mind.
        
        Cool. I'll do a JE if I get the ok for the GPL license.
        
        Re:Data sync can be ugly... (Score:2)
        
        by Chacham ( 981 ) writes:
        
        OK.
        
        And I must say, you sure do read slashdot often. :-)
        
        Re:Data sync can be ugly... (Score:2)
        
        by Xerithane ( 13482 ) writes:
        
        And I must say, you sure do read slashdot often. :-)
        
        Today I'm doing some finishing stuff on some code, so mostly my brain is off ensuring I'm not doing anything stupid and my body is trying to keep itself from fidgeting to the point of explosion... hence, slashdot :)
Garbage in, garbage out? (Score:2)

by GoRK ( 10018 ) writes:

Too much theory; not enough code!

I understand the need for truly enterprise stuff in some applications, but what kind of system is it that needs huge reports like this on up-to-the-minute data (ie not already a day old or something?) You'd have to run a horse track in hong-kong to need something like that, and even still your data set size could probably let you get away with Oracle on big iron.

~GoRK
- Re:Garbage in, garbage out? (Score:2)
  
  by Chacham ( 981 ) writes:
  
  Too much theory; not enough code!
  
  That the way Databases are. Mostly theory and design, and a few things like schemas, triggers, and stored procedures to make them work together. It means the design phase is the longest, and must be pretty much perfect.
  
  but what kind of system is it that needs huge reports like this on up-to-the-minute data
  
  None really. However, the application doesn't pay the bills, the company does. And the suits there generally want it up-to-date, even if for insane reasons. However,

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Journal Chacham's Journal: OLTP vs Data Warehouse, Why not or just have both? 11

OLTP vs Data Warehouse, Why not or just have both? More Login

OLTP vs Data Warehouse, Why not or just have both?

Data sync can be ugly... (Score:2)

Re:Data sync can be ugly... (Score:2)

Re:Data sync can be ugly... (Score:2)

Re:Data sync can be ugly... (Score:2)

Re:Data sync can be ugly... (Score:2)

Re:Data sync can be ugly... (Score:2)

Re:Data sync can be ugly... (Score:2)

Re:Data sync can be ugly... (Score:2)

Re:Data sync can be ugly... (Score:2)

Garbage in, garbage out? (Score:2)

Re:Garbage in, garbage out? (Score:2)

Slashdot Top Deals

Slashdot