Strategies for Test Databases? 66
youngcfan asks: "I've been tasked with finding strategies for a test database that can be used effectively by both software developers and the QA team. We're a J2EE shop with most of the interesting pieces of the application interacting heavily with the database -- so we need to test it. We're ramping up on JUnit, but are looking for ways to test the database-driven pieces of code. Since QA needs the same database for functional testing separate from developers' unit tests, DBUnit doesn't seem to suffice. We also have the challenge of working on multiple releases at the same time, which only complicates how and when to add new data to the test database in a way that's useful and valid for everyone. We're looking for strategies for using a test database in a way that meets both the QA's and the developers' needs, works for multiple releases, and isn't a heavy burden to maintain given that the schema and code can change anytime before any of the multiple upcoming releases. Any suggestions?"
Why use only one DB? (Score:4, Insightful)
You'll find it much easier to create dedicated DBs for each test scope.
Re: (Score:2, Funny)
Simultaneously, I should add.
Re: (Score:3, Funny)
KFG
Re:Why use only one DB? (Score:5, Insightful)
As the parent eludes to, the only way to do it The Right Way (tm) is to have a Development environment, a QA environment, and a Production system.
Each of these systems should be using the same architecture when it comes to hardware and configuration.
The Development system is always in a state of flux, as its name implies.
The QA system should *at least* approximate (if not be identical to) the data and load of the production system, and it should be treated like a production system that QA tries to break.
It is only in this fashion that you will be able to test and make sure your system will work as expected. Leave nothing to chance. Expensive, yes. But it's less expensive than a downed production system, and definitely less expensive than building a complete system and realising it doesn't perform as expected.
Re: (Score:2)
Another had 1 P, 2 QA and 2 Test.
My current company has 2 Prod (one is a daily clone for reporting from ), 1 QA and 2 test. And QA and test may have duplicate tables at any time of the normal tables, due to special testing.
Now if you will excuse me, The space manager just mounted that new pack for me, I
Re: (Score:2)
Each of these systems should be using the same architecture when it comes to hardware and configuration.
The Development system is always in a state of flux, as its name implies.
The QA system should *at least* approximate (if not be identical to) the data and load of the production system, and it should be treated like a production system that QA tries to break.
We
Re: (Score:3, Interesting)
You are nearly as wrong as your parrent!
1st: the QA system very likely won't be the production system, but the production system running in future.
2nd: DEFINETELY the development system is the same like the QA system. And no: it is not in flux!!! It is reset after each developer test, or developer access to it, either by erasing it and using a back up or by "roll back" of all transactions (that likely is not possible).
How the hell should a developer figure if his actual "attempt of a new worki
Re: (Score:2)
When/how do you prove that the flat file class and the database class are equivalent? When/how to you integrate the fact that flat files and databases have different failure modes, and thus different end user error messages?
When the flat file gives a "file not found error" the test environment is not set up correctly.
The flat file class and the data base class don't need to be equivalent. They are customized for every test case, so they are only equivalent in regard to that special case.
End user messages ar
Re: (Score:2)
And no: it is not in flux!!! It is reset after each developer test, or developer access to it, either by erasing it and using a back up or by "roll back" of all transactions (that likely is not possible).
How the hell do you get anything done? That would spell disaster where I work - 1000 developers split into about 80 teams, more or less, communicating via service interfaces, all using separate databases. Devo is Devo so you can trash it and nobody cares - this makes you design stuff that is resiliant in
Re: (Score:2)
2 QA Systems. One for testing the next release, one set up identically to the production server so QA can reproduce problems found with the production software.
The company I'm at now is extremely small. I constitute 50% of the software engineers and 33% of the entire IT department. We have a production setup, 2 developer setups, and a QA server that we've configured to rapidly switch between our production software and release software. The transition takes about 15 minutes.
Make Sure Test Data Isn't Sensitive (Score:2)
2 cents,
QueenB
Test databases (Score:2, Informative)
Oracle, Sybase and MySQL can all be used as test databases.
Perhaps you really want to know how to test code that uses databases, which is a different question
There are many refactorings that can be done to reduce your dependency on a particular database install...but thats a rather large topic. I'm available for consultancy, post here and I can get in touch...
Some things you might like to consider
Use real data, not test data (Score:3, Informative)
Test data sucks: there are too many real-world situations the developers fail to think of.
We're a pretty small shop, but here's what we do: The production server backup is loaded to the test server daily. Every developer maintains a set of scripts which make any needed databae structure modifications after the backup has loaded. All development and QA testing is done against this test database. Where the production data isn't stable enough for unit testing we force-feed a few specific rows (as few as possible). This gives us fresh, real-world data for development and testing, and when an application rolls out, the exact same set of modification scripts are usually run on the production server (i.e. the modification scripts have been indirectly but repeatedly tested themselves).
Re: (Score:1, Interesting)
Re: (Score:3, Insightful)
We do take a QA snapshot of the production server about once a week. Its confide
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
If you have to test code for handling bad data in the DB, then you are not testing the code that should be properly validating the data *before* it is inserted into the DB.
Re: (Score:2)
Re: (Score:2)
o. You should never have bad data in the database to start with. If you manually put bad data into a DB, you are of course going to be running into problems that should never exist.
Yes, because that never happens.
/furiously rolls eyes
Not for unit tests (Score:2)
Why? You can set up your JUnit failure method so it takes a snapshot of the database at the point of failure and mails it to you (as an XML attachment). This means you can run smoke tests nightly -- try doing that with a "complete" database that's been scribbled on by other tests and developers since the problem occurred.
Re: (Score:2)
Re: (Score:3, Insightful)
Re: (Score:2)
But if you can't trust your techs, devs and sysadmins to handle sensitive data, then how are you expecting them to fix a problem on a production system?
While i do most developing and testing on test data (to simplify backup, restore und bugtracking), i *always* use a backup of the real database for fina
Re: (Score:2)
It's not the developers business to define test data, but the business of either the business analyst, or the test engineer in cooperation with the business analyst.
Sure, lots of business cases are so simple a developer could define the test case. But, if you have a contract with a customer to develop something
angel'o'sphere
which goals ? (Score:1)
Perfect world enviroment (Score:2, Insightful)
Re: (Score:2)
Developers have DBA rights on Dev and are locked out of our Prod instances. Developers script all changes so that their work can be reapplied with the same results on every instance. We also log object changes so we c
Re: (Score:3, Interesting)
Seconded. I'm on a project right now where we (the programmers) have finally gotten management to allocate time for us so we can get going on doing more unit testing, integration tes
Doesn't Suffice? (Score:4, Insightful)
It also occurs to me that if you can't even decide what data is 'useful and valid to everyone' then your test data is nothing like the live data you will have. Here's my suggestion: If it seems like it'll be even slightly relevant to anyone, use it. Otherwise you aren't testing everything.
The constantly changing schema is puzzling also. Did you not plan your database beforehand? I'm guessing this is an XP shop then, eh? XP doesn't stand for 'no planning'. I can understand changes to the schema in the early stages of programming, but if you're getting close to 'multiple releases' then the schema should be pretty solid by now, and the little changes needed to make to DBUnit shouldn't be a big bother.
Re: (Score:3, Insightful)
In theory I'd agree with you, but in practice I've rarely worked on a project of significan
Re: (Score:3, Insightful)
Re: (Score:1)
As to the question of a way to test the DB, the use of a test system, or possibly even multiple test schemas is the correct way to accomplish this. If it's an issue with constructed
"This word does not mean what you think it means" (Score:2)
I guess some toys would only be able to handle a single database, but I can't imagine why anyone would use one when there are so many
Do you have a DBA? (Score:3, Insightful)
DBAs are usually viewed by devs as complete assholes, because they scream and holler at devs who make gratuitous changes to schemas and stored procedures. But a good DBA will make your database issues go away.
Snapshots (Score:2)
You should use caution here. Moving your production data is never trivial. Snapshots are not free. Developement machines can load the point-in-time copy to the point where it could impact the pr
My Test Dev (Score:1)
one word: virtualization! (Score:1)
proper design and planning? (Score:2, Insightful)
Listen to your costumer?? (Score:1)
Automated database-building scripts... (Score:2)
Our unit tests work at the C# level, not SQL (they test the objects implemented using the database, rather than the database itself). Most tests start by running the creation script to create a fresh database, do things to it, a
For test databases... (Score:4, Insightful)
1- "Clone" whatever is most recent on production
2- Revert to "known good QA state" (ie: big red reset button)
3- Dump current state for later use.
You need to be able to clone so that ad-hoc testing can be run against production data w/o making production impact. This doesn't have to be live, but can be like a once-a-week/once-a-month activity, or rotate out a slave DB every once in a while, or have your DB people test your backups / etc.
You need the ability to revert to a known good state so that specific tests can be run and those can be more easily automated. Like: search "foo", 7 results found (not 6, not 8, not "it was 8 a few seconds ago but now it's 9 because there's a new result that was just added)
The ability to dump out DB state is a very distant third, but can be helpful for post-testing analysis or being able to modify a particular DB snapshot to fit some particular testing needs and then dump that out to the file-system for later use.
QA is hard, thank you for trying to make it easier.
--Robert
HSQLDB for Unit Tests (Score:2)
Now, this can only really effectively test a few things, and generally, I find that it can only really be useful for exercising small operations, like individual DAO methods. This is act
Re: (Score:2)
Embedded DB (Score:2)
Point-in-time raw backups (Score:3, Informative)
Ideally, the developers and QA run against a smaller database that is (ideally) populated from scratch with a small dataset to speed development; and then for release testing use a much larger populated database or (if that's too difficult) a copy of the production database that has been appropriately scrubbed to get rid of confidential data.
The database offerings from the various major vendors allow you to "quiesce" the database which suspends new transactions, completes all pending transactions, and then ensure that all data and log are flushed to disk. Then, with the production system paused, take a hot point-in-time snapshot of the filesystem, effectively giving you a compelte database dump in a few seconds. (This requires a storage system that allows you to make snapshots -- NetApp's do this, for example.) Resume the database to let the production system continue, and then copy the snapshot of database files to another server and reconstruct a clone of the database.
Run the appropriate trimming/cleansing/schema update on the clone database, and then make a snapshot of THAT. You can then revert the database to a knowing starting point as you like. If your development requires schema changes, don't let developers make the schema changes directly -- insteead, insist on schema change DDL's to be scripted, and reapply the script to the snapshot at each refresh.
When doing the final release testing, get the latest snapshot of the production database, run the update scripts, and run the tests. If everything looks good, make another snapshot of the production database, and apply the updates to the production database.
Done right, you can always roll back the test
DB for Unit Testing? (Score:3, Insightful)
Schema Changes: As a J2EE architect, the first time I saw Ruby on Rails' database migrations my first impulse was to wonder, "Why the !@#$ is this not in Hibernate?" I am not aware of any slick framework for J2EE apps to manage DB migrations, so you may have to use your own migration scripts. Hopefully, your schema is not changing much.
Getting Data In There: This totally depends on how much data you need. My "favorite" reply to you was to have one snapshot of your production data per developer. That works great, as long as you don't have much data. My last project had I don't even remember how many terrabytes of data in prod. Do you really think the client was going to spring for that much storage and that many Oracle licenses to get one instance per developer? Yeah right. We had a full snapshot for performance testing, but regular integration testing was done on a representative subset of data.
DBUnit is a great way to initialize a small amount of data. For larger datasets, you cannot get away with things like DBUnit, as it would take hours, if not days, to get the data in there. For our performance testing databases, we had the prod data snapshot stored on a RAID-1. Before testing started, we broke the mirror and did testing against the degraded array. When it came time to reset the data, we shut down Oracle and rebuilt the array to the good snapshot. That wound up being very fast for us. For medium amounts of data, you could probably get away with using SQL*Loader.
Functional DB testing versus unit-testing... (Score:3, Insightful)
1) You maintain a canonical "test" database (or multiple ones). This database has the same functionality as the production database but generally contains much less data. No one touches this database unless they need to permanently modify the test data. After each release, you make a backup of the database and release that backup to everyone who needs a test database. They restore it to their own environment.
2) You always write changes to the database as scripts so that you can run them against your test database and your production database. Your release process has to change to include running any database modification scripts on the canonical test database as well as the production database. This ensures that your new test database matches the production database for that release.
3) You need to modify your test process so that it runs a database restore at the appropriate points. In our case, we always restore before QA functional tests (because they leave the database in an altered state) but we don't restore for unit-tests (because we insist they leave the database in the same state they started.)
The advantages to this approach is that everyone has a copy of an actual database and you get to see all the funkiness of your real environment. The downside is that you have to be very disciplined in keeping the backups for all releases, and for running modification scripts against both the test and production databases appropriately.
-Peter
Like a Forest Fire (Score:3, Interesting)
When our master build runs test cases, it drops all tables and creates them all fresh using the XML definitions. Each JUnit test case is responsible for ensuring it has the data it needs. In some cases, this is done by setting up a facade on the regular service so that the test can worry about semantics and not data storage. In other cases, the test (or a utility) creates test data. You could presumably also copy part of your live data, though that makes it much more difficult to know what the correct answer is in advance.
If you follow this structure, multiple releases with different schemas is trivial. Just have a parameter for the DB URL in your test suite and let it build the correct database version for you when it checks your schema out of your source repository.
(Incidentally, keeping your database schema in your source repository also allows easy comparison of database structure between code versions, making it easier to figure out what must happen when you upgrade.)
DBUnit can be very useful (Score:4, Interesting)
So you have this legacy code base - you want to make changes, but how can you validate the result? One approach is to compare database states - one from a known good codebase, one from a modified codebase. DBUnit can be tremendously useful here - this is what I've done (perhaps too complex for explaining on Slashdot):
Create a common Unit Test base class that extends DBUnit's DatabaseTestCase. It will:
a) receive a list of modified table names from the concrete test class
b) if a system property is set, export a pristine copy of these tables prior to running the test - 'reference data'.
c) execute the use case (register a user, perform a transaction, whatever) - this just makes a 'blind' call into the
code proper.
d) if a system property is set, export the modified table data ('known good results')
The idea is you run this test twice:
1) With the original codebase, with result exporting enabled to generate known good results.
2) With the codebase under test - the results generated will be compared against known good results and DBUnit will flag any differences. You can get it to ignore stuff like sequnces,dates that will differ between runs.
The reference data generated in (b) is reloaded prior to running the test second test, so you start from the same point. Each concrete test class just has to:
* figure out what tables change within the test
* provide the test code itself
Everything else is managed by DBUnit - exporting/importing datasets, comparing datasets, etc.
What about SQLUnit? (Score:2, Informative)
Spring (Score:1)
Why not try SQLite? (Score:1)
It works like MS Access (file-based) but supports most of the SQL92 standard.
http://sqlite.org/ [sqlite.org]