Tuesday, 16 Mar 2004
You're Going To Need Persistence
As far as I can tell, the "You Aren't Gonna Need It" principle
(abbreviated YAGNI) advocated by the Extreme Programming crowd isn't
really a principle at all, but rather a catch-phrase to end an
argument. It actually means, "we should do that later. Don't argue."
I think the test-first folks have good reasons for implementing things
in a certain order, but the real reasons are not necessarily the ones
they use to justify it.
Take persistence for example. The claim is that you might not need a
database, so don't bother implementing it until later. I haven't
tried it myself, but it looks like what actually happens is that you
implement the persistence layer three times: in-memory, using flat
files, and as a database. Implementing three persistence layers is
not really the "simplest thing that could possibly work" (another
catch-phrase). But the result is a more flexible application with a
better architecture.
The in-memory layer is implemented first because you need it to make
unit tests run fast (as a "mock object"), and it's the easiest way to
prototype. This is for the developer, not the customer, but it will
quickly pay for itself for anyone who is serious about unit tests.
Flat files come next because they are useful for small, single-user
datasets, and in the early phases of a new software project, all
datasets are small. Both developers and customers will build many
datasets while trying out the new application. It's very convenient -
everyone who has a computer already knows how to handle flat files,
and they already have all the tools they need. The operating system
stores them directly, they can be exchanged as email attachments,
shared on the web, checked into source control, backed up using any
number of methods, and so on. Furthermore, text files can edited by
hand and compared for differences, and there are even more tools for
XML files.
The next step (depending on the application) is the database, which
allows customers to build larger datasets and allows multiple users to
modify the same dataset. Putting this last means you already have a
lot of code written without a database in mind, which might need to be
refactored. But you also have sample data, an easy-to-use
import/export/backup format, some real-world experience, and a pretty
good idea what the schema should be, before designing the first table.
Plus, the import/export format is independent of any particular
database technology, making it easier to redesign the schema or switch
database vendors, especially in the early stage when the databases are
still fairly small.
A DBA can be pretty effective under these working conditions. Don't
let the cheesy slogans put you off.
respond | link |
/code
|