Ramblings on Librarianship, Technology, and Academia

I never metadiscourse I didn't like

4/2/08 10:22 am - RFC: Operational Preservation Matrix

[Tagged as, among other things, otw, because even though I am dealing with these issues as a professional I think that The Organization for Transformative Works is very well-placed to be one of the few organizations prepared to confront operational preservation from the outset. After all, the OTW has to deal with one even more frightening aspect of operational preservation: it is an entirely volunteer-run organization which promises perpetual preservation. It takes a lot of planning and commitment to be prepared to follow through on a commitment like that. Luckily, the OTW has both.]

Introductory thoughts on Operational Preservation )

I would love to get comments from the community on this, because I truly believe that this could be a very useful model for organizations designing digitization projects. I know I'm going to prompt my institution to follow this matrix for all new digitization efforts.

Problem Statement: When an archivist deposits material in a digital archive, he or she often has assumptions that object is preserved in perpetuity, just as it would be worried a physical object. Depositors of digital material often have the same assumptions, as do institutional administrators. However, the assumptions of the software development and maintenance community do not assume permanence on the same scale in which archivists are accustomed to providing permanence. Moreover, administrators (and archivists) often have unrealistic assumptions about the labor and costs involved in daily operational maintenance to provide digital preservation, which are -- if not higher -- certainly different from the operational maintenance costs for providing physical preservation. Even worse, many digital preservation projects are funded by limited-duration soft money instead of out of an operational budget.

Or, in a nutshell, we need to remember that Digital preservation has an ongoing operational cost which cannot be provided within the archive.

Operational Preservation: To that end, I am proposing this matrix for new preservation and archival projects to see if they have thought of the requirements necessary for permanent preservation.

Anything calling itself a digital preservation project has to be prepared, in perpetuity, to provide all items down the left-hand column for all of the items in the top row. Funding is really a redundant item -- by "Labor", I mean funding for staff to provide all of the work involved, and "Physical facility" is really something which can be provided by funding -- but the fact that digital preservation requires ongoing operational money is too important to ignore. By "Bureaucratic support" I mean policies and procedures in place which support the operational business of preservation at an organizational level.

Operational Preservation Matrix
Labor Physical facility Bureaucratic support Funding
Existence of the datastream
in a file system or database
. . . .
Object access via handle/doi/uri . . . .
Maintenance, repair, and upgrade
of hardware (server, disk, etc.)
. . . .
Maintenance, patching, and upgrade
operating system
. . . .
(The following tasks are not as
essential, but still very important)
. . . .
Rolling forward file formats . . . .
Transferring data to more modern
repository and software tools when appropriate
. . . .
Modernizing user interface as appropriate . . . .


(Of course, traditional preservation of physical objects is also an ongoing operational cost. Physical objects require extensive physical facilities with narrow environmental limitations, they require re-housing and repair, they require maintenance and supervision. But these ongoing operational tasks can be performed by archivists with traditional skills. The technological operational tasks of a digital archive often can't be performed even by technologically-trained archivists, because the institution will have specific requirements about who is able to, say, maintain the network.)

2/22/08 02:07 pm - real preservation

I've been getting increasingly concerned about what I see as a too-shallow view of sustainability in digital preservation. There's been a lot of lip service paid over the last few years to preservation, and I have certainly heard talks by grant-funding agencies in which they explained that they are now only funding grants which have sustainability written into the grant structure. Yet time and time again, I see soft money being awarded to projects for which the project administrators clearly have only the vaguest idea of what sustainability really means in a software environment.

I don't see this as anyone's fault, mind you. Software developers and IT folks aren't used to thinking of software projects in terms of Permanence. In the traditional software world, the only way something is going to be around forever is if it's going to be used all that time -- for example, a financial application which is in constant use needs to be constantly up. But archival digital preservation has a very different sense of permanence. For us, permanence might mean that you build a digital archival collection once, don't touch its content again for 10 years, but can still discover all of its preserved content at the end of those 10 years.

Meanwhile, in Internet time, a project which has been around for two years is clearly well past its prime and ready to be retired.

Repository managers are putting all of this great work into the repository layer* of preservation: handles and DOIs, PRESERV and PRONOM, JHOVE and audit trails and the RLG checklist. But meanwhile, all of these collections of digital objects -- many of them funded by limited-duration soft money -- are running on operating systems which will need to be upgraded and patched as time passes, on hardware which will need to be upgraded and repaired as time passes, on networks which require maintenance. Software requires sustenance and maintenance, and no project which doesn't take into account that such maintenance requires skilled technical people in perpetuity can succeed as perpetual preservation. Real sustainability means commitment from and communication with the programmers and sysadmins. It requires the techies understand an archivist's notion of "permanence", and the librarians and archivists (and grant agencies) understand how that a computer needs more than electricity to keep running -- it needs regular care and feeding.

(This, by the way, is one of the reasons I'm so excited by the OTW Archive of One's Own and the Transformative Works and Cultures journal. The individuals responsible for the archive and the journal *do* have a real understanding of and commitment to permanence down to the hardware and network provider level. Admittedly, it's a volunteer-run, donation supported organization, so its sustainability is an open question. But it's a question the OTW Board is wholeheartedly investigating, because they understand its importance.)

*I'm somewhat tempted to make an archival model of preservation that follows the layered structue of the OSI model of network communication. Collection policy layer, Accession layer, Content layer, Descriptive Metadata layer, Preservation Metadata layer, Application Layer, Operating System layer, Hardware layer. Then you could make sure any new preservation project has all of those checkboxes ticked. Sort of an uber-simplification of the RLG Checklist, in a nice, nerd-friendly format.

8/23/06 01:21 pm - this is why accessibility is such a big issue for me. My day is difficult enough with no roadblocks.

I'd forgotten how much I completely hate doing systems administration-type things by voice. Basically I'm spelling in the alpha-bravo alphabet at extremely high speeds, and anytime I want to say anything lengthy it's easiest to do it in a dictation box or DragonPad window because correction hasn't worked in xterm for several versions of NaturallySpeaking. And, as I remember from the last time I went through this, it's my hands which know how to do programming and system administration, and all of those pathways are burned from brain directly to fingers. My mouth doesn't know how to do these things. Is this what it feels like, at a certain level, to be split brain? Honestly, put a keyboard in front of me and a heavy dose of painkillers and I can program adequately and administer systems with the best of them, but make my voice be the interface and suddenly I'm stupid, I've forgotten everything. And my brain is just incredibly, strangely tired, like I've just run a marathon and my brain did all the work.

In any case, I finally have a default installation of DSpace up and running, so that's something that will be fun to play with. What should I put in it? Just copies of the same files and metadata we currently have in our existing repository (which is Ex Libris Digitool)? Arguably that would be the best first test (not only of DSpace, but of Digitool's ability to export in standard formats).

<violin class="world's smallest">

You know, I don't regret being a librarian. Hell, I love being a librarian, and I know I would've never come to this if I hadn't hurt my hands. But it's really so frustrating to know that there was something I was once very good at and now I just can neither do it well nor enjoy the process of relearning.

</violin>
Powered by LiveJournal.com