Ramblings on Librarianship, Technology, and Academia

I never metadiscourse I didn't like

4/2/08 10:22 am - RFC: Operational Preservation Matrix

[Tagged as, among other things, otw, because even though I am dealing with these issues as a professional I think that The Organization for Transformative Works is very well-placed to be one of the few organizations prepared to confront operational preservation from the outset. After all, the OTW has to deal with one even more frightening aspect of operational preservation: it is an entirely volunteer-run organization which promises perpetual preservation. It takes a lot of planning and commitment to be prepared to follow through on a commitment like that. Luckily, the OTW has both.]

Introductory thoughts on Operational Preservation )

I would love to get comments from the community on this, because I truly believe that this could be a very useful model for organizations designing digitization projects. I know I'm going to prompt my institution to follow this matrix for all new digitization efforts.

Problem Statement: When an archivist deposits material in a digital archive, he or she often has assumptions that object is preserved in perpetuity, just as it would be worried a physical object. Depositors of digital material often have the same assumptions, as do institutional administrators. However, the assumptions of the software development and maintenance community do not assume permanence on the same scale in which archivists are accustomed to providing permanence. Moreover, administrators (and archivists) often have unrealistic assumptions about the labor and costs involved in daily operational maintenance to provide digital preservation, which are -- if not higher -- certainly different from the operational maintenance costs for providing physical preservation. Even worse, many digital preservation projects are funded by limited-duration soft money instead of out of an operational budget.

Or, in a nutshell, we need to remember that Digital preservation has an ongoing operational cost which cannot be provided within the archive.

Operational Preservation: To that end, I am proposing this matrix for new preservation and archival projects to see if they have thought of the requirements necessary for permanent preservation.

Anything calling itself a digital preservation project has to be prepared, in perpetuity, to provide all items down the left-hand column for all of the items in the top row. Funding is really a redundant item -- by "Labor", I mean funding for staff to provide all of the work involved, and "Physical facility" is really something which can be provided by funding -- but the fact that digital preservation requires ongoing operational money is too important to ignore. By "Bureaucratic support" I mean policies and procedures in place which support the operational business of preservation at an organizational level.

Operational Preservation Matrix
Labor Physical facility Bureaucratic support Funding
Existence of the datastream
in a file system or database
. . . .
Object access via handle/doi/uri . . . .
Maintenance, repair, and upgrade
of hardware (server, disk, etc.)
. . . .
Maintenance, patching, and upgrade
operating system
. . . .
(The following tasks are not as
essential, but still very important)
. . . .
Rolling forward file formats . . . .
Transferring data to more modern
repository and software tools when appropriate
. . . .
Modernizing user interface as appropriate . . . .


(Of course, traditional preservation of physical objects is also an ongoing operational cost. Physical objects require extensive physical facilities with narrow environmental limitations, they require re-housing and repair, they require maintenance and supervision. But these ongoing operational tasks can be performed by archivists with traditional skills. The technological operational tasks of a digital archive often can't be performed even by technologically-trained archivists, because the institution will have specific requirements about who is able to, say, maintain the network.)

4/1/08 12:15 pm - Open Repositories 2008, part 1.

All these papers will eventually be available in the Open Repositories 2008 conference repository. I'm linking to all of the placeholders; papers should be up soon.

This will be very limited liveblogging, because I'm typing in the conference and dictating betwen sessions, so I can't say much. Hopefully I'll get some good fodder for my upcoming sustainability post.

Keynote:

Repositories for Scientific Data, Peter Murray-Rust )

Session 1 – Web 2.0

Adding Discovery to Scholarly Search: Enhancing Institutional Repositories with OpenID and Connotea, Ian Mulvany, David Kane )

The margins of scholarship: repositories, Web 2.0 and scholarly practice, Richard Davis )

Rich Tags: Cross-Repository Browsing, Daniel Smith, Joe Lambert, mc schraefel )

Ow. I'm not doing this for the next session. I can blog at the breaks.

3/12/08 03:52 pm - mostly links, a few thoughts. Lawrence Lessig, librarything, Major league baseball, Stephen Colbert

Some library, book, archives, records, baseball fandom, and government information musings and links just so I can clear the tabs out of my browser again: Cut to save your screen real estate )

2/22/08 02:07 pm - real preservation

I've been getting increasingly concerned about what I see as a too-shallow view of sustainability in digital preservation. There's been a lot of lip service paid over the last few years to preservation, and I have certainly heard talks by grant-funding agencies in which they explained that they are now only funding grants which have sustainability written into the grant structure. Yet time and time again, I see soft money being awarded to projects for which the project administrators clearly have only the vaguest idea of what sustainability really means in a software environment.

I don't see this as anyone's fault, mind you. Software developers and IT folks aren't used to thinking of software projects in terms of Permanence. In the traditional software world, the only way something is going to be around forever is if it's going to be used all that time -- for example, a financial application which is in constant use needs to be constantly up. But archival digital preservation has a very different sense of permanence. For us, permanence might mean that you build a digital archival collection once, don't touch its content again for 10 years, but can still discover all of its preserved content at the end of those 10 years.

Meanwhile, in Internet time, a project which has been around for two years is clearly well past its prime and ready to be retired.

Repository managers are putting all of this great work into the repository layer* of preservation: handles and DOIs, PRESERV and PRONOM, JHOVE and audit trails and the RLG checklist. But meanwhile, all of these collections of digital objects -- many of them funded by limited-duration soft money -- are running on operating systems which will need to be upgraded and patched as time passes, on hardware which will need to be upgraded and repaired as time passes, on networks which require maintenance. Software requires sustenance and maintenance, and no project which doesn't take into account that such maintenance requires skilled technical people in perpetuity can succeed as perpetual preservation. Real sustainability means commitment from and communication with the programmers and sysadmins. It requires the techies understand an archivist's notion of "permanence", and the librarians and archivists (and grant agencies) understand how that a computer needs more than electricity to keep running -- it needs regular care and feeding.

(This, by the way, is one of the reasons I'm so excited by the OTW Archive of One's Own and the Transformative Works and Cultures journal. The individuals responsible for the archive and the journal *do* have a real understanding of and commitment to permanence down to the hardware and network provider level. Admittedly, it's a volunteer-run, donation supported organization, so its sustainability is an open question. But it's a question the OTW Board is wholeheartedly investigating, because they understand its importance.)

*I'm somewhat tempted to make an archival model of preservation that follows the layered structue of the OSI model of network communication. Collection policy layer, Accession layer, Content layer, Descriptive Metadata layer, Preservation Metadata layer, Application Layer, Operating System layer, Hardware layer. Then you could make sure any new preservation project has all of those checkboxes ticked. Sort of an uber-simplification of the RLG Checklist, in a nice, nerd-friendly format.

6/22/07 11:40 am - jcdl post 2: digital curation and preservation

The first panel I went to was digital curation and preservation. my notes from these sessions are more sparse.

how to choose a digital preservation strategy )

factors affecting website reconstruction from the web infrastructure )

defining what digital curators do and what they need to know, the DigCCurr project )

generating best-effort preservation metadata for web resources at time of dissemination )

6/13/06 01:56 pm - JCDL detailed panel notes

Just a quick note: I have a big girly crush on Brewster Kahle, and he's not even here.

Opening Plenary on getting books online )

Interoperability panel )

Jonathan Zittrain on privacy )

Joanne Kaczmarek on the RLG Audit checklist )

This isn't every presentation that I liked, but most of the others I enjoyed were displays of clever software products, hardware display, or metadata tools (though I am fasincated by the project of "Exploring Erotics in Emily Dickinson’s Correspondence with Text Mining and Visual Interfaces") and I'm not sure how much there is to blog on them.

Oh, also, to my fellow presenters. If you are going to do a demo, get some capture software and make a video of yourself doing the demo. You're all either computer or library professionals, and should know better than to trust internet connections, computers, and A/V systems to work on demand. The demos that were pre-recorded went smoothly, and for many of the live demos we lost any real understanding of the software because you gut hung up on the failing demo.
Powered by LiveJournal.com