Home -> Introduction

Project Overview

Long term preservation of digital information is a critical challenge facing various organizations such as libraries, archives, government and state agencies, and academic institutions today. This problem affects three primary technologies: storage hardware/media, computer hardware, and system software and file format. The threat to aging digital information has surpassed the danger of unstable media or obsolete hardware. The most pressing problems confronting the organizations are software and data format obsolescence.

As the technology evolves rapidly, the problem arises of what to do with digital contents that were created using old and now obsolete hardware and software. Unless action is taken now, there is no guarantee that the archived contents will be accessible and readable by future computing environments in short or long term. Such uncertainty will leave most digital information at a high risk unless effective strategies are developed to manage the underlying technology evolution.

The digital format obsolescence problem is international and has attracted special groups from many countries especially from the government and archiving communities. There have been a number of ongoing international efforts to develop such strategies, which can be grouped roughly into three main approaches. The first consists of bringing various communities to agree on standardizing their digital content to a few common formats and structures and develop related open specifications. The second approach relies on techniques to migrate from obsolete to current or emerging applications, while the third is based on emulation strategies of obsolete to current platforms or applications.

Clearly, there is no single best solution to format obsolescence as of today. In this work, we develop a methodology for preserving reliable information about various strategies which are used to handle formats. This approach revolves around an international effort to establish a format registry, called the Global Digital Format Registry (GDFR). Our work builds on this concept by developing an efficient, scalable, and secure prototype format registry that captures all the essential features of GDFR, and attempts to offer a flexible framework for incorporating advances achieved through the approaches mentioned above. Our prototype, called FOCUS (Format CUration Service), is platform independent, and is built on top of proven web technologies. FOCUS leverages existing work on format obsolescence software tools (e.g. identification, verification and transformation) by integrating them to our digital format registry and making them available through web service architecture.

We believe that our approach is comprehensive and more importantly, provides scalable, robust, extensible, and secure architecture to deal with the problem of format obsolescence facing long term preservation of large scale collections of digital objects.




© Copyright 2004, Institute for Advanced Computer Study, University of Maryland, All rights reserved.