Friday, October 28, 2005

Preservation Readings 28 October 2005

Japanese holographic storage firm to ship 200GB drives in '06. Lucas Mearian. Computerworld. October 24, 2005.,10801,105682,00.html?source=NLT_AM&nid=105682

Optware Corporation is planning to ship three versions of its product by the end of next year, with up to a 200 GB . They expect to release a 1 TB disk by the end of 2008. A holographic disk can store more information by storing data inside the disk as well as on the surface. The cost of the disk is less than a hard drive. “Both Optware and InPhase are targeting their initial products at the data archival market because their holographic disk technology is removable and can be kept for decades without deterioration of data, which is stored within the disk and not on the surface.” Optware also plans to release a holographic disk product for streaming video, and a consumer disk about the size of a credit card that can hold 30GB.

Companies hope to extend open-source movement to data storage. Brian Bergstein. Detroit News Technology. October 25, 2005.

IBM is leading a group of companies that hope to extend the open-source movement to data storage. Generally each storage system has its own management software. The group hopes to develop new data-management software that would be open source and allow data to be moved seamlessly within their organization.

Cheap DLT on the way. Martin MC Brown. Computerworld. October 17, 2005.

A 1.6TB SuperDLT tape is still under development but should appear by the end of this year. The gap between successive tape generations will increase.

Cheap DLT pitched to outpace DAT. Bryan Betts. The Register. 17 October 2005.

Hard drives have been increasing in size quite rapidly, but the tapes for backup haven’t, so there are more tapes needed to backup a drive, which means slower and more expensive processes. Quantum has announced a 320 GB tape (with compression).

Robots and sensors are in IT's future, Gartner says. Patrick Thibodeau. Computerworld. October 20, 2005.,10801,105590,00.html?source=NLT_HW2&nid=105590

Hewlett-Packard has discussed "lights-out" or "humanless" data centers at a recent conference. They believe that management technologies will lead to fully automated data centers in the years ahead. The Gartner research group has predicted that as many as half of all hands-on data center jobs may disappear over the next two decades because of automation. But the concept of a fully automated data center is still hard for some IT mangers to accept.

Science and technology based institu[t]e in Chennai stress on digital libraries. Digital Opportunity Channel. October 24, 2005.

"Digital libraries are the only solution to the problem of pages disappearing from library books." The paper Preservation of Electronic Theses and Dissertations: A case study of SRM Institute of Science and Technology was presented at 8th International Symposium of Electronic Theses and Dissertations.

The full paper is at

Friday, October 21, 2005

Preservation Readings 21 October 2005

Urgent Action Needed to Preserve Scholarly Electronic Journals. Donald J. Waters. Association of Research Libraries. October 15, 2005

Digital preservation is a major challenge facing higher education. Yet organizations have been slow to invest in the infrastructure to maintain electronic journals and files over the long-term. The industry is shifting to electronic resources and print resources are being scaled back or canceled. Because licensed journals are being used, there is no local copy that is being retained. Four actions are essential:

1. Preservation of electronic journals is a kind of insurance, and not just access

2. Qualified preservation archives should provide a well-defined minimal set of services.

3. Libraries must invest in a qualified archiving solution.

4. Libraries must demand archival deposit by publishers as a condition of licensing electronic journals

The publishers archiving methods must be described publicly.

Gates cheers on computer museum. BBC News. 17 October 2005.

Bill Gates has pledged $15 million to the Computer history museum in California. The museum displays the history of computing as well as the impact. The museum currently houses a collection of more than 4,000 artifacts, 10,000 images, 4,000 linear feet of catalogued documentation and many gigabytes of software. "It's our responsibility to collect the artefacts and stories today that will explain this incredible change to future generations."

Caring for your collections: Cylinder, Disc and Tape Care in a Nutshell. Library of Congress. 7 October 2005.

This is part of the Library of Congress preservation web site. It contains very good information on topics such as handling, storage, packaging, equipment, and supply sources. This is more for analog materials. It does have some information on tapes and cassettes. The second site gives information on the principles and specifications for preservation digital reformatting. Some of the principles include:

· Retain an analog version of digitally-reformatted items until you are confident that the life-cycle management of digital data will ensure access for as long as, or longer than, the analog version.

· Minimize handling of originals in the digital reformatting work to assure the best digital capture of an undamaged original, as well as the longevity of the original item

· Ensure that the digital master file will allow a broad range of future use

· Capture the highest quality digital image technically possible and economically feasible for large-scale production, while optimizing the potential for longevity

· Archive a digital master file that is free of, or minimizes, artifacts introduced by the reformatting process, whenever possible

· Employ standards and best practices for structural, administrative, and descriptive metadata that will optimize interoperability

· Document digital master file contents with MD5 checksums (or a similar tool) and use them to ensure the data integrity of master files through back-up and migration

Video Format Identification Guide. Website. 2005.

This is a useful site to help archivists, librarians, curators and conservators identify the videotapes in their collections. The site has formats broken down by time period: 1956-1970; 1970-1985: 1985 to present. Each format type has an image and brief information about it, as well as an obsolescence designation: Extinct; Critically endangered; Endangered; Threatened; Vulnerable; or Lower risk. The site also contains an explanation of video terms.

Digital Preservation Topics in Google Groups. October 2005.

An interesting discussion of many topics dealing with audio preservation. Includes information on software for recording records and cassettes, to equipment, to format and media challenges for preserving CDs and other digital files.

Friday, October 14, 2005

Preservation Readings 14 October 2005

The Future of E-Mail Archiving. Jennifer LeClaire. TechNewsWorld. October 13, 2005.

Recent high-profile scandals illustrate the importance of email and the consequences of misuse. Recently 20% of employers have had email records subpoenaed, and 13% have fought lawsuits that were triggered by email. Email and other records are the “electronic equivalent of DNA evidence." Email archiving is growing rapidly and there is a great demand to have system administrators address the volume of email sent and stored. Email archiving must consider policies and the decision points that turn into policy. They want to make decisions based on long-term objectives and how the policies will fit into the operational model, “which includes policies for backup, restoring, disaster recovery, business continuity, security, flexibility and scalability.” "Archiving is a new concept, [!] and its growth has been fueled by new technologies that assist IT users in implementing this valuable strategy." They are looking to improve the intelligence of the archiving and retention functions, and to find ways to use the information effectively.

Ground Broken for New Church History Library. Press Release. 7 October 2005.,5422,116-22297,00.html

A new Church History Library is being built in downtown Salt Lake City by the Church of Jesus Christ of Latter-day Saints. The library will incorporate updated technology and will significantly increase archival storage capacity to preserve various types of materials, including print materials, manuscripts, photographs, microfilm, audiovisual items and others. They have consulted with international experts in records preservation and archival design to ensure it has the best lighting, humidity and temperature controls, as well as fire and seismic protection.,4945,40-1-3227-4,00.html

Archivists are already addressing the issues of transitioning to handling the digital materials. “Documents that are digitized and made available online are handled less frequently, extending the life of the original document. Creating digital documents isn't without challenges. Every 10 years advancing technology dictates that digitized documents be moved to a more current electronic medium.”

Holograph? Schmolograph... Larry Medina. Computerworld. October 4, 2005.

Concerns about holographic storage and its permanence. So far there has been no information about the permanence of holographic storage. Is there any information about accelerated aging tests? The Norsam technology was long term and stable; it may be time to look at this technology again. The point is that there is no standard for these new technologies.

More Eggs in One Basket: Will Blu-ray and HD-DVD Be Archival? D.W. Leitner. Video Systems. Oct 13, 2005.

There has been a lot of interest in the longevity of CDs and DVDs and the suitability for archiving. How do HD DVD and Blu-ray fit into this. Both have higher density than DVDs. HD DVD uses the same construction as DVDs, with the data layer between polycarbonate layers. With Blu-ray, the data layer is on the disc surface closest to the laser, with only a 0.1 mm protective coating, avoiding reading through thicker layers, which could cause optical distortion of the laser. With CDs, the data layer is near the surface on the opposite side of the laser. Disk scratches would be a concern to archivists, but professional versions would have a cartridge for protection (which means two different Blu-ray drives). The higher data densities of discs is a concern if one goes bad. And the holographic discs are even higher density than Blu-ray.

Thursday, October 13, 2005

Ingest Guide for University Electronic Records

"One of the key challenges to preserving electronic records in a meaningful way is preserving the authenticity and integrity of records during their movement from a recordkeeping system to a preservation system. This Ingest Guide describes the actions needed for a trustworthy ingest process. This process enables an Archive and Producer to move records from a recordkeeping system to a preservation system in a manner that allows a presumption of authenticity."

ProQuest Creates Digital Archive of British Periodicals

"ProQuest Information and Learning will digitize nearly 6 million pages of British periodicals from the seventeenth, eighteenth, nineteenth and early twentieth centuries, creating direct access for humanities scholars to the breadth of texts that captured both daily life and landmark thought of the time."

The Future of E-Mail Archiving

"E-mail archiving as an industry is growing rapidly and it is interesting to examine the underlying trends driving the growth," Nick Mehta, senior director of product development for Symantec, told TechNewsWorld.

"With regard to storage, Geis said companies want to make technology decisions based on what is going to last for the long haul. It is not just a simple matter of technology, he said, but how it will fit into the company's operational model, which includes policies for backup, restoring, disaster recovery, business continuity, security, flexibility and scalability."

Researchers to develop China-only version of HD-DVD

SEPTEMBER 20, 2005 (IDG NEWS SERVICE) - BEIJING -- In a bid to cut costs for local electronics makers, Chinese researchers plan to develop a version of the next-generation HD-DVD optical disc format specifically for China that will include support for a locally developed video compression technology, called AVS (Audio Video Coding Standard), according to a researcher involved with the project.

Automated Video Tape Preservation

LIBRARY OF CONGRESS SELECTS Automated VIDEOTape Preservation AND Digitization System FOR Audio-VIDEO PROJECT

SAMMA to Migrate Library’s Audio-Visual Collection

Washington DC – October 12, 2005 – The Library of Congress has contracted to purchase the System for the Automated Migration of Media Archives, or SAMMA, to migrate its massive collection of audio-visual material in preparation for its move to the National Audio-Visual Conservation Center in Culpeper, VA. Over the next several years, the Library will use SAMMA to migrate and digitize many of the hundreds of thousands of recordings in its collection.

The Library realized that it would take many decades and be prohibitively expensive to migrate and digitize the audio-visual collections manually. To have the material available at the Culpeper facility when it opens in 2007, a more practical, cost-effective, and efficient method had to be found. In examining the alternatives, the Library concluded that Media Matters’ innovative migration automation system would provide the high quality necessary to preserve the recordings, while meeting the required cost and time restraints.

SAMMA combines robotic tape handling systems with proprietary tape cleaning and signal analysis technologies. SAMMA’s expert system automatically supervises quality control of each media item’s migration. From a thorough examination of the physical tape for damage, to real-time monitoring of video and audio signal parameters during migration, SAMMA ensures that magnetic media is migrated with the highest degree of confidence and the least amount of human intervention. SAMMA uses specially-designed components to gather technical metadata throughout the entire migration process, ensuring that the process is documented in depth while gathering important metrics about the health of an entire collection. The modular, portable system will be installed on-site at the Library and run 24/7. The final product will be a lossless compressed Motion JPEG 2000 digital file copy of each master tape at preservation quality, and the technical metadata describing the condition of the media item and the migration process.

Upon completion, the National Audio-Visual Conservation Center of the Library of Congress will be the first centralized facility in America especially planned and designed for the acquisition, cataloging, storage and preservation of the nation’s heritage collections of moving images and recorded sounds. It is expected to be the largest facility of its kind in the world. The NAVCC, funded by the Packard Humanities Institute and the U.S. Congress, will open fully in 2007.

About Media Matters LLC

Media Matters LLC has extensive expertise with magnetic media migration, and is dedicated to taking traditional migration strategies into the 21st century by researching, developing, and deploying cutting-edge digital media technology. Media Matters recently completed the development hardware for creating real-time Motion JPEG 2000 files with synchronized uncompressed audio files. As the exclusive American partner in the EU’s PrestoSpace consortium, and through involvement in other international organizations, Media Matters is developing next-generation processes and standards for automated media migration.

For further information:

Contact Steve Kwartek at

Phone: 212-268-5528 X113

Monday, October 10, 2005

Digitizing Old Photos

Short but interesting article. Here is a good quote:

"Once you have your photographs digitized, make extra copies of those files on quality CDs and/or DVDs. Then store the extra copies in a safe deposit box, with family and/or with friends. Compared to the cost of losing those precious memories, the cost of digitizing them and making extra discs for storage is a small price to pay."

Click on the title above or use this link:

Friday, October 07, 2005

Preservation Readings 7 October 2005

New ISO standard will ensure long life for PDF documents. ISO Press Release. 7 October 2005. [Updated link, August 7, 2015.]
The PDF and archival PDF file formats have been approved as ISO standards. The standard “enables organizations to archive documents electronically in a way that will ensure the preservation of content and visual appearance over an extended period of time. It also allows documents to be retrieved and rendered with a consistent and predictable result in the future, independent of the tools and systems used for creating, storing and rendering the files.” This will have a significant impact on the digital preservation community. It will allow documents to be delivered in a standard way for a long time. "PDF/A files will be more self-contained, self-describing, device-independent than generic PDF 1.4 files, and should allow information to be retained longer as PDF." It is estimated that over 9% of the surface web consists of PDF documents. The current standard is ISO 19005, Document management – Electronic document file format for long-term preservation – Part 1, Use of PDF 1.4 (PDF/A-1). Future updates will provide compatibility with additional changes to the PDF specification, but will still standards and applications. An announcement from AIIM and NPES The Association for Suppliers of Printing, Publishing and Converting Technologies is at: Click here

Digital History: A Guide to Gathering, Preserving, And Presenting the Past on the Web. Daniel J. Cohen, Roy Rosenzweig. University of Pennsylvania Press. 2005.
This website has a free online version of the book. It looks at the qualities of digital media and networks that potentially allow us to do things better: capacity, accessibility, flexibility, diversity, manipulability, interactivity, and hypertextuality, as well as the hazards of quality, durability, readability, passivity, and inaccessibility. “One vision of the digital future involves the preservation of everything—the dream of the complete historical record. The current reality, however, is closer to the reverse of that—we are rapidly losing the digital present that is being created because no one has worked out a means of preserving it.”
One chapter specifically deals with digital preservation, with the fragility of digital materials, technical considerations, websites, selection of materials, and the future of digital materials. Future preservation should be a part of the planning of any digital project. Readers may now understand that digital preservation may require as much work or more than preserving paper. Any web project requiring a great deal of time to produce also needs a great deal of time to preserve. “It would be a shame to ‘print’ your website on the digital equivalent of the acidic paper.”
The Library of Congress estimates that possibly as much as 10% of their disc collection already contain serious data errors. “No acceptable methods exist today to preserve complex digital objects that contain combinations of text, data, images, audio, and video and that require specific software applications for reuse.”
Archivists who have studied the problem of constant technological change, have realized that “the ultimate solution to digital preservation will come less from specific hardware and software than from methods and procedures related to the continual stewardship of these resources.” The book talks about various technologies and software, such as DSpace and Fedora. “Because digital copies are so cheap, it does not hurt to have copies of digital documents and images in a variety of formats; if you are lucky, one or more will be readable in the distant future.” Backups of files is not preservation.. Preservation also involves dealing with the technological changes. Digitization is not preservation, because currently digital copies cannot be perfect copies of analog materials. But digitization may be the best solution in some cases. “Digital preservation is here to stay.” It is not the total answer, but it is another tool to use. “For now, you are the best preserver of your own materials.” Backup your work and create good documentation.

Microsoft says Office beta coming in November. Ina Fried. CNET October 3, 2005.
Microsoft to support PDF in Office 12. Martin LaMonica. . CNET October 3, 2005.
Microsoft has been under pressure to provide open formats for Office. It has announced that the next version of Office (version 12 due in the second half of 2006), will provide support for the PDF format: it will let users convert an Office document to PDF, but PDF files are not readable within Office applications. The Microsoft XML-based document format will be the default setting. Office 12 does not support OpenDocument. Windows Vista will have a format, called Metro, that will offer features similar to PDF. Microsoft has said that they have been getting 120,000 requests a month for PDF support. Office currently supports rtf and html formats.

Seagate exec: Hard disks anything but obsolete. Martyn Williams. Computerworld. October 5, 2005.
“Hard disk drive technology is anything but dead and isn't in danger of being replaced by memory chips anytime soon” said a Seagate executive in response to a Samsung announcement. This may be more of a reflection of the battle for the storage market.
A new way to stop digital decay. The Economist. September 15, 2005
The digital documents of today face a serious threat, the threat of disappearing. Even simple files may not be readable in the future if the software or hardware needed to read it is obsolete. One strategy is to migrate copies to new hardware and software, but that may be difficult, and may also have problems. The National Library of the Netherlands is exploring the possibility of a Universal Virtual Computer that is being developed by IBM. It will have the ability to run programs that can read different file formats. In the future, libraries will have to write software that emulates the virtual computer on each new generation of computer systems. But when that is done, the programs will be able to read the documents using the decoding programs that can be written and tested today, while the format is still readable. Decoding programs have been written for jpeg and gif, and the PDF format will be added.
Descriptive metadata for copyright status. Karen Coyle. First Monday. 3 October 2005.
One of the main characteristics of digital materials is that they can be reproduced easily. This has caused a near crisis in terms of intellectual property rights because of the networked world. Two approaches to resolve the problem have been to 1) change the copyright law, and 2) protect the digital format. This paper tries to define the metadata needed to provide the copyright information that determines the use of the item. The metadata must be able to capture copyright status and to assert what copyright information is unknown. It must also be able to provide contact information for those who need more information. Currently there is little copyright information in a MARC record. Besides typical information, the metadata needs metadata needs elements to show the copyright information taken from the piece only, and additional research undertaken to determine the copyright if it is unknown. Adding copyright information is a burden for those who create the metadata; the lack of information though creates an even larger burden for those who would like to use the material. “Copyright–related metadata, therefore, should be seen as an essential component of the resource description.” This should be kept with the work itself.

Yahoo Works With 2 Academic Libraries and Other Archives on Project to Digitize Collections. Scott Carlson, Jeffrey Young. The Chronicle of Higher Education. October 3, 2005.
Yahoo will be working a number of partners to digitize millions of volumes. These include the University of California, the University of Toronto, the Internet Archive, Adobe, the European Archive, the National Archives of England, O'Reilly Media, and Hewlett Packard Labs. The project will not include copyrighted books, unless they have permission. The texts will be available to be searched by other search engines as well as Yahoo. The project is modeled on open source software projects. The Internet Archive has been working on a pilot project with the University of Toronto for about a year. So far, about 2,000 books have been scanned.