Friday, March 20, 2009

Digital Preservation Matters - 20 March 2009

International Data curation Education Action (IDEA) Working Group: A Report from the Second Workshop of the IDEA. Carolyn Hank, Joy Davidson. D-Lib Magazine. March 2009.
This is a report of the workshops held in December, with links to programs and resources. In general the article acknowledges that curation of digital assets is a central challenge and opportunity for libraries and other data organizations. In order to meet this challenge, skilled professionals are needed who are trained “to perform, manage, and respond to a range of procedures, processes and challenges across the life-cycle of digital objects.” The presentations discuss developing a graduate-level curriculum to prepare master's students to work in the field of digital curation. Among the curricula at the institutions are: preparing faculty to research and teach in the field; data collection and management, knowledge representation, digital preservation and archiving, data standards, and policy. Collaboration between schools is important since the all recognize that no school can do it all. One item in particular: The skills, role and career structure of data scientists and curators: An assessment of current practice and future needs.


Report on the 2nd Ibero-American Conference on Electronic Publishing in the Context of Scholarly Communication (CIPECC 2008). Ana Alice Baptista. D-Lib Magazine. March 2009.
Some notes from this article:
  • IR (institutional repository) initiatives occur mostly in public universities
  • the main motivation for implementing an IR: answer specific demands and needs to digitally store the institution's scientific memory, rather than support for Open Access principles.
  • 40% of the analyzed IRs are maintained and coordinated by two or more sectors within each university
  • the databases with more than 3,000 documents are, in practice, OPACs with links to the full text versions.
  • the next step forward: provide new metrics on the impact factor (Scientometrics)

Items in this newsletter include:
  • CBS program on “Bye, Tech: Dealing with Data Rot.” Looks at obsolescence of computer hardware, software, and formats. “So the basic lesson is: Look after your own data and make sure that you take steps to keep it moving onto new formats about once every ten years." There are links where you can both read and watch the program. Their conclusions:
1. You should convert whatever you can afford to digital.
2. Store your tapes and films in a cool, dry place.
3. And above all, remain vigilant. As you now know, every ten years or so, you're going to have to transfer all your important memories to whatever format is current at the time, because there never has been, and there never will be, a recording format that lasts forever.
  • Federal Agencies Collaborate on Digitization Guidelines. A working group is developing best practices for digitizing recorded sound and moving images.


Got Data? A Guide to Data Preservation in the Information Age. (Updated link-August 2015.)  Francine Berman. Communications of the ACM. December 2008.
Digital data is fragile, even though we all assume it will be there when we want it. “The management, organization, access, and preservation of digital data is arguably a "grand challenge" of the information age.” This article looks at the key trends and issues with preservation:
  1. More digital data is being created than there is storage to host it.
  2. Increasingly more policies and regulations require the access, stewardship, and/or preservation of digital data.
  3. Storage costs for digital data are decreasing (but other areas are increasing).
  4. Increasing commercialization of digital data storage and services.
These four trends point to the need to take a comprehensive and coordinated approach to data cyber infrastructure. The greatest challenge in this is to develop a economically sustainable model. One approach is to create a data pyramid to the stewardship options. This shows that multiple solutions for sustainable digital preservation must be devised. There is also a need for ongoing research into and development of solutions that address these technical challenges as well as the economic and social aspects of digital preservation. They add 10 guidelines:
Top 10 Guidelines for Data Stewardship
1. Make a plan.
2. Be aware of data costs and include them in your overall IT budget.
3. Associate metadata with your data.
4. Make multiple copies of valuable data. Store some off-site and in different systems;
5. Plan for the transition and cost of digital data to new storage media ahead of time.
6. Plan for transitions in data stewardship.
7. Determine the level of "trust" required when choosing how to archive data.
8. Tailor plans for preservation and access to the expected use.
9. Pay attention to security and the integrity of your data.
10. Know the regulations.

The Library of Congress has been moving into the digital world, and one way is by a scanning project with the Internet Archive that has put 25,000 books online to date. "To preserve book knowledge and book culture means preserving every word of every sentence in the right sequence of pages in the right edition, within the appropriate historical, scholarly and bibliographical context. You must respect what you scan and treat it as an organic whole, not just raw bits of slapdash data." A lot of items that have not literally seen the light of day are being downloaded. The cost is just 10 cents a page.

No comments: