Saturday, March 30, 2013

'Oh, you wanted us to preserve that?!' Statements of Preservation Intent for the National Library of Australia's Digital Collections

'Oh, you wanted us to preserve that?!' Statements of Preservation Intent for the National Library of Australia's Digital Collections. Colin Webb, David Pearson, Paul Koerbin. National Library of Australia. D-Lib Magazine. January/February 2013.
Clarifying preservation intentions is likely to be a good starting point for preservation planning for diverse digital collections. This applies both in terms of identifying what needs to be kept and what does not warrant the use of limited preservation resources, and in terms of opening up conversations about what is required in order to achieve preservation intentions. This paper describes an approach being explored by the National Library of Australia to negotiate formal and reviewable statements of 'preservation intent' for each of the digital collections in its care with those responsible for those collections. The paper looks at the relationship with the widely discussed concept of 'significant properties', as well as the other benefits that the approach is delivering. The paper also looks at the preservation intent statements for archived web collections at the NLA as an illustrative case.

The approach described in this paper is based on a conviction that methods and solutions in digital preservation do not exist in a policy vacuum. Rather, they are only meaningfully discussed as solutions to problems that threaten or frustrate the organisation's explicit access intentions.
This approach still refers to significant properties as a critical part of the preservation planning process. We believe it puts the full identification and evaluation of significant properties into a more useful context: that is, being later in the decision-making process.
At the very least, we aim to be in a position to know whether preservation action will be needed and whether a bit-level preservation copy will be good enough when access to content is lost.
As with most preservation processes, evaluating and articulating preservation intentions is likely to be an ongoing process requiring proactive management and periodic review. Our current conception of it still requires further development but it has given the NLA a useful start in real preservation planning.

Without some kind of understanding between curators and preservers, we are doomed to recurring nasty surprises based on mismatched expectations.

Thursday, March 28, 2013

Challenges of Dumping/Imaging old IDE Disks

Challenges of Dumping/Imaging old IDE Disks. Dirk von Suchodoletz. Open Planets.
Full system preservation through imaging or processing in digital forensics depend on reliable hardware-software stacks for identity system disk migrations. There are a number of pitfalls which might prevent authentic copies of the original components to an image file. The article discusses issues with disk recognition, reading and verification. The tool of choice to produce identical copies of block devices in Linux/Unix systems is dd.

The Copyright Rule We Need to Repeal If We Want to Preserve Our Cultural Heritage

The Copyright Rule We Need to Repeal If We Want to Preserve Our Cultural Heritage. Benj Edwards. The Atlantic. Mar 15 2013.
The anti-circumvention section (1201) of the Digital Millennium Copyright Act needs to be repealed. The law threatens consumer control over the electronic devices we buy, but if the DMCA remains unaltered, cultural scholarship will soon be conducted only at the behest of corporations, and public libraries may disappear entirely. The DMCA prevents sharing information and sharing is vital to  preserve information. To properly preserve digital works, libraries and archives must be able to copy and media-shift them without fear of legal problems. The provisions in the DCMA are unacceptable and must change, or as a society we must be willing to say goodbye to libraries and the concept of universal public access to knowledge.

Training in Digital Preservation - Alliance for Permanent Access

Training in Digital Preservation - Alliance forPermanent Access. William Kilbride, Chiara Cirinnà, Sharon McMeekiny. 21 February 2013. [PDF] 
This paper summarizes the current digital preservation needs based on the APARSEN project.
The need for training is great and the resources available are relatively meagre: so there is an
opportunity to collaborate in order to maximise impact.  In the training courses that have taken place there were four themes consistently expressed in feedback:

  1. There is a great demand for training from staff already engaged in library and archive settings, especially for introductory material.
  2. Audiences welcomed practical, case-study based training that matched their needs over theoretical knowledge. Tools and services beyond their level of knowledge or which lacked practical application were also less popular.
  3. The audiences wanted practical interaction with preservation processes, including trying out the tools for themselves.
  4. Audiences did not feel the need  to have a complete overview of preservation before they got started, and were less interested in the theoretical which they saw as a hindrance.
 However, training could be popular but still leave significant gaps so training should not just be based on the feedback.  The report gives many recommendations for training for Operational Staff,
Operational Managers, and Senior Managers regarding standards, object life-cycles, practical experience, legal and policy frameworks, ingest, provenance, metadata, financial planning, user communities, and succession planning.  They point to a very large unmet training need and a long list of topics which training providers can actually provide.

By developing training that meets proven needs we can provide a strong foundation to
an ever larger and ever more diverse community.


Wednesday, March 27, 2013

The cost of doing nothing

The cost of doing nothing. Barbara Sierman. Digital Preservation Seeds. maart 11, 2013.

There has been much discussion on the fact that over the years the digital preservation community has created more than a dozen cost models, which may increase the confusion in digital preservation even bigger. May be this is part of the way things are going: everyone sees his own situation as something special with special needs.

The solution? Creating an existing model or developing a new one. We can expect help from the recently started European project 4C ,”The Collaboration to Clarify the Costs of Curation”. In their introduction they state that “4C reminds us that the point of this investment [in digital preservation] is to realise a benefit”. Less emphasis on the complexity of digital preservation, and more on the benefits. If we have better figures of the benefits of preserving digital material, we are in a better position to estimate what it will cost us if digital materials are not preserved.

Libraries, Hackspaces and E-waste: how libraries can be the hub of a young maker revolution

Libraries, Hackspaces and E-waste: how libraries can be the hub of a young maker revolution. Cory Doctorow. Raincoast Books.  February 24, 2013.
The problem with people who say "What do we need libraries for? We've got the Internet now!" is that they have confused a library with a book depository. "Now, those are useful, too, but a library isn't just (or even necessarily) a place where you go to get books for free." Libraries have always been places where skilled information professionals have helped people understand the world.

Librarians have been selecting credible books, cataloging and shelving them, and then assisting patrons in understanding how to synthesize the material in them. Libraries have been hubs where the curious, the entrepreneurs, the scholarly and others could gather in the company of one another, surrounded by untold information-wealth, professionals who could lend technical assistance where needed. "All these people were using the library as a place, a resource, and a community. Because that's what libraries are. And we've never needed that more than we need it today." Now we're *drowning* in information. We live in a "publish, then select" world: everyone can reach everything, all the time, and the job of experts is to collect and annotate that material, to help others navigate its worth and truthfulness. Society has never needed its librarians, and its libraries, more. The major life-skill of the information age is information literacy, and no one's better at that than librarians. It's what they train for. It's what they live for.

But there's another group of information-literate people out there who are a natural ally of libraries and librarians: the makers who build physical stuff. "They make robots, flying drones, 3D printers (and 3D printed stuff), jewelry, tools, printing presses, clothes,... Today's tinkerer work in vast, distributed communities where information sharing is the norm, where the ethics and practices of the free/open source software movement has gone physical. " We need to master computers  to master the systems of information, so that we can master information itself. Why not take surplused computers and components and make libraries "book-lined, computer-filled information-dojos where communities come together to teach each other black-belt information literacy, where initiates work alongside noviates to show them how to master the tools of the networked age from the bare metal up."
"Only through understanding the tools of information can we master them, and only by mastering them can we use them to make our lives better, rather than destroying them."


Supporting the Changing Research Practices of Chemists.

Supporting the Changing Research Practices of Chemists.  February 25, 2013. Matthew P. Long, Roger C. Schonfeld. Ithaka S+R. February 26, 2013. [PDF]

This report, intended for those who support chemists, including librarians, is about the latest research methods, practices, and information services needs of academics chemists. Chemists need services to make their lives easier and their research groups more productive; this includes minimizing paperwork and administrative tasks. They value academic libraries primarily for the access that they provide to electronic journals and other online resources. Researchers are often frustrated by an inability to share large amounts of data with a collaborator. Few chemists visit the physical library, but they use the library digital collections heavily.

In the survey, fewer than 10% reported a research consultation with a librarian, asked for help with a data management, or asked for assistance on an issue related to publishing in the past year; they rarely reach out to the library to discuss issues or request support. The main search sites for chemists are Web of Knowledge/Web of  Science, SciFinder, and PubMed. It would be helpful to have tools to help process all of this information,  a pre-scan of announcements from journals, and organize their materials. Electronic Lab Notebooks (ELNs) make it easy to share, archive, and search through past lab notes, but are at risk in the lab. Labs generally do not have good data management infrastructure or proper external support for developing it, especially in sharing and preserving files.

It is difficult for academic chemists to coordinate the recording and preservation of data after the completion of a project. When data are saved, they are often held in unstable or at-risk formats  or in formats where no one else can access or interpret them. Sometimes a large amount of potentially useful data is not shared or preserved in any durable way. One chemist invited the library to come and speak to the department about preservation and access. Chemists have a general lack of awareness of  effective data curation and preservation. Data management and preservation is time-consuming and rarely straightforward; it requires expert advice and constant monitoring.

The findings:
  1. Chemists need better support in data management, sharing and preservation.
  2.  Many researchers remain anxious about keeping up with the newest literature.
  3. They need new tools to stay aware of new research and also serendipitous discovery.
  4. Chemists  require greater support in disseminating their research, including articles, data, and other materials.
Other areas of concern for academic chemists : laboratory management, gaining access to industrial funding, and teaching support.
We see some real potential for the academic library to stretch the definition of the services it offers to the academic chemist. The library may also have a role in working with other service providers and ensuring that academics are aware of the latest research tools. It is clear from this project that libraries must think strategically about whether and how to invest in services for chemists.

Saturday, March 23, 2013

Digital Curation Bibliography

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works, 2012 Supplement. Charles W. Bailey, Jr. March 2013.

Bibliography: Preservation and Stewardship of Scholarly Works, 2012 Supplement, which presents over 130 English-language articles, books, and technical reports published in 2012 about digital curation and preservation, copyright issues, digital formats (e.g., media, e-journals, research data), metadata, models and policies, national and international efforts, projects and institutional implementations, research studies, services, strategies, and digital repository concerns.
It is a supplement to the Digital Curation Bibliography:Preservation and Stewardship of Scholarly Works, which covers over 650 works published from 2000 through 2011.

Adding Value to Electronic Theses and Dissertations in Institutional Repositories

Adding Value to Electronic Theses and Dissertations in Institutional Repositories. Joachim Schöpfel. D-Lib Magazine. March/April 2013.

This paper looks at the differences with institutional repositories that contain electronic theses and dissertations (ETDs, particularly regarding metadata, policy, access restrictions, representativeness, file format, status, quality and related services. The intent is to improve the  "quality of content and service provision in an open environment, in order to increase impact, traffic and usage". This paper shows five ways in which institutions can add value to the deposit and dissemination of electronic theses and dissertation:
  1. Quality of content. A good IR not only defines a set of standards and criteria for the selection and validation of deposits but also communicates and promotes this editorial policy.
  2. Metadata. The description of the content and context of the ETD files will make a difference. 
  3. Format. The IR should contain full text, offer different file formats, and have deposit formats are searchable, open, and appropriate for long-term preservation and use of the content.
  4. Repositories should network and interconnect.
  5. Provide needed services beyond basic searching, viewing, and downloading. Some possibilities are discussion forums, usage statistics and metrics, citations, Print On Demand in book format, copyright protection or Creative Commons licensing, and preservation. 
 Institutional Repositories must also be future-oriented and anticipate future transformation of scientific communication. "It is crucial for the success of a repository that the institution clearly defines its objectives in line with its scientific strategy and environment. "

Score Model

Score Model. Website. Digitaal Erfgoed Nederland, PACKED. 2013.

This score model guides you through the risks and threats to digital materials. It has a series of questions that create a report that points out the strong and weaker points of your digital organization. The report provides recommendations in order to minimise the risks are provided where possible.

The tool is intended especially for small and medium heritage organizations, but anyone who is managing a digital collection and has a concern about its sustainability can use the Score model.

These risks are grouped in seven clusters:
  1. Organisation and policy: does the preservation of digital files fit the structure and policy of your organisation?
  2. Preservation strategy: is it correctly recorded what is being preserved, for whom and in what manner?
  3. Expertise and organisation: is the right expertise present in your institution and it it put to good use?
  4. Storage management: is the physical storage of the digital files also reliable?
  5. Ingest: are the right measures taken whenever a digital object is ingested into your storage system?
  6. Planning and control: is the management well prepared? Are all actions retraceable?
  7. Access: is access to the digital files properly regulated?

How big is the sound of music?

How big is the sound of music?  Lucas Mearian. Computerworld. March 21, 2013. 

Recently audiophiles and musicians have been moving toward master-quality music that's playable from a hard drive. That has led to greater use of lossless file formats. A popular file format, or codec, is the MP3.  This is a compressed file format; it is referred to as "lossy," meaning data is lost in the translation from the original master to the compressed format. Analog audio is recorded by sampling it 44,100 times per second, and then the samples are used to reconstruct the audio signal when playing it back digitally. An uncompressed file on a CD for example, uses 44.1KHz or a 1,411Kbits of data per second (Kbps) while a compressed file may only offer a bit rate of up to 256Kbps, which results in lower quality.  Within file formats, there are many sampling rates or frequencies; the higher the sampling rate, the higher the sound quality.

Uncompressed files are often stored as WAV files (Waveform Audio File Format), which can typically be 10 - 40 times larger than MP3 files.Other lossless formats include FLAC (Free Lossless Audio Codec), AIFF (Audio Interchange File Format) and Apple's ALAC (Apple Lossless Audio Codec). These lossless file formats require less storage space than WAV files. These new lossless formats can save disk space and offer high-fidelity music playback. For example, an album in the WAV format make take up 640 MB of space; the same album would take up about 300 MB in the lossless FLAC format. And lossy MP3 file would take up about 60 MB. It is estimated that CDs only offer about 15% of the data that was in a master sound track; when that CD is further compressed into a lossy MP, even more depth and quality of a recording is lost.

A relatively new high-definition file format, called Direct-Stream Digital (DSD), was created by Sony and Philips.  DSD uses a sample rate of 2.8224MHz or 64 times that of a CD's 41.1KHz.

Monday, March 18, 2013

Establishing a Digital Preservation Policy

Establishing a Digital Preservation Policy. JISC. March 2013.
Rapid advances in technology can lead to digital collections becoming obsolete very quickly and a digital preservation policy is a crucial part of managing this risk. Digital preservation can be a costly process and will need continual attention well after all materials have been digitised and ingested into a collection. The digital preservation policy should highlight an organisation's ongoing commitment to digitally preserving valuable collections.

Digital preservation and digitization, though related, are distinct activities. The preservation of digital resources continues long after a digitization project has been completed. Digital preservation is not a time-limited process.

A preservation policy should be directly connected to the aims and goals of the institution. Clearly establishing the benefits of a digital preservation strategy at an early stage will allow these benefits to be measured and show the need for commitment by the institution. Implementing a preservation policy may only be possible by first raising awareness of the benefits of digital preservation and the potential dangers of ignoring it.

Strong policies should also be inclusive and cross-departmental. Creating a policy at an early stage may provide a basic digital preservation policy, which can then be developed as required. Tying in high-level policy documents can be especial beneficial when quantifying the benefits of preservation.

When an institution is digitizing content, it is important that a digital preservation policy is implemented as soon as possible. A best practice is to have a preservation strategy in place before any content is digitized so that standards are followed. However, a phased introduction may be necessary, perhaps beginning with the needs of the digitization project and evolving to embrace the needs of the institution.

A digital preservation policy should include:
  • An explanation of how the policy relates to other organisational goals, objectives and mission statements. This section should also quantify the benefits of a sustainable digital collection.
  • How the digital preservation policy sits along side other institutional policies, such as records management, IT or digitisation work. It should also highlight the use of agreed upon and interoperable standards.
  • The objectives of preservation activities, this section should outline how activities mentioned in the principle statement will be undertaken and by whom. Will preservation actions be carried out in-house or outsourced? For how long will materials be 
  • preserved?
  • Detail of just how digital preservation will be implemented. Which department will undertake what activities and when? Objectives of the policy should be spelled out in practical terms.
  • The scope of preservation activities should also be made clear. What will be preserved? Will you undertake to store ‘archival masters' only or multiple versions of a file? In detailed policies, preferred file formats should also be listed.
  • Accountability. Who, ultimately will be responsible for digital preservation within an organisation? How will the organisation fund staff training, equipment, outsourcing, and storage. Who will be responsible for future changes to the digitisation strategy? Signing-off an agreed policy could help its long-term prospects.
  • Glossary: Anyone unfamiliar with digital preservation may require a detailed glossary.
  • Version Control: Date of policy. It's status and review date should also be included.
A preservation policy must be aware of ongoing digital preservation costs and that the costs will vary by collection. 

The sooner the issues associated with digital preservation are addressed, the easier it will be to develop hands-on preservation procedures to ensure preservation objectives are met. A digital preservation policy is required for digitization projects, but more so for the long-term management and maintenance of digital collections.