Monday, October 31, 2016

MIT task force releases preliminary “Future of Libraries” report

MIT task force releases preliminary “Future of Libraries” report. Peter Dizikes. MIT News Office. October 24, 2016.
    An MIT task force released a preliminary report about making MIT’s library system an “open global platform” enabling the “discovery, use, and stewardship of information and knowledge” for future generations. It contains general recommendations to develop “a global library for a global university,” yet strengthen the library’s relationship with the local academic community and public sphere.  “For the MIT Libraries, the better world we seek is one in which there is abundant, equitable, meaningful access to knowledge and to the products of the full life cycle of research. Enduring global access to knowledge requires sustainable models for ensuring that past and present knowledge are available long into the future.”

The MIT task force arranged ideas into four “pillars":
  1. Community and Relationships: interactions with local and global users
  2. Discovery and Use: the provision of information
  3. Stewardship and Sustainability: management and protection of scholarly resources
  4. Research and Development: library practices and needs
The report suggests a flexible approach simultaneously serving students, faculty, staff, alumni, cooperating scholars, and the local and the global scholarly community. It recommends study of changes allowing quiet study as well as new types of instruction and collaboration. The library system needs to enhance its ability to disseminate MIT research, provide better  digital access to content, and generate open platforms for sharing and preserving knowledge. The report encourages the institution to help find solutions for the “preservation of digital research,” which the report says is a “major unsolved problem.”

The report engages advocates finding the right balance between analog and digital resources, since  “the materiality of certain physical resources continues to matter for many kinds of research and learning.” They see this as a high priority.

The Future of Libraries site has a link to the full PDF report.

Copyright is Not Inevitable, Divine, or Natural Right

Copyright is Not Inevitable, Divine, or Natural Right. Kenneth Sawdon. ALA Intellectual Freedom Blog. October 19, 2016.
     A copyright lawsuit was decided in India that allows academia to create unlicensed coursepacks and allow students to photocopy portions of textbooks used in their classes. The Court dismissed the case brought by publishers and "held that coursepacks and photocopies of chapters from textbooks are not infringing copyright, whether created by the university or a third-party contractor, and do not require a license or permission". Unlicensed custom coursepacks are not covered under fair use in the U.S. but they are in India.

The ruling included this quote about what copyright is:
"Copyright, specially in literary works, is thus not an inevitable, divine, or natural right that confers on authors the absolute ownership of their creations. It is designed rather to stimulate activity and progress in the arts for the intellectual enrichment of the public. Copyright is intended to increase and not to impede the harvest of knowledge. It is intended to motivate the creative activity of authors and inventors in order to benefit the public."
This ruling doesn’t suggest that everything is fair game, but only that the use of textbook excerpts in India is fair use. "Stopping a university or third-party from providing coursepacks or textbook excerpts merely prevents the students from getting the most convenient source for information that they are free to use."  The Court held that when texts are used for imparting education and not commercial sale, it can’t infringe on copyright of the publishers. In the United States the defense for fair use involving coursepacks failed a legal challenge.

Saturday, October 29, 2016

Beta Wayback Machine – Now with Site Search!

Beta Wayback Machine – Now with Site Search! Vinay Goel. Internet Archive Blogs. October 24, 2016.
     The Wayback Machine has provided access to the Internet Archive's archived websites for 15 years. Previously the URL was the main access. There is a new beta keyword search that returns a list of relevant archived websites with additional information.

Friday, October 28, 2016

A Method for Acquisition and Preservation of Emails

A Method for Acquisition and Preservation of Emails. Claus Jensen, Christen Hedegaard. iPres 2016. (Proceedings p. 72-6/ PDF p. 37-39).
     The paper describes new methods for the acquisition of emails from a broad range of sources not directly connected with the responsible organization, as well ingesting into the repository. Some of the requirements:

Non-technical requirements
  • Maximum emulation of traditional paper-based archiving criteria, procedures
  • High level of security against loss, degradation, falsification, and unauthorized access
  • A library record should exist, even if documents are not publicly available 
  • Simple procedure for giving access to third-party by donor
  • Maximum degree of auto-archiving
  • Minimum degree of curator involvement after Agreement
Technical-oriented requirements
  • No new software programs for the donor to learn
  • No need for installation of software on the donor’s machine
  • As much control over the complete system  as possible
  • Automated workflows as much as possible
  • Independence from security restrictions on the donor system imposed by others 
New requirements for the second prototype
  • The system should be based on standard email components
  • Easy to use for both curator and donors
  • Donors’ self-deposit
  • System based on voluntary/transparent deposit 
  • It should be independent of technical platforms  
  • Donor ability to transfer emails to the deposit area at any time
  • Donor should always have access to donated emails
  • Varying levels of access for external use 
  • Donors must be able to organize and reorganize emails.
  • Donors must be allowed to saved delete emails within a certain time-frame
  • The original email header metadata must be preserved
  • The donors must be able to deposit other digital content besides emails
The Royal Library created two areas for each donor, the deposit area and the donation area.  The repository supports linked data and uses RDF within its data model that creates relations between the objects. By ingesting the different email representations the system is able to perform file characterization on the email container files, individual emails, and attachments.

"The email project is still active, and there is still time to explore alternative or supplementing methods for the acquisition of emails. Also the task of finding good ways of disseminating the email collections has not yet begun."

Thursday, October 27, 2016

Exit Strategies and Techniques for Cloud-based Preservation Services

Exit Strategies and Techniques for Cloud-based Preservation Services. Matthew Addis. iPres 2016. (Proceedings p. 276-7/ PDF p. 139).
   This poster discusses the need for an exit strategy for when organisations that use cloud-based preservation services, and understanding what is involved in migrating to or from a cloud-hosted service. It specifically looks at Arkivum and Archivematica. Some of the topics include Contractual agreements, data escrow, open source software licensing, use of independent third-party providers, and tested processes and procedures in order to mitigate risks. The top two issues are
  1. the need for an exit strategy when using a cloud preservation service, and
  2. the need to establish trust and perform checks on the quality of the service
It mentions that “full support for migrating between preservation environments has yet to be implemented in a production preservation service.” The approach used in the poster includes:
  • Data escrow
  • Log files of the software versions and updates
  • Ability to export database and configuration
  • Ability to test a migration
 It is important to remember in a migration test that “production pipelines may contain substantial amounts of data and hence doing actual migration tests of the whole service on a regular basis will typically not be practical”.  “Hosted preservation services offer many benefits but their adoption can be hampered by concerns over vendor lock-in and inability to migrate away from the service, i.e. lack of exit-plan.“   

Wednesday, October 26, 2016

Research data is different

Research data is different. Simon Wilson. Digital Archiving blog. 5 August 2016.
     A blog post about some born digital archives at Hull.  It is not academic research data but instead comes from a variety of sources. By using DROID to look at 270,867 accessioned files they discovered the following:
  • 97.96% of files were identified by DROID 
  • There were 228 different format types were identified 
  • The most common format is fmt/40 (MS Word 97-2003) with 120,595 files (44.5%).  
  •   The top formats they found were:
    Microsoft Word Document (97-2003)                 44.52%
    Microsoft Word for Windows (2007 and later)     5.63%
    Microsoft Excel 97 Workbook                              5.08%
    Graphics Interchange Format                              4.15%
    Acrobat PDF 1.4 - Portable Document Format     3.12%
    JPEG File Interchange Format (1.01)                    2.72%
    Microsoft Word Document (6.0 / 95)                    2.46%
    Acrobat PDF 1.3 - Portable Document Format     2.39%
    JPEG File Interchange Format (1.02)                    1.83%
    Hypertext Markup Language (v4)                         1.67%
 The number of and type of formats they found in their collections was different from other institutions that had research data.  An important step is to then look at the identified file formats and determine a strategy to migrate that format. Knowing the number and frequency of the formats in the collections will allow efforts to be prioritized.

Tuesday, October 25, 2016

Checksum 101: A bit of information about Checksums

Checksum 101: A bit of information about Checksums. Ross Spencer. Archives NZ Workshop. 2 October 2016.
    A slide presentation providing very good information on checksums. Why do we use checksums:
  • Policy: Provides Integrity
  • Moving files: Validation after the move
  • Working with files: Uniquely identifying what we’re working with
  • Security:  a by-product of file integrity
An algorithm does the computing bit, and there are a variety of types, MD5, CRC32, SHA, etc. A checksum algorithm is a one way function that can't be reversed. DROID can handle MD5, SHA1, and SHA256.  Why use multiple checksums? This helps to avoid potential collisions, though the probabilities are low. The presentation shows the different type of checksums and how they are generated.

Checksums will ensure uniqueness. We can automate processes better with file checksums. Some people may have a preference of which checksums to use. Using the checksums will help future proof the systems and provide greater security

Monday, October 24, 2016

Our Preservation Levels

Our Preservation Levels. Chris Erickson. October 24, 2016.
     After looking at the levels used by various groups, we have decided on 4 levels for our preservation plan. We want to keep it simple so that it is not difficult to determine and that it is meaningful for our workflows. Our Rosetta preservation system is a dark archive that can harvest digital materials from several publicly accessible content management systems. The curator or subject specialist for the collection will determine the level of preservation together with the preservation priority and will indicate that on the Digital Preservation Decision Form.

The Preservation Levels
0.   No preservation. Regular backups only (for example: Shared network drive that is  backed up regularly by IT)
1.   Basic preservation. A copy on M-Disc in Special Collections besides an access copy in our CMS, which is backed up by IT. No other preservation processing
2.   Full preservation. A master copy in Rosetta, with format migration, descriptive and preservation metadata, fixity checks, multiple copies (tape, data center, Granite Mountain Vault)
3.  Extended preservation. Full preservation services plus either DPN or remote/internet storage copy for materials that are appropriate for DPN
The intention is to recognize that some materials do not need full preservation services, nor long term storage in DPN. We will evaluate the levels next year and see if they are working the way we expect.

Thursday, October 20, 2016

Digital Storage In Space Rises Above The Cloud

Digital Storage In Space Rises Above The Cloud.  Tom Coughlin. Forbes. October 13,  2016.
     A start up company (Cloud Constellation) plans to build an array of earth orbiting data center satellites that would provide a space-based infrastructure for cloud service providers that can provide a private network with communications directly to and from the satellite network without any communication over the Internet via tight beam radio and hence no public data transmission headers. The company says that latencies will be lower than those through conventional Internet transmission.

The digital storage in these orbiting data centers will be solid-state drives and the internal temperature inside the satellites will be kept at about 70 degrees Fahrenheit. The budget to build the initial phase of this satellite network is estimated at $400 M, much less than the cost of building an equivalent terrestrial global data center network with an equivalent level of security. Data is encrypted on the way to the satellite chain, inside the satellite storage and when the data is transmitted back to earth. This should provide secure storage and transport of data without interruption or exposure to exposed networks.It could protect critical and sensitive data for potential clients, including university archives and libraries. The first phase is planned to be operational in 2018 or 2019. Soon many companies and organizations will have an option to store their data securely in outer space.

Wednesday, October 19, 2016

Filling the Digital Preservation Gap. Phase Three report - October 2016.

Filling the Digital Preservation Gap. A Jisc Research Data Spring project. Phase Three report - October 2016. 19 October 2016. Jenny Mitcham, et. al. [PDF]
     This is a report of phase 3 of the Filling the Digital Preservation Gap project.  It is important to
consider how we incorporate digital preservation functionality into our Research Data Management workflows.
  • Phase 1: addressed the need for digital preservation as part of the research data management infrastructure
  • Phase 2: practical steps to enhance their preservation system for research data 
  • Phase 3 has the following aims:
    • To establish proof of concept implementations of Archivematica at the Universities of Hull and York, integrated with other research data systems at each institution
    • To investigate the problem of unidentified research data file formats and consider practical steps for increasing the representation of research data formats in PRONOM3
    • To continue to disseminate the outcomes of the project both nationally and internationally and to a variety of different audiences

"Preserving digital data isn’t solely reliant on the implementation of a digital preservation system, it is also necessary to think about related challenges that will be encountered and how they may be addressed."  In working with formats it was clear that DROID does not look inside the zip files, and not all files were assigned a file format identification. Of the 3752 files analysed at York, only 1382 (37%) were assigned a file format identification by DROID. At the University of Hull a similar exercise had quite different results, with 89% of files assigned an identification by DROID. At Lancaster University the identification rate was 46%. Of the files, 70% of the files were TIFF images. Of the files that were not automatically identified, files with no extension made up 26% of the total.

"One possible solution to the file format problem as described would be to limit the types of files that would be accepted within the digital repository. This is a tried and tested approach for certain disciplines and data archives" and follows the NDSA level one recommendations, to “... encourage use of a limited set of known open formats ...”. This may be a problem with preserving research data, since researchers use a wide range of specialist hardware and software and it will be "hard for the repository and research support staff to provide appropriate advice on suitable formats. For much of the data there will be no obvious preservation format for that data."

The University of York encourages researchers (in training sessions and webpages) to consider file formats throughout their project, and the longevity and accessibility of the formats they select, but  researcher decides what formats to deposit their data in. The university accepts these formats and will preserve them on a best efforts basis. "Understanding the file format moves us one step closer to preservation and reuse over the longer term." In order to help the research data community their recommendations include:
  • For data curators: 
    • Greater engagement with researchers on the value and necessity of recognising and recording the file formats they will use/generate to inform effective data curation.
  • For researchers:
    • Supply adequate metadata about submitted datasets. Clear and accurate metadata about file formats and hardware/software dependencies will aid file format identification and future preservation work. 
    • Be open to sharing sample files for testing and to aid signature development where appropriate.

Appendix 2 contains A Draft PCDM-based Data Model for Datasets

Tuesday, October 18, 2016

When Archivists and Digital Asset Managers Collide: Tensions and Ways Forward

When Archivists and Digital Asset Managers Collide: Tensions and Ways Forward. Anthony Cocciolo. The American Archivist. Spring/Summer 2016. [PDF]
     The article looks at tensions in an organization between archivists and digital asset managers. Archivists maintain the inactive records (paper or electronic) of permanent value for an organization. A records manager’s role is to manage active records, and records with permanent value are transferred to the archives when they become inactive. Digital asset managers often see their role in  creating repositories of assets that can be easily and efficiently reused by staff. This accompanies the attitude that digital files will never become inactive.

This study is limited because it provides at a single instance that may not apply to other organizations that have both archivists and digital asset managers. It looks at tensions that can exist between archivists and digital asset managers which mostly come from digital asset managers and archivists not recognizing the different role each plays. 

For archives, the unit being managed is a record (“data or information in a fixed form that is created or received in the course of individual or institutional activity and set aside (preserved) as evidence of that activity for future reference"). In digital asset management, the unit being managed is an asset (a kind of record that individuals can readily reuse in future work products). Archivists are interested in the record not only for its content and aspects about the record itself, such as historical and social implications. Digital asset managers are more focused on the content and the legal rights to reuse, and are more like libraries in their approach.

One tension between the two groups is that if a file was deposited and permanently preserved in the DAM, there would be no reason to deposit it in the archives. Other tensions are
  1. Users, Files, and Where They Get Stored
  2. Differing Work Practices
  3. Approaches to Digital Preservation
  4. Communication
  5. Differing Approaches to Planning
The article states that archivists and digital asset manager differ in the view of preservation planning, fixity checking, formats accepted, and how to respond to file formats once they became obsolete. [Not all digital asset managers are as 'short term' as implied. cle]  However,  digital asset or content management systems are “not adequate for long-term digital preservation because [they include] no mechanisms for reliably assuring authenticity and intelligibility of digital documents for fifty years or longer.”   Also, another problem is that many things are called an “archives” which can be troubling for the archivists, who must contend with staff who believe that they are keeping archives and may view the DAM as yet another archives.

The article recommends that items deemed assets be deposited both in the DAM system and in the digital archives. In the digital archives, the asset will be grouped with other records of the same provenance and metadata will be attached to the file to make it more find-able. The archivists will document the activity of the institution for researchers. Since the purposes are not the same and the user groups do not overlap entirely, it is sensible that assets appear in both places. This is not wasteful because digital preservationists because multiple copies can increase object safety.  At a minimum, references to the assets in the DAM should be added to the archives intellectually if not physically. Asset management systems should not replace the need to create digital archives that document
institutional activity.

It is also essential that digital asset managers and archivists respect the different roles they play and not try to undermine each other. Each should focus on their own missions:
  • digital asset managers: creating a collection of digital assets for effective and efficient reuse by staff members. 
  • archivists: documenting institutional activity through records of permanent value in whatever format they may occur for use by staff and public researchers.

Monday, October 17, 2016

Digital Preservation through Digital Sustainability

Digital Preservation through Digital Sustainability. Matthias Stuermer, Gabriel Abu-Tayeh. iPres 2016.  (Proceedings p. 18 - 22/ PDF p. 10-12).
     The concept of digital sustainability examines how to maximize the benefits of digital resources. They specify nine basic conditions for digital sustainability which also contribute to potential solutions to the challenges of digital preservation:

    Conditions regarding the digital artifact:
1. Elaborateness: For instance, data quality requires characteristics such as accuracy, relevancy, timeliness, completeness and many more characteristics. Quality of data plays a significant role within digital preservation
2. Transparent structures: technical openness of content and software is essential for digital sustainability. Open standards and open file formats are particularly important for digital preservation.
3. Semantic data: adding meaningful information about the data to make it more easily comprehensible
4. Distributed location: redundant storage of information in different locations decreases the risk of loss

    Conditions regarding the ecosystem:
5. Open licensing regime: the legal framework plays a crucial role for digital artifacts. Objects are protected by rights, but it hinders the use of digital assets and decreases their potential for society as a whole.
6. Shared tacit knowledge: enables individuals and groups to understand and apply technologies and create further knowledge, which all needs to be updated and adapted continuously
7. Participatory culture: an active ecosystem leads to significant contributions from outsiders such as volunteers. The expertise from an international set of contributors can lead to high-quality peer-reviewed processes of knowledge creation.
8. Good governance: While technology companies and innovative business models are considered part of sustainable digital resources, they should remain independent from self-serving commercial interests and control by a few individuals.
9. Diversified funding: this reduces control by a single organization, which increases the independence of the endeavor.

Saturday, October 15, 2016

DPTP: Introduction to Digital Preservation Planning for Research Managers

DPTP: Introduction to Digital Preservation Planning for Research Managers. Ed Pinsent, Steph Taylor. ULCC. 15 October 2016.
     Today I saw this course offered and thought it looked interesting (wish I were in London to attend).  It is a one-day introduction to digital preservation and is designed specifically to look at preservation planning from the perspective of the research data manager. Digital preservation, the management and safeguarding of digital content for the long-term, is becoming more important for research data managers to make sure  content remains accessible and authentic over time.  The learning outcomes are:
  • Understand what digital preservation means and how it can help research managers
  • How to assess content for preservation
  • How to integrate preservation planning into a research data management plan
  • How to plan for preservation interventions
  • How to identify reasons and motivations for preservation for individual projects
  • What storage means, and the storage options that are available
  • How to select appropriate approaches and methods to support the needs of projects
  • How to prepare a business case for digital preservation
The course contains eight modules, which are:
  1. Find out about digital preservation and how and why it is important in RDM.
  2. Assessing research data and understanding how to preserve them for the longer term, and understanding your users.
  3. Learn how a RDM plan can include preservation actions. 
  4. Managing data beyond the life of projects, planning the management of storage and drafting a selection policy.
  5. Understanding individual institutions, stakeholders and requirements and risk assessment.
  6. Understand why preservation storage has extra requirements, considering‘the Cloud’
  7. The strategy of migrating formats, including databases; risks and benefits, and tools you can use. 
  8. Making a business case (Benefits; Risks; Costs) to persuade your institution why digital preservation is important

Friday, October 14, 2016

Digital Preservation Program: Levels of Digital Preservation Support

Digital Preservation Program. South Dakota State Historical Society. 2015.
     A look at the South Dakota State Archives webpage concerning the levels of digital preservation.  They are committed to collecting, preserving, and providing access to their materials.

Levels of Digital Preservation Support:  The Archives has established three distinct levels of preservation support for digital archival materials that will be applied to digital materials at the time of accession. The levels are:
  • Full Support:  The Archives will take all reasonable actions to maintain usability including migration, emulation, or normalization and will ensure data fixity for all original and transformed files and will provide access to transformed files.
  • Limited Support:  The Archives will take limited steps to maintain usability and undertake strategic monitoring. They may actively transform a file from one format to another to mitigate format obsolescence, and will ensure data fixity for all original and transformed files and will provide access to transformed files.
  • Basic Support: The Archives will provide access to the item in its submission file format only and will work to ensure data fixity of the submitted file. No transformations will be enacted on these files for preservation purposes.
The archives also has created a chart that outlines the preservation tasks associated with each level of preservation support. The tasks are:
  • Create preservation metadata for accessibility, provenance, and management
  • Perform fixity checks on a regular basis using proven checksum methods
  • Periodically refresh storage media
  • Provide for discovery of objects via online descriptive finding aid  
  • Undertake strategic monitoring of file format
  • Plan and perform file normalization if necessary
  • Plan and perform migration to succeeding format upon obsolescence
  • Offer long-term storage in a trusted preservation-worthy format

Thursday, October 13, 2016

Generating public interest in digital preservation

Born Digital 2016: Generating public interest in digital preservation. Sarah Slade. Poster, iPres 2016.  (Proceedings p. 262 / PDF p. 132).
     This poster describes the development and delivery of a national media and communications campaign by the National and State Libraries of Australasia Digital Preservation Group in order to broaden public awareness of what digital preservation is and why it matters.  The campaign focused on the benefits to the wider community of collecting and preserving digital material, rather than on the concept of loss, which forms the usual arguments about why digital preservation is important.

Their Digital Preservation Group identified best practice and collaborative options for the preservation of born digital and digitised materials.  Earlier they had identified six priority themes and their poster addresses  priority 5 (Collaboration and Partnership).
  1. What is it and why? A statement on digital preservation and set of principles.
  2. How well? A Digital Preservation Environment Maturity Matrix.
  3. Who? A Digital Preservation Organisational Capability and Skills Maturity Matrix. 
  4. Nuts and Bolts: A technical registry of file formats with software/hardware dependencies.
  5. Collaboration and Partnership: Opportunities for promotion and collaboration.
  6. Confronting the Abyss: A business case for research on preserving difficult object types.
While it is true that digital material is being lost to future generations due to inadequate digital collecting practices and the lack of resources and systems, they felt that it was important to reframe the discussion with a more positive focus in order to involve the public and traditional media in this campaign. They decided the most effective way to do this was with a collaborative, coordinated communications strategy, and they chose a theme for each of the five days:  Science and Space; Indigenous Voices; Truth and History; Digital Lifestyles; and Play. These would provide an opportunity for "national and local engagement with audiences through traditional and social media, and for individual libraries to hold events". The themes would target  a broad range of community sectors and ages, as well as a different focus for the public to think about reasons why digital material should be collected and preserved. A high-profile expert speaker was chosen for each of the themes and included scientists, journalists, academics, and gaming and media personalities.

Wednesday, October 12, 2016

Q&A with CNI’s Clifford Lynch: Time to re-think the institutional repository?

Q&A with CNI’s Clifford Lynch: Time to re-think the institutional repository?  Richard Poynder. Blog: Open and Shut? September 22, 2016.
     In 1999, a meeting was held to discuss scholarly archives and repositories and ways in which to make them interoperable and to avoid needlessly replicating each other’s content. This led to the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). One notion was that the individual archives "would be given easy-to-implement mechanisms for making information about what they held in their archives externally available".  Open access advocates saw OAI-PMH as a way of aggregating content hosted in local archives, or institutional repositories. This would "encourage universities to create their own repositories and then instruct their researchers to deposit in them copies of all the papers they published in subscription journals."

The interoperability promised by OAI-PMH has not really materialised, and author self-archiving "has remained a minority sport, with researchers reluctant to take on the task of depositing their papers in their institutional repository". Some believe the "IR now faces an existential threat". The interview and additional information are available in a separate PDFThis file looks at whether the IR will survive, be "captured by commercial publishers" or "the research community will finally come together, agree on the appropriate role and purpose of the IR, and then implement a strategic plan that will see repositories filled with the target content."

Tuesday, October 11, 2016

Digital Preservation of Photo Books

Digital Preservation of Photo Books. Mark Mizen. All About Images Blog. September 20, 2016.
     The post is a follow on to the paper Long-Term Digital Preservation of Photo Books presented at the International Symposium on Technologies for Digital Photo Fulfillment in Manchester, England. The  presentation highlights the need to think about photo books as not just a printed book but a combination of the printed book and the related electronic files, which are resources to be preserved.

Photo books give context to photos and provide an unparalleled source of information about life as it is happening. They are today’s scrapbooks and give a glimpse into everyday life. Preserving the digital file that created the photo book is important, but unfortunately, most manufacturers do not provide the digital file when the book is printed and the files are lost as soon as the book is printed.  The Forever company allows the PDF file to be saved and preserved. Always ask the photo book supplier for the files. 

Monday, October 10, 2016

Secure cloud doesn’t always mean your stuff in it is secure too

Secure cloud doesn’t always mean your stuff in it is secure too. Gareth Corfield. The Register. 6 Oct 2016.
     Workflows are moving to the cloud and security technology is helping to build customer confidence. “Picking a secure cloud partner is not as trivial as it may seem. Don't assume that because the cloud is secure, your business within the cloud is secure."  The public cloud can provide  better security monitoring and analysis, management, redundancy and resilience. But you have to choose a secure cloud platform. Microsegmentation can help secure the platform against malware and other security threats. It helps to improve operational efficiency. The cloud provides many services, more than just storage.

Saturday, October 08, 2016

Preserving & Curating ETD Research Data & Complex Digital Objects

Preserving & Curating ETD Research Data & Complex Digital Objects. Katherine Skinner, Sam Meister. ETDplus project, Educopia Institute. October 7, 2016.
     The ETDplus project is funded by IMLS and led by the Educopia Institute, in collaboration with many others.  The project helps ensure the longevity and availability of ETD research data and complex digital objects (e.g., software, multimedia files) that are part of student theses and dissertations. The project has just published a set of six Guidance Briefs to help students understand how to prepare, manage, and store the research files associated with their ETDs.

The Guidance Briefs are short “how-to” oriented briefs "designed to help ETD programs build and nurture supportive relationships with student researchers. These briefs are written for a student audience. They are designed to assist student researchers in understanding how their approaches to data and content management impact credibility, replicable research, and general long-term accessibility: knowledge and skills that will impact the health of their careers for years to come."

The Guidance Briefs can be downloaded at the site, and cover the following topics:
1. Copyright
2. Data Structures
3. File Formats
4. Metadata
5. Storage
6. Version Control

Institutions can use the guides as fits their local audiences. Each Brief includes information about the  topic and a “Local Practices” section where an institutions can highlight their own activities.

Friday, October 07, 2016

Proceedings of the 13th International Conference on Digital Preservation: iPres 2016

Proceedings of the 13th International Conference on Digital Preservation. iPRES 2016. October 3 – 6, 2016. 169 pp.  PDF  (Link updated)
     The proceedings of the conference, along with other posts of presenters, including slides and images. A wealth of information to read in the weeks ahead.

‘We’re going backward!’

‘We’re going backward!’ Vinton G. Cerf. Communications of the ACM. October 2016.  HTML   PDF
     Update from Vinton Cerf on blog about media longevity.  "Perhaps  by  now  you  are  noticing  a   trend  in  the  narrative.  As we move toward the present, the media of our expression seems to have decreasing longevity." It is not just digital media, but physical as well. Photographs may not last more than 150–200 years and normal books may not last more than 100 years. He is concerned for the "longevity of digital media and our ability to correctly interpret digital content, absent the software that produced it". He reflects on the ephemeral nature of our artifacts and that the centuries before ours may be better known than ours unless we are persistent about preserving digital content.

"Just as the monks and Muslims of the Middle Ages preserved content by copying into new media, won’t we need to do the same for our modern content? These thoughts immediately raise the question of financial support for such work."  In the past, patrons, religious orders and centers of Islamic science underwrote the preservation costs. Our society must find a way to underwrite the cost of preserving knowledge in media that will have some permanence and the executable software for their rendering. Unless we face this challenge the knowledge we have produced may simply evaporate with time.

Thursday, October 06, 2016

Judging a book through its cover

Judging a book through its cover. Larry Hardesty. MIT News Office. September 9, 2016.
    MIT researchers and colleagues are designing an imaging system that can read closed books, and particularly antique books that are too fragile to touch. The system uses terahertz radiation emitted in short bursts that can gauge the distance to individual pages of the book and can distinguish between ink and blank paper, in a way that X-rays can’t. It is still a new technology but they are working to improve the depth of penetration of a book and also the accuracy.

Wednesday, October 05, 2016

How many copies are needed for preservation?

How many copies are needed for preservation? Chris Erickson. 4 October 2016.
     An important component for preservation is to have multiple copies. The specific questions are: how many copies, how should they be stored, and where should they be located. Many people advocate the 3-2-1 rule for digital storage: three copies, stored on two different media, and one copy located off-site, preferably in areas with different disaster threats. (NARA; Library of Congress) The NDSA levels also incorporate this rule in the storage section.

The copies we have been looking at are:
     Copy 1: Rosetta storage on spinning disk in the campus data center
     Copy 2: Tape copies of our archive in the Granite Mountain Record Vault.
                    Annual Tape archive plus incremental transactional backups
     Copy 3: Internet copy, with DPN or Amazon Glacier
     Copy 4: Access copy within Special Collections on M-Discs or our CMS
What we choose to put in DPN will affect the third copy. We need to determine if these copies are adequate, and if not, then find different storage methods that are cost effective and fit within our workflow.

Additional posts:

Monday, October 03, 2016

Digital Preservation Priorities: What to preserve?

Digital Preservation Priorities: What to preserve? Chris Erickson. 3 October 2016.
     Recently we have been reviewing the digital preservation policies that we have been working under. The current policy states that the subject specialists (curators, subject librarians, faculty members) who are responsible for a collection should decide what will be preserved in the Rosetta digital archive. They should know the library collection and the collecting policies, as well as the faculty and the university curriculum, and be able to decide what is worth preserving long term. We provide the Digital Preservation Decision Form to help them in their decisions. Currently the choices are to preserve, not to preserve, and the order in which collections needs to be processed.

The amount of content in our digital archive is increasing rapidly. As we plan for the future of the archive, there are questions raised about the number of archival copies, particularly when discussing what content should go into DPN. Those questions in turn raise other questions, including the question of preservation priorities. Are all objects equally important? If not, what are the most important objects or collections to preserve? Should we periodically revisit what is in the archive and deaccession content that is less important? In a world of finite resources we decided that we need to determine our preservation priorities in order to better preserve the important content.

Our goal is to preserve the important digital resources created in, or acquired by, or managed by the University Library and Archives. The proposed change is that the content preserved will be addressed according to the following guidelines, in descending order of importance:
1.      Unique University created content with no physical copy 
2.      Unique University owned items that are At-risk 
3.      Digital content in the library with a physical copy that may be at risk
4.      Digital content that would be difficult or costly to reproduce 
5.      Content digitized for convenience

We will be reviewing our digital collections and deciding if these priorities will help our selection and preservation processes, if they need to be revised, or if we need to go in another direction. We are also looking at implementing levels of preservation along with these priorities.

Saturday, October 01, 2016

What happens when the Internet and digital preservation coincide

What happens when the Internet and digital preservation coinicide. Jay Gattuso. jaygattuso's Blog, Open Preservation Foundation. 25th Sep 2016.
     A very thought provoking post that uses a job recruitment post as the basis for a discussion about the library's going digital preservation program, and what happens when they identify a gap in the capability that can’t be ignored. The primary purpose of the Digital Preservation Web Engineer is to "define, implement and support the efficient acquisition, preservation and discovery/delivery of web based digital content subject to the Library’s legislated mandate."  They understand that there is digital, and there is “online”, and sometimes digital is online. It is important to be able to confidently collect online digital content, maintain a sense of content, context and structure but there is a capability gap that they have been working around for a while. There are still many questions deliver content to a readership that is still establishing its own needs. And there is the challenge of doing this on a large scale.

They want the two processes, digital collecting and digital preservation to dovetail into a well-considered unified workflow.While they are all about collecting, storing and preserving important things that are precious to New Zealand, the same concepts hold true for others that collect to their own mission. "We don’t believe this point can be understated. We are slowly start to understand the cultural and research impact of web content, and this new post is a direct response to the challenge that sits behind national level collection building and the rapid uptake of Internet based content and information."

The content collected has an extremely important role in their National memory, and they have an obligation to operate with the care and expertise that this content demands. The collections help people understand their sense of place and history as well as informing research and creative outputs alike.

Their post addresses one of the problems facing digital preservation today. Digital Preservation is "an emergent discipline, finding our way through new challenges, and without specifically crafted routes into the work we expect to undertake. We are only just starting to see the edges of what’s possible, and unless we repeatedly open the door to complimentary professions we are going to struggle to address the contemporary challenge of collecting fast moving content, regardless of the ongoing care required when today’s harvests become tomorrow’s Preservation Masters with all the attendant questions of technical sustainability."