Digital collections are administered by the Weinberg Memorial Library's Digital Services Department, established in 2008, in collaboration with the Library's Systems, Cataloging, and Media Resources departments and the University's Division of Information Resources.
Our work includes the acquisition, appraisal, description, publication, and long-term preservation of digital content, and our collections include digitized as well as born digital materials of all media types.
We are continually digitizing materials to increase access to our unique collections. While some small-scale, item-level digitization is performed upon researcher request, we generally aim to digitize full collections or series. Due to significant limitations on staff time for digitization, processing, and description, we prioritize our digitization projects based on several factors, such as:
The Library digitizes small format items such as photographs, documents, slides, and negatives in-house. Our equipment includes several models of Epson scanners (Expression 10000XL, Epson GT-2500, and V500), a Nikon Coolscan, and a Konica Minolta PS7000.
Large format items (such as maps), bound volumes, and audio/visual media are outsourced to professional digitization vendors. Several of our digitization projects were completed by Internet Archive via the LYRASIS Mass Digitization Collaborative.
Our digitization standards vary by collection and researcher needs. Generally, however, our in-house digitization standards are as follows:
|Original||Bit depth||Resolution||Master File Format|
|Text-based materials without images (documents)||8-bit grayscale or 24-bit color||300 ppi||Uncompressed TIFF or PDF|
|Text-based materials with images||8-bit grayscale or 24-bit color||400 ppi||Uncompressed TIFF|
|Artifactual text or manuscript (rare books, medieval manuscripts)||24-bit or 48-bit color||400 - 800 ppi||Uncompressed TIFF|
|Film (slides, negatives)||8-bit grayscale or 24-bit color||800 - 2800 ppi||Uncompressed TIFF|
|Photographs||24-bit color||400-600 ppi||Uncompressed TIFF|
Master archival files are preserved in our digital preservation repository. We generally prepare derivative copies (often in JPEG2000, JPG, or PDF format) for display and web publication.
For digitized, typed documents, we generate searchable text using automated optical character recognition (OCR). In some cases, glue stains, tears, or other damage to the original may result in obscured text and poor OCR. In general, automatically generated OCR is inexact; an OCR transcript may include symbols or extra characters (known as "dirty OCR").
For handwritten documents and oral history recordings, full-text transcriptions may be generated by staff members, students, donors, or volunteers. Not all of our materials have been transcribed. Please contact us if you are interested in helping us transcribe a collection!
Born digital materials are accessioned when submitted to the University Archives by University departments and offices or donated to the McHugh Special Collections.
Description of digital collections materials is conducted in collaboration with the Weinberg Memorial Library's cataloging department.
We use a standardized set of metadata fields, mapped to the Dublin Core schema, across all of our collections. Some collections include customized fields to document specialized information. Our metadata practices are documented in a data dictionary.
In order to maximize the interoperability of our collections, we have implemented several controlled vocabulary standards. These include:
We also work to align our metadata practices with the Pennsylvania Digital Partnership (PA Digital) Metadata Guidelines.
Descriptive metadata from our digital resources is available for harvesting by search interfaces, catalogs, and other aggregation services via OAI-PMH. Harvesting is enabled at a collection level.
At this time, descriptive information from nearly all of our collections is searchable via the Weinberg Memorial Library catalog, the Pennsylvania Digital Library, and WorldCat. Efforts are currently underway to include our metadata in the Digital Public Library of America via the PA Digital Partnership.
We use CONTENTdm digital collection management software to provide access to our digital collections.
Our goal is to make our collections as publicly available as possible. However, in some cases, we are obligated to apply access restrictions. At this time, we provide three basic levels of access:
In the future, we hope to provide more nuanced, granular permissions, such as:
Over 2 terabytes of digital collections materials are stored, monitored, and maintained in our digital preservation repository. We use a managed storage service called DuraCloud to support this work. In addition to maintaining local copies, we sync our digital master files across two different cloud storage services (Amazon S3 and Amazon Glacier) to ensure geographically distributed backup. DuraCloud's platform also allows us to monitor the fixity of our digital files, so that we can easily detect any changes or integrity issues.
Questions about our digital collections and practices may be addressed to the Digital Services department at email@example.com.