Technical & Operational Overview

Project Timeline

Original proposal 12/11/2003
Planning & "hiatus" 2004
Budget approved 12/02/2004
Preprocessing begins 1/2006
Preprocessing complete 8/2006
Vendor selected 8/2006
Scanning begins 1/2007
Scanning completed 6/2007
Post-processing completed 1/2008
Application/web site development begins 1/2008
Application/web site launched 4/15/2008

Collection Statistics

Collection Size:

  • ca. 500 linear feet scanned
  • 32,890 complete documents scanned
  • 43,479 page images scanned

Authorship:

  • HH Lehman & staff = 15,362 (47%)
  • Lehman & family = 1,095 (3%)
  • All other = 16,433 (50%) 
  • TOTAL = 32,890

Project Teams

  • Libraries Digital Program Division: Robbie Blitz, Terry Catapano, Joanna DiPasquale, Stuart Marquis, Stephen Davis
  • Preservation & Digital Conversion Division: Dave Ortiz, Dina Sokolova, Emily Holmes, Janet Gertz
  • Curatorial & Administrative Staff: Jean Ashton, Tamar Dougherty, Susan Hamson, Michael Ryan, Janet Gertz, Jane Winland
  • Pre-Processing: Ann Young (2006), Annie Grunow (2006)
  • Scanning Vendor: Backstage Library Works, Provo Utah (1/2007 - 7/2007)

Pre-Processing & Metadata

  • Item Numbering: All files, documents and pages were collated & numbered, e.g.,
    • [file #]-[document #]-[page #]
    • 0002-0001-001
  • Collection Reprocessing: Duplicates were marked as not-to-be scanned; poor-quality photocopies re-copied; items needing conservation identified and referred to Conservation Lab; entire collection was relabeled and refoldered; 'separated material' was reintegrated. Decided original documents in "VIP Files" would not be scanned directly because of security and operational concerns, instead photocopies of them were scanned.
  • Descriptive Metadata: Recorded file ID, file title, folder ID, document ID, document date, number of pages in document, genre, author type (i.e., HHL / Staff, HHL Family, Other).  (See master project spreadsheet.)
  • Conservation Information: Pre-scanning conservation needs, photocopy status (if not original document)
  • Technical Metadata Recorded: EXIF standard data plus
    • Image Producer (vendor name)
    • OS version
    • Scanner or Digital Camera
    • Scanner/Digital Camera Software
    • Lens (if applicable)
    • Focal Length (if applicable)
    • Scene Illuminant (if applicable)
    • Sampling Frequency Plane (in this case it is direct capture)
    • Sampling Frequency Unit (in this case inches)

Scanning Information

  • Scanning Equipment:
    • BSLW Hasselblad H2D-39
  • Scanning Specifications:
    • Items measuring up to 10” x 13.5” scanned at 400 ppi, 24 bit color
    • Items measuring 13.5” x 18” to 18” x 24” scanned at 300 ppi, 24 bit color
  • Scanning Deliverables:
    • One set of unaltered original TIFF images on DVD
    • One set of cropped and de-skewed 24 bit TIFF images on DVD
    • One set of Macbeth scanned color charts for each scanning session
    • One set of text-searchable PDF files
    • OCR converted text (Raw OCR)

Rights & Permissions

Application & Web Presentation (to come)

  • METS
  • Lucene
  • SOLR

User Testing

  • (to come)