|
 |

Digital Library Data Services

Parity provides comprehensive services for migrating and transforming
publisher content into digital libraries and related systems.
Parity provides turnkey solutions and handles all requirements that arise
from content collection through delivery of final high-quality, comprehensive
data products. For each project, Parity can supply all required project
design and project management. Parity's services are tailored for each customer,
usually based on some mix of the services described below. Data
Cleansing and Deduplication
When integrating diverse data sources, or working with content data
that has been developed over a long period of time or with diverse methods,
the data sometimes does not adhere to a uniform standard of quality, consistency,
and completeness. Parity's data cleansing services can be used to raise
all content up to a high uniform standard of quality. Services include
inspection and correction, completion of missing data, establishing consistent
relationships and ID schemes, and sophisticated de-duplication for databases
that may contain duplicate records.
Scanning and OCR
Parity Computing has extensive experience in producing high-quality
page scans at a low cost. Strict quality control ensures uniformly clean
images. Pages are collated into documents, integrated with other document
sources (scanned or non-scanned), processed as required (OCR, text extraction
etc.) and converted to a uniform target resolution and format. Parity
has experience with handling varying page quality across large collections
of publications spanning many years. Scanned and collated documents can
be delivered as PDF or other industry- standard formats, with a variety
of options for file size and image quality. Parity has produced digital
libraries incorporating hundreds of thousands of pages of scanned documents,
with consistently high quality.
Metadata Extraction and Tagging
Parity Computing's extensive experience in the design and usage of metadata
structures incorporates detailed domain understanding and specific customer
requirements. Parity has developed a scalable, flexible, high-accuracy,
low-cost workflow for metadata capture. Integrated metadata databases
are built from available paper-based and electronic sources for periodicals,
conferences, books, and other publications. These include metadata at
the publication, volume, issue, section/chapter, and article levels, with
metadata structures adapted to the requirements of the content and customer.
In addition to standard title, author, pagination, etc., additional metadata
such as abstracts, author affiliations and biographies, subject keywords,
bibliographic reference links, and other available data can be captured
through Parity's automated processes. Parity has created hundreds of thousands
of comprehensive metadata records under rigorous schedule and quality
constraints.
|
 |

|