Access and Feeds

File Formats: Long Term Archival with PDF/A

By Dick Weisinger

The PDF file format created by Adobe is almost universally used when sharing fixed-format documents. Adobe made the PDF file format publicly available as an open standard in 2005, and the format is under the control of the ISO committee.

PDF/A (“A” for long-term archival), is an ISO standard file format for storing documents in a format that can be permanently preserved.  Technology changes rapidly and file formats quickly become outdated and not supported. The intent is that PDF/A should be a standard open format that allows the viewing of the documents in their original format for a long time into the future.

The history of updates  to the PDF/A standard are as follows:

  • PDF/A-1 [2005] – PDF release based on PDF 1.4 format
  • PDF/A-2 [2011] – Added support for JPEG 2000 compression, transparencies, embedding of OpenType fonts, and digital signatures
  • PDF/A-3 [2012] – Specifies how to embed arbitrary file formats, like XML, CSV, spreadsheet and CAD, into a PDF/A document.
  • PDF/A-4 [due 2018] – Updates to make PDF format consistent with PDF 2.0 format.

But a report by the NDSA was critical of the use of PDF/A-3 format. “The introduction of arbitrary embedded files to PDF/A-3 introduces significant concerns for memory institutions. A PDF/A-3 file may have any other type of file embedded within it and the only statement that the standard makes in relation to preservation intent is that a compliant PDF/A-3 reader should not render embedded files, but merely support their extraction. The standard is silent as to whether the embedded content may be considered essential to full understanding or use of the primary document whose visual appearance is preservable. The result is that, accepting a PDF/A-3 file without additional rules or active negotiation, may lead an archival institution to acquire embedded content in a format that it did not expect and cannot deal with and whose relationship to the primary document may be unclear.”

 

 

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)
One comment on “File Formats: Long Term Archival with PDF/A
  1. Julien SERVAJEAN says:

    So what is the recommended way to deal with that issue ? What do you mean by “active negotiation” ?

Leave a Reply

Your email address will not be published. Required fields are marked *

*

4 + 3 =