Access and Feeds

File Formats: The Controversy Of PDF/A

By Dick Weisinger

The PDF/A file format is a standardized version of PDF that specially designed to be able to archive and store electronic documents for long periods. The format covers the color management, embedded fonts and annotations for PDF/A file viewers.

The goal is that documents stored in PDF/A format will still be readable and true to their correct formatting tens and even hundreds of years later.

But some argue that it is unrealistic to expect a file format to be able to survive that long of a period. Technology changes quickly, and often in just a few years, data and expectations for how they should be rendered change. Some problems cited with the PDF/A format include layout, searchability, support for multiple languages, and redaction.

For example, Marco Klindt, researcher at the Zuse Berlin Institute, said that “Even today, with the internet, the expectations of how to access information, how it’s organized, structured, and connected to other pieces of information of relevance are different from the common practice of just some years ago.”

Jakob Nielsen, usability expert, wrote that “PDF is good for printing, but that’s it. Don’t use it for online presentation.”

But despite the controversy, at this point, PDF/A doesn’t have a challenger offering that can be a viable archival file format.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)
3 comments on “File Formats: The Controversy Of PDF/A
  1. Duff Johnson says:

    Neither Klindt nor Nielsen’s comment speaks to the ability of PDF/A files to “survive”. There’s no real controversy on that point at all.

    Whether the information contained in a given PDF/A file is useful to a given downstream user is a different question, and doesn’t pertain to the choice of format but choices made in authoring the file.

    Likewise, and if chosen by the author, PDF/A can be searchable, support multiple languages and may be readily redacted. Is there another option, even in principle, that has these capabilities?

    • dweisinger says:

      You’re right, and as noted in the post, PDF/A is not perfect, but it may be the best thing going right now. Decades from now it’s likely that all of today’s file formats, including PDF/A, will be viewed as antiquated.

      • Duff Johnson says:

        Thanks for the reply. You’re certainly right that today’s PDF/A will one day seem antiquated… but the specification advances to meet current needs.

        What I don’t see as ever becoming antiquated is the value of fixed, self-contained content. If that’s true, then PDF/A’s purpose will live forever.

Leave a Reply

Your email address will not be published. Required fields are marked *

*