Accessibility of PDF documents and misconceptions about PAC

Many providers make their digital content available as PDF documents. Therefore, it is important that they are accessible so that as many users as possible can use them. However, the PDF format often appears to be more 'bulky' than other formats, because it is difficult to implement them in an accessible way and due to the misunderstandings that still exist around PDF accessibility in particular. What criteria accessible PDFs must fulfill and how this can be tested will be explained below.

What is the PDF Accessibility Checker (PAC)?

The PDF Accessibility Checker (PAC) is a tool used for automated accessibility checking of PDF documents. While using PAC is a good way to test, it does not replace a manual review of PDFs. The requirements and how PAC can be reasonably used will be described below. The current version of PAC can be found at https://pdfua.foundation/en.

What is required for accessible PDFs?

Regardless of the hope for other, possibly more accessible formats (e.g. EPUB), a large amount of online content continues to be available primarily in the form of PDF documents. For these, as for all other digital formats, in Switzerland as in many other countries, the requirements of Web Content Accessibility Guidelines (WCAG) apply. In Switzerland, the eGov standard eCH-0059 requires level AA compliance with WCAG 2.1 for public sector PDF documents.

The main requirements that accessible PDFs must meet are the following (see also Federal PDF accessibility requirements; link, retrieved 05/01/2023):

  • the PDF has a meaningful document title;
  • the document has a tag structure with correct semantics: Paragraphs are correctly formatted as separate paragraphs; visual headings are also semantically formatted as such, with correct hierarchy or logical structure; bulleted lists are formatted as lists; tables have column and/or row headings, etc.
  • language declarations of the entire document, and of parts in other languages, if any, are correctly defined;
  • the reading order is logically correct, so that screen readers, for example, do not output incomprehensible confusion even when texts have multiple columns;
  • informative graphics have correct text alternatives;
  • the document has sufficient contrasts (font-background reaches at least 4.5:1 for normal text, 3:1 for large text);
  • Links are keyboard-operable, i.e. can be reached by tabbing.

It is best to take all these requirements into account already in the primary format (e.g. in Word or InDesign), which is often also quite easy to do (e.g. by using style sheets correctly in a word processor, by depositing alternative texts, by specifying meta information in the document, etc.).

In the next step, conversion to PDF, it is important that the preliminary measures taken are transferred to the PDF. For example, a ‘print to file’ should be avoided, because this can cause the semantic structure to be lost. In addition, the tag structure from the word processing software must be transferred to the PDF.

Post-processing in the resulting PDF document is a stopgap measure and should be avoided. For example, there is the problem that every time the document is adjusted in the primary format, the post-processing in the PDF is required again. However, if only the PDF is available, if aspects are involved that cannot be defined in the primary format, or if the authors lack the necessary basic knowledge, then there is often no alternative. But PDF production processes that are sustainably geared towards accessibility look different.

Ideally, PDF documents should be accessible and their content should be accessible to as many users as possible, including those with various forms of disability.

The sensible use of PAC when checking PDFs

Website operators who are faced with an extensive legacy of PDF documents, or clients who receive new PDF documents from agencies, usually do not know whether these documents are accessible. This is where PAC comes into play, and the following fundamental testing process is well supported by this tool:

  1. Test whether the PDF contains correct metadata (document title, language declaration) – the result is immediately displayed in the test summary;
  2. Determine whether a tag structure is present – the result is also immediately displayed in the test summary: if there is no tag structure, the test can be terminated, as the document cannot be used efficiently with a screen reader due to the lack of structure; if the structure is simple (e.g. single-column, without tables, etc.), which is often the case with documents from Word, etc., then the contents are at least sequentially readable and understandable;
  3. If a structure is present, it is necessary to check in detail whether this tag structure is correct, using the ‘Screen reader preview’ function: in direct comparison with the original document, you can visually verify whether visual headings have been correctly converted as headings (at the correct, logical level), lists as lists, paragraphs as paragraphs, tables with column and row headings (th), etc.; it is also important that all content is also found in this structure and that the reading order is correct, and that text alternatives for graphics are recognisable.
  4. Contrasts: PAC also checks contrasts quite reliably under the ‘WCAG’ tab;
  5. Keyboard operability: PAC can provide information for this test point, even if experience shows that this test is still rather unreliable. It is best to test directly using the keyboard in a commonly used PDF reader.

The third step in particular, the ‘manual visual check’ of the logical structure of the document using the screen reader preview, is crucial and is often the most important test step. Many documents have a structure, but it can be completely wrong (e.g. there are only paragraphs and no headings or similar). However, due to the lack of automated checkability, PAC cannot criticise this clearly enough. Therefor, non-professionals shouldn’t give too much weight to the automatically generated test result in PAC (the ‘red crosses’) – at least in the ‘PDF/UA’ tab, as the requirements placed on PDF documents are mostly the WCAG, and not PDF/UA. And the effort involved in solving the issues is often disproportionate to the actual gain in practical accessibility of a PDF document.

Misconceptions about the use of test tools in general, and about PAC in particular

Automated testing of the accessibility of digital content is currently limited. Even providers of such tools usually do not speak of more than a quarter of the WCAG requirements that prove to be reliably testable. In view of the large number of potential WCAG violations on a complex system, this can be very helpful. And warnings about potentially identified violations can indicate the need for manual review by experts. This assessment of the limitations of automated accessibility testing currently seems to be quite widely accepted.

However, there are significant misunderstandings about PDF documents in particular. In many cases, ‘PAC conformity’, i.e. the automated test using the PDF Accessibility Checker PAC, is defined as the standard for PDF documents, even though it doesn’t reflect the actual accessibility of a document! This means that the automated test result of a single testing tool is simply accepted as proof of the accessibility of documents, which is completely wrong.

This is because automated testing is prone to errors and any testing tool can be fooled. For example, we often come across PDF documents that have 0 errors in the automated PAC check but are completely inaccessible. In many cases, the logical structure of the content (paragraphs, headings, lists, etc.) is incorrect, making a document no longer efficient or even understandable using a screen reader. But PAC cannot test this due to the lack of text comprehension. However, if clients only consider this parameter and agencies only work towards ‘PAC conformity’ and no longer towards accessibility, it becomes difficult to achieve good practical accessibility or WCAG compliance.

Conclusion

PDF accessibility remains relevant, and the basic requirements for accessible PDFs can usually be implemented with reasonable effort. However, the objective must be practical accessibility and WCAG compliance, and not conformity with a single automated testing tool with limited reliability. With this understanding, the PDF Accessibility Checker PAC can be used sensibly, especially to support the visual testing of the document structure.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.