The Importance of EPUB

Introduction

EPUB has become a fundamental technology for the global publishing ecosystem. It is the preferred format for a broad range of types of publications, and it is considered essential for accessibility. It has also become embedded in systems and workflows, not just as a distribution file format, but as the basis for content development and management workflows as well.

It is important to this ecosystem that the specificity, portability, and predictability provided by EPUB be maintained and advanced as a profile of the more general, flexible, and accommodating Web Publication format.

As the convergence of EPUB and Web Publications moves forward in the proposed Publications WG in the W3C, it is critical to the publishing ecosystem that EPUB 3 be maintained and refined in the meantime (which will be done in the EPUB 3 CG). It is even more important that the next generation of EPUB, currently referred to as EPUB 4, retain the specificity, portability, and predictability required by the publishing ecosystem while benefitting from the improved features and functionality offered by full alignment with the Open Web Platform as a profile of Web Publications and as a well-defined type of Portable Web Publication.

EPUB 4 must not be in conflict with Web Publications; it must be a type of Web Publication that provides the predictability and interoperability that this ecosystem has come to rely on.

Trade Books

The first and still the most common use of EPUB is for the distribution of ebooks. Because it has become so widely accepted in this space, it is now possible for trade book publishers to create a single EPUB file that can be provided to all the retailers and aggregators for whom they previously had to create separate versions. Although the biggest recipient, Amazon, still delivers to consumers a proprietary format, the single EPUB that a trade book publisher sends to the rest of its partners is also the preferred format to send to Amazon, where it is converted into their proprietary format.

The ability to send a single EPUB file to multiple recipients in the book supply chain is an important business requirement to publishers, removing significant friction and maintenance overhead to production and distribution workflows. That ability is based on the specificity and consistency provided by the EPUB format, removing ambiguity and unpredictability as files move between systems.

Although EPUB was used at first mainly for books with relatively simple formats—fiction and trade nonfiction—it is now used for almost all types of trade books, including books with complex layouts (e.g., cookbooks, travel guides) and books for which the graphics and page layout are essential to how the book “works,” such as many children’s books. As another example, EPUB has become the standard format for the distribution of e-manga in Japan.

Education

EPUB and the EPUB for Education profile are used not so much for distribution to the retail supply chain, but as a framework for the content infrastructures and platforms by which many large educational publishers develop, deliver, and disseminate their content to the learning management systems (LMS’s) and virtual learning environments (VLEs) used in the classroom.

While these implementations are essentially built on Open Web Technologies, this is an example of the added value that the EPUB format provides: an enhanced vocabulary, containing publication- and education-specific terms not available in HTML or WAI-ARIA; the ability to create a complex publication consisting of many documents, media, and interactive features as a single well organized entity; and the ability to extract ”chunks” of content (distributable objects) such as tests, quizzes, exercises, scripted components, etc. and distribute them as valid EPUBs as well. EPUBs used in education also have stricter accessibility requirements than those of the web in general, although those requirements are all consistent with WAI, WCAG, and ARIA.

The ability to create arbitrarily complex, interactive, and media-rich publications as consistent, coherent, identifiable entities is an important business requirement for publishers that the EPUB format provides.

Government and Corporate Documents

EPUB is also not just for book content. IBM, for example, has moved from PDF to EPUB as the standard format by which its documents are delivered. Japanese official documents are distributed as EPUBs. The EU Publications Office (EU OP) has created EPUBs for the extremely diverse set of publications it distributes—ranging from legal, parliamentary, and judicial documents to instructional and informational documents from the EU agencies in all countries of the European Union, in all the EU languages. The EU OP is a strong supporter of the continued evolution of EPUB and Web Publications because their mission is the wide and free distribution of content by all means possible throughout the EU. Finally, as an indication of how ubiquitous EPUB has become for document publishing, Google Docs now provides automatic export as EPUBs.

The ability to disseminate publications in a form that can adapt to any rendering environment, online or offline, in any orientation and dimension, and that is well understood and adopted throughout the world, is an important business requirement for publishers that EPUB provides.

Scholarly Journals

Because scholarly journals were early to see the benefits of digital distribution, the use of PDFs for journal articles became the norm years ago. This is a problem today because PDFs are not reflowable or sufficiently accessible. This situation is about to change: Atypon, one of the leading hosts of scholarly journal content—40% of the world’s peer reviewed journal literature is on their Literatum platform—has announced that its next release, coming later in 2017, will create EPUBs as a standard output, requiring no changes to submitted content on the part of the publisher. This will suddenly make it possible for literally millions of journal articles to be available as EPUBs.

The ability to automatically generate a reflowable file that renders adaptively, online or online, in a web-conformant format, from arbitrary source files such as the NLM/JATS/BITS XML format universally used in scholarly journals, is an important business requirement that EPUB provides.

EPUB as Embedded Technology

As a further indication of how ubiquitous EPUB has become, it is part of iBooks, which is embedded in iOS. This means that all users of current iOS devices can render EPUBs natively. Similarly, EPUB is natively supported by Google Play, which is available in Android. Even more significantly, the late beta of Microsoft Edge incorporates EPUB natively in the browser, as does the Windows 10 Creators Update. These are all indications of how fundamental EPUB has become—and how close it is to supplanting PDF as the default publication viewing format.

The ability to create and disseminate publications in a format that renders natively in browsers, authoring environments, and other widely used systems is an important business requirement that EPUB provides.

Accessibility

The publication of EPUB Accessibility 1.0 in January was a watershed event in the publishing ecosystem. This provides the long-needed “baseline specification” for what is meant by “an accessible publication.” Based on and fully conformant with all Web accessibility guidelines, EPUB Accessibility provides publication-specific requirements that will enable the creation of authoritative, referenceable specifications for use both in legal contexts and in procurement documents, especially in government and educational contexts. It also provides the basis for accessibility certification, which is actively being developed by the DAISY Consortium under a Google Impact Grant. EPUB is now widely preferred as the format for the distribution of accessible content.

The ability to create accessible publications not in a separate, purpose-built form based on remediation of standard publication formats, but to make the standard publication formats, created by standard publishing workflows, natively accessible is an important business requirement provided by EPUB.

Why EPUB 4?

The convergence of the Web and Publishing—which is the main motivation behind the main motivation behind the recent combination of the IDPF into the W3C—means that future publications will be able to make use of all the features available on the Web, and can produce publications that can be displayed, without any specific actions, in any Web browser. This evolution is essential for some of the aforementioned publishing areas like publishing educational document or scholarly journals and books. This evolution leads to the concept put forward by the recent work at W3C and now planned to be a core development for Publishing@W3C—Web Publications, and its subset, Portable Web Publications.

Web Publications need to be able to use any and all available web technologies, whether online or offline. The Web Publication format needs to be extremely accommodating and agnostic. For example, when a Web Publication is packaged, it must be possible to use any packaging format available on the Web, now or in the future. And the Web Publication specification needs to align completely—down to the specifics of “may,” “should,” and ”must”—with the Web in general.

However, the publishing ecosystem requires specificity, portability, and predictability that may mean, in some respects, limiting such choices and requiring things that may not be required by Web Publications in general. For example, while a Web Publication may be packaged in any valid way, it is useful for the publishing ecosystem to know that all EPUBs are packaged in a certain way (e.g., as a .zip). Likewise, the Web does not require WCAG AA conformance; this is only recommended for web content. EPUB 4, on the other hand, may require WCAG AA conformance.

The recognizable and widely implemented EPUB format can, and should, continue to evolve. But it is important for its identity as a specific type of Web Publication, which provides the specificity, portability, and predictability required by the publishing ecosystem, to be maintained in its next, fully Web conformant, generation.