Interoperable global standard for“smart”e-Textbooks
2014-03-31MCCOY
B.MCCOY
(International Digital Publishing Forum (IDPF))
I High-level requirements for next-generation digital learning content
It is clear that monolithic digitized equivalents of print textbooks cannot meet emerging requirements of digital learning environments.E-Textbooks must ultimately be reconceived as dynamic collections of intelligent concept-based components,which may be orchestrated and integrated(with respects to assessments and analytics)at the publica-tion,course,and/or curriculum level.And such“smart”learning content must also be interoperable:deployable across a wide range of devices,and utilizable across disparate learning platforms.
All learning platforms require supporting instructional content,which is increasingly expected to be digitally delivered and consumed,rather than paper-based.In some cases this content may be similar to traditional textbooks,and may exist side-by-side with continued use of paper materials.In other cases this content may need to be integrated as a more fine-grained level,as collections of concept-based components through which students pursue customized learning paths to master an overall curriculum.In still other cases,the content itself may be the orchestration of the overall learning process,with integrated assessments and analytics and with the learning path(s)provided in the“smart”e-Textbook itself.
Digital learning content is often expected to be consumed online,and from a user perspective it will be increasingly important to not merely be guided through distilled courses but also to have the opportunity to connect out to broader sources of information.But at the same time,in the global context,occasionally-connected and limited-bandwidth computing is still the rule rather than the exception,so the ability to consume content offline remains important.
Compelling user experiences are critical to engaging and delighting learners and differentiated experiences will distinguish premium products,as much as differentiated content.And while content may in many cases become more granular,the fine-grained digital assets-pieces of artwork and video,paragraphs of text-are only the raw ingredients of the overall experience.As ingredients can be assembled in many ways into dishes and meals,so with digital learning content we need to think about the assets as well as about their orchestration into effective learning modules,courses,and curricula.Content needs to be well-structured in order to be meaningful as well as to be reusable in these varied contexts.
2 Learning platforms and interoperability
The concept of“learning platform”is problematic thanks to rapid evolution in the development of e-Learning environments and ongoing exploration of multiple pedagogies.Any ICT platform is an ecosystem,or shared architecture with multiple participants(users,implementers,and solution providers)[1].In some ICT vertical markets the boundaries and points of integration between platforms are relatively clear:for example,as between the PC hardware and PC operating system platforms.However platforms often exhibit overlap and in education environments this is particularly evident,with integration points that are not yet completely defined.Learning platforms may include Learning Management Systems (LMS’s),school-wide digital gradebooks and student portfolios,course-specific platforms provided by publishers or specialized solutions,and providers supporting massively open online courses(MOOCs).Particular platforms may supporttraditional teacher-student roles,completely online distance learning,and/or intermediate approaches such as“flipped classrooms”.The interactive book itself may be the primary platform or at least its manifestation to the reader.
In a learning environment education-specific platforms also intersect with larger platform ecosystems including scholarly and research platforms,library platforms,and general e-Commerce and device ecosystems.
A complicating factor is the natural desire of platform providers to achieve lock-in to their own proprietary architectures.Platforms often exhibit network effects,where the more adopters a platform has,the more others are incented to adopt it,which in turn makes the platform still more valuable to others.Platforms often also exhibit switching costs,which means that adopters of a platform will incur costs in migrating to another platform.And,proprietary platforms often exhibit implicit or explicit barriers to entry by competitors.While closed,proprietary platforms can innovate quickly,particularly in the earliest stages of development of a market,closed platforms also inhibit innovation because they do not facilitate competitive innovation and“lock-in”adopters.As a market matures,innovation theory suggests[2]that proprietary platforms will generally be replaced by interoperable modular architectures that can ultimately deliver more innovation and lower overall costs.The alternative scenario is a single commercial vendor dominating a key market with a closed proprietary platform to which customers are locked via switching costs,from which competitors are excluded due to barriers to entry,and with network effects that,while salutary,accrue primarily to the profit of the platform providers.By contrast interoperable open platforms are level playing fields open to all,customers are not“locked-in”and network effects accrue to the benefit of all participants.
Open vs.closed platforms is not necessarily a bright-line distinction.Cloud-services,for example,are often positioned as open,yet they can foster lock-in even if their user manifestations are websites that work across browsers and platforms and even expose APIs for 3rd-party extension:content and data can be locked-in to cloud-based vendor silos as easily to solutions that rely on custom applications on the client side.
It is intrinsically desirable that next-generation digital content be usable across platforms and devices in order to minimize costs of development and distribution and maximize availability and accessibility.Research has also demonstrated that higher levels of interoperability generally lead to higher levels of innovation[3].So,interoperability is a prima facie goal.
But in addition,interoperability of digital content is also an important strategic objective for the industry and society,so that intermediaries do not capture a monopoly position and increase costs at the expense of cost-effectively serving learners and adequately compensating the authors and publishers of premium content.
While multiple interoperable content formats could coexist,if the education industry can standardize on a single global format that would further reduce costs and increase con-tent availability and accessibility,particularly if fragmentation -a frequent bane of open architectures-can be avoided.
3 Standards for content interoperability
While not a complete answer to the need for platform interoperability for digital learning environments,an open standard for structured content is clearly a key building block.What format(s)are suitable to fill this need?
PDF is by far the most widely-adopted format for portable documents.However,PDF has a key architectural limitation:it is based on final-form page images that replicate paper.This renders PDF inherently unsuitable for delivering content to a wide array of device screen sizes and formats.As well the“typeset at the factory”nature of PDF makes it far less suitable to deliver accessible content,and makes remixing and other downstream reuse of PDF content much more difficult.Content increasingly needs to be able to be machine-processed for a wide variety of use cases including aggregation,search,discovery,data analysis,and automatic summarization.An arbitrary PDF file can be reliable divided only at page boundaries,and its internal logical structure is not reliably determinable.
In addition,while PDF over time added support for forms,scripting,and rich media these additions were proprietary and unique to PDF,and are not widely supported beyond the implementations of its proprietary vendor,Adobe,and have been repeatedly associated with security exposures.This speaks to a more general problem with PDF,that while it is not longer explicitly the property of its original developer,Adobe,it is not a collaboratively developed open standard and its de jure standardization has been something of a rubberstamp process.All in all,while PDF will clearly remain an important format particularly for print/prepress workflows,it is inherently unsuitable to the needs for next-generation interactive content,as evidenced by the fact that Adobe itself is utilizing another proprietary format of its own devising,rather than PDF,for interactive digital magazines.
The genesis of the Web was a delivery format for digital publications,originally physics papers.Websites using HTML and CSS quickly became a viable way to deliver hypertext content that,unlike PDF,was not limited to mimicking paper.However,HTML and CSS did not provide strong support for interactivity,rich media,or high-design content,and proprietary solutions including Adobe Flash and Microsoft Silverlight and Sun’s Java were delivered as browser plug-ins.Flash in particular featured strong cross-platform support and designer-friendly tooling,that could be used on websites(via plug-in),in installed applications(via solutions like Adobe AIR)and even for documents(via solution like FlashPaper).A number of interactive e-textbook solutions have been deployed via Flash.
However,these proprietary vendor-specific technologies proved challenging to scale across the wide array of emerging mobile devices and in the meantime the Web platform significantly advanced based on collaborative development by many parties.HTML5 inparticular came into being in large part to create an open alternative to these proprietary solutions and its core capabilities including video,audio,and animation are now supported by all modern browsers.CSS styling has also advanced considerably,supporting features such as multiple columns and custom fonts.And related Open Web Platform standards are also adopted on all modern browsers including SVG (scalable vector graphics),MathML(structured markup for mathematics)and Canvas(an API for 2D drawing from JavaS-cript).The modern Open Web Platform is still maturing,with tools and runtimes that in some ways have yet to surpass Flash much less equal that of native app development,but the pace of evolutionary change is rapid,with four browsers having large market share,thousands of tool and solution developers,and millions of Web developers,and Flash,Silverlight,and client-side Java are now viewed as niche/legacy technologies.
XML is a widely utilized format for representing structured document content that shares a common origin with HTML(and several Web Standards like HTML,SVG,and MathML support XML-based serializations).However,there are a wide range of XML-based publication schemas in use(e.g.DITA,DocBook and NLM).Given this proliferation of specialized schemas it would be challenging to proliferate a universal horizontal standard.Overall,XML is only a building-block format and generally more suitable for storage of textual assets“upstream”in content management systems than delivery of rich,structured content experiences.And XML has no inherent support for rich media,interactivity,styling,or layout and adding same to a custom XML-based format would“reinvent the wheel”of the Web.
It seems clear based on this overall landscape that the Open Web Platform is the appropriate foundational technology for“smart”e-Textbooks,as it supports rich media,interactivity,separation of content structure and layout styles,and the browser stack is becoming the ubiquitous cross-platform “experience delivery runtime”.However,online websites are not the only way content needs to be delivered,and arbitrary websites,while being more structured than PDF files in some ways,are less structured in others-a website for example has no clear linear reading order.And even the internal structure of the content elements making up a website may be obscured by JavaScript-based navigational affordances and other“web app”constructs,and further confused by client-server entanglements.And the very nature of the Open Web Platform is a somewhat amorphous,as over 100 distinct specifications are considered by W3C to comprise the platform[4],but there is no over-arching profile of how these specifications fit together,so developers can only proceed based on how they are integrated de facto in popular Web browsers.
The EPUB3 format[5]fills this gap by defining a reliable portable document abstraction for the Open Web Platform.EPUB specifies a concrete profile of the multiplicity of moving-target building-block Web Standards,and adds navigational structure,metadata,packaging,synchronized media playback,text-to-speech enhancements,and global language support.These features enable EPUB3 publications to have useful properties inclu-ding being accessible,usable offline,and remixable.
4 EPUB Background and Adoption
The International Digital Publishing Forum (IDPF)is a non-profit trade and standards organization comprised of over 300 organization members from41 countries.Our membership is drawn from the whole value chain of publishing including publishers,eBook retailers,software and device manufacturers,service providers,libraries,government agencies,educational institutions,and regional publishing associations.IDPF’s mission is to foster global adoption of an open,accessible,interoperable digital publishing ecosystem.
The IDPF has since 2000 developed a standard file format(originally OEBPS,since 2007 called EPUB)for eBooks and other digital publications based on XHTML,CSS,and related Web Standards.Unlike inherently pre-paginated formats such as PDF,EPUB is designed to adapt gracefully to different sized displays.With eBook adoption accelerating in recent years,EPUB has rapidly become the prevalent open standard format.All of the top“trade”book publishers in the United States now deliver a single EPUB file per title to all of their digital distribution channels.Many eBook retailers and device/software manufacturers distribute EPUB files to consumers,including Apple iBooks,Barnes&Noble Nook,Kobo,Google Play Books,OverDrive,Sony Reader,and many others.EPUB is also used as an interchange format:publishers submit EPUB files to vendors such as Amazon,who convert EPUB into proprietary distribution formats,and web-based service providers like Safari Books Online,who ingest EPUB content into their cloud database and give consumers browser-based reading experiences.
The latest version of EPUB,EPUB3.0,was approved by the IDPF membership in October,2011.EPUB 3 is based on HTML5,CSS3 modules and other elements of the modern Open Web Platform,enhancing layout options and accessibility and adding support for fixed-layout,rich media,interactivity,and global languages.
EPUB3 has been rapidly adopted in Japan and other markets that require global language support,and as the eBook market in North America and Europe migrates from E Ink based dedicated eReaders to tablets and smartphones is gradually replacing EPUB2.Among the general-purpose reading systems in the market with EPUB3 support are Apple iBooks,Kobo,Google Play Books,Sony Reader,Voyager Japan,and(via conversion)Amazon Kindle.Over two dozen organizations have formed an indpendent open source consortium,Readium Foundation,to pursue common implementation technology for EPUB3 for native apps and browser-based consumption.The Association for American Publishers(AAP)has formed an EPUB3 Implementation Project to migrate to EPUB3 for“trade”eBooks by Q2 2014.
In addition to use for general eBooks,EPUB 3 is being rapidly adopted for e-textbooks in North America,with the two largest providers,VitalSource and CourseSmart,both moving from proprietary formats to EPUB3.An IDPF workshop on the adoption and next steps for EPUB 3 and the Open Web Platform for education,EDUPUB,is taking place in Boston,USA on October 29-30 2013[6].
5 EPUB Development Roadmap
IDPF develops specifications through Working Groups that are open to participation by all IDPF members and Invited Experts.New working groups can be formed by interested IDPF members through a well-defined process.
Presently the umbrella EPUB Working Group is completing a minor revision to EPUB,EPUB3.0.1,that will be completed by late 2013,and is begining an initiative to develop interoperable annotations for EPUB(unfinished business from the EPUB 3.0 Charter),that aims to complete in1H2014.Separate working groups are developing modular specifications for indexes,dictionaries/glossaries,and advanced fixed-layout (supporting samples,selection between multiple renditions in a single.epub file,mapping between renditions and enhanced sub-page navigation).These specifications are also expected to complete by 1H2014.
Additionally there is already one compatible specialized profile of EPUB 3,for the Japanese market,and several others are likely to be developed in the next year.IDPF will establish an overall framework for coordinating such profiles and avoiding fragmentation.
6 Key Organizational Collaborations
W3C:IDPF has had an established liaison with W3C for several years and has collaborated particularly closely with the W3C CSS Working Group,with EPUB3.0development helping to reinvigorate W3C work on global language support features in CSS such as vertical writing and ruby and instigate new work such as CSS Regions and Exclusions.Following ajoint workshop with IDPF in February,2013 the W3C has formed a new Digital Publishing Activity with an interest group co-chaired by the CTO of IDPF,to work to improve base features of the Open Web Platform for publishers and enhance overall alignment of EPUB with that broader platform.W3C is co-sponsoring the IDPF EDUPUB workshop on29-30 October,2013,as well as an IDPF workshop on EPUB3 and the Open Web Platform on November 30,2013.
DAISY Consortium:DAISY focuses on accessibility for people with blindness and other print disabilities,and has developed a custom format,DAISY DTBook for accessible digital editions.DAISY and IDPF collaborated closely to ensure that EPUB3 was a superset of DTBook accessibility functionality and DAISY’s technology development efforts are now focused on EPUB3 as the mainstream accessible-enabled format.Accessibility mandates that presently specify DAISY content,such as the NIMAS in the USA,are expected to migrate to EPUB 3 over time.DAISY’s CTO is jointly appointed CTO of IDPF and DAISY’s CEO serves as the elected Board President of IDPF.
IMS Global Learning:IMS focuses on e-Learning standards.IMS and IDPF are collaborating on the necessary infrastructure to connect e-Textbook content and LMS environments,and IMS is an organizzational co-sponsor of the EDUPUB workshop..
IDEAlliance:IMS focuses on magazine industry standards.IDEAlliance and IDPF collaborated to ensure that EPUB3,0could support IDEAlliance PRISM metadata,and are presently collaborating on a potential digital magazine profile of EPUB3.
BISG (Book Industry Study Group):BISG focuses on best practices and education for the publishing industry supply chain.IDPF and BISG are collaborating closely as the BISG Content Structure Committee is developing a variety of work products related to EPUB3,including a field-guide to fixed layout EPUB3 and a database of EPUB3feature support in reading systems.
EDItEUR:EDItEUR defines metadata standards for the publishing industry including ONIX.EDItEUR and IDPF collaborated to ensure that EPUB 3,0could support ONIX metadata,and are presently collaborating together with DAISY on various accessibility initiatives,including an enabling technologies framework supported by the United Nations WIPO(World Intellectual Property Organization).
ISO/IEC:EPUB3is in the process of being approved as an ISO-level Technical Specification under the auspices of JTC1/SC34,based on a fast-track submission by the Republic of Korea(which has adopted EPUB3.0as a Korean National Standard).IDPF is granting permission for this and will participate as a liaison to a Joint Working Group being established that will initially include JTC1/SC34,ISO TC46,and IEC TC100/TA 10.A full International Standard is expected to be pursued in the future(pending stabilization of building-block W3C specifications such as HTML5 and CSS3 modules).
7 Conclusions
Publishers in general,including education publishers,need to deliver digital book content as packaged eBooks,websites,and native applications built with Web technologies.It is not economically viable for publishers to create and manage many distinct versions of each title.Therefore,it’s critical for publishers to have a consistent platform that enables them to,wherever possible,deliver a single publication format that can be utilized everywhere and–even where specialization is called for-to reuse tools and components across these different modes of distribution.
As well,notwithstanding the growing importance of mobile apps and application runtimes,a core strength of the Web lies in declarative document representations.Fundamental content structure is an area of central interest to the publishing industry,an interest which transcends how content is packaged and delivered.And in helping educational publishing migrate to digital learning materials,handling content that is both rich and structured will be particularly critical.
EPUB delivers on these core requirements and has been far more broadly adopted thanany alternatives.While there is still significant risk of fragmentation and divergence into proprietary silos,by universally adopting EPUB3 as the accessible,global standard,and working to further enhance it to meet education publishing industry requirements,we can ensure an interoperable environment for next-generation digital content.
IDPF solicits continued involvement of the global community in the process of advancing the collaboratively-developed EPUB open standard to better serve emerging best practices for next-generation learning environments and effectively deliver both the core capabilities and real-world interoperability needed to spark widespread global adoption of e-Learning and“smart”e-Textbooks.
[1] EVANS D.Invisible Engines:How Software Platforms Drive Innovation and Transform Industries(2006).
[2] CHRISTENSEN C.Disruption,disintegration and the dissipation of differentiability ICC(2002 11(5):9565-993.
[3] PALFREY J,GASSER U.Interop,the Promise and Perils of Highly Interconnected Systems(2012).
[4] W3Chttp://www.w3.org/QA/2011/01/100_specifications_for_the_ope.html.
[5] IDPF.EPUB3.http://idpf.org/epub/30.
[6] IDPF.EDUPUB.http://idpf.org/edupub-2013.