As an author of stories for language learners it’s always important to me to create materials that are both fun and useful on a day-today basis. So perhaps it will not come as much of a surprise that many of these works are based on my own experiences (and frustrations) of learning languages myself.
A couple of weeks ago, for example, in an effort to bolster my Hebrew reading-comprehension skills, I was delving into an Ephraim Kishon story and really wished that there was a way to have the text read aloud to me while I was parsing it. As you may know, modern Hebrew is written without vowels which can make the pronunciation of unknown words really tricky sometimes. This paragraph would read something like this:
cpl f wks g, fr xmpl, n n ffrt t blstr my Hbrw rdng-cmprhnsn sklls, ws dlvng nt n phrm Kshn stry nd rlly wshd tht thr ws wy t hv th txt b rd ld t m whl ws prsng t. s y my knw, mdrn Hbrw s wrttn wtht vwls whch cn mk th prnnctn f nknwn wrds rlly trcky smtms. Ths prgrph wld rd lk ths:
Fun times, right?
The Bewildering World Of Talking Ebooks
Since I couldn’t force the publisher to integrate narration into the text, I decided to look into various ways of creating something like this for my own learning materials, since I have both text and the audio (each published separately) for many of my books. Ideally I wanted something where the ebook would mark each sentence as it was being read aloud and also allowing for tapping on certain sentences to hear them read.
Now, some of you might say: but Amazon already has something like that. No need to reinvent the wheel!
And it’s true. If you own the Kindle edition of a title and also have the Audible edition, Amazon will automagically bring them together as part of a feature called “WhisperSync for Voice”. Since most of my books are available on both Kindle and Audible, I thought it to be no big feat to just bring the two together.
Turns out that
a) Amazon doesn’t make this available for all titles,
b) the conditions for making it available are really opaque
Apparently the Kindle edition and the Audible edition need to be word-by-word duplicates, which makes sense since the app has to align the audio with the text but really is a bummer for books such as mine that contain extra text like vocabulary and exercises.
So Amazon and its Kindle devices & apps unfortunately were off the table.
Next, I started looking into the open EPUB standard, which literally everyone (except Amazon!) uses, and lo and behold – EPUB3 supports this feature called “Media Overlays”. I built a quick prototype (with the help of Alberto Pettarin‘s excellent guide), tested it on Windows Adobe Digital Editions and it was working!
Unfortunately, when I put the resulting EPUB on my iPad and tried to open it with iBooks, the narration was *poof* gone, non-existent. After a bit of research I found out that Apple apparently only supports narration for “fixed layout” EPUBS, not reflowable texts. Why, Apple?! The reflowable nature of ebooks is what allows for font-resizing and other customizations, whereas fixed layouts are like glorified PDFs, i.e. the amount of words and their format on each page are set in stone.
Open Standards For An Open World
So two of the biggest ebook ecosystems on the planet, Kindle and iBooks dropped out of the race before I had even really started. No biggie, right? But I liked the concept too much to just give up on it. I simply had to find a way to work around the giants and put this book into the hands of readers directly.
In other words, I had to come up with different user scenarios and find a solution which is the most comfortable for casual readers. So, Adobe Digital Editions works with EPUB3 overlays on PC, for example, but not on Android or iOS!
In the end I found an excellent app called Menestrello, developed by Alberto Pettarin, which runs perfectly on iOS and Android and thus covers most of mobile devices (see updates below) . On Chrome there’s the excellent Readium extension, which catches most laptops and desktops.
However, I still wanted something simpler, something that people could just use immediately, without downloading or installing anything. That’s when I found that the same people behind the Readium extension (update: discontinued) also made an open-source web-app for embedding ePUBS directly in your browser.
So that’s what I settled on in the end:
- offering the EPUB as a direct download
- embed it in a “cloud reader” for immediate enjoyment
It’s been more than 2 years since I wrote this blog post. Here’s what has changed since then:
- Menestrello for iOS is no longer available (use Cloudshelf Reader as alternative Update: discontinued. see here for alternatives)
- Menestrello is no longer available on Google Play either (but you can still get the APK from the developer)
- I’ve created two more TalkingBooks, for Ferien in Frankfurt and Karneval in Köln
Put simply, EPUB3 audio support has become even less commonplace in apps since 2017 (especially due to the loss of Menestrello). So while Amazon is continuing to loop more and more people into its proprietary WhisperSync for Voice program, the open source world in general and EPUB in particular is surprisingly lacking in innovation.
This seems like such a basic feature, and EPUB3 has been supporting it for years, at least technically. It’s easy to complain about Amazon, but why aren’t other platforms building open source alternatives? All the tools and foundations are there. Honestly, it’s mind-boggling. Personally, I just love this feature too much to give up on it. Especially for language learners this is so helpful.
Currently I’m looking into ways to develop my own simple Android and iOS apps based on Readium, but this is all going to take a lot more time. Fortunately Readium Web has been going strong ever since and it just runs directly in browser (for some reason on iOS it will not work in Chrome, only Safari).
For my latest TalkingBook I’ve finally used Tobi, “a free, open source, multimedia book production authoring tool” to iron out some issues where phrases were chopped or not aligned perfectly. Tobi is really helpful for making quick adjustments to phrasings. And if you don’t have any audio narration yet for your text you can record and edit directly within Tobi. It hasn’t replaced Aeneas for me, but has become invaluable for improving the results provided by the former.
It’s been another 2 years since last update, I just released a new TalkingBook for Ahoi aus Hamburg and I just wanted to take a few minutes to talk about the state of EPUB3 audio-overlays in the year 2022.
- For this edition I relied solely on TOBI, skipping aeneas entirely, which made development much speedier (and less technical).
- Since the demise of both Menestrello and Cloudshelf Reader, iOS support for these types of books seems to be getting slimmer and slimmer. The only two apps that I’ve found so far on iOS which support EPUB3 with audio overlays are Adobe Digital Editions (AppStore Link) and Kobo Books (Appstore Link). Kudos to Kobo for allowing sideloading EPUB3 with a full feature set.
- Thorium is probably the best way to read and enjoy TalkingBooks on desktop at this point. Cross-platform compatibility (Linux, Mac, Windows), a beautiful clean UI and open-source.
I’ve been using Thorium extensively in this development cycle for proofing, finally replacing the (eternally buggy) Adobe Digital Editions (which for whatever weird reason severely degrades audio quality). In a perfect world, Thorium Reader would also exist on Android, iOS. EDRLab, the developer behind it, does offer R2 Reader iOS, but for whatever reason it doesn’t seem to support audio. (update: looks like it’s on the roadmap) Also, an integrated lookup/dictionary function would be awesome, but it doesn’t seem a priority at this point.
My own attempts of creating an iOS/Android app around the Readium have not yielded anything substantial yet, but I’ll keep on looking into it as time allows. If you’re an iOS and/or Android developer with EPUB3 experience and would like to collaborate, just let me know.