I am considering writing an Android reader software that can read ePubs and display them. I checked the ePub standard documents. However, these contain a lot of information. So I am wondering what is the process of implementing a standard for a file format. What are the steps to get a working implementation without passing by parts of the standard? Are there any best practices?
Also, is it even possible to program this alone in a reasonable time?
From what I have already found out, ePub is basically a zip archive. That means I could probably use zlib to decompress it. The content is in XHTML and CSS, so I believe it should be possible to display it in a WebView. The parts that are missing are writing the code that can read the metadata and manage the non-standard XHTML extensions.
1
Let’s try to answer your question a point at a time.
-
I’m note sure to understand your question (I’m not an english mothertongue speaker…). If you are asking what is the process used to define a new file format like ePub, then the answer is: gather a few seasoned specialists around a table, feed them with hamburgers and problems and wait for a document. If you are asking how a working ePub reader is developed to serve as a reference implementation, then the answer is: have the specialists write one together with the reference document. Most of times, this implementation actually comes sooner than the reference document. If you are asking how to implement your own reader, see below.
-
Yes, there are Best Practices for developing tools for XML/HTML and ePub. They are mostly the same used to develop any other kind of software. Any good book describing the BPs used in your preferred language (Java on Android) can give you all the info you need. To see which BPs are actually used in this specific field, you have to explore the world of XML and XML tools (parsers, converters and so on).
-
Yes, it is surely possible to write an ePub reader by yourself in a reasonable amount of time. As you have already seen by yourself, a ePub reader is actually just a program that decompresses a zip file in a temporary storage area (in memory or on disk) and shows the pages in a web browser (WebView in Android) starting from index.html. Just do not try to (re)write the web browser yourself… Writing an ePub editor would be a completely different story.
-
Well, XHTML is actually an “application” of XML. There is nothing into XHTML that is “non-standard”. There could not even be. XML is extensible by definition. It can accomodate every new tag or structure. Sure, there is a part of the ePub document that is not strictly related to XML and does not comply with XML standard but it is not actually non-standard. It is just another type of data. As a consequence, you can use any standard XML tool to deal with the XML/XHTML part of ePub.
By far, the best way to write your own ePub reader is:
-
Find some high-level description of the file format (even the ePub wikipedia page should be enough)
-
Study some existing, open source reader’s code. Just Google around for “java or android open source epub reader”.
-
Try to reimplement the program functionalities one at a time (and keep asking for help/info).
Good luck.