|
The process of converting these texts to HTML can be an arduous one. Even if we already
have a copy of a novel in plain text form, it is not merely a matter of adding some
<P> tags to it.
Depending on the complexity, quality and length of the text, it can currently take as much as a
week to convert a single novel. Adding paragraph tags is just one of the many processes a
single text goes through before it is ready for display.
Some of these are:-
- Adding blockquotes
- Adding non-breaking spaces between sentences
- Removing surplus space characters
- Reconstructing words split across lines
- Adding emphasis
- Converting special characters
- Other formatting
- Splitting into chapters
- Creating a table of contents and links
- Proofreading
We are looking into ways to automate some of the more repetitive tasks, but this is complicated
by the many forms that plain text files, and indeed the original novels, can take.
Emphasis, for example, can be shown by CAPITALS, *asterisks* or _underscores_ in
amongst the standard text. This is not always consistent - even if from the same source or in
the same document.
Checking against a printed edition of the book is advisable, but not always possible.
Even formatting the text into paragraphs is not without some difficulties. For example, a
block of text that forms a paragraph would usually have <P> at the start and
</P> at the end. If however it was a verse from a song or poem, there would be
additional <BR> tags at the end of each line.
Some novels include verses between paragraphs, and telling one from the other needs more
intelligence than the average text to HTML utility has.
Why are we telling you all this? Well, there are several reasons:-
- So that you realise we won't be producing texts very quickly at first
- So you realise how much effort is involved
- In the hope that you will credit us if using our texts elsewhere
- In case you know of any shortcuts
|