Converting documents from .pdf to .html

My books have been stored for a few years now in Apple Pages, converted from there to PDFs and published to my site.  I like the formatting and font choices of PDF, but it doesn’t work well on mobile devices for me (they look good on laptop screens but not mobile screens).  I am moving from my current formats to .txt—markdown—as their native format (for most documents) and converting that to .html to publish to my site.  I still have a few stuck as PDFs generated from LaTeX and a few now stuck in PDFs generated from Apple Pages (like TRANS!—it has formatting unable to move to HTML).  But moving forward I will write in .txt (markdown) and eventually I plan to convert my handful of old Pages and LaTeX documents to markdown or whatever I’m using in the future. My goal here is to not change my document format often (they’ve been in Pages—from Emacs to Google Docs—for a handful of years now) and also to not have to change my format much as the future approaches (I think markdown with a .txt extension will accomplish this).

At least one of my files (Brattleboro Stories) is too big for the conversion programs—I put in a request that they support files of 400,000+ words. I am losing smart quotes. I am changing line-based section separators to be three-dot separators. And I’m giving up double spaces between sentences—those will all be single-space sentence separators for now.

All that should make my books more readable than they have been.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.