This is the second post on Pelican. Its not exactly in the order I’ve been doing it, but its easier than jumping back and forth.

Now that we have a generic and functional pelican setup, its time to migrate the content from wordpress. The first step of converting the content is easy and automated. It gets it about 95% of the way. It will generate your Markdown or reStructured Text files for you. It won’t pull in pictures, and if you’re using any wordpress plugins to display pictures, it won’t do anything with that either. I’ll go over what I ended up doing.

Initial WP Conversion

We’ll be following the documentation on the Peilican Docs site. I decided that I wanted to use Markdown, because I’ve been using it for documentation other places. With a small bit of trial and error, I arrived at the following pelican-import command:

pelican-import --wpfile -m MARKUP -d /path/to/output /path/to/wp-export.xml

This provided a dump of Markdown files in my output directory. Not every post converted smoothly. Some of them were drafts and got converted as empty documents. Others had odd names. Minor fixing.

Images

Posts that have regular embedded images get converted properly. You may not like the way it formats things (it creates a super long line of code instead of generating a reference link). I found a collection of Markdown services for OS X, one of which is ‘Links - To References’. It iterates over the content of the document and fixes it.

If you have some wordpress plugin that produces a gallery, you’ll have to convert that manually. I’m happy with the ‘photos’ plugin (available from github as part of the pelican-plugins repo. If you want to display the images, the plugin already has integrated support for the Magnific Popup javascript library. The ‘photos’ plugin requires a couple of directives to be added to the pelicanconf.py, and then place the images into the appropriate directory. The pelicanconf.py directives are:

PHOTO_LIBRARY = "~/Pictures"
PHOTO_GALLERY = (1024, 768, 80)
PHOTO_ARTICLE = ( 760, 506, 80)
PHOTO_THUMB = (192, 144, 60)

These are documented in the Readme for the plugin. Once you have that all setup, you can simply add the following metadata line to the upper portion of the Markdown file:

Gallery: {photo}/path/to/gallery

If you want to just inline an image, you can place this anywhere in the content of the document:

{photo}/path/to/image.png

Most of my posts use inline photos rather than galleries, mostly because for a given graphic, there tends to be some amount of verbiage that goes with it, and its difficult to arrange things when its with a gallery.

Other plugins

Other plugins that I’ve found useful include:

  • assets - This plugin allows you to use the ‘Webassets’ module to manage assets such as CSS and JS files.
  • better_code_samples - This plugin wraps all table blocks with a class attribute .codehilitetable in an additional div of class .hilitewrapper. It thus permits to style codeblocks better, especially to make them scrollable.
  • better_codeblock_line_numbering - Pelican uses Python’s built-in code highlighting extension when processing Markdown. This extension, called Code HiLite, can add line numbers to any code that is enclosed in triple backticks (and, by default, has a shebang as a first line)
  • extract_toc - A Pelican plugin to extract table of contents (ToC) from article.content and place it in its own article.toc variable for use in templates.
  • gzip_cache - The gzip_cache plugin compresses all common text type files into a .gz file within the same directory as the original file.
  • liquid_tags - This plugin allows liquid-style tags to be inserted into Markdown within Pelican documents. Liquid uses tags bounded by {% ... %}, and is used to extend Markdown in other blogging platforms such as octopress. * liquid_tags.graphviz - This is a sub-plugin that allows inline graphviz text to be used to generate dynamic graphics.
  • liquid_tags.b64img - This is needed because both the graphviz and blockdiag modules generate inline base64 images.
  • liquid_tags.diag - This is a sub-plugin that allows inline blockdiag text to be used to generate dynamic graphics.
  • neighbors - This plugin adds next_article (newer) and prev_article (older) variables to the article’s context.
  • optimize_images - This plugin applies lossless compression on JPEG and PNG images, with no effect on image quality. It uses jpegtran and OptiPNG. It assumes that both of these tools are installed on system path.
  • pelican-open_graph - This plugin adds Open Graph Protocol tags to your articles.
  • pin_to_top - Pin Pelican’s article(s) to top “Sticky article”. It is useful when you want to publish new articles while keeping one or more articles at the top of your articles list.
  • post_stats - A Pelican plugin to calculate various statistics about a post and store them in an article.stats dictionary
  • render_math - This plugin gives pelican the ability to render mathematics. It accomplishes this by using the MathJax javascript engine.
  • series - The series plugin allows you to join different posts into a series.
  • sitemap - The sitemap plugin generates plain-text or XML sitemaps. You can use the SITEMAP variable in your settings file to configure the behavior of the plugin.
  • sub_parts - Use sub-parts to break a very long article in parts, without polluting the timeline with lots of small articles. Sub-parts are removed from timelines and categories, but remain in tag and author pages.
  • summary - This plugin allows easy, variable length summaries directly embedded into the body of your articles.
  • tipue_search - A Pelican plugin to serialize generated HTML to JSON that can be used by jQuery plugin

*[reStructured Text]: reStructuredText is a file format for textual data used primarily in the Python programming language community for technical documentation. - Wikipedia *[Markdown]: Markdown is a lightweight markup language with plain text formatting syntax designed so that it can be converted to HTML and many other formats using a tool by the same name. Markdown is often used to format readme files, for writing messages in online discussion forums, and to create rich text using a plain text editor. - Wikipedia