Moving from Wordpress to Hugo

grahams - - 5 mins read

Back in the last millennium, my website was pretty basic, as most were. To be honest, I don’t even really remember what was there beyond some links to projects and other sites. In June 2000, I joined LiveJournal and began a decade or so of being clinically online and posting. In the back half of the ’00s, I started to get more of a life and posted less frequently. By 2010, after 10 years, I decided LJ was quickly becoming an abandoned alleyway on the internet and moved to self-hosted Wordpress.

This seemed like a good idea at the time, and as I met needs for additional sites (the film club and radio show as examples), I continued to turn to Wordpress as the tool of choice. But in the past few years, it seems like a massive amount of software to support a handful of sites that get nearly no (legitimate) traffic and are updated, at most, once a week.

Replacing Wordpress with a statically generated site had been on my radar for awhile now. The complexity of Wordpress, combined with keeping it updated, maintaining a database, redis, etc, all was massive overkill, with little upside. And since I was laid off a few weeks ago and have some newfound time on my hands (hire me!), it seemed like the time had finally arrived.

I decided on using Hugo to statically generate these sites. I can’t say that I engaged in a thorough amount of due diligence before choosing it, but it seemed popular and I’m not looking for something niche.

I poked around for migration tools and settled on wp2hugo. For several of my blogs, it just worked. But, for example, the radio show blog was using Mixcloud embeds, so I had to hack on wp2hugo to make sure it was extracting the Mixcloud links and sticking a hugo shortcode for them into the posts.

Another blog I had used PowerPress podcast plugin for Wordpress… This was trickier to figure out, since the set of people that use Wordpress, use PowerPress, and want to migrate to Hugo can’t be that wide. Unsurprisingly, wp2hugo didn’t support it at all, and I really dreaded hacking support into wp2hugo for something I would only need once. Did I mention I don’t really know Go at all?

I decided to be like all the cool kids and use Cursor to see if their coding agent could help me get closer to the finish line. I opened the wp2hugo repo in Cursor, and my first idea was to ask it “where is wordpress postmeta handled”. It “went away” for awhile and came back with a nice report for me.

WordPress Postmeta Handling in wp2hugo

WordPress postmeta is primarily handled in the src/wp2hugo/internal/wpparser/wp_parser_setup.go file. Here’s how it works:

1. Main Postmeta Processing Functions

The codebase currently handles postmeta through two specific functions:

getFootnotes() function (lines 640-680):

func getFootnotes(item *rss.Item) []Footnote {
	if len(item.Extensions["wp"]["postmeta"]) == 0 {
		return nil
	}

[…]

I then fed it the Wordpress XML for a single post, and it reported “What’s Missing for Your Use Case: Your podcast posts contain important metadata that would be valuable to preserve”. Correct, robot!

Looking at your WordPress export XML, I can see that your blog posts contain several postmeta entries that are currently not handled by wp2hugo. Let me analyze what’s in your export and what’s currently supported:

Your WordPress Postmeta Entries

Your export contains these postmeta keys:

  1. _g_feedback_shortcode_419e419ecf775a20ef20956a5c1fbb37cc16376a - Contact form shortcode
  2. _g_feedback_shortcode_atts_419e419ecf775a20ef20956a5c1fbb37cc16376a - Contact form attributes (PHP serialized array)

[…]

Recommendation

Since your posts are podcast episodes with important audio metadata, you might want to:

  1. Fork wp2hugo and add podcast metadata support
  2. Create a custom Hugo shortcode to handle podcast episodes
  3. Use Hugo’s data files to store podcast metadata separately
  4. Consider using a Hugo podcast theme that’s designed for this content type

The current wp2hugo implementation would miss all your podcast-specific metadata, which would be a significant loss for a podcast site.

[…]

I then asked “can you implement this?”. The agent went off for awhile, making changes, and to my astonishment when I came back from fixing a drink a few minutes later, it was done, and it actually worked! It kind of “made up” the output format, adding a bunch of frontmatter to the hugo posts, but being that

  1. There doesn’t seem to be a “standard” for podcasts in hugo
  2. I didn’t actually specify output at all in my 4 word request to the agent

I was definitely impressed. Every piece of metadata I needed was parsed and put into the Hugo markdown files, I just needed to do a bit of refining on the output formatting. I did some of that the old fashioned way, and others I engaged the agent for, but this was all home stretch stuff as I could look at the git diffs and quickly figure out what I needed to do to massage the output.

I know that there remains all kinds of controversy around LLMs, but being able to solve this one-time need in less than an hour without knowing wp2hugo nor Go much at all was pretty nifty. There is not a doubt in my mind I could have solved this problem, but having a robot intern I could turn to do it for me was also real nice. And there is no reality it would have taken an hour if I was reading the source code and the golang manual myself.

I do want to learn Go, as it seems like a nice, useful language that is turning up in more places, but I also wanted to solve this immediate problem without consuming several days in the endeavor.

Once all that was handled, I had to figure out how I wanted to deploy the sites, and since I already have my own Gitea server, I decided to host the repos for the blog sources there and then use Gitea Actions to publish them. Gitea Actions are mostly compatible with Github actions, so I was able to leverage my existing knowledge there…

I put this migration off for the better part of a year, and all it took was unemployment to knock it out in a few days…