It’s been a while since I’ve had a chance to work on Publ, but the great thing is that I actually had a reason to work on it for my day job. Which is to say I’m finally being paid to work on Publ. ;)
Changes since 0.3.14:
- Add requirement for Arrow 0.13.0 (issue 41)
- Fix a dumb tpyo that was the cause of issue 158
- Don’t rewrite DRAFT files; fixes 137
- Move sample-site files back to the library repo rather than in the doc repo
- Fix the way we map malformed category URLs (issue 156)
- Update upstream library versions
- Move version number to publ module
- Allow empty slug-text in entry route (fixes 161)
- Process HTML entries, to finally handle issues 136 and 154.
Some more information about that last one under the cut!
So, my day job is currently doing programming, IT, systems support, and general technology stuff for a research lab. This lab has a somewhat large website with a lot of specialized resources on it. This website used to be managed by Movable Type, but at some point it broke, and rather than fix MT, the site was just getting incrementally updated via hand-editing the HTML, which is… not particularly ideal.
Being a large Movable Type site, there’s a lot of HTML-specific hacks that go into making it work with MT – this was, of course, the exact reason I wrote Publ to begin with, to have something like Movable Type but more flexible and less fragile.
But being so large means that a lot of the purpose-specific HTML stuff is not particularly easy to migrate over to Markdown in one fell swoop. There’s a lot of special-case
<div> tags as well as a lot of tables involved. Pandoc would not be able to deal with this en masse.
But also, because there’s a lot of stuff with image links and PDFs and so on, it seemed like using plain ol' HTML entries was not going to really cut it either.
So looking at my options, I saw two paths forward:
- Do the same thing I did on beesbuzz.biz and write a bunch of special-purpose import scripts to try to massage things into working with Publ, or
- Finally implement HTML entry processing to make links and images work right in an HTML context
2 seemed like a lot less work, so that’s what I did.
Of course, I’m still going to need to do a lot of massaging to the various files (replacing template boilerplate with entry headers, fixing internal links, etc.), but that’s fairly easy to do with BeautifulSoup and the like. I also have a working-ish database dump of Movable Type from before it broke and so I might be able to simply re-export the site in a Publ entry format instead, which I did some of in the beesbuzz migration as well although at the time I had basically no plain-HTML support so it was only a partial solution at the time, but it would work a lot better now I think.
Anyway, my hope is that this will finally prove out Publ as a collaborative site-management platform, as there are others in the lab who will be managing site content. Hopefully this will also become a means for me to see what the weak spots are for semi-technical users (who are good at git but not programmers).
There’s a few other “fun” things to worry about in this case because the Apache configuration is a bit of a mess and there are URL mappings to a bunch of different apps running in the same domain root. But I don’t think this will be a problem.