S3 without the Cloud - blog engine update

I dockerized my blog. It only took three weeks.

published: August 25, 2023

I am slowly reworking my homelab setup (keep an eye out for more homelab-focused posts here on the blog, hopefully soon™), and part of that was transitioning some services out of a Proxmox container (a custom LXC-ish container) and onto "standard" docker/OCI containers. Of cource writing the Dockerfile and calling it a day isn't what took three weeks, but to understand what did, we have to go aaall the way baaack into the depths of the Kalahari desert...

Strap yourself in, Dorothy, we are going back to Africa

A large part of the current blog engine got hacked together on the back seat of a Land Cruiser somewhere in South Africa. The Africa blogpost was ambitious, with tons of photos, videos and animated "slice-of-life" media to accompany the written account of our adventures. It also came with specific requirements: the images and videos were compressed down locally using standard Linux tools and uploaded using limited mobile data connections while we were still on the road.

Once I successfully foiled multiple attempts of Jan-Erik trying to feed me to the lions, we returned in one piece and the requirements changed, some new questions came up:

How do I make the build environment (image magick, ffmpeg, etc.) reproducible
How do I store the originals safely, but ideally privately, while still deploying built artifacts publicly

Docker would eventually answer the first but what of the second?

Git LFS, as in "Light on Free Storage"

The first versions of the blog was hosted on now.sh, who pushed all built artifacts (just static html, some assets for the fonts, css, etc. and the images/media of course) to a web server. This worked quite well, the deployed bundle is less than a hundred megabytes due to being conservative with image resolutions and video encoding.

Now (aka Zeit, now Vercel) kept shifting around and at some point I moved on deploying straight to GitHub pages - storing this many images (mostly static, that is, not changing too often) in Git wasn't a problem, and this was a static site after all.

Throughout all this the storage of the uncompressed original sources (some few hundred megabytes of media, mostly for the Africa blogpost) remained unsolved.

After moving to GitHub Pages I made a brief foray into Git LFS, transitioning the repo to use LFS to store media, but what became quickly apparent is that I would quickly exhaust the free tier usage (1GB of stored files) and would have to pay, monthly for extra storage. This seemed like a silly thing to buy into a indefinite monthly subscription for, so the status quo remained.

That is, until one day...

Fantastic buckets and where to find them

I run my own Mastodon instance. Mastodon can store media locally, or on an "S3-compatible storage bucket". A couple years into my self-hosting journey, I decided this was a sign to give Minio, an S3-compatible open source storage server a go. Minio worked out great for Mastodon, and I had it chugging along quietly with plenty of storage to spare, and that gave me an idea: what if I stored all sources in S3?

Assets would be accessible everywhere, using an access token, so a deployment of the blog would just need to connect and download all assets, then generate the media, the static pages and push it to a web server or serve it straight up.

Screenshot of the "musings" bucket displayed in the Minio admin panel, 213 objects are using 332.8 MiB of storage

I ended up deciding to even serve the media from the S3 bucket, because why not (in theory the whole blog, being static html and all, can be served out of a bucket, if needed).

Compared to being locked into git/GitHub and LFS, locked into that ecosystem (e.g. Actions for builds), the clunky tooling that comes with LFS, and the free-tier limitations, I now have full control over the deployment and can even manage the files in S3 through the Minio UI or the local filesystem.

While there are tools to sync files with S3, Minio has a Node.js client library, so I made syncing an integral part of the build process in the latest engine update.

Containerized builds, containerized deployment

No this is not a Kubernetes post.

For now the deployment is pretty minimal, I have a VM dedicated for docker deployments, it builds, syncs and deploys the blog.

Another cool thing with Minio that it supports webhook events, so when code updates (in Git) or asset updates (in S3) happen, the deployment can receive webhooks automatically re-generate the deployed site. Support for this is planned for a future update, as well as some quality-of-life enhancements like dedicated drafts and local watch-rebuild for writing new posts.

Cover photo by Luke Southern on Unsplash.