I Built a Personal Media Diary From My PDS Repository

All my records are in one place, might as well use them

March 07, 2026

I had mentioned that I built a Media Diary a while ago but decided to explain the process further in this article.

For a while now I've been scattered across a handful of apps. Letterboxd for films. BookHive for reading. Spotify for music. Bluesky for thoughts. Each one doing its job, but nothing connecting them. No single place to document how I consumed media and say — what was I into in March of 2024?

So, I built one.

The Idea

I wanted a simple site that would show everything in one place. Films I've watched with my ratings and reviews. Books I've read and what I thought of them. Music I've liked. My own Bluesky posts organized so I can actually browse them.

Decided to use this approach: instead of setting up a database I would have to manage, I built it on top of the AT Protocol — an open standard that powers Bluesky. Records live in a repository tied to my Bluesky account, publicly readable by any app that knows where to look. It made sense given that half my data was already there.

How the Data Gets There

Each data source needed its own pipeline to get records into the my repository.

Films were the most work. I had 270 films logged on Letterboxd going back years, so I started with a CSV export and wrote a script to process them all at once — enhancing the new records by looking up posters and metadata from The Movie Database (more forgiving than iMDB). Now there's an automated workflow running in Pipedream that watches my Letterboxd RSS feed. Every time I log a new film, that entry gets picked up automatically and adds the record in my repository. I don't have to think about it.

Books were easy — mainly because the legwork was already done. I had previously exported my reading history from Goodreads as a CSV and imported it into the Bluesky/ATProto aware site BookHive, so by the time this project started the records were already sitting in my repository, just in a different location. All the site had to do was read them once it knew where to look. I did write a small script afterward to add published dates for each book, pulling from the Open Library API.

Music came from Spotify. I wrote a Pipedream workflow that authenticates with Spotify, parses my liked songs, and writes each one as a record. I tried getting genre and release year from Last.fm first, but the coverage was spotty. A second pass using Spotify's own artist API filled in most of the gaps.

Posts needed nothing at all. Bluesky posts are already AT Protocol records — same system, same structure. The site just reads them directly.

What Pipedream Does

If you're not familiar with it, Pipedream is a cloud automation tool that lets you write code that runs in response to triggers. It is essentially IFTTT, looking for key information, then taking action.

One example is that it watches Letterboxd for new films, then creates a record based on that information. Another was used to do a large Spotify import. There are other examples designed to backfill missing metadata. They all run quietly and I mostly forget they exist — which is exactly what I want.

The Site Itself

The whole frontend is a single HTML file. No framework. No build step. No server. It loads in the browser, makes a few API calls to fetch the records, and renders everything.

There are four tabs:

Films — poster grid, ratings, reviews, filters by year watched and rating tier
Books — covers, status badges (Finished / Reading / Want to Read / DNF), filters by published year
Music — album art, artist, genre, release year, link to Spotify
Posts — my Bluesky posts, defaulting to the 9 most recent, with an "On This Day" shortcut that shows what I was posting on this date from any year

Every filter you apply gets reflected in the URL, so you can bookmark or share any view. Something like (?tab=films&rating=bad&yearWatched=2025) gives you a direct link to my worst-rated films from last year.

The "On This Day" feature for posts is probably my favorite part. It pulls up every post I made on today's date, going back as far as my Bluesky history goes. I have been missing this functionality since Timehop went away.

Hosting

The site is hosted on Netlify, which offers a free option that accomplishes just what I need. Deploying updates to Netlify means dragging a single file into their dashboard. That's it. Nothing more.

Because all the data comes from public AT Protocol APIs, I have nothing to maintain on the backend. The data updates whenever I log something new, and the site just reads it.

What I Plan To Do Next

A few things still on the to-do list:

Auto-sync new Spotify likes the same way Letterboxd syncs (Pipedream trigger on new liked songs)
Stats page — films per year, books per year, top genres
Notes field for music so I can add a few words about why I liked something

Where Claude AI Came In

Full disclosure: I did not write most of this code myself. The tools are open and the APIs are documented, but actually connecting them — writing the import scripts, structuring the Pipedream workflows, building the frontend from scratch — that was done in collaboration with Claude.

The way it worked: I would describe what I wanted, Claude would write the code, I would run it and report back what happened, and we would iterate from there step by step. Some things worked on the first try, but not always.

Technical Challenges

(WARNING: heavy technical jargon usage from this point forward):

The Bluesky posts tab took several attempts to get right. The first version tried to use the getAuthorFeed API, which requires authentication — so it silently returned nothing. Switching to listRecords (the same unauthenticated approach used for films and books) fixed the fetch, but then the date filtering was off because the code was comparing dates in local time while Bluesky stores everything in UTC. Posts from late evening would show up on the wrong day. After some back and forth, Claude caught the issue once I described what I was seeing.

Genre data for music was a two-pass problem. The plan was to get it from my Last.fm I had previously synced with my Spotify. Unfortunately, Last.fm's data turned out to be inconsistent — several tracks came back with no data. Rather than leave the gaps, Claude wrote another enrichment script that used Spotify's own artist API to backfill any missing data. That script had to be careful not to overwrite records in my repository that already had good data, and to run in small batches to avoid hitting rate limits.

The Letterboxd CSV import required some detective work. Letterboxd's export format is not documented anywhere obvious, so Claude had to infer the field structure from the data itself and make some assumptions about edge cases — films with no rating, films with duplicate entries, reviews that contained special characters that would break the JSON.

From my perspective, the main challenge was learning enough about each piece of the puzzle to describe problems clearly. When something was not working, I needed to be able to pull a console error or a sample API response. The more specific I could be, the faster things got resolved. Vague descriptions of broken behavior tended to produce fixes for the wrong problem.

This proved to be a different kind of project than writing everything yourself. I understood every decision that was made, but I was not the one typing most of the code. That felt a little strange at first, but the end result is something I genuinely would not/could not have built on my own.

Honestly, the thing that surprised me most was how little infrastructure this needed. I kept waiting for the part where I had to set up a database or a backend API, and that never happened. The AT Protocol handles the storage. Pipedream handles the automation. Netlify hosts a single file. That's the full picture.

There is something about having my data in one place that makes me want to open it. The "On This Day" posts feature especially — I'll check it and find something I wrote two years ago that I'd completely forgotten about. It has become less of a dashboard and more of a scrapbook.

There are still rough edges. Some of my older music records are missing genre data. The book covers don't always load. A few Bluesky posts have broken image links. But for a side project that I built in pieces over a few weeks, it works well enough that I keep coming back to it — and that feels like the right solution for me.

Remember Timehop?

I have all my data now

Bluesky

Confessions of a Lethargic Mind

Continuing my blog from the early 2000s