MacKuba

🍎 Kuba Suder's blog on Mac/iOS & web dev

Running Bluesky PDS undockered

Categories: Social Comments: 0 comments

A bit over a year ago, in the first week of January 2025, I migrated my main Bluesky account to my own PDS on a Netcup VPS. It’s been quite easy to set up using the official installer, and it’s been running pretty much without any problems or maintenance the whole year.

Despite that, I haven’t been 100% happy with this setup for one reason: Docker. So I decided to try to take it out of the box, and I made it run first on the same VPS installed separately, and then moved it to another machine this month with a clean install. This blog post is a guide to how I did this, if you’re interested. There are a few existing posts about this already:

But I figured it doesn’t hurt to make another one that does things slightly differently again. “There are many like this, but this one is mine”. (I mostly followed the benharri.org version.)

Note: I’m describing what I did to migrate an existing PDS from in-Docker to outside-Docker, so I already had existing data and pds.env config; if you wanted to install one from scratch this way, you’d probably need to also set up the config manually.

You might be asking: why? And that’s a good question. I mostly wouldn’t recommend this setup over the standard Docker one by default, unless you know what you’re doing. The standard installation is literally running one command and answering some questions, and then it auto-updates and manages everything.

My reason is that I’m generally pretty familiar with installing things on Linux servers manually, but I’m completely unfamiliar with Docker. I always wanted to do some modifications on the PDS, but I didn’t know how, because the Docker setup basically takes over the whole server for itself. I don’t know where it pulls code from, I don’t know where it puts it, and I don’t know when it can overwrite any changes I make. I don’t feel in control. (And to be clear, this is likely a me problem.)

So here’s what I did (this setup is for Ubuntu 24.04 Noble):


Install Nginx

The standard PDS distribution uses Caddy, but I use Nginx everywhere and I have configs built for it, so I’ve set up Nginx:

# install Nginx
sudo apt-get install --no-install-recommends nginx-light

# enable HTTP on the firewall
sudo ufw allow http/tcp
sudo ufw allow https/tcp

# if you haven't enabled ufw before:
sudo ufw limit log ssh/tcp
sudo ufw enable

Also here’s a standard thing I do on VPSes to let me install webapps in /var/www from my account:

# set up environment for webapps

sudo groupadd deploy
sudo adduser psionides deploy
sudo chown root:deploy /var/www
sudo chmod 775 /var/www

I also need Certbot for LetsEncrypt:

# install Certbot

sudo apt-get install --no-install-recommends certbot python3-certbot-nginx
sudo certbot plugins --nginx --prepare

certbot plugins --nginx --prepare does some initial setup of some config files in /etc/letsencrypt that are unrelated to specific certificates.

One thing to note is that the standard setup with Caddy uses a .well-known route to verify any handles under *.yourdomain.com, and it automatically creates HTTPS certificates for those subdomains; this wouldn’t be as simple here, but I don’t need to be able to mass create new handles under my domain. If I ever need one or two, I’ll just set them up manually.

Email

We’ll also need email – you can use an external SMTP, but I like to use local sendmail that’s configured to forward emails to my main account:

# install Postfix
sudo apt-get install --no-install-recommends postfix

  # choose: "Internet site"
  # enter domain: "lab.martianbase.net"

# set up the email forwarding
sudo nano /etc/postfix/virtual

  # add a line like this:
  # kuba@lab.martianbase.net my.real.email@domain.com

sudo nano /etc/postfix/main.cf

  # add:
  # virtual_alias_domains = lab.martianbase.net
  # virtual_alias_maps = hash:/etc/postfix/virtual

sudo postmap /etc/postfix/virtual
sudo service postfix reload

# enable SMTP on firewall
sudo ufw allow smtp/tcp

If you set things up this way, you need to set the PDS_EMAIL_SMTP_URL in pds.env to smtp:///?sendmail=true.

Note: you will probably need to set up a few things the right way in the DNS and might also need more tweaks in the /etc/postfix/main.cf for the email sending and forwarding to work. I honestly don’t understand the Postfix config well enough, I just have a setup that works and I don’t touch it, but it’s old and I don’t know how correct it is, so I won’t share it here. From my experience, it’s generally not super hard to configure self-hosted email in such way that it works for sending emails to you and only to you (like I have with my PDS and my Mastodon instance). If it goes to spam, you know where to look, and if you move it to the inbox, generally the email service should remember and whitelist the sender. (Sending emails to others is a whole different story of course.)

NodeJS

Next, NodeJS. I used asdf to install it (the version 0.15 is because in 0.16 they did a complete rewrite in Go from the original Bash version, and some things don’t work as before).

# install asdf
git clone https://github.com/asdf-vm/asdf.git ~/.asdf --branch v0.15.0
nano .bashrc

  # add:
  # source "$HOME/.asdf/asdf.sh"

# ---
# log out & back in here - this is also needed for the deploy group to take effect
# ---

# install node
asdf plugin add nodejs

NODE_VER=`asdf latest nodejs 22`
asdf install nodejs $NODE_VER
asdf global nodejs $NODE_VER

corepack enable
asdf reshim nodejs

The Docker version runs on 20.x, but that’s almost EOL, so I’ve upgraded to the latest 22.x. 24.x doesn’t work at the moment, I get some ugly errors during installation.

Install the PDS

Now the actual PDS code. I’ve decided to keep the code in /var/www, fetch it from git and use git pull for updates, and keep the data separately in /var/lib.

cd /var/www
git clone https://github.com/bluesky-social/pds
cd pds/service

pnpm install --production --frozen-lockfile

Migrate the data

Now it’s time to copy over the data from the previous setup.

On the first server I did it like this (remember to also turn off and disable the old PDS service in Docker):

sudo rsync -rlpt /pds /var/lib/
sudo chown -R psionides:psionides /var/lib/pds

For the second one, I made a temporary SSH key on the old server:

ssh-keygen -t ed25519 -f ~/.ssh/migration -N "" -C "pds migration"

Added it to authorized keys on the new one:

nano ~/.ssh/authorized_keys

Prepared an empty directory:

sudo mkdir /var/lib/pds
sudo chown psionides:psionides /var/lib/pds

And rsynced the data from the old one to the new one:

# install if missing on the target server:
sudo apt-get install rsync

rsync -rltv -e "ssh -i ~/.ssh/migration" /var/lib/pds/ newserver:/var/lib/pds/

I tried to sync it from a local machine first, but I realized it would take much longer, and between two servers on the same network it took less than a minute for many GBs.

Start the service

With the data copied, it’s time to finish the installation. First, an HTTPS certificate – for now, I made one using manual DNS registration before switching the DNS records:

sudo certbot certonly --manual --preferred-challenges dns --key-type ecdsa \
    -m $myemail --agree-tos --no-eff-email -d lab.martianbase.net

The way it works is that it gives me a kind of verification token, and I need to put it in a TXT DNS record at _acme-challenge.lab.martianbase.net before pressing continue.

Then, I added a systemd service at /etc/systemd/system/pds.service:

[Unit]
Description=Bluesky PDS
After=network.target

[Service]
Type=simple
User=psionides
WorkingDirectory=/var/www/pds/service
ExecStart=/home/psionides/.asdf/shims/node --enable-source-maps index.js
Restart=on-failure
EnvironmentFile=/var/lib/pds/pds.env
Environment="NODE_ENV=production"
TimeoutSec=15
Restart=on-failure
RestartSec=1
StandardOutput=append:/var/lib/pds/pds.log

[Install]
WantedBy=default.target

I also told the PDS what port to run on in pds.env:

PDS_PORT=3000

And then enabled it like this:

sudo systemctl daemon-reload
sudo systemctl enable --now pds

Test?

$ curl http://localhost:3000

         __                         __
        /\ \__                     /\ \__
    __  \ \ ,_\  _____   _ __   ___\ \ ,_\   ___
  /'__'\ \ \ \/ /\ '__'\/\''__\/ __'\ \ \/  / __'\
 /\ \L\.\_\ \ \_\ \ \L\ \ \ \//\ \L\ \ \ \_/\ \L\ \
 \ \__/.\_\\ \__\\ \ ,__/\ \_\\ \____/\ \__\ \____/
  \/__/\/_/ \/__/ \ \ \/  \/_/ \/___/  \/__/\/___/
                   \ \_\
                    \/_/

It’s working! Now, the Nginx config.

Nginx config for the PDS

In /etc/nginx/sites-available/pds.site:

# this is needed to proxy the relevant HTTP headers for websocket
map $http_upgrade $connection_upgrade {
  default upgrade;
  ''      close;
}

upstream pds {
  server 127.0.0.1:3000 fail_timeout=0;
}

server {
  server_name lab.martianbase.net;
  listen 80;
  listen [::]:80;  # ipv6

  # redirect any http requests to https
  location / {
    return 301 https://$host$request_uri;
  }

  # except for certbot challenges
  location /.well-known/acme-challenge/ {
    root /var/www/html;
  }
}

server {
  server_name lab.martianbase.net;
  listen 443 ssl http2;
  listen [::]:443 ssl http2;

  ssl_certificate /etc/letsencrypt/live/lab.martianbase.net/fullchain.pem;
  ssl_certificate_key /etc/letsencrypt/live/lab.martianbase.net/privkey.pem;

  access_log /var/log/nginx/pds-access.log combined buffer=16k flush=10s;
  error_log /var/log/nginx/pds-error.log;

  client_max_body_size 100M;

  location / {
    include sites-available/proxy.inc;
    proxy_pass http://pds;
  }
}

In /etc/nginx/sites-available/proxy.inc:

proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Origin "";
proxy_set_header Proxy "";

proxy_buffering off;
proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;

tcp_nodelay on;

Then:

sudo ln -s /etc/nginx/sites-available/pds.site /etc/nginx/sites-enabled/
sudo nginx -t
sudo service nginx reload

DNS

At this point it was time to update the DNS and wait…

Once it started working for me, I also had to wait a bit more for the existing relays to notice the IP change and I needed to poke them a few times to reconnect to me again, like this:

curl https://bsky.network/xrpc/com.atproto.sync.requestCrawl \
  --json '{"hostname": "lab.martianbase.net"}'

Restarting the server with sudo service pds restart also sends a requestCrawl at least to bsky.network. What you want is to see in the pds.log that after pds has started it says request to com.atproto.sync.subscribeRepos. There was a brief moment when I could post something, but after reloading bsky.app I wouldn’t see my post, because the AppView hasn’t indexed it yet… thankfully, after the relay finally reconnected, it backfilled the missing events.

I also updated the Certbot certificate again the normal way so I wouldn’t forget about it later:

sudo certbot certonly --webroot --webroot-path /var/www/html \
    --key-type ecdsa -d lab.martianbase.net

(There’s also a more automated Nginx verification method, where it adds the required certificate lines to your Nginx configs automatically, but I don’t like it because it makes a terrible mess of the configs…)

Gatekeeper

There was one more thing I did while I was already messing with the PDS, which was to add Bailey Townsend’s Gatekeeper service, which restores support for email-based 2FA. For some reason, the email 2FA was implemented in the “entryway” PDS bsky.social and not in the main code, and as of today it still hasn’t been added to the self-hosted PDS distribution… so Bailey decided that “we can just do things” and went and added it himself. You just need to install it separately (it’s written in Rust).

First, we install the 🦀:

# install Rust

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --profile minimal
# (press enter)

source ~/.cargo/env

I used rustup, because Gatekeeper requires a fairly new version of Rust and the one I had in Ubuntu LTS repo didn’t cut it.

We also need some additional dependencies for building:

# some general compilers and stuff
sudo apt-get install --no-install-recommends build-essential

# openssl and pkg-config
sudo apt-get install --no-install-recommends libssl-dev pkg-config

Now, time to clone and build the service:

git clone https://tangled.org/baileytownsend.dev/pds-gatekeeper /var/www/gatekeeper
cd /var/www/gatekeeper/
cargo build --release
sudo cp target/release/pds_gatekeeper /usr/local/bin/

(It takes a bit of time to compile, go make yourself a coffee ☕️)

The current version is missing support for sendmail email transport, so I had to make some tweaks to make it work (see PR).

There are a couple of ENV entries we need to add to the pds.env:

PDS_BASE_URL=http://localhost:3000
GATEKEEPER_PORT=8000

And again, we need to write a systemd service at /etc/systemd/system/gatekeeper.service:

[Unit]
Description=Bluesky Gatekeeper
After=network.target

[Service]
Type=simple
User=psionides
ExecStart=/usr/local/bin/pds_gatekeeper
Restart=on-failure
Environment="PDS_ENV_LOCATION=/var/lib/pds/pds.env"
TimeoutSec=15
Restart=on-failure
RestartSec=1
StandardOutput=append:/var/lib/pds/gatekeeper.log

[Install]
WantedBy=default.target

And launch it:

sudo systemctl daemon-reload
sudo systemctl enable --now gatekeeper

Finally, a few updates to the Nginx config:

upstream gatekeeper {
  server 127.0.0.1:8000 fail_timeout=0;
}

server {
  ...

  location /xrpc/com.atproto.server.createSession {
    include sites-available/proxy.inc;
    proxy_pass http://gatekeeper;
  }

  location /xrpc/com.atproto.server.getSession {
    include sites-available/proxy.inc;
    proxy_pass http://gatekeeper;
  }

  location /xrpc/com.atproto.server.updateEmail {
    include sites-available/proxy.inc;
    proxy_pass http://gatekeeper;
  }

  location /@atproto/oauth-provider/~api/sign-in {
    include sites-available/proxy.inc;
    proxy_pass http://gatekeeper;
  }
}

And reload it again:

sudo nginx -t && sudo service nginx reload

At this point the PDS should be fully operational, including email 2FA when you try to log in 🎉

For creating accounts and other admin stuff, you can use the pdsadmin scripts from the source code dir, but you need to pass a PDS_ENV_FILE env var with the path to pds.env:

cd /var/www/pds
PDS_ENV_FILE=/var/lib/pds/pds.env bash pdsadmin/account.sh list

Backups

There’s one last thing – it would also be nice to have backups of your nearly 3 years worth of Bluesky posts…

I do it like this:

# create a backup user
sudo adduser archivist

# add a separate SSH key from local Mac
sudo mkdir /home/archivist/.ssh
echo "ssh-ed25519 ... archivist@martianbase.net" | sudo tee /home/archivist/.ssh/authorized_keys
sudo chown -R archivist:archivist /home/archivist/.ssh

sudo nano /usr/local/sbin/backup_pds:

#!/bin/bash

set -e

service pds stop
rsync -rlpt --exclude="pds.log*" /var/lib/pds/ /var/backups/pds
chmod -R g+rX /var/backups/pds
chown -R root:archivist /var/backups/pds
service pds start

sudo chmod a+x /usr/local/sbin/backup_pds

sudo nano /etc/cron.d/backup_pds:

10 10 * * *   root   /usr/local/sbin/backup_pds

And then I have a cron job on the local Mac which runs this backup script:

#!/bin/bash

SSH_COMMAND="ssh -i ~/.ssh/archivist"

rsync -rltq -e "$SSH_COMMAND" "archivist@lab.martianbase.net:/var/backups/pds/" pds

This way, at one point during the day the PDS data is copied to another folder on the server, shutting down the PDS for a moment to make sure the .sqlite files don’t end up corrupted, and then at some point later, my Mac separately rsyncs the copy of the data from that second folder, while the files aren’t being actively written to. This obviously doubles the data size requirements, so it wouldn’t be feasible for a larger PDS, but it’s totally fine on mine.

And that’s it. A bit more than curl | bash, but I now control every piece of it and I can change them as I please. The old-school way, as God intended 😎