Running Bluesky PDS undockered
A bit over a year ago, in the first week of January 2025, I migrated my main Bluesky account to my own PDS on a Netcup VPS. It’s been quite easy to set up using the official installer, and it’s been running pretty much without any problems or maintenance the whole year.
Despite that, I haven’t been 100% happy with this setup for one reason: Docker. So I decided to try to take it out of the box, and I made it run first on the same VPS installed separately, and then moved it to another machine this month with a clean install. This blog post is a guide to how I did this, if you’re interested. There are a few existing posts about this already:
- https://benharri.org/bluesky-pds-without-docker/
- https://char.lt/blog/2024/10/atproto-pds/
- https://cprimozic.net/notes/posts/notes-on-self-hosting-bluesky-pds-alongside-other-services/
But I figured it doesn’t hurt to make another one that does things slightly differently again. “There are many like this, but this one is mine”. (I mostly followed the benharri.org version.)
Note: I’m describing what I did to migrate an existing PDS from in-Docker to outside-Docker, so I already had existing data and pds.env config; if you wanted to install one from scratch this way, you’d probably need to also set up the config manually.
You might be asking: why? And that’s a good question. I mostly wouldn’t recommend this setup over the standard Docker one by default, unless you know what you’re doing. The standard installation is literally running one command and answering some questions, and then it auto-updates and manages everything.
My reason is that I’m generally pretty familiar with installing things on Linux servers manually, but I’m completely unfamiliar with Docker. I always wanted to do some modifications on the PDS, but I didn’t know how, because the Docker setup basically takes over the whole server for itself. I don’t know where it pulls code from, I don’t know where it puts it, and I don’t know when it can overwrite any changes I make. I don’t feel in control. (And to be clear, this is likely a me problem.)
So here’s what I did (this setup is for Ubuntu 24.04 Noble):
Install Nginx
The standard PDS distribution uses Caddy, but I use Nginx everywhere and I have configs built for it, so I’ve set up Nginx:
# install Nginx sudo apt-get install --no-install-recommends nginx-light # enable HTTP on the firewall sudo ufw allow http/tcp sudo ufw allow https/tcp # if you haven't enabled ufw before: sudo ufw limit log ssh/tcp sudo ufw enable
Also here’s a standard thing I do on VPSes to let me install webapps in /var/www from my account:
# set up environment for webapps sudo groupadd deploy sudo adduser psionides deploy sudo chown root:deploy /var/www sudo chmod 775 /var/www
I also need Certbot for LetsEncrypt:
# install Certbot sudo apt-get install --no-install-recommends certbot python3-certbot-nginx sudo certbot plugins --nginx --prepare
certbot plugins --nginx --prepare does some initial setup of some config files in /etc/letsencrypt that are unrelated to specific certificates.
One thing to note is that the standard setup with Caddy uses a .well-known route to verify any handles under *.yourdomain.com, and it automatically creates HTTPS certificates for those subdomains; this wouldn’t be as simple here, but I don’t need to be able to mass create new handles under my domain. If I ever need one or two, I’ll just set them up manually.
We’ll also need email – you can use an external SMTP, but I like to use local sendmail that’s configured to forward emails to my main account:
# install Postfix sudo apt-get install --no-install-recommends postfix # choose: "Internet site" # enter domain: "lab.martianbase.net" # set up the email forwarding sudo nano /etc/postfix/virtual # add a line like this: # kuba@lab.martianbase.net my.real.email@domain.com sudo nano /etc/postfix/main.cf # add: # virtual_alias_domains = lab.martianbase.net # virtual_alias_maps = hash:/etc/postfix/virtual sudo postmap /etc/postfix/virtual sudo service postfix reload # enable SMTP on firewall sudo ufw allow smtp/tcp
If you set things up this way, you need to set the PDS_EMAIL_SMTP_URL in pds.env to smtp:///?sendmail=true.
Note: you will probably need to set up a few things the right way in the DNS and might also need more tweaks in the /etc/postfix/main.cf for the email sending and forwarding to work. I honestly don’t understand the Postfix config well enough, I just have a setup that works and I don’t touch it, but it’s old and I don’t know how correct it is, so I won’t share it here. From my experience, it’s generally not super hard to configure self-hosted email in such way that it works for sending emails to you and only to you (like I have with my PDS and my Mastodon instance). If it goes to spam, you know where to look, and if you move it to the inbox, generally the email service should remember and whitelist the sender. (Sending emails to others is a whole different story of course.)
NodeJS
Next, NodeJS. I used asdf to install it (the version 0.15 is because in 0.16 they did a complete rewrite in Go from the original Bash version, and some things don’t work as before).
# install asdf git clone https://github.com/asdf-vm/asdf.git ~/.asdf --branch v0.15.0 nano .bashrc # add: # source "$HOME/.asdf/asdf.sh" # --- # log out & back in here - this is also needed for the deploy group to take effect # --- # install node asdf plugin add nodejs NODE_VER=`asdf latest nodejs 22` asdf install nodejs $NODE_VER asdf global nodejs $NODE_VER corepack enable asdf reshim nodejs
The Docker version runs on 20.x, but that’s almost EOL, so I’ve upgraded to the latest 22.x. 24.x doesn’t work at the moment, I get some ugly errors during installation.
Install the PDS
Now the actual PDS code. I’ve decided to keep the code in /var/www, fetch it from git and use git pull for updates, and keep the data separately in /var/lib.
cd /var/www git clone https://github.com/bluesky-social/pds cd pds/service pnpm install --production --frozen-lockfile
Migrate the data
Now it’s time to copy over the data from the previous setup.
On the first server I did it like this (remember to also turn off and disable the old PDS service in Docker):
sudo rsync -rlpt /pds /var/lib/ sudo chown -R psionides:psionides /var/lib/pds
For the second one, I made a temporary SSH key on the old server:
ssh-keygen -t ed25519 -f ~/.ssh/migration -N "" -C "pds migration"
Added it to authorized keys on the new one:
nano ~/.ssh/authorized_keys
Prepared an empty directory:
sudo mkdir /var/lib/pds sudo chown psionides:psionides /var/lib/pds
And rsynced the data from the old one to the new one:
# install if missing on the target server: sudo apt-get install rsync rsync -rltv -e "ssh -i ~/.ssh/migration" /var/lib/pds/ newserver:/var/lib/pds/
I tried to sync it from a local machine first, but I realized it would take much longer, and between two servers on the same network it took less than a minute for many GBs.
Start the service
With the data copied, it’s time to finish the installation. First, an HTTPS certificate – for now, I made one using manual DNS registration before switching the DNS records:
sudo certbot certonly --manual --preferred-challenges dns --key-type ecdsa \
-m $myemail --agree-tos --no-eff-email -d lab.martianbase.net
The way it works is that it gives me a kind of verification token, and I need to put it in a TXT DNS record at _acme-challenge.lab.martianbase.net before pressing continue.
Then, I added a systemd service at /etc/systemd/system/pds.service:
[Unit]
Description=Bluesky PDS
After=network.target
[Service]
Type=simple
User=psionides
WorkingDirectory=/var/www/pds/service
ExecStart=/home/psionides/.asdf/shims/node --enable-source-maps index.js
Restart=on-failure
EnvironmentFile=/var/lib/pds/pds.env
Environment="NODE_ENV=production"
TimeoutSec=15
Restart=on-failure
RestartSec=1
StandardOutput=append:/var/lib/pds/pds.log
[Install]
WantedBy=default.target
I also told the PDS what port to run on in pds.env:
PDS_PORT=3000
And then enabled it like this:
sudo systemctl daemon-reload sudo systemctl enable --now pds
Test?
$ curl http://localhost:3000
__ __
/\ \__ /\ \__
__ \ \ ,_\ _____ _ __ ___\ \ ,_\ ___
/'__'\ \ \ \/ /\ '__'\/\''__\/ __'\ \ \/ / __'\
/\ \L\.\_\ \ \_\ \ \L\ \ \ \//\ \L\ \ \ \_/\ \L\ \
\ \__/.\_\\ \__\\ \ ,__/\ \_\\ \____/\ \__\ \____/
\/__/\/_/ \/__/ \ \ \/ \/_/ \/___/ \/__/\/___/
\ \_\
\/_/
It’s working! Now, the Nginx config.
Nginx config for the PDS
In /etc/nginx/sites-available/pds.site:
# this is needed to proxy the relevant HTTP headers for websocket
map $http_upgrade $connection_upgrade {
default upgrade;
'' close;
}
upstream pds {
server 127.0.0.1:3000 fail_timeout=0;
}
server {
server_name lab.martianbase.net;
listen 80;
listen [::]:80; # ipv6
# redirect any http requests to https
location / {
return 301 https://$host$request_uri;
}
# except for certbot challenges
location /.well-known/acme-challenge/ {
root /var/www/html;
}
}
server {
server_name lab.martianbase.net;
listen 443 ssl http2;
listen [::]:443 ssl http2;
ssl_certificate /etc/letsencrypt/live/lab.martianbase.net/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/lab.martianbase.net/privkey.pem;
access_log /var/log/nginx/pds-access.log combined buffer=16k flush=10s;
error_log /var/log/nginx/pds-error.log;
client_max_body_size 100M;
location / {
include sites-available/proxy.inc;
proxy_pass http://pds;
}
}
In /etc/nginx/sites-available/proxy.inc:
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Origin "";
proxy_set_header Proxy "";
proxy_buffering off;
proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
tcp_nodelay on;
Then:
sudo ln -s /etc/nginx/sites-available/pds.site /etc/nginx/sites-enabled/ sudo nginx -t sudo service nginx reload
DNS
At this point it was time to update the DNS and wait…
Once it started working for me, I also had to wait a bit more for the existing relays to notice the IP change and I needed to poke them a few times to reconnect to me again, like this:
curl https://bsky.network/xrpc/com.atproto.sync.requestCrawl \
--json '{"hostname": "lab.martianbase.net"}'
Restarting the server with sudo service pds restart also sends a requestCrawl at least to bsky.network. What you want is to see in the pds.log that after pds has started it says request to com.atproto.sync.subscribeRepos. There was a brief moment when I could post something, but after reloading bsky.app I wouldn’t see my post, because the AppView hasn’t indexed it yet… thankfully, after the relay finally reconnected, it backfilled the missing events.
I also updated the Certbot certificate again the normal way so I wouldn’t forget about it later:
sudo certbot certonly --webroot --webroot-path /var/www/html \
--key-type ecdsa -d lab.martianbase.net
(There’s also a more automated Nginx verification method, where it adds the required certificate lines to your Nginx configs automatically, but I don’t like it because it makes a terrible mess of the configs…)
Gatekeeper
There was one more thing I did while I was already messing with the PDS, which was to add Bailey Townsend’s Gatekeeper service, which restores support for email-based 2FA. For some reason, the email 2FA was implemented in the “entryway” PDS bsky.social and not in the main code, and as of today it still hasn’t been added to the self-hosted PDS distribution… so Bailey decided that “we can just do things” and went and added it himself. You just need to install it separately (it’s written in Rust).
First, we install the 🦀:
# install Rust curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --profile minimal # (press enter) source ~/.cargo/env
I used rustup, because Gatekeeper requires a fairly new version of Rust and the one I had in Ubuntu LTS repo didn’t cut it.
We also need some additional dependencies for building:
# some general compilers and stuff sudo apt-get install --no-install-recommends build-essential # openssl and pkg-config sudo apt-get install --no-install-recommends libssl-dev pkg-config
Now, time to clone and build the service:
git clone https://tangled.org/baileytownsend.dev/pds-gatekeeper /var/www/gatekeeper cd /var/www/gatekeeper/ cargo build --release sudo cp target/release/pds_gatekeeper /usr/local/bin/
(It takes a bit of time to compile, go make yourself a coffee ☕️)
The current version is missing support for sendmail email transport, so I had to make some tweaks to make it work (see PR).
There are a couple of ENV entries we need to add to the pds.env:
PDS_BASE_URL=http://localhost:3000
GATEKEEPER_PORT=8000
And again, we need to write a systemd service at /etc/systemd/system/gatekeeper.service:
[Unit]
Description=Bluesky Gatekeeper
After=network.target
[Service]
Type=simple
User=psionides
ExecStart=/usr/local/bin/pds_gatekeeper
Restart=on-failure
Environment="PDS_ENV_LOCATION=/var/lib/pds/pds.env"
TimeoutSec=15
Restart=on-failure
RestartSec=1
StandardOutput=append:/var/lib/pds/gatekeeper.log
[Install]
WantedBy=default.target
And launch it:
sudo systemctl daemon-reload sudo systemctl enable --now gatekeeper
Finally, a few updates to the Nginx config:
upstream gatekeeper {
server 127.0.0.1:8000 fail_timeout=0;
}
server {
...
location /xrpc/com.atproto.server.createSession {
include sites-available/proxy.inc;
proxy_pass http://gatekeeper;
}
location /xrpc/com.atproto.server.getSession {
include sites-available/proxy.inc;
proxy_pass http://gatekeeper;
}
location /xrpc/com.atproto.server.updateEmail {
include sites-available/proxy.inc;
proxy_pass http://gatekeeper;
}
location /@atproto/oauth-provider/~api/sign-in {
include sites-available/proxy.inc;
proxy_pass http://gatekeeper;
}
}
And reload it again:
sudo nginx -t && sudo service nginx reload
At this point the PDS should be fully operational, including email 2FA when you try to log in 🎉
For creating accounts and other admin stuff, you can use the pdsadmin scripts from the source code dir, but you need to pass a PDS_ENV_FILE env var with the path to pds.env:
cd /var/www/pds
PDS_ENV_FILE=/var/lib/pds/pds.env bash pdsadmin/account.sh list
Backups
There’s one last thing – it would also be nice to have backups of your nearly 3 years worth of Bluesky posts…
I do it like this:
# create a backup user sudo adduser archivist # add a separate SSH key from local Mac sudo mkdir /home/archivist/.ssh echo "ssh-ed25519 ... archivist@martianbase.net" | sudo tee /home/archivist/.ssh/authorized_keys sudo chown -R archivist:archivist /home/archivist/.ssh
sudo nano /usr/local/sbin/backup_pds:
#!/bin/bash set -e service pds stop rsync -rlpt --exclude="pds.log*" /var/lib/pds/ /var/backups/pds chmod -R g+rX /var/backups/pds chown -R root:archivist /var/backups/pds service pds start
sudo chmod a+x /usr/local/sbin/backup_pds
sudo nano /etc/cron.d/backup_pds:
10 10 * * * root /usr/local/sbin/backup_pds
And then I have a cron job on the local Mac which runs this backup script:
#!/bin/bash SSH_COMMAND="ssh -i ~/.ssh/archivist" rsync -rltq -e "$SSH_COMMAND" "archivist@lab.martianbase.net:/var/backups/pds/" pds
This way, at one point during the day the PDS data is copied to another folder on the server, shutting down the PDS for a moment to make sure the .sqlite files don’t end up corrupted, and then at some point later, my Mac separately rsyncs the copy of the data from that second folder, while the files aren’t being actively written to. This obviously doubles the data size requirements, so it wouldn’t be feasible for a larger PDS, but it’s totally fine on mine.
And that’s it. A bit more than curl | bash, but I now control every piece of it and I can change them as I please. The old-school way, as God intended 😎