This diagram shows the current changelog.com setup:
%% https://fontawesome.com/search
graph TD
classDef link stroke:#59b287,stroke-width:3px;
%% Code & assets
subgraph GitHub
repo{{ fab:fa-github thechangelog/changelog.com }}:::link
click repo "https://github.com/thechangelog/changelog.com"
cicd[/ fa:fa-circle-check GitHub Action - Ship It \]:::link
click cicd "https://github.com/thechangelog/changelog.com/actions/workflows/ship_it.yml"
automation[\ fab:fa-golang Dagger Go SDK /]:::link
click automation "https://github.com/thechangelog/changelog.com/blob/master/magefiles/magefiles.go"
registry(( fab:fa-github ghcr.io )):::link
click registry "https://github.com/orgs/thechangelog/packages"
chat(( fab:fa-slack Slack )):::link
click chat "https://changelog.slack.com/archives/C03SA8VE2"
repo -.-> |.github/workflows/ship_it.yml| cicd
cicd --> |magefiles/magefiles.go| automation
cicd --> |success #dev| chat
end
repo -.- |2022.fly| app
registry --> |ghcr.io/changelog/changelog-prod| app
container --> |flyctl deploy| app
repo -.- |fly.io/dagger-engine-2023-05-20| container
%% PaaS - https://fly.io/dashboard/changelog
subgraph Fly.io
proxy{fa:fa-globe Proxy}
proxy ==> |https| app
container([ fa:fa-project-diagram Dagger Engine 2023-05-20 ]):::link
click container "https://fly.io/apps/dagger-engine-2023-05-20"
app(( fab:fa-phoenix-framework App changelog-2022-03-13.fly.dev )):::link
style app fill:#488969;
click app "https://fly.io/apps/changelog-2022-03-13"
dbw([ fa:fa-database PostgreSQL Leader 2023-07-31 ]):::link
click dbw "https://fly.io/apps/changelog-postgres-2023-07-31"
dbr1([ fa:fa-database PostgreSQL Replica 2023-07-31 ])
app <==> |pgsql| dbw
dbw -.-> |replication| dbr1
automation --> |wireguard| container
container --> |ghcr.io/changelog/changelog-runtime| registry
container --> |ghcr.io/changelog/changelog-prod| registry
metricsdb([ fa:fa-chart-line Prometheus ])
metrics[ fa:fa-columns Grafana fly-metrics.net ]:::link
click metrics "https://fly-metrics.net"
metrics --- |promql| metricsdb
metricsdb -.- |metrics| app
metricsdb -.- |metrics| dbw
metricsdb -.- |metrics| container
end
%% Secrets
secrets(( fa:fa-key 1Password )):::link
click secrets "https://changelog.1password.com/"
secrets -.-> |secrets| app
secrets -.-> |secrets| repo
%% Search
search(( fa:fa-magnifying-glass Typesense ))
app -...-> |search| search
%% Exceptions
exceptions(( fa:fa-car-crash Sentry )):::link
click exceptions "https://sentry.io/organizations/changelog-media/issues/?project=5668962"
app -...-> |exceptions| exceptions
%% CDN - https://manage.fastly.com/configure/services/7gKbcKSKGDyqU7IuDr43eG
subgraph Fastly
apex[ changelog.com ]:::link
click apex "https://changelog.com"
subgraph Ashburn
cdn[ cdn.changelog.com ]
end
end
subgraph AWS.S3
logs[ fab:fa-aws changelog-logs ]
end
apex & cdn-.-> |logs| logs
%% Observability
observability(( fa:fa-bug Honeycomb )):::link
click observability "https://ui.honeycomb.io/changelog/datasets/changelog_opentelemetry/home"
app -....-> |traces| observability
logs -.-> |logs| observability
%% Object storage
apex ==> |https| proxy
subgraph Cloudflare.R2
assets[ fab:fa-cloudflare changelog-assets changelog.place ]
end
cdn ==> |https| assets
%% Monitoring
subgraph BetterStack
status[ fa:fa-layer-group status.changelog.com ]:::link
click status "https://status.changelog.com"
monitoring(( fa:fa-table-tennis Uptime )):::link
click monitoring "https://uptime.betterstack.com/team/133302/monitors"
monitoring -....-> |monitors| apex
monitoring -.-> |monitors| cdn
monitoring -.-> |monitors| proxy
monitoring -.-> |monitors| status
end
Let's dig into how all the above pieces fit together.
TL;DR:
- Front-end
- Fastly
- Fly.io Proxy
- Cloudflare R2
- Application
- Elixir / Phoenix
- Database
- PostgreSQL
changelog.com is a monolithic Elixir application built with the Phoenix web framework. It uses PostgreSQL for persistence & Node.js to digest & compile static assets (CSS & JS).
Static assets, including all our mp3 episodes, are stored on Cloudflare R2. They are served via Fastly, specifically https://cdn.changelog.com. In summary:
Fastly (cdn.changelog.com)
↓
Cloudflare R2 (changelog.place)
The production instance of our application is running on Fly.io. All https://changelog.com requests are served via Fastly. Each Fastly request gets proxied to our application instance via the Fly.io Proxy. In summary:
Fastly (changelog.com)
↓
Fly.io Proxy
↓
Application (changelog-2022-03-13.fly.dev)
The production database - PostgreSQL - is running on Fly.io too. It is a replicated setup, with one leader & one replica. In summary:
Application (changelog-2022-03-13.fly.dev)
↓
PostgreSQL Leader
↓
PostgreSQL Replica
Each commit made against our primary branch gets deployed straight into production. The "Ship It!" GitHub Actions workflow is responsible for this. From the workflow jobs perspective, it is fairly standard:
- 1/2. CI/CD
- Uses Dagger Go SDK so that it works exactly the same locally as it does in GitHub Actions
- Spins up a Dagger Engine as a Fly.io machine on-demand, then connects to it so that caching is reliable & persistent between workflow runs
- A successful run publishes a container image to https://ghcr.io/thechangelog/changelog-runtime & https://ghcr.io/thechangelog/changelog-prod
- Deploys to Fly.io
- 2/2. Notify
- Notifies
#dev
channel in changelog.slack.com if CI/CD succeeds
- Notifies
All our secrets are stored in 1Password, in
the Shared Vault. Currently, they are manually declared in Fly.io via
flyctl
. They are pasted manually in GitHub Actions
secrets.
Since our application & database are running on Fly.io, we benefit from free infrastructure metrics: https://fly-metrics.net
All logs from Fastly are streamed into Honeycomb.io. This allows us to ask unknown questions about how various HTTP clients interact with our content. It also helps us explore how Fastly interacts with Fly.io.
We also send app traces via OpenTelemetry to Honeycomb.io.
App errors - e.g. Plug.Conn.InvalidQueryError
- show up in Sentry.io.
BetterStack.com monitors our public HTTPS endpoints & alerts us when they become unhealthy.
We use Typesense for search. It's near-instant & it just works.
The above is what we have so far. While we like to keep things simple, our setup is a constant work in progress. We keep making small improvements all the time, and we talk about them every 10 weeks in the context of our Ship It! Kaizen episodes. For example, this diagram and document were created in the context of 🎧 Kaizen 8: 24 improvements & a lot more. If you would prefer to stay in reading mode, check out GitHub discussion #433.
If anything on this page is missing, or could be clearer, please open an issue. Thank you very much!
- Provision a new PostgreSQL instance
flyctl postgres create \
--org changelog --region iad \
--name changelog-postgres-2023-07-31 \
--initial-cluster-size 2 \
--vm-size performance-2x \
--volume-size 10
- Connect to newly created instance (we want to use the new
pd_dump
, with the latest improvements)
flyctl ssh console --app changelog-postgres-2023-07-31
- Create new db
createdb changelog --host localhost --username postgres
- Dump database to local file
pg_dump --host postgres-2022-03-12.internal --username postgres changelog > changelog.sql
- Restore database from local file
psql --host localhost --username postgres --single-transaction changelog < changelog.sql
psql --host localhost --command 'ANALYZE VERBOSE;' changelog postgres
Note If a previous restore failed, run
dropdb --force --host localhost --username postgres changelog
, thencreatedb ...
again.
- Configure app to use new PostgreSQL instance
flyctl secrets set DB_HOST=changelog-postgres-2023-07-31.flycast DB_PASS=<NEW_DB_PASSWORD> --app changelog-2022-03-13