Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Federation outbound proxy #15773

Merged
merged 47 commits into from
Jul 5, 2023
Merged

Federation outbound proxy #15773

merged 47 commits into from
Jul 5, 2023

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Jun 13, 2023

Federation outbound proxy. Allow configuring the set of workers to proxy outbound federation traffic through (outbound_federation_restricted_to).

This is useful when you have a worker setup with federation_sender instances responsible for sending outbound federation requests and want to make sure all outbound federation traffic goes through those instances. Before this change, the generic workers would still contact federation themselves for things like profile lookups, backfill, etc. This PR allows you to set more strict access controls/firewall for all workers and only allow the federation_sender's to contact the outside world.

The original code is from @erikjohnston's branches which I've gotten in-shape to merge.

Switched from matrix:// to matrix-federation:// because matrix:// is actually a registered scheme nowadays and doesn't make sense for our internal to Synapse use case anymore. (discussion)

Testing strategy

Testing with Complement

  1. Add some logging to MatrixFederationAgent.request(...) and ProxyAgent.request(...) to verify requests flying around
    logger.info("MatrixFederationAgent.request(uri=%s)", uri)
    ...
    logger.info("MatrixFederationAgent.request -> uri=%s", uri)
    
    ...
    
    logger.info("ProxyAgent.request(uri=%s)", uri)
    logger.info(
        "ProxyAgent.request -> endpoint=%s parsed_uri=%s request_path=%s",
        endpoint,
        parsed_uri,
        request_path,
    )
  2. Modify docker/configure_workers_and_start.py#L399-L400 so it looks like the following:
        if "federation_sender" in worker_types_set:
            shared_config.setdefault("federation_sender_instances", []).append(worker_name)
    
            # Restrict outbound federation traffic only to the federation_sender workers
            shared_config.setdefault("outbound_federation_restricted_to", []).append(
                worker_name
            )
            instance_map[worker_name] = {
                "host": "localhost",
                "port": worker_port,
            }
  3. Run a Complement test that utilizes outbound federation requests:
    WORKERS=1 WORKER_TYPES=federation_sender COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1 COMPLEMENT_DIR=../complement ./scripts-dev/complement.sh -run TestOutboundFederationProfile -p 1
  4. Check the logs to verify that MatrixFederationAgent.request(...) only happens on the federation_sender

Testing manually

  1. Setup a Redis instance for inter-worker communication: docker run -d -p 6379:6379 --name redis redis

  2. To see if Redis is working, run exec 3<>/dev/tcp/127.0.0.1/6379 && echo -e "PING\r\n" >&3 && head -c 7 <&3 (via https://stackoverflow.com/a/39214806/796832) and you should see +PONG

  3. Modify your homeserver.yaml as instructed on https://matrix-org.github.io/synapse/latest/workers.html#shared-configuration

    # extend the existing `listeners` section. This defines the ports that the
    # main process will listen on.
    listeners:
      # The HTTP replication port
      - port: 9093
        bind_address: '127.0.0.1'
        type: http
        resources:
         - names: [replication]
    
    # Add a random shared secret to authenticate traffic.
    worker_replication_secret: ""
    
    redis:
        enabled: true
    
    instance_map:
        main:
            host: 'localhost'
            port: 9093
        generic_worker1:
            host: 'localhost'
            port: 9094 # Seems to work if you use the replication (9094) or HTTP (8084) port
  4. Additionally add the following to your homeserver.yaml

    outbound_federation_restricted_to:
     - generic_worker1 # Make sure this worker is also listed in your `instance_map` as shown above
    
    # Allow talking directly to localhost
    ip_range_blacklist: []
  5. Create a new homeserver_generic_worker1.yaml

    worker_app: synapse.app.generic_worker
    worker_name: generic_worker1
    
    worker_listeners:
      - type: http
        port: 8084
        x_forwarded: true
        resources:
          - names: [client, federation]
      - port: 9094
        bind_address: '127.0.0.1'
        type: http
        resources:
         - names: [replication]
    
    worker_log_config: "/home/eric/Documents/github/element/synapse/my.synapse.linux.server_generic_worker1.log.config"
  6. Copy your existing logging config cp my.synapse.linux.server.config my.synapse.linux.server_generic_worker1.log.config and adjust filename log path so they don't conflict

  7. Add some logging to MatrixFederationAgent.request(...) and ProxyAgent.request(...) to verify requests flying around

    logger.info("MatrixFederationAgent.request(uri=%s)", uri)
    ...
    logger.info("MatrixFederationAgent.request -> uri=%s", uri)
    
    ...
    
    logger.info("ProxyAgent.request(uri=%s)", uri)
    logger.info(
        "ProxyAgent.request -> endpoint=%s parsed_uri=%s request_path=%s",
        endpoint,
        parsed_uri,
        request_path,
    )
  8. Modify MatrixFederationAgent.request(...) to always override method and uri with our own dummy request (add the following at the top of the function):

    method = b"GET"
    uri = b"matrix-federation://localhost:4444/foo.json"
  9. Setup a dummy server that responds to https://localhost:4444/foo.json (note that it needs to support HTTPS/TLS):

    node server.js
    // Setup:
    // 1. npm install express
    // 2. Setup the self-signed certificate (see below)
    
    const assert = require('assert');
    const path = require('path');
    const fs = require('fs');
    
    const express = require('express');
    const https = require('https');
    
    // 1. Generate a self-signed certificate:
    // openssl req -x509 -sha256 -nodes -newkey rsa:2048 -days 365 -subj '/CN=localhost' -keyout localhost.key -out localhost.crt
    //
    // via https://stackoverflow.com/a/32169444/796832 and https://stackoverflow.com/a/52007971/796832
    //
    // 2. Trust the certificate:
    // For Manjaro Linux:
    // sudo trust anchor --store localhost.crt
    //
    // via https://warlord0blog.wordpress.com/2021/01/17/trusting-ca-certificates-manjaro/ and https://wiki.archlinux.org/title/User:Grawity/Adding_a_trusted_CA_certificate#System-wide_%E2%80%93_Arch,_Fedora_(p11-kit)
    //
    // When generating the self-signed cert,
    // > The most important line is the one that requests the `Common Name`. You need to
    // > enter either the hostname you’ll use to access the server by, or the public IP of
    // > the server. It’s important that this field matches whatever you’ll put into your
    // > browser’s address bar to access the site, as a mismatch will cause more security
    // > errors.
    // >
    // > -- https://www.digitalocean.com/community/tutorials/how-to-create-a-self-signed-ssl-certificate-for-apache-in-ubuntu-16-04
    const key = fs.readFileSync(path.join(__dirname, './localhost.key'));
    const cert = fs.readFileSync(path.join(__dirname, './localhost.crt'));
    
    function setupServer({ port, enableHttps }) {
      assert(port !== undefined);
      assert(enableHttps !== undefined);
    
      const app = express();
    
      app.get('/', (req, res) => {
        res.send('Hello World!');
      });
    
      app.get('/foo', (req, res) => {
        res.send('bar');
      });
    
      app.get('/foo.json', (req, res) => {
        res.json({ foo: 'bar' });
      });
    
      const serverOpts = {};
      if (enableHttps) {
        serverOpts.key = key;
        serverOpts.cert = cert;
      }
    
      const server = https.createServer(serverOpts, app);
    
      server.listen(port, () => {
        console.log(`Example app listening on port ${port}`);
      });
    }
    
    setupServer({
      port: 4444,
      enableHttps: true,
    });
  10. Start the main process poetry run synapse_homeserver --config-path homeserver.yaml

  11. Start the worker process poetry run synapse_worker --config-path homeserver.yaml --config-path homeserver_generic_worker1.yaml

  12. POST http://localhost:8008/_matrix/client/v3/join/%23matrix%3Amatrix.org

  13. Check the logs to verify that MatrixFederationAgent.request(...) only happens on the federation_sender

  14. Check the logs to verify that the main process still got the dummy response (if you really want to be sure, you can modify synapse/handlers/directory.py#L270) to log the fed_result.

Todo

  • Add tests
  • Update docs

Dev notes

SYNAPSE_POSTGRES=1 SYNAPSE_POSTGRES_USER=postgres SYNAPSE_TEST_LOG_LEVEL=INFO python -m twisted.trial tests.http.test_matrixfederationclient.FederationClientProxyTests

Related tests:

  • tests/http/test_matrixfederationclient.py
  • tests/http/test_proxyagent.py -> MatrixFederationAgentTests
  • tests/federation/test_federation_client.py

Twisted docs:

Twisted source:

What headers to copy over (hop-by-hop vs end-to-end headers):


Manually messing matrix-federation:// against the workers:

$ telnet localhost 8084
GET matrix-federation://gitter.im/_matrix/federation/v1/query/directory?room_alias=%23MadLittleMods_new-room1%3Agitter.im HTTP/1.1


# needs two CRLF at the end (press enter twice)
$ { echo "GET matrix-federation://gitter.im/_matrix/federation/v1/query/directory?room_alias=%23MadLittleMods_new-room1%3Agitter.im HTTP/1.1

"; cat -; } | telnet localhost 8084

Pull Request Checklist

@MadLittleMods MadLittleMods added A-Federation Z-Time-Tracked Element employees should track their time spent on this issue/PR. labels Jun 13, 2023
Maybe fix #15773 (comment)

```
Failure: twisted.trial.util.DirtyReactorAggregateError: Reactor was unclean.
```
Conflicts:
	tests/http/test_matrixfederationclient.py
@erikjohnston
Copy link
Member

It'd be useful if someone else could take a quick pass through this to make sure I'm not being a crank.

@erikjohnston erikjohnston requested a review from a team June 28, 2023 07:13
@H-Shay H-Shay requested a review from a team June 30, 2023 16:41
Copy link
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me.

Only point that seemed dubious to me was that we ignore the federation control lists when using a federation proxy.
I guess we may be expecting the federation proxy to enforce that for us, but even if this is the case I think it would be worth spelling it out.

Comment on lines 411 to 412
hs.config.server.federation_ip_range_whitelist,
hs.config.server.federation_ip_range_blacklist,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably want to assert that these options are not in use, if we are using the proxy, since it seems like we do not respect them (right now).

Copy link
Contributor Author

@MadLittleMods MadLittleMods Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those options will be respected. Workers will delegate outbound federation federation traffic to whatever worker is configured for outbound_federation_restricted_to. And that outbound_federation_restricted_to worker will use the federation_ip_range_allowlist/federation_ip_range_blocklist options.

You have to trust that whatever worker you proxy through but in the Synapse case, I think that's a given (other workers behave as expected).

synapse/config/workers.py Show resolved Hide resolved
@MadLittleMods
Copy link
Contributor Author

Thanks for the review @erikjohnston, @H-Shay, and @reivilibre 🦒


if federation_proxy.tls:
tls_connection_creator = self._policy_for_https.creatorForNetloc(
federation_proxy.host,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@realtyem mentioned that we probably need to copy what https://github.com/matrix-org/synapse/pull/15746/files did for TLS to work properly. host comes in as a str and creatorForNetloc(...) is expecting bytes.

Suggested change
federation_proxy.host,
federation_proxy.host.encode("utf-8"),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #15886

MadLittleMods added a commit that referenced this pull request Jul 6, 2023
Same fix as #15746

Thanks to @realtyem for pointing it out!

`creatorForNetloc(...)` doesn't come with typing and expects `host`
to be `bytes` instead of a `str`.

`ProxyAgent` was introduced with the federation outbound proxy:
#15773
MadLittleMods added a commit that referenced this pull request Jul 6, 2023
MadLittleMods added a commit that referenced this pull request Jul 10, 2023
MadLittleMods added a commit that referenced this pull request Jul 10, 2023
Revert "Federation outbound proxy (#15773)"

This reverts commit b07b14b.
MadLittleMods added a commit that referenced this pull request Jul 10, 2023
Allow configuring the set of workers to proxy outbound federation traffic through (`outbound_federation_restricted_to`).

This is useful when you have a worker setup with `federation_sender` instances responsible for sending outbound federation requests and want to make sure *all* outbound federation traffic goes through those instances. Before this change, the generic workers would still contact federation themselves for things like profile lookups, backfill, etc. This PR allows you to set more strict access controls/firewall for all workers and only allow the `federation_sender`'s to contact the outside world.

The original code is from @erikjohnston's branches which I've gotten in-shape to merge.
MadLittleMods added a commit that referenced this pull request Jul 10, 2023
Same fix as #15746

Thanks to @realtyem for pointing it out!

`creatorForNetloc(...)` doesn't come with typing and expects `host`
to be `bytes` instead of a `str`.

`ProxyAgent` was introduced with the federation outbound proxy:
#15773
MadLittleMods added a commit that referenced this pull request Jul 11, 2023
**Before:**
```
Error retrieving alias
```

**After:**
```
Error retrieving alias #foo:bar -> 401 Unauthorized
```

*Spawning from creating the [manual testing strategy for the outbound federation proxy](#15773
Fizzadar pushed a commit to beeper/synapse-legacy-fork that referenced this pull request Aug 29, 2023
**Before:**
```
Error retrieving alias
```

**After:**
```
Error retrieving alias #foo:bar -> 401 Unauthorized
```

*Spawning from creating the [manual testing strategy for the outbound federation proxy](matrix-org#15773
Fizzadar pushed a commit to beeper/synapse-legacy-fork that referenced this pull request Aug 31, 2023
**Before:**
```
Error retrieving alias
```

**After:**
```
Error retrieving alias #foo:bar -> 401 Unauthorized
```

*Spawning from creating the [manual testing strategy for the outbound federation proxy](matrix-org#15773
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Federation Z-Time-Tracked Element employees should track their time spent on this issue/PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants