Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare for authenticated media freeze #17433

Merged
merged 8 commits into from
Jul 22, 2024
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/17433.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Prepare for authenticated media freeze.
10 changes: 10 additions & 0 deletions docs/usage/configuration/config_documentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -1863,6 +1863,16 @@ federation_rr_transactions_per_room_per_second: 40
## Media Store
Config options related to Synapse's media store.

---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(on the fence; do you think we ought to add an upgrade notes entry about this change, so people can get ahead of the curve and set the option to false explicitly if they want to make sure their special client won't be bitten by the upcoming freeze? It'd also be a good place to just lay out the plan a bit, since I think for most who don't read TWIM etc, the upcoming changes may not be obvious?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to get @turt2live's opinion on this - perhaps it would make sense to link to the blog post announcing the freeze? I think for now though we can merge this (I'd like to get it into the next rc so people can test against it on beta.matrix.org) and figure out how to handle the upgrade notes, etc soon.

Copy link
Member

@turt2live turt2live Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Server operators are going to be increasingly incentivized to ensure this option is actually set to true rather than false, regardless of client breakage. When we change the default to true we should definitely call out matrix.org's plan as reference, but I wouldn't include language about setting it to false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From brief discussion in backend internal: we should provide users with instructions on how to set it to false, but we can have the changelog read as eventually removing the option (making the feature inevitable)

### `enable_authenticated_media`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this would be clearer as new_media_requires_authentication or something like that? It seems like once the media gets the flag, it stays in that state forever, so the current name doesn't give quite the right impression.

Thanks for writing this up though, this is way clearer about what we're hoping this feature will do, it's worth it just for that alone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mrh, actually, it looks like the two options were merged and this option is named properly, but it'd probably be worth pointing out that disabling this option means all media, including those flagged as 'authenticated media', will be available over the unauthenticated endpoint? Since we disable the auth checks when this is turned off.


When set to true, all subsequent media uploads will be marked as authenticated, and will not be available over legacy
unauthenticated media endpoints (`/_matrix/media/(r0|v3|v1)/download` and `/_matrix/media/(r0|v3|v1)/thumbnail`) - requests for authenticated media over these endpoints will result in a 404. All media, including authenticated media, will be available over the authenticated media endpoints `_matrix/client/v1/media/download` and `_matrix/client/v1/media/thumbnail`. Media uploaded prior to setting this option to true will still be available over the legacy endpoints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Defaults to false, but this will change to true in a future Synapse release.

Example configuration:
```yaml
enable_authenticated_media
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
enable_authenticated_media
enable_authenticated_media: true

```
---
### `enable_media_repo`

Expand Down
5 changes: 3 additions & 2 deletions synapse/_scripts/synapse_port_db.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,18 +119,19 @@
"e2e_room_keys": ["is_verified"],
"event_edges": ["is_state"],
"events": ["processed", "outlier", "contains_url"],
"local_media_repository": ["safe_from_quarantine"],
"local_media_repository": ["safe_from_quarantine", "authenticated"],
"per_user_experimental_features": ["enabled"],
"presence_list": ["accepted"],
"presence_stream": ["currently_active"],
"public_room_list_stream": ["visibility"],
"pushers": ["enabled"],
"redactions": ["have_censored"],
"remote_media_cache": ["authenticated"],
"room_stats_state": ["is_federatable"],
"rooms": ["is_public", "has_auth_chain_index"],
"users": ["shadow_banned", "approved", "locked", "suspended"],
"un_partial_stated_event_stream": ["rejection_status_changed"],
"users_who_share_rooms": ["share_private"],
"per_user_experimental_features": ["enabled"],
}


Expand Down
4 changes: 4 additions & 0 deletions synapse/config/repository.py
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,10 @@ def read_config(self, config: JsonDict, **kwargs: Any) -> None:
remote_media_lifetime
)

self.enable_authenticated_media = config.get(
"enable_authenticated_media", False
)

def generate_config_section(self, data_dir_path: str, **kwargs: Any) -> str:
assert data_dir_path is not None
media_store = os.path.join(data_dir_path, "media_store")
Expand Down
39 changes: 38 additions & 1 deletion synapse/media/media_repository.py
Original file line number Diff line number Diff line change
Expand Up @@ -430,6 +430,7 @@ async def get_local_media(
media_id: str,
name: Optional[str],
max_timeout_ms: int,
allow_authenticated: bool = True,
federation: bool = False,
) -> None:
"""Responds to requests for local media, if exists, or returns 404.
Expand All @@ -442,6 +443,7 @@ async def get_local_media(
the filename in the Content-Disposition header of the response.
max_timeout_ms: the maximum number of milliseconds to wait for the
media to be uploaded.
allow_authenticated: whether media marked as authenticated may be served to this request
federation: whether the local media being fetched is for a federation request

Returns:
Expand All @@ -451,6 +453,10 @@ async def get_local_media(
if not media_info:
return

if self.hs.config.media.enable_authenticated_media and not allow_authenticated:
if media_info.authenticated:
raise NotFoundError()

self.mark_recently_accessed(None, media_id)

media_type = media_info.media_type
Expand Down Expand Up @@ -481,6 +487,7 @@ async def get_remote_media(
max_timeout_ms: int,
ip_address: str,
use_federation_endpoint: bool,
allow_authenticated: bool = True,
) -> None:
"""Respond to requests for remote media.

Expand All @@ -495,6 +502,8 @@ async def get_remote_media(
ip_address: the IP address of the requester
use_federation_endpoint: whether to request the remote media over the new
federation `/download` endpoint
allow_authenticated: whether media marked as authenticated may be served to this
request

Returns:
Resolves once a response has successfully been written to request
Expand Down Expand Up @@ -526,6 +535,7 @@ async def get_remote_media(
self.download_ratelimiter,
ip_address,
use_federation_endpoint,
allow_authenticated,
)

# We deliberately stream the file outside the lock
Expand All @@ -548,6 +558,7 @@ async def get_remote_media_info(
max_timeout_ms: int,
ip_address: str,
use_federation: bool,
allow_authenticated: bool,
) -> RemoteMedia:
"""Gets the media info associated with the remote file, downloading
if necessary.
Expand All @@ -560,6 +571,8 @@ async def get_remote_media_info(
ip_address: IP address of the requester
use_federation: if a download is necessary, whether to request the remote file
over the federation `/download` endpoint
allow_authenticated: whether media marked as authenticated may be served to this
request

Returns:
The media info of the file
Expand All @@ -581,6 +594,7 @@ async def get_remote_media_info(
self.download_ratelimiter,
ip_address,
use_federation,
allow_authenticated,
)

# Ensure we actually use the responder so that it releases resources
Expand All @@ -598,6 +612,7 @@ async def _get_remote_media_impl(
download_ratelimiter: Ratelimiter,
ip_address: str,
use_federation_endpoint: bool,
allow_authenticated: bool,
) -> Tuple[Optional[Responder], RemoteMedia]:
"""Looks for media in local cache, if not there then attempt to
download from remote server.
Expand All @@ -619,6 +634,11 @@ async def _get_remote_media_impl(
"""
media_info = await self.store.get_cached_remote_media(server_name, media_id)

if self.hs.config.media.enable_authenticated_media and not allow_authenticated:
# if it isn't cached then don't fetch it or if it's authenticated then don't serve it
if not media_info or media_info.authenticated:
raise NotFoundError()

# file_id is the ID we use to track the file locally. If we've already
# seen the file then reuse the existing ID, otherwise generate a new
# one.
Expand Down Expand Up @@ -792,6 +812,11 @@ async def _download_remote_file(

logger.info("Stored remote media in file %r", fname)

if self.hs.config.media.enable_authenticated_media:
authenticated = True
else:
authenticated = False

return RemoteMedia(
media_origin=server_name,
media_id=media_id,
Expand All @@ -802,6 +827,7 @@ async def _download_remote_file(
filesystem_id=file_id,
last_access_ts=time_now_ms,
quarantined_by=None,
authenticated=authenticated,
)

async def _federation_download_remote_file(
Expand Down Expand Up @@ -915,6 +941,11 @@ async def _federation_download_remote_file(

logger.debug("Stored remote media in file %r", fname)

if self.hs.config.media.enable_authenticated_media:
authenticated = True
else:
authenticated = False

return RemoteMedia(
media_origin=server_name,
media_id=media_id,
Expand All @@ -925,6 +956,7 @@ async def _federation_download_remote_file(
filesystem_id=file_id,
last_access_ts=time_now_ms,
quarantined_by=None,
authenticated=authenticated,
)

def _get_thumbnail_requirements(
Expand Down Expand Up @@ -1030,7 +1062,12 @@ async def generate_local_exact_thumbnail(
t_len = os.path.getsize(output_path)

await self.store.store_local_thumbnail(
media_id, t_width, t_height, t_type, t_method, t_len
media_id,
t_width,
t_height,
t_type,
t_method,
t_len,
)

return output_path
Expand Down
48 changes: 43 additions & 5 deletions synapse/media/thumbnailer.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@

from PIL import Image

from synapse.api.errors import Codes, SynapseError, cs_error
from synapse.api.errors import Codes, NotFoundError, SynapseError, cs_error
from synapse.config.repository import THUMBNAIL_SUPPORTED_MEDIA_FORMAT_MAP
from synapse.http.server import respond_with_json
from synapse.http.site import SynapseRequest
Expand Down Expand Up @@ -274,13 +274,20 @@ async def respond_local_thumbnail(
m_type: str,
max_timeout_ms: int,
for_federation: bool,
allow_authenticated: bool = True,
) -> None:
media_info = await self.media_repo.get_local_media_info(
request, media_id, max_timeout_ms
)
if not media_info:
return

# if the media the thumbnail is generated from is authenticated, don't serve the
# thumbnail over an unauthenticated endpoint
if self.hs.config.media.enable_authenticated_media and not allow_authenticated:
if media_info.authenticated:
raise NotFoundError()

thumbnail_infos = await self.store.get_local_media_thumbnails(media_id)
await self._select_and_respond_with_thumbnail(
request,
Expand All @@ -307,14 +314,20 @@ async def select_or_generate_local_thumbnail(
desired_type: str,
max_timeout_ms: int,
for_federation: bool,
allow_authenticated: bool = True,
) -> None:
media_info = await self.media_repo.get_local_media_info(
request, media_id, max_timeout_ms
)

if not media_info:
return

# if the media the thumbnail is generated from is authenticated, don't serve the
# thumbnail over an unauthenticated endpoint
if self.hs.config.media.enable_authenticated_media and not allow_authenticated:
if media_info.authenticated:
raise NotFoundError()

thumbnail_infos = await self.store.get_local_media_thumbnails(media_id)
for info in thumbnail_infos:
t_w = info.width == desired_width
Expand Down Expand Up @@ -381,14 +394,27 @@ async def select_or_generate_remote_thumbnail(
max_timeout_ms: int,
ip_address: str,
use_federation: bool,
allow_authenticated: bool = True,
) -> None:
media_info = await self.media_repo.get_remote_media_info(
server_name, media_id, max_timeout_ms, ip_address, use_federation
server_name,
media_id,
max_timeout_ms,
ip_address,
use_federation,
allow_authenticated,
)
if not media_info:
respond_404(request)
return

# if the media the thumbnail is generated from is authenticated, don't serve the
# thumbnail over an unauthenticated endpoint
if self.hs.config.media.enable_authenticated_media and not allow_authenticated:
if media_info.authenticated:
respond_404(request)
return

thumbnail_infos = await self.store.get_remote_media_thumbnails(
server_name, media_id
)
Expand Down Expand Up @@ -446,16 +472,28 @@ async def respond_remote_thumbnail(
max_timeout_ms: int,
ip_address: str,
use_federation: bool,
allow_authenticated: bool = True,
) -> None:
# TODO: Don't download the whole remote file
# We should proxy the thumbnail from the remote server instead of
# downloading the remote file and generating our own thumbnails.
media_info = await self.media_repo.get_remote_media_info(
server_name, media_id, max_timeout_ms, ip_address, use_federation
server_name,
media_id,
max_timeout_ms,
ip_address,
use_federation,
allow_authenticated,
)
if not media_info:
return

# if the media the thumbnail is generated from is authenticated, don't serve the
# thumbnail over an unauthenticated endpoint
if self.hs.config.media.enable_authenticated_media and not allow_authenticated:
if media_info.authenticated:
raise NotFoundError()

thumbnail_infos = await self.store.get_remote_media_thumbnails(
server_name, media_id
)
Expand Down Expand Up @@ -485,8 +523,8 @@ async def _select_and_respond_with_thumbnail(
file_id: str,
url_cache: bool,
for_federation: bool,
server_name: Optional[str] = None,
media_info: Optional[LocalMedia] = None,
server_name: Optional[str] = None,
) -> None:
"""
Respond to a request with an appropriate thumbnail from the previously generated thumbnails.
Expand Down
3 changes: 2 additions & 1 deletion synapse/rest/media/download_resource.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ async def on_GET(

if self._is_mine_server_name(server_name):
await self.media_repo.get_local_media(
request, media_id, file_name, max_timeout_ms
request, media_id, file_name, max_timeout_ms, allow_authenticated=False
)
else:
allow_remote = parse_boolean(request, "allow_remote", default=True)
Expand All @@ -106,4 +106,5 @@ async def on_GET(
max_timeout_ms,
ip_address,
False,
allow_authenticated=False,
)
5 changes: 4 additions & 1 deletion synapse/rest/media/thumbnail_resource.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ async def on_GET(
m_type,
max_timeout_ms,
False,
allow_authenticated=False,
)
else:
await self.thumbnail_provider.respond_local_thumbnail(
Expand All @@ -107,6 +108,7 @@ async def on_GET(
m_type,
max_timeout_ms,
False,
allow_authenticated=False,
)
self.media_repo.mark_recently_accessed(None, media_id)
else:
Expand Down Expand Up @@ -134,6 +136,7 @@ async def on_GET(
m_type,
max_timeout_ms,
ip_address,
False,
use_federation=False,
allow_authenticated=False,
)
self.media_repo.mark_recently_accessed(server_name, media_id)
Loading
Loading