-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from lsst-dm/tickets/DM-40565
DM-40565: Initial code for S3 daemon.
- Loading branch information
Showing
9 changed files
with
283 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
name: lint | ||
|
||
on: | ||
push: | ||
branches: | ||
- main | ||
pull_request: | ||
|
||
jobs: | ||
call-workflow: | ||
uses: lsst/rubin_workflows/.github/workflows/lint.yaml@main | ||
ruff: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- uses: chartboost/ruff-action@v1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
name: Check Python formatting | ||
|
||
on: | ||
push: | ||
branches: | ||
- main | ||
pull_request: | ||
|
||
jobs: | ||
call-workflow: | ||
uses: lsst/rubin_workflows/.github/workflows/formatting.yaml@main |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Copyright 2023 The Board of Trustees of the Leland Stanford Junior University, through SLAC National Accelerator Laboratory |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,53 @@ | ||
# s3daemon | ||
Client/server for pushing to objects to S3 storage. | ||
s3daemon | ||
======== | ||
|
||
Client/server for pushing objects to S3 storage. | ||
|
||
The server is intended to be able to maintain long-lived TCP connections, avoiding both authentication delays and TCP slow start on long bandwidth-delay product network segments. | ||
Enabling multiple simultaneous parallel transfers also is intended to maximize usage of the network. | ||
|
||
The client is intended to allow "fire-and-forget" submissions of transfer requests by file-writing code. | ||
|
||
Prerequisites | ||
------------- | ||
|
||
As specified in ``requirements.txt``, the ``aiobotocore`` Python package must be available for the server. | ||
This can be installed into a system Python with ``pip``, or it can be installed into a Conda environment that is activated when the server is started. | ||
``micromamba`` (https://mamba.readthedocs.io/en/latest/micromamba-installation.html) or ``mambaforge`` (https://mamba.readthedocs.io/en/latest/mamba-installation.html) are the recommended ways of obtaining a Conda Python. | ||
|
||
The client requires only a vanilla Python 3.x. | ||
|
||
TCP port 15555 on ``localhost`` is used for client/server communication. | ||
This port number is currently hard-coded but could be made configurable if needed. | ||
|
||
s3daemon.py | ||
----------- | ||
|
||
``s3daemon.py`` is the server that maintains a connection pool to the object store and mediates the transmissions. | ||
|
||
It is configured via three environment variables: | ||
|
||
- ``S3_ENDPOINT_URL``: The URL to the object store endpoint (e.g. ``https://s3dfrgw.slac.stanford.edu``). | ||
- ``AWS_ACCESS_KEY_ID``: The access key credential for the object store. | ||
- ``AWS_SECRET_ACCESS_KEY``: The secret key credential for the object store. | ||
|
||
Note that having the credentials in the environment may allow them to be visible to other users of the host on which the server runs. | ||
|
||
The server accepts client connections, receives filename/destination pairs, opens the file, and uses ``put_object`` to send it to the object store. | ||
Note that the file must be readable by the UID that the server runs under. | ||
A connection pool (currently hard-coded to 25 connections, but can be made configurable) and the ``tcp_keepalive`` parameter are used to minimize connection overhead. | ||
Python ``asyncio`` is used to allow potential interleaving of multiple transmissions at the same time. | ||
|
||
Currently, the start time (in Unix seconds since the epoch) and total send time for each object are written to standard output for performance monitoring. | ||
|
||
send.py | ||
------- | ||
|
||
``send.py`` is the client. | ||
It takes two command line parameters: the filename to be transmitted and the object store destination in ``{alias}/{bucket}/{key}`` format. | ||
The filename must not contain spaces. | ||
The alias is present for compatibility with the ``mc`` client and is ignored. | ||
The key may contain slashes. | ||
|
||
If a successful result is returned from the server, there is no output, and the exit status is 0. | ||
Otherwise, the error will be output to standard error, and the process exit status will be 1. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
[tool.black] | ||
line-length = 110 | ||
target-version = ["py311"] | ||
|
||
[tool.isort] | ||
profile = "black" | ||
line_length = 110 | ||
|
||
[tool.pydocstyle] | ||
convention = "numpy" | ||
# Our coding style does not require docstrings for magic methods (D105) | ||
# Our docstyle documents __init__ at the class level (D107) | ||
# We allow methods to inherit docstrings and this is not compatible with D102. | ||
# Docstring at the very first line is not required | ||
# D200, D205 and D400 all complain if the first sentence of the docstring does | ||
# not fit on one line. We do not require docstrings in __init__ files (D104). | ||
add-ignore = ["D107", "D105", "D102", "D100", "D200", "D205", "D400", "D104"] | ||
|
||
[tool.ruff] | ||
ignore = [ | ||
"N802", | ||
"N803", | ||
"N806", | ||
"N812", | ||
"N815", | ||
"N816", | ||
"N999", | ||
"D107", | ||
"D105", | ||
"D102", | ||
"D104", | ||
"D100", | ||
"D200", | ||
"D205", | ||
"D400", | ||
] | ||
line-length = 110 | ||
select = [ | ||
"E", # pycodestyle | ||
"F", # pycodestyle | ||
"N", # pep8-naming | ||
"W", # pycodestyle | ||
"D", # pydocstyle | ||
] | ||
target-version = "py311" | ||
# Commented out to suppress "unused noqa" in jenkins which has older ruff not | ||
# generating E721. | ||
#extend-select = [ | ||
# "RUF100", # Warn about unused noqa | ||
#] | ||
|
||
[tool.ruff.pycodestyle] | ||
max-doc-length = 79 | ||
|
||
[tool.ruff.pydocstyle] | ||
convention = "numpy" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
# This file is part of s3daemon. | ||
# | ||
# Developed for the LSST Data Management System. | ||
# This product includes software developed by the LSST Project | ||
# (https://www.lsst.org). | ||
# See the COPYRIGHT file at the top-level directory of this distribution | ||
# for details of code ownership. | ||
# | ||
# This program is free software: you can redistribute it and/or modify | ||
# it under the terms of the GNU General Public License as published by | ||
# the Free Software Foundation, either version 3 of the License, or | ||
# (at your option) any later version. | ||
# | ||
# This program is distributed in the hope that it will be useful, | ||
# but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
# GNU General Public License for more details. | ||
# | ||
# You should have received a copy of the GNU General Public License | ||
# along with this program. If not, see <https://www.gnu.org/licenses/>. | ||
|
||
import asyncio | ||
import os | ||
import time | ||
|
||
import aiobotocore.session | ||
import botocore | ||
|
||
config = botocore.config.Config( | ||
max_pool_connections=25, | ||
tcp_keepalive=True, | ||
s3=dict( | ||
payload_signing_enabled=False, | ||
addressing_style="path", | ||
), | ||
) | ||
|
||
PORT = 15555 | ||
endpoint_url = os.environ["S3_ENDPOINT_URL"] | ||
access_key = os.environ["AWS_ACCESS_KEY_ID"] | ||
secret_key = os.environ["AWS_SECRET_ACCESS_KEY"] | ||
|
||
|
||
async def handle_client(client, reader, writer): | ||
"""Handle a client connection to the server socket. | ||
Parameters | ||
---------- | ||
client : `S3` | ||
The S3 client to use to talk to the server. | ||
reader : `asyncio.StreamReader` | ||
A stream connected to the socket to read the filename/destination pair. | ||
writer : `asyncio.StreamWriter` | ||
A stream connected to the socket to write back status information. | ||
""" | ||
filename, dest = (await reader.readline()).decode("UTF-8").rstrip().split(" ") | ||
start = time.time() | ||
# ignore the alias | ||
_, bucket, key = dest.split("/", maxsplit=2) | ||
with open(filename, "rb") as f: | ||
try: | ||
await client.put_object(Body=f, Bucket=bucket, Key=key) | ||
writer.write(b"Success") | ||
except Exception as e: | ||
writer.write(bytes(repr(e), "UTF-8")) | ||
print(start, time.time() - start, "sec") | ||
|
||
|
||
async def main(): | ||
"""Start the server.""" | ||
session = aiobotocore.session.get_session() | ||
async with session.create_client( | ||
"s3", | ||
aws_access_key_id=access_key, | ||
aws_secret_access_key=secret_key, | ||
endpoint_url=endpoint_url, | ||
config=config, | ||
) as client: | ||
|
||
async def client_cb(reader, writer): | ||
await handle_client(client, reader, writer) | ||
|
||
server = await asyncio.start_server(client_cb, "localhost", PORT) | ||
async with server: | ||
await server.serve_forever() | ||
|
||
|
||
if __name__ == "__main__": | ||
asyncio.run(main()) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# This file is part of s3daemon. | ||
# | ||
# Developed for the LSST Data Management System. | ||
# This product includes software developed by the LSST Project | ||
# (https://www.lsst.org). | ||
# See the COPYRIGHT file at the top-level directory of this distribution | ||
# for details of code ownership. | ||
# | ||
# This program is free software: you can redistribute it and/or modify | ||
# it under the terms of the GNU General Public License as published by | ||
# the Free Software Foundation, either version 3 of the License, or | ||
# (at your option) any later version. | ||
# | ||
# This program is distributed in the hope that it will be useful, | ||
# but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
# GNU General Public License for more details. | ||
# | ||
# You should have received a copy of the GNU General Public License | ||
# along with this program. If not, see <https://www.gnu.org/licenses/>. | ||
|
||
import socket | ||
import sys | ||
|
||
PORT = 15555 | ||
|
||
|
||
def send(filename, dest): | ||
"""Send a filename/destination pair to the server. | ||
Parameters | ||
---------- | ||
filename : `str` | ||
Name of the file to be uploaded. May not contain spaces. | ||
dest : `str` | ||
Destination in ``{alias}/{bucket}/{key}`` format. | ||
The key may contain slashes. | ||
""" | ||
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock: | ||
sock.connect(("localhost", PORT)) | ||
sock.sendall(bytes(f"{filename} {dest}\n", "UTF-8")) | ||
received = str(sock.recv(4096), "utf-8") | ||
sock.close() | ||
if received.startswith("Success"): | ||
sys.exit(0) | ||
else: | ||
print(received, file=sys.stderr) | ||
sys.exit(1) | ||
|
||
|
||
if __name__ == "__main__": | ||
send(sys.argv[1], sys.argv[2]) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
aiobotocore |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
[flake8] | ||
max-line-length = 110 | ||
max-doc-length = 79 | ||
ignore = W503, E203, N802, N803, N806, N812, N815, N816 |