Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-46586: Modify embargo-butler auto-ingest to handle photodiode files #60

Merged
merged 2 commits into from
Nov 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@ and automatically ingest them into an appropriate Butler repository.
Containers
----------

Three containers are built from this repo: enqueue, ingest, and idle.
Four containers are built from this repo: enqueue, ingest, idle, and presence.
hsinfang marked this conversation as resolved.
Show resolved Hide resolved

Building
--------

Pull requests will build versions of all three containers tagged with the PR's "head ref".
If a tag (which should always be on main and prefixed with "v") is pushed, versions of all three containers with that tag's version number will be built.
Otherwise, merges to main will result in versions of all three containers tagged with "latest".
Pull requests will build versions of all four containers tagged with the PR's "head ref".
If a tag (which should always be on main and prefixed with "v") is pushed, versions of all four containers with that tag's version number will be built.
Otherwise, merges to main will result in versions of all four containers tagged with "latest".

If the code in this repo has not changed but a new version of the Science Pipelines stack is needed for the ingest container, the ["On-demand ingest build" workflow](https://github.com/lsst-dm/embargo-butler/actions/workflows/build-manually.yaml) should be executed.
It takes a ref — which should be the latest tag, not usually a branch — a rubin-env version, and the Science Pipelines release tag (e.g. `w_2023_41`).
Expand Down
5 changes: 4 additions & 1 deletion src/info.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,9 +146,12 @@ def __init__(self, path):
self.instrument = f"{csc}/{generator}"
elif len(components) == 6:
self.bucket, self.instrument, year, month, day, self.filename = components
elif len(components) == 5: # photodiode data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. This really says that the Camera is not following salobj precedent in generating its LFA data. I guess it may be too late to get that changed.

self.bucket, self.instrument, data_type, self.obs_day, self.filename = components
else:
raise ValueError(f"Unrecognized number of components: {len(components)}")
self.obs_day = f"{year}{month}{day}"
if not self.obs_day:
self.obs_day = f"{year}{month}{day}"
except Exception:
logger.exception("Unable to parse: %s", path)
raise
41 changes: 40 additions & 1 deletion src/ingest.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,16 @@
"""
import json
import os
import re
import socket
import time

import astropy.io.fits
import requests
from lsst.daf.butler import Butler
from lsst.pipe.base import Instrument
from lsst.obs.base import DefineVisitsTask, RawIngestTask
from lsst.obs.lsst import PhotodiodeIngestTask
from lsst.resources import ResourcePath

from info import Info
Expand Down Expand Up @@ -200,7 +203,18 @@ def main():
on_metadata_failure=on_metadata_failure,
)

if not is_lfa:
if is_lfa:
# LSSTCam photodiode is copy mode only.
instrument = Instrument.from_string("LSSTCam", butler.registry)
lsstcam_photodiode_ingester = PhotodiodeIngestTask(
config=PhotodiodeIngestTask.ConfigClass(),
butler=butler,
instrument=instrument,
on_success=on_success,
on_ingest_failure=on_ingest_failure,
on_metadata_failure=on_metadata_failure,
)
else:
define_visits_config = DefineVisitsTask.ConfigClass()
define_visits_config.groupExposures = "one-to-one"
visit_definer = DefineVisitsTask(config=define_visits_config, butler=butler)
Expand All @@ -220,6 +234,15 @@ def main():

logger.info("Ingesting %s", resources)
refs = None
if is_lfa:
resources_photodiode = []
resources_others = []
for resource in resources:
if re.search(r"MTCamera/photodiode.*_photodiode.ecsv", resource):
resources_photodiode.append(resource)
else:
resources_others.append(resource)
resources = resources_others
try:
refs = ingester.run(resources)
except Exception:
Expand All @@ -233,6 +256,22 @@ def main():
info = Info.from_path(resource.geturl())
r.lrem(worker_queue, 0, info.path)

if is_lfa and resources_photodiode:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is the same structure as the previous section, it might be nice to abstract that out into a function, in case we have more ingesters later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also bothered by the repeated structure and want to abstract it out. But as the camera team wants this asap I'll defer the refactoring to a future ticket.

try:
refs = lsstcam_photodiode_ingester.run(resources_photodiode)
except Exception:
logger.exception(
"Error while ingesting %s, retrying one by one", resources_photodiode
)
refs = []
for resource in resources_photodiode:
try:
refs.extend(lsstcam_photodiode_ingester.run([resource]))
except Exception:
logger.exception("Error while ingesting %s", resource)
info = Info.from_path(resource.geturl())
r.lrem(worker_queue, 0, info.path)

# Define visits if we ingested anything
if not is_lfa and refs:
ids = [ref.dataId for ref in refs]
Expand Down
Loading