Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ops error with OPERA_L3_DSWX-HLS_V1 browse images #36

Open
frankinspace opened this issue Aug 5, 2024 · 9 comments
Open

Ops error with OPERA_L3_DSWX-HLS_V1 browse images #36

frankinspace opened this issue Aug 5, 2024 · 9 comments
Labels
bug Something isn't working

Comments

@frankinspace
Copy link
Member

Starting with release of bignbit 0.1.1 on 2024-07-31 GIBS delivery of browse images has been interrupted resulting in no browse images being available through worldview for this collection.

@frankinspace frankinspace added the bug Something isn't working label Aug 5, 2024
@frankinspace
Copy link
Member Author

@ymchenjpl discovered that the old SNS response topic was removed during deployment of bignbit 0.1.1, causing no responses to be received from GIBS.

Took corrective action in ops:

  1. Manually recreated svc-pobit-podaac-ops-cumulus-gibs-response-topic in OPS AWS Console. Confirmed permissions are set that allows GIBS to publish to that topic
  2. Subscribe svc-bignbit-podaac-ops-cumulus-gibs-response-queue to the manually created svc-pobit-podaac-ops-cumulus-gibs-response-topic

This should restore bignbit operations now.

Then, next steps would be:
Update GIBS ICD to change the response topic from svc-pobit-podaac-ops-cumulus-gibs-response-topic to the new svc-bignbit-podaac-ops-cumulus-gibs-response-topic
Once GIBS updates to the new topic and we confirm we are still receiving the responses, remove the manually created svc-pobit-podaac-ops-cumulus-gibs-response-topic

@frankinspace
Copy link
Member Author

GIBS ops has reported

Last successful pull from our side was 7/31/24 at 06:52:26 (local time EDT). And nothing since

@frankinspace
Copy link
Member Author

Still have not received any responses from GIBS after re-establishing the correct response topic which indicates there is another problem going on.

May need to consider rolling back bignbit update.

@frankinspace
Copy link
Member Author

Plan is to roll back to big v0.3.3 and pobit v0.4.1 in UAT and retry sending an OPERA granule to GIBS UAT. If that works we can also rollback ops, if it doesn't we will need further debugging.

@voxparcxls
Copy link

Rollback(PR) big v0.3.3 & pobit v0.4.1
https://github.jpl.nasa.gov/podaac/cumulus-deploy-tf/pull/360

@frankinspace
Copy link
Member Author

PO.DAAC has re-deployed UAT venue with the v0.3.3 BIG and v0.4.1 POBIT components as a dry-run for fixing the OPS venue.

3 Opera OPERA_L3_DSWX-HLS_V1 granules to sent to GIBS UAT.

OPERA_L3_DSWx-HLS_T01FBE_20240727T215911Z_20240803T144035Z_S2A_30_v1.0
OPERA_L3_DSWx-HLS_T01KAA_20240730T220003Z_20240804T020052Z_L8_30_v1.0
OPERA_L3_DSWx-HLS_T01KBU_20240731T221939Z_20240802T034200Z_S2B_30_v1.0

GIBS confirmed they processed the following in UAT

OPERA_L3_DSWx-HLS_T01KBU_001063_20240731T221939Z_20240802T034200Z_S2B_30_v1.0_BROWSE_2024213
OPERA_L3_DSWx-HLS_T01KBU_002063_20240731T221939Z_20240802T034200Z_S2B_30_v1.0_BROWSE_2024213
OPERA_L3_DSWx-HLS_T01KAA_320064_20240730T220003Z_20240804T020052Z_L8_30_v1.0_BROWSE_2024212
OPERA_L3_DSWx-HLS_T01KBU_002064_20240731T221939Z_20240802T034200Z_S2B_30_v1.0_BROWSE_2024213
OPERA_L3_DSWx-HLS_T01KAA_001064_20240730T220003Z_20240804T020052Z_L8_30_v1.0_BROWSE_2024212
OPERA_L3_DSWx-HLS_T01FBE_319036_20240727T215911Z_20240803T144035Z_S2A_30_v1.0_BROWSE_2024209
OPERA_L3_DSWx-HLS_T01KAA_001065_20240730T220003Z_20240804T020052Z_L8_30_v1.0_BROWSE_2024212
OPERA_L3_DSWx-HLS_T01FBE_320036_20240727T215911Z_20240803T144035Z_S2A_30_v1.0_BROWSE_2024209
OPERA_L3_DSWx-HLS_T01KAA_320065_20240730T220003Z_20240804T020052Z_L8_30_v1.0_BROWSE_2024212
OPERA_L3_DSWx-HLS_T01KBU_001064_20240731T221939Z_20240802T034200Z_S2B_30_v1.0_BROWSE_2024213
OPERA_L3_DSWx-HLS_T01FBE_001036_20240727T215911Z_20240803T144035Z_S2A_30_v1.0_BROWSE_2024209

And PO.DAAC confirmed responses were received for the 3 granules in UAT. Will gain consensus and apply the roll-back to ops.

@frankinspace
Copy link
Member Author

Roll back was applied in OPS. Confirmed success of OPERA_L3_DSWx-HLS_T01FBE_20240727T215911Z_20240803T144035Z_S2A_30_v1.0 in OPS with GIBS. Count of responses returned from GIBS increased from 0 it was showing previously.
image

@viviant100
Copy link

Deployed to ops on 8/6/24.

@torimcd
Copy link

torimcd commented Sep 9, 2024

Issue was in GITC configuration, fix in place in UAT. Testing with bignbit 0.1.1 in 24.3 IP sprint via podaac/bignbit#4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

4 participants