Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip errors during harvest and report to Sentry #551

Merged
merged 5 commits into from
Nov 20, 2023

Commits on Nov 16, 2023

  1. Pin vcrpy versions

    Why these changes are being introduced:
    vcrpy 5.x has breaking changes with our test suite and pipenv setup. This will
    be addressed during maintenance and not current work.
    
    How this addresses that need:
    * pins vcrpy and urllib3
    
    Side effects of this change:
    * pinned versions until updates and maintenance
    
    Relevant ticket(s):
    * https://mitlibraries.atlassian.net/browse/TIMX-258
    ghukill committed Nov 16, 2023
    Configuration menu
    Copy the full SHA
    01bf65b View commit details
    Browse the repository at this point in the history
  2. Gracefully skip records for GetRecords harvest

    Why these changes are being introduced:
    During a harvest that uses the GetRecords approach, if a particular record throws
    and error the harvest fails.  The former workaround was an external skip list, but this
    had overhead to maintain.
    
    By gracefully skipping records and reporting them to Sentry, the harvest can continue
    even for known problematic records.
    
    How this addresses that need:
    * the GetRecords harvest skips errors and sends logs to Sentry
    * a new MAX_ALLOWED_ERRORS config value is added to set the upper limit of failed records allowed
    
    Side effects of this change:
    * Harvests will complete WITHOUT an external skip list even with problematic records
    * DataEng is responsible for monitoring Sentry and triaging skipped records
    * We can stop maintaining an external skip list as an SSM parameter
    
    Relevant ticket(s):
    * https://mitlibraries.atlassian.net/browse/TIMX-258
    * https://mitlibraries.atlassian.net/browse/TIMX-257
    ghukill committed Nov 16, 2023
    Configuration menu
    Copy the full SHA
    4d7bcc8 View commit details
    Browse the repository at this point in the history

Commits on Nov 17, 2023

  1. Configuration menu
    Copy the full SHA
    21bebad View commit details
    Browse the repository at this point in the history
  2. reformatting docstring

    ghukill committed Nov 17, 2023
    Configuration menu
    Copy the full SHA
    89c2d2a View commit details
    Browse the repository at this point in the history

Commits on Nov 20, 2023

  1. Configuration menu
    Copy the full SHA
    7948e35 View commit details
    Browse the repository at this point in the history