Skip to content

Commit

Permalink
add instructions for doing a complete reset of QA or stage (including…
Browse files Browse the repository at this point in the history
… wiping out preserved SDR content), other touchups

* remove some redundancy from the README
* fix some ambiguous/unnecessary Rails terminology re-use in our own util methods and documentation, see #1154
  • Loading branch information
jmartin-sul committed Aug 25, 2023
1 parent 27b321d commit 3345fac
Show file tree
Hide file tree
Showing 3 changed files with 66 additions and 49 deletions.
66 changes: 42 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,26 +169,25 @@ See wiki, in particular
- [https://github.com/sul-dlss/preservation_catalog/wiki/Audits-(basic-info)](Audits (basic info))
- [https://github.com/sul-dlss/preservation_catalog/wiki/Audits-(how-to-run-as-needed)](Audits (how to run as needed))

## Seed the Catalog
## Populate the Catalog

Seed the Catalog with data about the Moabs on the storage roots the catalog tracks -- presumes rake db:seed already performed)
Populate the Catalog with data about the Moabs already written to the storage roots the catalog tracks -- presumes `rake db:seed` already performed

_<sub>Note: "seed" might be slightly confusing terminology here, see https://github.com/sul-dlss/preservation_catalog/issues/1154</sub>_

Seeding the catalog presumes an empty or nearly empty database -- otherwise seeding will throw `druid NOT expected to exist in catalog but was found` errors for each found object.
Seeding does more validation than regular M2C.
Populating the catalog presumes an empty or nearly empty database -- otherwise walking the storage roots to populate it will throw
`druid NOT expected to exist in catalog but was found` errors for each found object. Catalog population does more validation than the
similar `MoabToCatalogJob`/`MoabRecordService::CheckExistence`.

From console:
```ruby
Audit::MoabToCatalog.seed_catalog_for_all_storage_roots
CatalogUtils.populate_catalog_for_all_storage_roots
```

#### Reset the catalog for re-seeding
### Reset the catalog for re-population (without wiping preserved Moabs from storage roots)

**DANGER!** this will erase the catalog, and thus require re-seeding from scratch. It is mostly intended for development purposes, and it is unlikely that you'll _ever_ need to run this against production once the catalog is in regular use.
**DANGER!** this will erase the catalog, and thus require re-population from scratch. It is mostly intended for development purposes, and it is unlikely that you'll _ever_ need to run this against production once the catalog is in regular use.

* Deploy the branch of the code with which you wish to seed, to the instance which you wish to seed (e.g. main to stage).
* Reset the database for that instance. E.g., on production or stage: `RAILS_ENV=production bundle exec rake db:reset`
* Deploy the branch of the code with which you wish to populate the catalog, to the instance which you wish to populate (e.g. `main` to stage).
* Reset the database for that instance. E.g., on production or stage: `RAILS_ENV=production bundle exec rake db:reset` (this will wipe the DB and seed it with the necessary storage root and endpoint info from the app settings, to which other catalog records will refer)
* note that if you do this while `RAILS_ENV=production` (i.e. production or stage), you'll get a scary warning along the lines of:

```
Expand All @@ -198,9 +197,11 @@ DISABLE_DATABASE_ENVIRONMENT_CHECK=1
```

Basically an especially inconvenient confirmation dialogue. For safety's sake, the full command that skips that warning can be constructed by the user as needed, so as to prevent unintentional copy/paste dismissal when the user might be administering multiple deployment environments simultaneously. Inadvertent database wipes are no fun.
* `db:reset` will make sure db is migrated and seeded. If you want to be extra sure: `RAILS_ENV=[environment] bundle exec rake db:migrate db:seed`

### run `rake db:seed` on remote servers:

Useful if storage roots or cloud endpoints are added to configuration. Will not delete entries that have been removed from configuration, that must be done manually using `ActiveRecord`.

These require the same credentials and setup as a regular Capistrano deploy.

```sh
Expand All @@ -213,33 +214,26 @@ or
bundle exec cap prod db_seed # for the prod servers
```

### Populate the catalog
### Populate the catalog for a single storage root

In console, start by finding the storage root.

```ruby
msr = MoabStorageRoot.find_by!(name: name)
Audit::MoabToCatalog.seed_catalog_for_dir(msr.storage_location)
```

Or for all roots:
```ruby
MoabStorageRoot.find_each { |msr| Audit::MoabToCatalog.seed_catalog_for_dir(msr.storage_location) }
CatalogUtils.populate_catalog_for_dir(msr.storage_location)
```

## Deploying

Capistrano is used to deploy. You will need SSH access to the targeted servers, via `kinit` and VPN.

```sh
bundle exec cap stage deploy # for the stage servers
bundle exec cap ENV deploy # e.g. bundle exec cap qa deploy
```

Or:
## Jobs

```sh
bundle exec cap prod deploy # for the prod servers
```
Sidekiq is run on one or more worker VMs to handle queued jobs.

### Sidekiq

Expand Down Expand Up @@ -304,3 +298,27 @@ ZipPart.all.annotate(caller).where.not(status: 'ok').count
ZipPart.annotate(caller).where.not(status: 'ok').count
ZipPart.where.not(status: 'ok').count.annotate(caller)
```

## Resetting the preservation system

This procedure should reset both the catalog and the Moab storage roots, as one would do when resetting the entire stage or QA environment.

### Requirements

These instructions assume that SDR will be quiet during the reset, and that all other SDR apps will also be reset. Since related services (like DSA) and activity (like `accessionWF` and `preservationIngestWF`) won't be making requests to preservation services, and since e.g. the Cocina data store will be reset, it's not necessary to forcibly make sure that no robots are active, and it's not necessary to worry about syncing content from other apps with preservation.

### Steps

You can view the raw markdown and copy the following checklist into a new issue description for tracking the work (if you notice any necessary changes, please update this README):

- [ ] Quiet Sidekiq workers for preservation_catalog and preservation_robots in the environment being reset. For both apps, dump any remaining queue contents manually.
- [ ] preservation_catalog: stop the web services
- _NOTE_: stopping pres bots workers and pres cat web services will effectively halt accessioning for the environment
- [ ] Work with ops to delete archived content on cloud endpoints for the environment being reset. If Suri is not reset, then druids won't be re-used, and this can be done async from the rest of this process. _But_, if the reset is completed and accessioning starts up again before the old cloud archives are purged, you should dump a list of the old druids before the preservation_catalog DB reset, and give those to ops, so that only old content is purged. You can query for the full druid list (and e.g. redirect or copy and save the output to a file) with the following query from pres cat: `PreservedObject.pluck(:druid)` (to be clear, this will return tens of thousands of druids).
- [ ] Delete all content under the `deposit` dirs and the `sdr2objects` dirs in each of the storage roots (`sdr2objects` is the configured "storage trunk", and `deposit` is where the bags that are used to build Moabs are placed). Storage root paths are listed in shared_configs for preservation_catalog (and should be the same as what's configured for presbots/techmd in the same env). So, e.g., `rm -rf /services-disk-stage/store2/deposit/*`, `rm -rf /services-disk-stage/store2/sdr2objects/*`, etc if the storage roots are `/services-disk-stage/store2/` etc. Deletions must be done from a preservation_robots machine for the env, as techMD and pres cat mount pres storage roots as read-only.
- [ ] From a preservation_robots VM for the environment: Delete all content under preservation_robots' `Settings.transfer_object.from_dir` path, e.g. `rm -rf /dor/export/*`
- [ ] preservation_catalog: _NOTE: this step likely not needed for most resets:_ for the environment being reset, `cap shared_configs:update` (to push the latest shared_configs, in case e.g. storage root or cloud endpoint locations have been updated)
- [ ] preservation_catalog: from any one host in the env to be reset, run `rake db:reset`. This should clear the DB, migrate to the current schema, and run the `rake db:seed` command (which should recreate the current storage root and cloud endpoint configs). See [this comment about DB reset approach for SDR resets](https://github.com/sul-dlss/argo/issues/4116#issuecomment-1688690735).
- [ ] re-deploy preservation_catalog, preservation_robots, and techMD. This will bring the pres cat web services back online, bring the Sidekiq workers for all services back online, and pick up any shared_configs changes.

Once the various SDR services are started back up, preserved content on storage roots and in the cloud will be rebuilt by regular accessioning. Since preservation just preserves whatever flows through accessioning, there's no need to worry about e.g. having particular APOs in place.
13 changes: 6 additions & 7 deletions app/services/catalog_utils.rb
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ def self.check_existence_for_druid_list(druid_list_file_path)
end
end

def self.seed_catalog_for_dir(storage_dir)
logger.info "#{Time.now.utc.iso8601} Seeding starting for '#{storage_dir}'"
def self.populate_catalog_for_dir(storage_dir)
logger.info "#{Time.now.utc.iso8601} Starting to populate catalog for '#{storage_dir}'"
results = []
ms_root = MoabStorageRoot.find_by!(storage_location: storage_dir)
MoabOnStorage::StorageDirectory.find_moab_paths(storage_dir) do |druid, path, _path_match_data|
Expand All @@ -45,16 +45,15 @@ def self.seed_catalog_for_dir(storage_dir)
end
results
ensure
logger.info "#{Time.now.utc.iso8601} Seeding ended for '#{storage_dir}'"
logger.info "#{Time.now.utc.iso8601} Ended populating catalog for '#{storage_dir}'"
end

# TODO: If needing to run several seed jobs in parallel, convert seeding to queues.
def self.seed_catalog_for_all_storage_roots
MoabStorageRoot.pluck(:storage_location).each { |location| seed_catalog_for_dir(location) }
def self.populate_catalog_for_all_storage_roots
MoabStorageRoot.pluck(:storage_location).each { |location| populate_catalog_for_dir(location) }
end

def self.populate_moab_storage_root(name)
ms_root = MoabStorageRoot.find_by!(name: name)
seed_catalog_for_dir(ms_root.storage_location)
populate_catalog_for_dir(ms_root.storage_location)
end
end
36 changes: 18 additions & 18 deletions spec/services/catalog_utils_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -37,22 +37,22 @@
end
end

describe '.seed_catalog_for_all_storage_roots' do
it 'calls seed_catalog_for_dir with the right argument once per root' do
allow(described_class).to receive(:seed_catalog_for_dir).exactly(MoabStorageRoot.count).times
describe '.populate_catalog_for_all_storage_roots' do
it 'calls populate_catalog_for_dir with the right argument once per root' do
allow(described_class).to receive(:populate_catalog_for_dir).exactly(MoabStorageRoot.count).times
MoabStorageRoot.pluck(:storage_location) do |path|
allow(described_class).to receive(:seed_catalog_for_dir).with("#{path}/#{Settings.moab.storage_trunk}")
allow(described_class).to receive(:populate_catalog_for_dir).with("#{path}/#{Settings.moab.storage_trunk}")
end

described_class.seed_catalog_for_all_storage_roots
expect(described_class).to have_received(:seed_catalog_for_dir).exactly(MoabStorageRoot.count).times
described_class.populate_catalog_for_all_storage_roots
expect(described_class).to have_received(:populate_catalog_for_dir).exactly(MoabStorageRoot.count).times
MoabStorageRoot.pluck(:storage_location) do |path|
expect(described_class).to have_received(:seed_catalog_for_dir).with("#{path}/#{Settings.moab.storage_trunk}")
expect(described_class).to have_received(:populate_catalog_for_dir).with("#{path}/#{Settings.moab.storage_trunk}")
end
end

it 'does not ingest more than one Moab per druid (first ingested wins)' do
described_class.seed_catalog_for_all_storage_roots
described_class.populate_catalog_for_all_storage_roots
expect(PreservedObject.count).to eq 17
expect(MoabRecord.count).to eq 17
expect(MoabRecord.by_druid('bz514sm9647').count).to eq 1
Expand Down Expand Up @@ -122,21 +122,21 @@
end
end

describe '.seed_catalog_for_dir' do
describe '.populate_catalog_for_dir' do
let(:storage_dir_a) { 'spec/fixtures/storage_rootA/sdr2objects' }
let(:druid) { 'bz514sm9647' }

it "calls 'find_moab_paths' with appropriate argument" do
allow(MoabOnStorage::StorageDirectory).to receive(:find_moab_paths).with(storage_dir)
described_class.seed_catalog_for_dir(storage_dir)
described_class.populate_catalog_for_dir(storage_dir)
expect(MoabOnStorage::StorageDirectory).to have_received(:find_moab_paths).with(storage_dir)
end

it 'gets moab size and current version from Moab::StorageObject' do
allow(moab).to receive(:size).at_least(:once)
allow(moab).to receive(:current_version_id).at_least(:once)
allow(Moab::StorageServices).to receive(:new)
described_class.seed_catalog_for_dir(storage_dir)
described_class.populate_catalog_for_dir(storage_dir)
expect(moab).to have_received(:size).at_least(:once)
expect(moab).to have_received(:current_version_id).at_least(:once)
expect(Moab::StorageServices).not_to have_received(:new)
Expand All @@ -156,7 +156,7 @@
end

it 'calls #create_after_validation' do
described_class.seed_catalog_for_dir(storage_dir)
described_class.populate_catalog_for_dir(storage_dir)
expected_argument_list.each do |arg_hash|
expect(MoabRecordService::CreateAfterValidation).to have_received(:execute).with(
druid: arg_hash[:druid],
Expand All @@ -169,20 +169,20 @@
end

it 'returns correct number of results' do
expect(described_class.seed_catalog_for_dir(storage_dir).count).to eq 3
expect(described_class.populate_catalog_for_dir(storage_dir).count).to eq 3
end

it 'will not ingest a MoabRecord for a druid that has already been cataloged' do
expect(MoabRecord.by_druid(druid).count).to eq 0
expect(described_class.seed_catalog_for_dir(storage_dir).count).to eq 3
expect(described_class.populate_catalog_for_dir(storage_dir).count).to eq 3
expect(MoabRecord.by_druid(druid).count).to eq 1
expect(MoabRecord.count).to eq 3

storage_dir_a_seed_result_lists = described_class.seed_catalog_for_dir(storage_dir_a)
expect(storage_dir_a_seed_result_lists.count).to eq 1
storage_dir_a_population_result_lists = described_class.populate_catalog_for_dir(storage_dir_a)
expect(storage_dir_a_population_result_lists.count).to eq 1
expected_result_msg = 'db update failed: #<ActiveRecord::RecordNotSaved: Failed to remove the existing associated moab_record. ' \
'The record failed to save after its foreign key was set to nil.>'
expect(storage_dir_a_seed_result_lists.first).to eq([{ db_update_failed: expected_result_msg }])
expect(storage_dir_a_population_result_lists.first).to eq([{ db_update_failed: expected_result_msg }])
expect(MoabRecord.by_druid(druid).count).to eq 1
# the Moab's original location should remain the location of record in the DB
expect(MoabRecord.by_druid(druid).take.moab_storage_root.storage_location).to eq(storage_dir)
Expand All @@ -192,7 +192,7 @@
end

describe '.populate_moab_storage_root' do
before { described_class.seed_catalog_for_all_storage_roots }
before { described_class.populate_catalog_for_all_storage_roots }

it "won't change objects in a fully seeded db" do
expect { described_class.populate_moab_storage_root('fixture_sr1') }.not_to change(MoabRecord, :count).from(17)
Expand Down

0 comments on commit 3345fac

Please sign in to comment.