-
Notifications
You must be signed in to change notification settings - Fork 1
Write a simple importer
- Write a very simple CSV importer
- Be able to point to the parts of the importer
If you have changes in your current branch -- you can check on this via git status
-- you'll want to save those before starting this lesson (which uses a separate branch):
git checkout -b your_branch_name
git add .
git commit -m 'checkpoint before beginning simple importer'
git checkout importer_setup
NOTE: If you make experimental changes and want to get back to the minimal code state necessary to run this lesson, you can check the starting code out again using:
git checkout importer_setup
As you've come to expect by now, we're going to write a test for our simple importer first. Make a directory in the spec folder for our importer tests:
mkdir spec/importers
Now, make a file in that folder called simple_importer_spec.rb
and paste the following content into it:
# frozen_string_literal: true
require 'rails_helper'
require 'active_fedora/cleaner'
RSpec.describe SimpleImporter do
let(:one_line_example) { 'spec/fixtures/csv_files/one_line_example.csv' }
let(:three_line_example) { 'spec/fixtures/csv_files/three_line_example.csv' }
before do
DatabaseCleaner.clean
ActiveFedora::Cleaner.clean!
end
it "imports a csv" do
expect { SimpleImporter.new(three_line_example).import }.to change { Image.count }.by 3
end
it "puts the title into the title field" do
SimpleImporter.new(one_line_example).import
expect(Image.where(title: 'A Cute Dog').count).to eq 1
end
it "puts the url into the source field" do
SimpleImporter.new(one_line_example).import
expect(Image.where(source: 'https://www.pexels.com/photo/animal-blur-canine-close-up-551628/').count).to eq 1
end
it "creates publicly visible objects" do
SimpleImporter.new(one_line_example).import
imported_image = Image.first
expect(imported_image.visibility).to eq 'open'
end
it "attaches files" do
allow(AttachFilesToWorkJob).to receive(:perform_later)
SimpleImporter.new(one_line_example).import
expect(AttachFilesToWorkJob).to have_received(:perform_later).exactly(1).times
end
end
Run your test: rspec spec/importers/simple_importer_spec.rb
. You should see an error that says something like:
NameError:
uninitialized constant SimpleImporter
Let's write just enough of our importer to make that error message change. Make a folder called importers
in the app
folder (mkdir app/importers
), and within that make a file called simple_importer.rb
. Paste this into it:
class SimpleImporter
def initialize(file)
@file = file
@user = ::User.batch_user
end
def import
end
end
Now run your test again (rspec spec/importers/simple_importer_spec.rb
). The test will still fail, but for different reasons. Now it is able to find a class called SimpleImporter
, but calling it does not produce the expected results. Writing our test first and making small changes to behavior, while running our test over and over to observe how it behaves is a good TDD habit that we're practicing here.
Ideally we would make just one test pass at at time; to save time in this lesson, we're showing the completed code to pass four of our five tests. Replace your simple_importer.rb
file with this one:
require 'csv'
class SimpleImporter
def initialize(file)
@file = file
@user = ::User.batch_user
end
def import
CSV.foreach(@file) do |row|
image = Image.new
image.depositor = @user.email
image.title << row[1]
image.source << row[2]
image.visibility = Hydra::AccessControls::AccessRight::VISIBILITY_TEXT_VALUE_PUBLIC
image.save
end
end
end
Run your tests again, and most of them should pass. The only one that still fails is the file attachment.
Replace simple_importer.rb
again, with this version of the code:
require 'csv'
class SimpleImporter
def initialize(file)
@file = file
@user = ::User.batch_user
end
def import
CSV.foreach(@file) do |row|
image = Image.new
image.depositor = @user.email
image.title << row[1]
image.source << row[2]
image.visibility = Hydra::AccessControls::AccessRight::VISIBILITY_TEXT_VALUE_PUBLIC
# Attach the image file and run it through the actor stack
# Try entering Hyrax::CurationConcern.actor on a console to see all of the
# actors this object will run through.
image_binary = File.open("#{::Rails.root}/spec/fixtures/images/#{row[0]}")
uploaded_file = Hyrax::UploadedFile.create(user: @user, file: image_binary)
attributes_for_actor = { uploaded_files: [uploaded_file.id] }
env = Hyrax::Actors::Environment.new(image, ::Ability.new(@user), attributes_for_actor)
Hyrax::CurationConcern.actor.create(env)
image_binary.close
end
end
end
Now run your tests again and they should all pass.
Now that we have an importer, let's actually make it run in our development environment. Make a rake task so we can invoke it easily. Make a file called lib/tasks/simple_import.rake
and paste this content into it:
CSV_FILE = "#{::Rails.root}/spec/fixtures/csv_files/three_line_example.csv"
namespace :csv_import do
desc 'Import the three line sample CSV'
task :simple_import => [:environment] do |_task|
SimpleImporter.new(CSV_FILE).import
end
end
Now invoke the rake task (rake csv_import:simple_import
) and go to http://localhost:3000/catalog to see the objects that were created.
Note: You can see the changes we made in this section on github.
- What is
Hydra::AccessControls::AccessRight::VISIBILITY_TEXT_VALUE_PUBLIC
? Why use that instead of just saying "open"? What happens if you enter a different value? - What happens if we add a header row to a future version of our CSV file?
- What happens if we change the order of the columns in our CSV file?
- What happens if we want to attach more than one file per object?
- What do you need to do if you want to add another of the core Hyrax metadata fields to the data?
- What is the actor stack? What are some of the things that it does?
- Can you identify the parts of an importer we talked about? Where is the:
- top level kickoff?
- parser?
- mapper?
- record importer?
- logger?