Zero-downtime Elasticsearch tooling for managing indices and indexing from ActiveRecord with PostgreSQL to Elasticsearch.
Add this line to your application's Gemfile:
gem 'zelastic'
And then execute:
$ bundle install
Or install it yourself as:
$ gem install zelastic
For each ActiveRecord scope you want to index, you'll need a configuration:
class MyModel < ApplicationRecord
...
end
MyModelIndex = Zelastic.new(
client: Elasticsearch::Client.new(...),
mapping: {
...
},
data_source: MyModel.some_scope
) do |my_model|
# this block transforms an instance of MyModel into the hash which goes into Elasticsearch
{
attr_1: my_model.attr_1,
attr_2: my_model.attr_2,
attr_3: my_model.attr_3
}
end
You can also override some defaults, if you wish:
index_settings
: by default there aren't any, but you can provide, for example, custom analysers hereread_alias
: by default this is the table name of thedata_source
write_alias
: by default this is theread_alias
, with_write
appended
If you pass an array to as the client
argument, all writes will be applied to every client in the
array.
You'll need to make sure the following gets run whenever an instance of MyModel is updated:
indexer = Zelastic::Indexer.new(MyModelIndex)
indexer.index_record(my_model)
And when an instance of MyModel gets deleted:
indexer = Zelastic::Indexer.new(MyModelIndex)
indexer.delete_by_id(my_model.id)
There's also some bulk-change methods which may be useful:
indexer = Zelastic::Indexer.new(MyModelIndex)
indexer.index_batch(MyModel.where(id: [...]))
indexer.delete_by_ids([1, 2, 3])
indexer.delete_by_query(elasticsearch_query)
Sometimes you'll need to do a full reindex - maybe because of a bug which left the index in a bad state, or because of a new index definition, or...anything else.
We use index aliases to make it easy to do zero-downtime reindexing. The actual indexes are
<read_alias>_<random>
. The read_alias
points to the single "current" index.
The write_alias
is usually the same as the read alias, except during re-indexing, where it
points at both the old and new indices, so both receive writes. The following steps run a
full reindex:
new_name = SecureRandom.hex(3)
index_manager = Zelastic::IndexManager.new(MyModelIndex, client: Elasticsearch::Client.new(...))
index_manager.create_index(new_name)
index_manager.populate_index(new_name, batch_size: 3000)
- Check that the new index is looking alrightish
index_manager.switch_read_index(new_name)
- Probably do some more checks, then
index_manager.stop_dual_writes
index_manager.cleanup_old_indices
The client
keyword argument to Zelastic::IndexManager.new
is optional. It defaults to the client
passed to Zelastic::Config.new
, if one client is passed, or the first client in the array, if an
array is passed to Zelastic::Config.new
.
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
Bug reports and pull requests are welcome on GitHub at https://github.com/carwow/zelastic.
The gem is available as open source under the terms of the MIT License.