Skip to content

ess-dive/dataone-indexer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataONE Indexer

This component provides index task processing of index tasks created by other components. It consists of three main subsystems, each defined by it's own helm subsystem chart:

  • index-worker: a subsystem implementing a Worker class to process index jobs in parallel
  • rabbitmq: a deplyment of the RabbitMQ queue management system
  • solr: a deployment of the SOLR full text search system

Clients are expected to register index task messages in the RabbitMQ queue to be processed. Upon startup, the RabbitMQ workers register themselves as handlers of the index task messages. As messages enter the queue, RabbitMQ dispatches these to registered workers in parallel, and workers in turn process the associated object and insert a new index entry into SOLR.

See LICENSE.txt for the details of distributing this software.

Building Docker image

The image can be built with either docker or nerdctl depending on which container environment you have installed. Here I show the example using Racher Desktop configured to use nerdctl.

mvn clean package -DskipTests
nerdctl build -t dataone-index-worker:2.4.0 -f docker/Dockerfile --build-arg TAG=2.4.0 .

Running the IndexWorker in the docker container

nerdctl run --rm dataone-index-worker:2.4.0

History

This is a refactored version of the original DataONE d1_cn_index_processor that runs completely independently of other DataONE Coordinating Node services. It is intended to be deployed in a Kubernetes cluster environment, but is written such that it can be deployed in other environments as well as needed.

Packages

No packages published

Languages

  • Java 76.5%
  • MATLAB 11.6%
  • Roff 5.3%
  • XSLT 3.1%
  • JavaScript 2.7%
  • CSS 0.4%
  • Other 0.4%