Skip to content

Latest commit

 

History

History
394 lines (235 loc) · 11.8 KB

INSTALL.rst

File metadata and controls

394 lines (235 loc) · 11.8 KB

Installation Guide

Prerequisites

To be able to develop and run Invenio you will need the following installed and configured on your system:

There are two possibilities for setting up your own development version of CERN Analysis Preservation (CAP), a Development installation with python virtualenvwrapper and a Docker Installation.

CAP depends on PostgreSQL, Elasticsearch 5.x, Redis and RabbitMQ.

If you are only interested in running CAP locally, follow the Docker installation guide below. If you plan to eventually develop CAP code you continue further to Development installation to find out how to set up the local instance for easy code development.

For this guide you will need to install docker along with the docker-compose tool.

Docker installation is not necessary, although highly recommended.

If you can't use docker you can run CAP and all of the required services directly in your system. Take a look at docker-compose.yml file to find out what is required and how the configuration looks like. For development you will need to set-up and configure four services: PostgreSQL (db), Elasticsearch (es), Redis (cache) and RabbitMQ (mq).

Docker installation

The easiest way to run Zenodo locally is to use the provided docker-compose configuration containing full Zenodo stack. First checkout the source code, build all docker images and boot them up using docker-compose:

$ git clone https://github.com/cernanalysispreservation/analysispreservation.cern.ch.git
$ cd analysispreservation.cern.ch
$ docker-compose -f docker-compose.full.yml build
$ docker-compose -f docker-compose.full.yml up -d

Keep the session with the docker-compose above alive, and in a new shell, go to the project directory and run the init script which creates the database tables, search indexes and some data fixtures:

$ docker-compose -f docker-compose.full.yml run web sh scripts/init.sh

Now visit the following URL in your browser:

https://<docker ip>

Note

If you're running docker on Linux or newer Mac OS X systems, the <docker ip> is usually the localhost. For older Mac OS X and Windows systems running docker through docker-machine, you can find the IP with

$ docker-machine ip <machine-name>

Development installation

For the development setup we will reuse the CAP docker image from previous section to run only essential CAP services, and run the application code and the Celery worker outside docker - you will want to have easy access to the code and the virtual environment in which it will be installed.

Since docker will be mapping the services to the default system
ports on localhost, make sure you are not running PostgreSQL,
Redis, RabbitMQ or Elasticsearch on those ports in your system.

Similarly to how we previously ran docker-compose -f docker-compose.full.yml up -d to run full-stack CAP, this time we run only four docker nodes with the database, Elasticsearch, Redis and RabbitMQ:

$ docker-compose up -d

Keep the docker-compose session above alive and in a separate shell, create a new Python virtual environment using virtualenvwrapper (virtualenvwrapper), in which we will install CAP code and its dependencies:

$ mkvirtualenv cap
(cap)$

Note

CAP works on both on Python 2.7 and 3.5+. However in case you need to use the XRootD storage interface, you will need Python 2.7 as the underlying libraries don't support Python 3.5+ yet.

Next, install CAP and code dependencies:

Go into the CAP directory and install the Python requirements:

cd cap

(cap)$ pip install -r requirements.txt
(cap)$ pip install -e .[all]
(cap)$ pip install -r requirements-local-forks.txt

Now, go to the React SPA direcotry and install UI dependencies:

(cap)$ cd ./ui
(cap)$ yarn install

Running services

To run CAP locally, you will need to have some services running on your machine. At minimum you must have PostgreSQL, Elasticsearch 2.x, Redis and RabbitMQ. You can either install all of those from your system package manager and run them directly or better - use the provided docker image as before.

The docker image is the recommended method for development.

Note

If you run the services locally, make sure you're running Elasticsearch 5.x.

To run only the essential services using docker, execute the following:

$ cd <to-project-dir>
$ docker-compose up -d

This should bring up four docker nodes with PostgreSQL (db), Elasticsearch (es), RabbitMQ (mq), and Redis (cache). Keep this shell session alive.

Initialization

Now that the services are running, it's time to initialize the CAP database and the Elasticsearch index.

Create the database, Elasticsearch indices, messages queues and various fixtures for schemas, users and roles in a new shell session:

$ cd <to-project-dir>
$ workon cap
(cap)$ sh ./scripts/init.sh

Let's also run the Celery worker on a different shell session:

$ cd <to-project-dir>
$ workon cap
(cap)$ celery worker -A cap.celery -l INFO --purge

Note

Here we assume all four services (db, es, mq, cache) are bound to localhost (see cap/config.py). If you fail to connect those services, it is likely you are running docker through docker-machine and those services are bound to other IP addresses. In this case, you can redirect localhost ports to docker ports as follows.

ssh -L 6379:localhost:6379 -L 5432:localhost:5432 -L 9200:localhost:9200 -L 5672:localhost:5672 docker@$(docker-machine ip)

The problem usually occurs among Mac and Windows users. A better solution is to install the native apps Docker for Mac or Docker for Windows (available since Docker v1.12) if possible, which binds docker to localhost by default.

Loading data

Next, let's load some external data. Loading of this data is done via access rights and depends on internet access since it involves harvesting external DBs or REST APIs.

See `cap.samples.env` for indication of ENV variables needed to be exported in your shell session

Make sure you keep the session with Celery worker alive. Launch the data loading commands in a separate shell:

$ cd <to-project-dir>
$ workon cap
(cap)$ cap fixtures cms sync-cadi

Finally, run the CAP development server and the React SPA app in debug mode:

$ cd <to-project-dir>
$ workon cap
(cap)$ export FLASK_DEBUG=True
(cap)$ export DEBUG_MODE=True
(cap)$ cap run --reload

And in another shell the React SPA application developement server, also in debug mode - so that requests point to the above server http://localhost:5000 :

$ cd <to-project-dir>/ui
$ export ENABLE_BACKEND_PROXY=true
$ yarn start

If you go to http://localhost:3000, you should see an instance of CAP, similar to the production instance at https://analysispreservation.cern.ch.

Recipes

More recipes exist to accomodate some of your use-cases:

To run a recipe do:

// Using local dev enviroment
sh scripts/<recipe-file.sh>

// Using docker enviroment
docker-compose -f docker-compose-dev.yml run web sh scripts/<recipe-file.sh>

Existing recipes list:

build-assets.sh // Collecting and Building Assets
clean-and-init.sh // Drop, detroy everything and re-init DB, ES, data location, redis
create-demo-users.sh  // Creates demo users for Admin, ALICE, ATLAS, CMS, LHCb
init.sh // Init DB, ES, data location, redis
init-db.sh // clean-and-init.sh + create-demo-users.sh

Additional information

For a more detailed guide on how to install CAP on Mac OS X check here

If you are working in Linux, you may need those additional libraries for python-ldap:

sudo apt-get install libsasl2-dev python-dev libldap2-dev libsasl2-dev

To use git hooks shared by our team:

# Git version 2.9 or greater
git config core.hooksPath .githooks

# older versions
find .git/hooks -type l -exec rm {} \;
find .githooks -type f -exec ln -sf ../../{} .git/hooks/ \;

You can also use yarn instead of npm, with the exact same syntax, i.e. yarn install and yarn start

Database Migrations

We use Alembic as a migration tool. Alembic stores all the changes, as a revisions under a specific branches. Changes for CERN Analysis Preservation are under cap branch.

To make sure, that your database is up to date with all the changes, run:

cap alembic upgrade heads

If you made some changes in one of the CAP models, Alembic can generate migration file for you. Keep in mind, that you need to specify parent revision for each of the revision (should be the latest revision for cap branch).

# To check parent revision
cap alembic heads | grep cap

# To create a new revision in cap branch
cap alembic revision "Add some field" -b cap -p <parent-revision>

Missing Requirements

If you have trouble with the setup, check if you are missing one of the following requirements, e.g. on Debian GNU/Linux:

sudo apt-get install npm ruby gcc python-virtualenvwrapper

The version of Python 2 given by python --version or python2 --version should be greater than 2.7.10.

Database Indexing Problems

If you have trouble indexing the database try:

cap db destroy
cap db init

and if that does not work try:

curl -XDELETE 'http://localhost:9200/_all'
cap db init