Installation Guide

Prerequisites

To be able to develop and run Invenio you will need the following installed and configured on your system:

Docker v1.18+ and Docker Compose v1.23+
NodeJS v6.x+ and NPM v4.x+
Enough virtual memory for Elasticsearch (when running in Docker).
python-ldap - installation and pre-requisites

There are two possibilities for setting up your own development version of CERN Analysis Preservation (CAP), a Development installation with python virtualenvwrapper and a Docker Installation.

CAP depends on PostgreSQL, Elasticsearch 5.x, Redis and RabbitMQ.

If you are only interested in running CAP locally, follow the Docker installation guide below. If you plan to eventually develop CAP code you continue further to Development installation to find out how to set up the local instance for easy code development.

For this guide you will need to install docker along with the docker-compose tool.

Docker installation is not necessary, although highly recommended.

If you can't use docker you can run CAP and all of the required services directly in your system. Take a look at docker-compose.yml file to find out what is required and how the configuration looks like. For development you will need to set-up and configure four services: PostgreSQL (db), Elasticsearch (es), Redis (cache) and RabbitMQ (mq).

Docker installation

The easiest way to run CERN Analysis Preservation locally is to use the provided docker-compose configuration containing full CERN Analysis Preservation stack. First checkout the source code, build all docker images and boot them up using docker-compose:

$ git clone https://github.com/cernanalysispreservation/analysispreservation.cern.ch.git
$ cd analysispreservation.cern.ch
$ docker-compose -f docker-compose.full.yml build
$ docker-compose -f docker-compose.full.yml up -d

Keep the session with the docker-compose above alive, and in a new shell, go to the project directory and run the init script which creates the database tables, search indexes and some data fixtures:

$ docker-compose -f docker-compose.full.yml run web-api sh scripts/init.sh

Now visit the following URL in your browser:

https://<docker ip>

Note

If you're running docker on Linux or newer Mac OS X systems, the <docker ip> is usually the localhost. For older Mac OS X and Windows systems running docker through docker-machine, you can find the IP with

$ docker-machine ip <machine-name>

Docker Dev Environment installation

It is possible to use a docker development environment, for testing purposes. You can create and run it by following the next steps:

Build your ui by using the command yarn build, after installing the dependencies. The index.html file should be inside the ui/cap-react/dist folder, in order to be mounted and used by the containers.
Build and start the containers, by using the following command:
```
$ docker-compose -f docker-compose.dev.yml up
```
In order to initialize the necessary services (e.g. build and connect to the db, etc), open another cell and use the following command, while the services are running:
```
$ docker-compose -f docker-compose.dev.yml run web-api sh scripts/clean-and-init.sh
```

Now you have a dev environment that can automatically reload changed code in the backend, and will also accept changes in the frontend, after rebuilding the index.html file.

use the a Python debugger, e.g. import pdb; pdb.set_trace(),

If you want to debug your backend code, you will need to attach a new shell to the web-api container. Find the container id for web-api, by using the command docker ps and copying the CONTAINER_ID of the image. Now do:

$ docker attach <CONTAINER_ID>

and use the a Python debugger, e.g. import pdb; pdb.set_trace(), somewhere in your code. The project will be reloaded, with the breakpoint now set. The next time the debugger is triggered, you will be able to debug inside the attached shell.

Using Redirect URLs

You may need to use a redirect url, for OAuth testing or similar purposes. In order to do that, you need first to create an OAuth app, and then change the following environment variables in the docker-services.yml file: - INVENIO_CERN_APP_CREDENTIALS_KEY (the app id/key) - INVENIO_CERN_APP_CREDENTIALS_SECRET (the app secret) - DEV_HOST (the host that will be used for testing, could be ngrok, localhost, or by using the /etc/hosts file to add a name to it)

Development installation

For the development setup we will reuse the CAP docker image from previous section to run only essential CAP services, and run the application code and the Celery worker outside docker - you will want to have easy access to the code and the virtual environment in which it will be installed.

Since docker will be mapping the services to the default system
ports on localhost, make sure you are not running PostgreSQL,
Redis, RabbitMQ or Elasticsearch on those ports in your system.

Similarly to how we previously ran docker-compose -f docker-compose.full.yml up -d to run full-stack CAP, this time we run only four docker nodes with the database, Elasticsearch, Redis and RabbitMQ:

$ docker-compose up -d

Keep the docker-compose session above alive and in a separate shell, create a new Python virtual environment using virtualenvwrapper (virtualenvwrapper), in which we will install CAP code and its dependencies:

$ mkvirtualenv cap
(cap)$

Note

CAP works on both on Python 2.7 and 3.5+. However in case you need to use the XRootD storage interface, you will need Python 2.7 as the underlying libraries don't support Python 3.5+ yet.

Next, install CAP and code dependencies:

Go into the CAP directory and install the Python requirements:

cd cap

(cap)$ pip install -r requirements.txt
(cap)$ pip install -e .[all]
(cap)$ pip install -r requirements-local-forks.txt

Now, go to the React SPA direcotry and install UI dependencies:

(cap)$ cd ./ui
(cap)$ yarn install

Running services

To run CAP locally, you will need to have some services running on your machine. At minimum you must have PostgreSQL, Elasticsearch 5.x, Redis and RabbitMQ. You can either install all of those from your system package manager and run them directly or better - use the provided docker image as before.

The docker image is the recommended method for development.

Note

If you run the services locally, make sure you're running Elasticsearch 5.x.

To run only the essential services using docker, execute the following:

$ cd <to-project-dir>
$ docker-compose up -d

This should bring up four docker nodes with PostgreSQL (db), Elasticsearch (es), RabbitMQ (mq) and Redis (cache). Keep this shell session alive.

Note

For monitoring CAP locally, make sure to run the command for setting up statping where you run the server $ export "DEV_HOST=host.docker.internal"

Initialization

Now that the services are running, it's time to initialize the CAP database and the Elasticsearch index.

Create the database, Elasticsearch indices, messages queues and various fixtures for schemas, users and roles in a new shell session:

$ cd <to-project-dir>
$ workon cap (or activate your virtual environment)
(cap)$ sh ./scripts/init.sh

Let's also run the Celery worker on a different shell session:

$ cd <to-project-dir>
$ workon cap (or activate your virtual environment)
(cap)$ celery worker -A cap.celery -l INFO --purge

Note

Here we assume all four services (db, es, mq, cache) are bound to localhost (see cap/config.py). If you fail to connect those services, it is likely you are running docker through docker-machine and those services are bound to other IP addresses. In this case, you can redirect localhost ports to docker ports as follows.

ssh -L 6379:localhost:6379 -L 5432:localhost:5432 -L 9200:localhost:9200 -L 5672:localhost:5672 docker@$(docker-machine ip)

The problem usually occurs among Mac and Windows users. A better solution is to install the native apps Docker for Mac or Docker for Windows (available since Docker v1.12) if possible, which binds docker to localhost by default.

Loading data

Next, let's load some external data. Loading of this data is done via access rights and depends on internet access since it involves harvesting external DBs or REST APIs.

See `cap.samples.env` for indication of ENV variables needed to be exported in your shell session

Make sure you keep the session with Celery worker alive. Launch the data loading commands in a separate shell:

$ cd <to-project-dir>
$ workon cap (or activate your virtual environment)
(cap)$ cap fixtures cms sync-cadi

Finally, run the CAP development server and the React SPA app in debug mode:

$ cd <to-project-dir>
$ workon cap (or activate your virtual environment)
(cap)$ export FLASK_DEBUG=True
(cap)$ export DEBUG_MODE=True
(cap)$ cap run --reload

And in another shell the React SPA application developement server, also in debug mode - so that requests point to the above server http://localhost:5000 :

First install all the dependencies in the ui root directory

$ cd <to-project-dir>/ui
$ yarn install

Since the project utilizes a monorepo approach using yarn workspaces, it is advised to install the dependencies in the ui root directory. Afterwards, each workspace can be initiated from its own directory.

$ cd <to-project-dir>/ui/cap-react
$ export ENABLE_BACKEND_PROXY=true
$ yarn start

If you go to http://localhost:3000, you should see an instance of CAP, similar to the production instance at https://analysispreservation.cern.ch.

Recipes

More recipes exist to accomodate some of your use-cases:

To run a recipe do:

// Using local dev enviroment
sh scripts/<recipe-file.sh>

// Using docker enviroment
docker-compose -f docker-compose-dev.yml run web sh scripts/<recipe-file.sh>

Existing recipes list:

build-assets.sh // Collecting and Building Assets
clean-and-init.sh // Drop, detroy everything and re-init DB, ES, data location, redis
create-demo-users.sh  // Creates demo users for Admin, ALICE, ATLAS, CMS, LHCb
init.sh // Init DB, ES, data location, redis
init-db.sh // clean-and-init.sh + create-demo-users.sh

Setup Statping Dashboard

Setup using default services template. The script requires following arguments:

To add a new service to the template:

More documentation about CLI recipes exist here

Additional information

For a more detailed guide on how to install CAP on Mac OS X check here

If you are working in Linux, you may need those additional libraries for python-ldap:

sudo apt-get install libsasl2-dev python-dev libldap2-dev libsasl2-dev

To use git hooks shared by our team:

# Git version 2.9 or greater
git config core.hooksPath .githooks

# older versions
find .git/hooks -type l -exec rm {} \;
find .githooks -type f -exec ln -sf ../../{} .git/hooks/ \;

You can also use yarn instead of npm, with the exact same syntax, i.e. yarn install and yarn start

Database Migrations

We use Alembic as a migration tool. Alembic stores all the changes, as a revisions under a specific branches. Changes for CERN Analysis Preservation are under cap branch.

To make sure, that your database is up to date with all the changes, run:

cap alembic upgrade heads

If you made some changes in one of the CAP models, Alembic can generate migration file for you. Keep in mind, that you need to specify parent revision for each of the revision (should be the latest revision for cap branch).

# To check parent revision
cap alembic heads | grep cap

# To create a new revision in cap branch
cap alembic revision "Add some field" -b cap -p <parent-revision>

Missing Requirements

If you have trouble with the setup, check if you are missing one of the following requirements, e.g. on Debian GNU/Linux:

sudo apt-get install npm ruby gcc python-virtualenvwrapper

The version of Python 2 given by python --version or python2 --version should be greater than 2.7.10.

Database Indexing Problems

If you have trouble indexing the database try:

cap db destroy
cap db init

and if that does not work try:

curl -XDELETE 'http://localhost:9200/_all'
cap db init

Development Full docker environment

Open the project to a terminal window

Run the following command

docker-compose -f docker-compose.dev.yml up   --remove-orphans

When all the services are running you can navigate to a browser

https://localhost

Open a second terminal window with the project

Do the following commands in the exact order

docker-compose -f docker-compose.dev.yml run web-api  curl -XDELETE es:9200/_all

docker-compose -f docker-compose.dev.yml run web-api sh scripts/clean-and-init.sh

docker-compose -f docker-compose.dev.yml run web-api  cap files location local var/data --default

docker-compose -f docker-compose.dev.yml run web-api  cap fixtures cms index-datasets --file /opt/cap/demo/das.txt

docker-compose -f docker-compose.dev.yml run web-api  cap fixtures cms index-triggers --file /opt/cap/demo/cms-triggers.json

Statping User and Token Creation

Follow the below recipe for creating a user and generate a token

Bash into cap-web pod of our prod, qa, dev or test.

kubectl exec -it cap-web-<pod> -- bash

Create a user

cap users create statping-qa@cern.ch -a -password <>
cap access allow cms-access user statping-qa@cern.ch

Create a token for the created user

from invenio_accounts.models import User
from invenio_oauth2server.models import Token

user = User.query.filter_by(email="statping-qa@cern.ch").first()
token_ = Token.create_personal(<token_name>, user.id, scopes=['deposit:write'])
db.session.add(token_)
db.session.commit()
token_.access_token

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

INSTALL.rst

INSTALL.rst

Installation Guide

Prerequisites

Docker installation

Docker Dev Environment installation

Development installation

Running services

Initialization

Loading data

Recipes

Setup Statping Dashboard

Additional information

Development Full docker environment

Statping User and Token Creation

Files

INSTALL.rst

Latest commit

History

INSTALL.rst

File metadata and controls

Installation Guide

Prerequisites

Docker installation

Docker Dev Environment installation

Development installation

Running services

Initialization

Loading data

Recipes

Setup Statping Dashboard

Additional information

Development Full docker environment

Statping User and Token Creation