To be able to develop and run Invenio you will need the following installed and configured on your system:
- Docker v1.18+ and Docker Compose v1.23+
- NodeJS v6.x+ and NPM v4.x+
- Enough virtual memory for Elasticsearch (when running in Docker).
- python-ldap - installation and pre-requisites
There are two possibilities for setting up your own development version of CERN Analysis Preservation (CAP), a Development installation with python virtualenvwrapper and a Docker Installation.
CAP depends on PostgreSQL, Elasticsearch 5.x, Redis and RabbitMQ.
If you are only interested in running CAP locally, follow the Docker installation guide below. If you plan to eventually develop CAP code you continue further to Development installation to find out how to set up the local instance for easy code development.
For this guide you will need to install docker along with the docker-compose tool.
Docker installation is not necessary, although highly recommended.
If you can't use docker you can run CAP and all of the required services
directly in your system. Take a look at
docker-compose.yml
file to find out what is required and how the configuration looks like.
For development you will need to set-up and configure
four services: PostgreSQL (db
), Elasticsearch (es
),
Redis (cache
) and RabbitMQ (mq
).
The easiest way to run CERN Analysis Preservation locally is to use the provided docker-compose
configuration containing full CERN Analysis Preservation stack. First checkout the source code,
build all docker images and boot them up using docker-compose
:
$ git clone https://github.com/cernanalysispreservation/analysispreservation.cern.ch.git
$ cd analysispreservation.cern.ch
$ docker-compose -f docker-compose.full.yml build
$ docker-compose -f docker-compose.full.yml up -d
Keep the session with the docker-compose above alive, and in a new shell, go to the project directory and run the init script which creates the database tables, search indexes and some data fixtures:
$ docker-compose -f docker-compose.full.yml run web-api sh scripts/init.sh
Now visit the following URL in your browser:
https://<docker ip>
Note
If you're running docker on Linux or newer Mac OS X systems,
the <docker ip>
is usually the localhost. For older Mac OS X and
Windows systems running docker through docker-machine
, you can find
the IP with
$ docker-machine ip <machine-name>
It is possible to use a docker development environment, for testing purposes. You can create and run it by following the next steps:
Build your ui by using the command yarn build, after installing the dependencies. The
index.html
file should be inside theui/cap-react/dist folder
, in order to be mounted and used by the containers.Build and start the containers, by using the following command:
$ docker-compose -f docker-compose.dev.yml up
In order to initialize the necessary services (e.g. build and connect to the db, etc), open another cell and use the following command, while the services are running:
$ docker-compose -f docker-compose.dev.yml run web-api sh scripts/clean-and-init.sh
Now you have a dev environment that can automatically reload changed code in the backend, and will also
accept changes in the frontend, after rebuilding the index.html
file.
use the a Python debugger, e.g. import pdb; pdb.set_trace(),
If you want to debug your backend code, you will need to attach a new shell to the web-api
container. Find
the container id for web-api
, by using the command docker ps and copying the CONTAINER_ID of
the image. Now do:
$ docker attach <CONTAINER_ID>
and use the a Python debugger, e.g. import pdb; pdb.set_trace(), somewhere in your code. The project will be reloaded, with the breakpoint now set. The next time the debugger is triggered, you will be able to debug inside the attached shell.
Using Redirect URLs
You may need to use a redirect url, for OAuth testing or similar purposes. In order to do that, you need first to create
an OAuth app, and then change the following environment variables in the docker-services.yml
file:
- INVENIO_CERN_APP_CREDENTIALS_KEY
(the app id/key)
- INVENIO_CERN_APP_CREDENTIALS_SECRET
(the app secret)
- DEV_HOST
(the host that will be used for testing, could be ngrok, localhost, or by using the /etc/hosts
file to add a name to it)
For the development setup we will reuse the CAP docker image from previous section to run only essential CAP services, and run the application code and the Celery worker outside docker - you will want to have easy access to the code and the virtual environment in which it will be installed.
Since docker will be mapping the services to the default system ports on localhost, make sure you are not running PostgreSQL, Redis, RabbitMQ or Elasticsearch on those ports in your system.
Similarly to how we previously ran
docker-compose -f docker-compose.full.yml up -d
to run full-stack
CAP, this time we run only four docker nodes with the database,
Elasticsearch, Redis and RabbitMQ:
$ docker-compose up -d
Keep the docker-compose session above alive and in a separate shell, create a new Python virtual environment using virtualenvwrapper (virtualenvwrapper), in which we will install CAP code and its dependencies:
$ mkvirtualenv cap
(cap)$
Note
CAP works on both on Python 2.7 and 3.5+. However in case you need to use the XRootD storage interface, you will need Python 2.7 as the underlying libraries don't support Python 3.5+ yet.
Next, install CAP and code dependencies:
Go into the CAP directory and install the Python requirements:
cd cap
(cap)$ pip install -r requirements.txt
(cap)$ pip install -e .[all]
(cap)$ pip install -r requirements-local-forks.txt
Now, go to the React SPA direcotry and install UI dependencies:
(cap)$ cd ./ui
(cap)$ yarn install
To run CAP locally, you will need to have some services running on your machine. At minimum you must have PostgreSQL, Elasticsearch 5.x, Redis and RabbitMQ. You can either install all of those from your system package manager and run them directly or better - use the provided docker image as before.
The docker image is the recommended method for development.
Note
If you run the services locally, make sure you're running Elasticsearch 5.x.
To run only the essential services using docker, execute the following:
$ cd <to-project-dir>
$ docker-compose up -d
This should bring up four docker nodes with PostgreSQL (db), Elasticsearch (es), RabbitMQ (mq) and Redis (cache). Keep this shell session alive.
Note
For monitoring CAP locally, make sure to run the command for setting up statping where you run the server $ export "DEV_HOST=host.docker.internal"
Now that the services are running, it's time to initialize the CAP database and the Elasticsearch index.
Create the database, Elasticsearch indices, messages queues and various fixtures for schemas, users and roles in a new shell session:
$ cd <to-project-dir>
$ workon cap (or activate your virtual environment)
(cap)$ sh ./scripts/init.sh
Let's also run the Celery worker on a different shell session:
$ cd <to-project-dir>
$ workon cap (or activate your virtual environment)
(cap)$ celery worker -A cap.celery -l INFO --purge
Note
Here we assume all four services (db, es, mq, cache) are bound to localhost
(see cap/config.py).
If you fail to connect those services, it is likely
you are running docker through docker-machine
and those services are
bound to other IP addresses. In this case, you can redirect localhost ports
to docker ports as follows.
ssh -L 6379:localhost:6379 -L 5432:localhost:5432 -L 9200:localhost:9200 -L 5672:localhost:5672 docker@$(docker-machine ip)
The problem usually occurs among Mac and Windows users. A better solution is to install the native apps Docker for Mac or Docker for Windows (available since Docker v1.12) if possible, which binds docker to localhost by default.
Next, let's load some external data. Loading of this data is done via access rights and depends on internet access since it involves harvesting external DBs or REST APIs.
See `cap.samples.env`
for indication of ENV variables needed to be exported in your shell session
Make sure you keep the session with Celery worker alive. Launch the data loading commands in a separate shell:
$ cd <to-project-dir>
$ workon cap (or activate your virtual environment)
(cap)$ cap fixtures cms sync-cadi
Finally, run the CAP development server and the React SPA app in debug mode:
$ cd <to-project-dir>
$ workon cap (or activate your virtual environment)
(cap)$ export FLASK_DEBUG=True
(cap)$ export DEBUG_MODE=True
(cap)$ cap run --reload
And in another shell the React SPA application developement server, also in debug mode - so that requests point to the above server http://localhost:5000 :
First install all the dependencies in the ui root directory
$ cd <to-project-dir>/ui
$ yarn install
Since the project utilizes a monorepo approach using yarn workspaces, it is advised to install the dependencies in the ui root directory. Afterwards, each workspace can be initiated from its own directory.
$ cd <to-project-dir>/ui/cap-react
$ export ENABLE_BACKEND_PROXY=true
$ yarn start
If you go to http://localhost:3000, you should see an instance of CAP, similar to the production instance at https://analysispreservation.cern.ch.
More recipes exist to accomodate some of your use-cases:
To run a recipe do:
// Using local dev enviroment
sh scripts/<recipe-file.sh>
// Using docker enviroment
docker-compose -f docker-compose-dev.yml run web sh scripts/<recipe-file.sh>
Existing recipes list:
build-assets.sh // Collecting and Building Assets
clean-and-init.sh // Drop, detroy everything and re-init DB, ES, data location, redis
create-demo-users.sh // Creates demo users for Admin, ALICE, ATLAS, CMS, LHCb
init.sh // Init DB, ES, data location, redis
init-db.sh // clean-and-init.sh + create-demo-users.sh
Setup using default services template. The script requires following arguments:
To add a new service to the template:
More documentation about CLI recipes exist here
For a more detailed guide on how to install CAP on Mac OS X check here
If you are working in Linux, you may need those additional libraries for python-ldap:
sudo apt-get install libsasl2-dev python-dev libldap2-dev libsasl2-dev
To use git hooks shared by our team:
# Git version 2.9 or greater
git config core.hooksPath .githooks
# older versions
find .git/hooks -type l -exec rm {} \;
find .githooks -type f -exec ln -sf ../../{} .git/hooks/ \;
You can also use yarn instead of npm, with the exact same syntax, i.e. yarn install
and yarn start
Database Migrations
We use Alembic as a migration tool. Alembic stores all the changes, as a revisions under a specific branches. Changes for CERN Analysis Preservation are under cap branch.
To make sure, that your database is up to date with all the changes, run:
cap alembic upgrade heads
If you made some changes in one of the CAP models, Alembic can generate migration file for you. Keep in mind, that you need to specify parent revision for each of the revision (should be the latest revision for cap branch).
# To check parent revision
cap alembic heads | grep cap
# To create a new revision in cap branch
cap alembic revision "Add some field" -b cap -p <parent-revision>
Missing Requirements
If you have trouble with the setup, check if you are missing one of the following requirements, e.g. on Debian GNU/Linux:
sudo apt-get install npm ruby gcc python-virtualenvwrapper
The version of Python 2 given by python --version
or python2 --version
should be greater than 2.7.10.
Database Indexing Problems
If you have trouble indexing the database try:
cap db destroy
cap db init
and if that does not work try:
curl -XDELETE 'http://localhost:9200/_all'
cap db init
Open the project to a terminal window
Run the following command
docker-compose -f docker-compose.dev.yml up --remove-orphans
When all the services are running you can navigate to a browser
https://localhost
Open a second terminal window with the project
Do the following commands in the exact order
docker-compose -f docker-compose.dev.yml run web-api curl -XDELETE es:9200/_all
docker-compose -f docker-compose.dev.yml run web-api sh scripts/clean-and-init.sh
docker-compose -f docker-compose.dev.yml run web-api cap files location local var/data --default
docker-compose -f docker-compose.dev.yml run web-api cap fixtures cms index-datasets --file /opt/cap/demo/das.txt
docker-compose -f docker-compose.dev.yml run web-api cap fixtures cms index-triggers --file /opt/cap/demo/cms-triggers.json
Follow the below recipe for creating a user and generate a token
- Bash into cap-web pod of our prod, qa, dev or test.
kubectl exec -it cap-web-<pod> -- bash
- Create a user
cap users create statping-qa@cern.ch -a -password <>
cap access allow cms-access user statping-qa@cern.ch
- Create a token for the created user
from invenio_accounts.models import User
from invenio_oauth2server.models import Token
user = User.query.filter_by(email="statping-qa@cern.ch").first()
token_ = Token.create_personal(<token_name>, user.id, scopes=['deposit:write'])
db.session.add(token_)
db.session.commit()
token_.access_token