Skip to content
Maria Paola Ferri edited this page Oct 28, 2024 · 21 revisions

openvre

Table of contents

What is openVRE

openVRE is a cloud-based working environment that allow to rapidly build your own computational platform. It offers:

  • a user-friendly web-based interface that integrates a number of pluggable resources:
    • Analysis tools and pipelines
    • Interfaces to external data repositories
    • Visualization tools
  • an scalable backend for cloud computing compatible with OCCI middlewares like OpenNebula or OpenStack.

Architecture

The main repository contains the source code of openVRE, including a customizable web front-end and core code that manages data, metadata, processing units (analysis and visualization tools), and computational resources. The following diagram illustrates a possible layout for a computational infrastructure based on openVRE:

arch_1

Different components work together to provide a computational environment for analysis or visualization tools initiated by users via the VRE web interface. User requests are processed by one of two available schedulers: PMES or SGE-oneflow. These schedulers trigger the tools within the appropriate virtual machines (VMs) where the selected tools are encapsulated. Resources are dynamically allocated based on user demand, with the number of dedicated VMs scaling according to job requests, as allowed by cloud computational resources. These VMs access various data volumes where input files are stored—either private (from the user's workspace) or public datasets—and where output files are written.

Once a job request reaches the allocated VM, tool execution begins. The integrated VRE tools are independent software components developed by third parties (e.g., bioinformaticians, researchers, statisticians) and are deployed on the on-premises VMs of the compute platform. All tools adhere to the common structure defined by the openvre-tool-api, allowing the openVRE core to trigger them uniformly.

Components

  • Queue system + oneflow: The Grid Engine, commonly known as SGE (Sun Grid Engine), serves as the core server responsible for launching and managing jobs in a cluster-based infrastructure. In this context, the openVRE core acts as a submitter node, dispatching jobs to VMs that host the tool code (the workers). Each tool has a dedicated queue, with several VM replicas associated with each queue. The availability of these replicas is managed by Oneflow, an OpenNebula service that dynamically provisions VMs based on configurable system metrics, such as VM load. Consequently, the number of queue workers can automatically adjust to meet demand, ensuring that there is always a VM available to handle job requests.

  • PMES (Programming Model Enactment Service): This service remotely controls job execution on the SGE core server in the underlying cloud platform via the Open Cloud Computing Interface (OCCI), independent of the cloud middleware used (e.g., OpenNebula, OpenStack). PMES oversees VM creation, contextualization, application execution, and VM termination, facilitating the efficient execution of jobs submitted to the SGE.

  • **MongoDB:**The backbone of the openVRE architecture, MongoDB is a NoSQL database that stores all data, including user information, job metadata, and tool configurations. Its flexible schema allows for dynamic data modeling, making it ideal for the varying requirements of computational tasks. MongoDB ensures efficient data retrieval and management, supporting the scalability needed for large-scale analyses and visualizations.

  • openvre-tool-api: This is a VRE tool skeleton written in Python, including parsing and modeling classes that facilitate batch and asynchronous execution. It provides a common command-line client for openVRE tools, acting as an adapter and uniform entry point between the openVRE core and the application code.

  • Keycloak Server: Keycloak serves as the authorization and authentication component for openVRE. It can be installed locally or configured to refer to a remote domain. Keycloak enables secure user authentication and role-based access control, ensuring that only authorized users can access specific functionalities and data within the openVRE environment.

  • HashiCorp Vault: HashiCorp Vault is used to securely store keys and credentials that users need to launch jobs or access computational environments and databases. Vault provides a robust security model for managing sensitive data, enabling dynamic secrets, and controlling access to various resources.

Docker-based architecture

The branch Dockerized of the main repository provides a containerized version of openVRE along with the essential components needed to deploy a minimal yet functional computational cloud infrastructure.. While the underlying components for the infrastructure remain the same, they are packaged in a Docker environment for ease of deployment and scalability. The following diagram outlines these components:

arch_2

Installation

openVRE provides a central manager and a user interface to the in-premisses compute platform. Other components (see the following section ) are also required to build an operational infrastructure.

Manual installation

Use the step-by-step installation guide from the OpenVRE source code repository. It provide instructions on how to build and start the PHP-based application service.

Docker-based deployment

To simplify the installation process, the following repository contains the code for deploying an operational openVRE-based computational platform out of the box. It includes a composition of the minimal dependent components (i.e. an authentication service, a local SGE queue system, etc.) along with the openVRE service.

Containerized Deployment Install.md

Pluggable resources

openVRE supports three distinct types of pluggable software components that can be integrated into a functional VRE in a modular fashion by a VRE administrator. Depending on the implementation method, different tutorials and guidelines are available for integrating these resources.

Tools

This installation guide provides instructions for implementing tools in openVRE. Depending on your preference and requirements, you can choose between two different methods: Manual Installation and Dockerized Deployment. Below are the details for each approach. You can find the ste-by-step instruction on how to integrate a tool on the platform here.

Visualizers
Repository Interfaces