From 8445d8143d0e6650b9b1ecc49d93ec5b24eea16f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Manuel=20Dom=C3=ADnguez?= Date: Fri, 19 Jul 2024 14:47:12 +0200 Subject: [PATCH 1/5] Add section on high availability setups to Galaxy Interactive Tools training --- .../tutorials/interactive-tools/tutorial.md | 189 +++++++++++++++++- 1 file changed, 184 insertions(+), 5 deletions(-) diff --git a/topics/admin/tutorials/interactive-tools/tutorial.md b/topics/admin/tutorials/interactive-tools/tutorial.md index 91696ed5a58304..c8885173a4161a 100644 --- a/topics/admin/tutorials/interactive-tools/tutorial.md +++ b/topics/admin/tutorials/interactive-tools/tutorial.md @@ -22,6 +22,7 @@ contributors: - slugger70 - hexylena - abretaud + - kysrpex tags: - ansible - interactive-tools @@ -224,7 +225,7 @@ When an Interactive Tool's Docker container starts, it will be assigned a random ![Galaxy Interactive Tools Proxy Diagram](../../images/interactive-tools/gxit-proxy-diagram.png "Galaxy Interactive Tools Proxy Diagram") -As you can see, the client only ever speaks to nginx on the Galaxy server running on the standard https port (443), never directly to the interactive tool (which may be running on a node that does not even have a public IP address). The mapping of GxIT invocation and its corresponding host/port is kept in a SQLite database known as the *Interactive Tools Session Map*, and the path to this database is important, since both Galaxy and the proxy need access to it. +As you can see, the client only ever speaks to nginx on the Galaxy server running on the standard https port (443), never directly to the interactive tool (which may be running on a node that does not even have a public IP address). By default, the mapping of GxIT invocation and its corresponding host/port is kept in a SQLite database known as the *Interactive Tools Session Map*, and the path to this database is important, since both Galaxy and the proxy need access to it. The GIE Proxy is written in [Node.js][nodejs] and requires some configuration. Thankfully there is an Ansible role, [usegalaxy_eu.gie_proxy][usegalaxy_eu-gie_proxy], that can install the proxy and its dependencies, and configure it for you. As usual, have a look through the [README][usegalaxy_eu-gie_proxy-readme] and [defaults][usegalaxy_eu-gie_proxy-defaults] to investigate which variables you might need to set before continuing. @@ -258,7 +259,7 @@ The GIE Proxy is written in [Node.js][nodejs] and requires some configuration. T > gie_proxy_git_version: main > gie_proxy_setup_nodejs: nodeenv > gie_proxy_virtualenv_command: "{{ pip_virtualenv_command }}" -> gie_proxy_nodejs_version: "10.13.0" +> gie_proxy_nodejs_version: "14.21.3" > gie_proxy_virtualenv: /srv/galaxy/gie-proxy/venv > gie_proxy_setup_service: systemd > gie_proxy_sessions_path: "{{ galaxy_mutable_data_dir }}/interactivetools_map.sqlite" @@ -295,7 +296,7 @@ The GIE Proxy is written in [Node.js][nodejs] and requires some configuration. T > > > > > > 1. A new Python venv was created at `/srv/galaxy/gie-proxy/venv` -> > 2. Node.js version 10.13.0 was installed in to the venv +> > 2. Node.js version 14.21.3 was installed in to the venv > > 3. The proxy was cloned to `/srv/galaxy/gie-proxy/proxy` > > 4. The proxy's Node dependencies were installed to `/srv/galaxy/gie-proxy/proxy/node_modules` using the venv's `npm` > > 5. A systemd service unit was installed at `/etc/systemd/system/galaxy-gie-proxy.service` @@ -893,8 +894,186 @@ Once the playbook run is complete and your Galaxy server has restarted, run the > {: .question } -# Final Notes +## High availability setup with PostgresSQL (Optional) -As mentioned at the beginning of this tutorial, Galaxy Interactive Tools are a relatively new and rapidly evolving feature. At the time of writing, there is no official documentation for Interactive Tools. Please watch the [Galaxy Release Notes][galaxy-release-notes] for updates, changes, new documentation, and bug fixes. +> +> This section is **only relevant if you are running a high-availability** setup, meaning that you have multiple copies of Galaxy running behind a load balancer. +> +> If you have installed Galaxy following the [Galaxy Installation with Ansible]({% link topics/admin/tutorials/ansible-galaxy/tutorial.md %}) tutorial, or are completing this tutorial as part of a [Galaxy Admin Training][gat] course, please skip this section, as you are then _not_ running a high-availability setup. +{: .comment} + +In a _high availability_ setup, multiple redundant copies of Galaxy run simultaneously behind a load balancer to minimize downtime and service interruptions. + +As explained in [one of the previous sections](#installing-the-interactive-tools-proxy), the Galaxy Interactive Tools Proxy redirects requests to each Interactive Tool's host and port. By default, the mapping of GxIT invocations to their corresponding host/port is kept in a SQLite database known as the _Interactive Tools Session Map_. + +By design, [SQLite is the wrong choice for high availability setups][sqlite_situations_where_a_client_server_rdbms_may_work_better], the showstopper being that the SQLite database file would have to be shared over a network filesystem, which are usually associated with too high latencies for RDBMS use. For this reason, Galaxy and the Interactive Tools Proxy can also store the **Session Map in a PostgreSQL database**. + +[sqlite_situations_where_a_client_server_rdbms_may_work_better]: https://www.sqlite.org/whentouse.html#situations_where_a_client_server_rdbms_may_work_better + +> Preparing the database +> +> First, you need to create a database for the Interactive Tools Proxy. +> +> > +> > Do **not** use the Galaxy database for this purpose. The main Galaxy database is reserved for Galaxy's core functionality, and Interactive Tools have not yet reached this stage. Since Galaxy does not expect to find the Interactive Tools Session Map in this database, storing it there can lead to errors. +> {: .warning } +> +>
+> +> 1. Access PostgresSQL **in your database server**. +> +> > Bash +> > ```bash +> > sudo -iu postgres psql +> > ``` +> > {: data-cmd="true"} +> {: .code-in} +> +> > SQL +> > ``` +> > psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1)) +> > Type "help" for help. +> > +> > postgres=# +> > ``` +> {: .code-out} +> +> 2. Create a `gxitproxy` database to store the Interactive Tools Session Map. +> +> > SQL +> > ```sql +> > CREATE DATABASE gxitproxy; +> > ``` +> > {: data-cmd="true"} +> {: .code-in} +> +> > SQL +> > ``` +> > CREATE DATABASE +> > ``` +> {: .code-out} +> +> 3. For simplicity, the same user that operates on the Galaxy main database, typically named `galaxy`, is also going to operate on this one. Make this user the owner of the new database. +> +> > SQL +> > ```sql +> > ALTER DATABASE gxitproxy OWNER TO galaxy; +> > ``` +> > {: data-cmd="true"} +> {: .code-in} +> +> > SQL +> > ``` +> > ALTER DATABASE +> > ``` +> {: .code-out} +> +> 4. Sign out of the `postgres` database using `exit`. Then connect to the `gxitproxy` database as `galaxy`. +> +> > SQL +> > ```sql +> > exit +> > ``` +> > {: data-cmd="true"} +> {: .code-in} +> +> > Bash +> > ```bash +> > sudo -iu galaxy psql -d gxitproxy +> > ``` +> > {: data-cmd="true"} +> {: .code-in} +> +> > SQL +> > ``` +> > psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1)) +> > Type "help" for help. +> > +> > gxitproxy=# +> > ``` +> {: .code-out} +> +> 5. Create a `gxitproxy` table in the new database. +> +> > SQL +> > ```sql +> > CREATE TABLE IF NOT EXISTS gxitproxy (key TEXT, key_type TEXT, token TEXT, host TEXT, port INTEGER, info TEXT, PRIMARY KEY (key, key_type)); +> > ``` +> > {: data-cmd="true"} +> {: .code-in} +> +> > SQL +> > ``` +> > CREATE TABLE +> > ``` +> {: .code-out} +> +> This is enough to let Galaxy and the Interactive Tool Proxy store the Interactive Tools Session Map in PostgreSQL. But there is a catch: when the Interactive Tool Proxy uses SQLite, it knows the database has changed because it watches the file for changes. When using Postgres, this mechanism is not available. By default, the proxy simply polls the database at regular intervals. To let the user access interactive tools as fast as possible, the proxy can also be notified of updates via [PostgreSQL asynchronous notifications](https://www.postgresql.org/docs/16/libpq-notify.html). To enable them, you have to create a PostgreSQL trigger that sends a NOTIFY message to the channel `gxitproxy` every time the table `gxitproxy` changes. +> +>
+> +> {:start="6"} +> 6. Run the following commands to create to create a function that sends a NOTIFY message to the channel `gxitproxy` and a trigger that runs the function every time the table `gxitproxy` changes. +> +> > SQL +> > ```sql +> > CREATE OR REPLACE FUNCTION notify_gxitproxy() +> > RETURNS trigger AS $$ +> > BEGIN +> > PERFORM pg_notify('gxitproxy', 'Table "gxitproxy" changed'); +> > RETURN NEW; +> > END; +> > $$ LANGUAGE plpgsql; +> > +> > CREATE TRIGGER gxitproxy_notify +> > AFTER INSERT OR UPDATE OR DELETE ON gxitproxy +> > FOR EACH ROW EXECUTE FUNCTION notify_gxitproxy(); +> > ``` +> > {: data-cmd="true"} +> {: .code-in} +> +> > SQL +> > ``` +> > CREATE FUNCTION +> > CREATE TRIGGER +> > ``` +> {: .code-out} +> +{: .hands_on} + +The next step is configuring Galaxy and the Interactive Tool Proxy to use the new database. + +> Configure Galaxy and the Interactive Tool Proxy +> +> 1. Adjust your `group_vars/galaxyservers.yml` file as follows. +> +> {% raw %} +> ```yaml +> # ... existing configuration options ... # +> +> galaxy_config: +> galaxy: +> # ... existing configuration options in the `galaxy` section ... +> # interactivetools_map: "{{ gie_proxy_sessions_path }}" # comment, remove or leave this line in place (it will be overridden by the option below) +> interactivetools_map_sqlalchemy: "{{ gie_proxy_sessions_path }}" +> # ... other existing configuration options in the `galaxy` section ... +> +> # ... other existing configurations ... # +> +> gie_proxy_sessions_path: "postgresql:///gxitproxy?host=/var/run/postgresql" +> ``` +> {% endraw %} +> +> 2. Run the playbook: +> +> ``` +> ansible-playbook galaxy.yml +> ``` +> +{: .hands_on} + +That's it, once the playbook run is complete, both Galaxy and the Interactive Tools Proxy will be storing the Interactive Tools Session Map in PostgreSQL. +# Final Notes +As mentioned at the beginning of this tutorial, Galaxy Interactive Tools are a relatively new and rapidly evolving feature. At the time of writing, there is no official documentation for Interactive Tools. Please watch the [Galaxy Release Notes][galaxy-release-notes] for updates, changes, new documentation, and bug fixes. [galaxy-release-notes]: https://docs.galaxyproject.org/en/master/releases/index.html From 04a54a9d99bf43c7ae1341f786c3cdb0b49c1048 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Manuel=20Dom=C3=ADnguez?= <43052541+kysrpex@users.noreply.github.com> Date: Mon, 22 Jul 2024 11:01:18 +0200 Subject: [PATCH 2/5] Include PostgreSQL asynchronous notifications explanation under step 6 of HA setup section hands-on of the Galaxy Interactive Tools training Co-authored-by: Helena --- topics/admin/tutorials/interactive-tools/tutorial.md | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/topics/admin/tutorials/interactive-tools/tutorial.md b/topics/admin/tutorials/interactive-tools/tutorial.md index c8885173a4161a..1d75058607ed5e 100644 --- a/topics/admin/tutorials/interactive-tools/tutorial.md +++ b/topics/admin/tutorials/interactive-tools/tutorial.md @@ -1008,12 +1008,9 @@ By design, [SQLite is the wrong choice for high availability setups][sqlite_situ > > ``` > {: .code-out} > -> This is enough to let Galaxy and the Interactive Tool Proxy store the Interactive Tools Session Map in PostgreSQL. But there is a catch: when the Interactive Tool Proxy uses SQLite, it knows the database has changed because it watches the file for changes. When using Postgres, this mechanism is not available. By default, the proxy simply polls the database at regular intervals. To let the user access interactive tools as fast as possible, the proxy can also be notified of updates via [PostgreSQL asynchronous notifications](https://www.postgresql.org/docs/16/libpq-notify.html). To enable them, you have to create a PostgreSQL trigger that sends a NOTIFY message to the channel `gxitproxy` every time the table `gxitproxy` changes. +> 6. This is enough to let Galaxy and the Interactive Tool Proxy store the Interactive Tools Session Map in PostgreSQL. But there is a catch: when the Interactive Tool Proxy uses SQLite, it knows the database has changed because it watches the file for changes. When using Postgres, this mechanism is not available. By default, the proxy simply polls the database at regular intervals. To let the user access interactive tools as fast as possible, the proxy can also be notified of updates via [PostgreSQL asynchronous notifications](https://www.postgresql.org/docs/16/libpq-notify.html). To enable them, you have to create a PostgreSQL trigger that sends a NOTIFY message to the channel `gxitproxy` every time the table `gxitproxy` changes. > ->
-> -> {:start="6"} -> 6. Run the following commands to create to create a function that sends a NOTIFY message to the channel `gxitproxy` and a trigger that runs the function every time the table `gxitproxy` changes. +> Run the following commands to create to create a function that sends a NOTIFY message to the channel `gxitproxy` and a trigger that runs the function every time the table `gxitproxy` changes. > > > SQL > > ```sql From 72e272f103662f6ece4970196e6f6e63a10b285a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Manuel=20Dom=C3=ADnguez?= Date: Mon, 22 Jul 2024 14:12:22 +0200 Subject: [PATCH 3/5] Unify steps 1, 2 and 3 under database preparation hands-on on HA setup section of the Galaxy Interactive Tools training --- .../tutorials/interactive-tools/tutorial.md | 57 ++----------------- 1 file changed, 6 insertions(+), 51 deletions(-) diff --git a/topics/admin/tutorials/interactive-tools/tutorial.md b/topics/admin/tutorials/interactive-tools/tutorial.md index 1d75058607ed5e..d4562838d524da 100644 --- a/topics/admin/tutorials/interactive-tools/tutorial.md +++ b/topics/admin/tutorials/interactive-tools/tutorial.md @@ -920,62 +920,17 @@ By design, [SQLite is the wrong choice for high availability setups][sqlite_situ > >
> -> 1. Access PostgresSQL **in your database server**. +> 1. **On your database server**, access PostgresSQL and create a `gxitproxy` database to store the Interactive Tools Session Map. For simplicity, the same user that operates on the Galaxy main database, typically named `galaxy`, is also going to operate on this one and will own the new database. > > > Bash > > ```bash -> > sudo -iu postgres psql +> > # one-liner that connects to Postgres, creates the database and assigns ownership +> > sudo -u postgres createdb -O galaxy gxitproxy > > ``` > > {: data-cmd="true"} > {: .code-in} > -> > SQL -> > ``` -> > psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1)) -> > Type "help" for help. -> > -> > postgres=# -> > ``` -> {: .code-out} -> -> 2. Create a `gxitproxy` database to store the Interactive Tools Session Map. -> -> > SQL -> > ```sql -> > CREATE DATABASE gxitproxy; -> > ``` -> > {: data-cmd="true"} -> {: .code-in} -> -> > SQL -> > ``` -> > CREATE DATABASE -> > ``` -> {: .code-out} -> -> 3. For simplicity, the same user that operates on the Galaxy main database, typically named `galaxy`, is also going to operate on this one. Make this user the owner of the new database. -> -> > SQL -> > ```sql -> > ALTER DATABASE gxitproxy OWNER TO galaxy; -> > ``` -> > {: data-cmd="true"} -> {: .code-in} -> -> > SQL -> > ``` -> > ALTER DATABASE -> > ``` -> {: .code-out} -> -> 4. Sign out of the `postgres` database using `exit`. Then connect to the `gxitproxy` database as `galaxy`. -> -> > SQL -> > ```sql -> > exit -> > ``` -> > {: data-cmd="true"} -> {: .code-in} +> 2. Connect to the `gxitproxy` database as `galaxy`. > > > Bash > > ```bash @@ -993,7 +948,7 @@ By design, [SQLite is the wrong choice for high availability setups][sqlite_situ > > ``` > {: .code-out} > -> 5. Create a `gxitproxy` table in the new database. +> 3. Create a `gxitproxy` table in the new database. > > > SQL > > ```sql @@ -1008,7 +963,7 @@ By design, [SQLite is the wrong choice for high availability setups][sqlite_situ > > ``` > {: .code-out} > -> 6. This is enough to let Galaxy and the Interactive Tool Proxy store the Interactive Tools Session Map in PostgreSQL. But there is a catch: when the Interactive Tool Proxy uses SQLite, it knows the database has changed because it watches the file for changes. When using Postgres, this mechanism is not available. By default, the proxy simply polls the database at regular intervals. To let the user access interactive tools as fast as possible, the proxy can also be notified of updates via [PostgreSQL asynchronous notifications](https://www.postgresql.org/docs/16/libpq-notify.html). To enable them, you have to create a PostgreSQL trigger that sends a NOTIFY message to the channel `gxitproxy` every time the table `gxitproxy` changes. +> 4. This is enough to let Galaxy and the Interactive Tool Proxy store the Interactive Tools Session Map in PostgreSQL. But there is a catch: when the Interactive Tool Proxy uses SQLite, it knows the database has changed because it watches the file for changes. When using Postgres, this mechanism is not available. By default, the proxy simply polls the database at regular intervals. To let the user access interactive tools as fast as possible, the proxy can also be notified of updates via [PostgreSQL asynchronous notifications](https://www.postgresql.org/docs/16/libpq-notify.html). To enable them, you have to create a PostgreSQL trigger that sends a NOTIFY message to the channel `gxitproxy` every time the table `gxitproxy` changes. > > Run the following commands to create to create a function that sends a NOTIFY message to the channel `gxitproxy` and a trigger that runs the function every time the table `gxitproxy` changes. > From 6b902063c761ead8fe748dfca8743ced1acadf66 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Manuel=20Dom=C3=ADnguez?= Date: Thu, 5 Sep 2024 09:48:22 +0200 Subject: [PATCH 4/5] Rename `interactivetools_map_sqlalchemy` to `interactivetoolsproxy_map` --- topics/admin/tutorials/interactive-tools/tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topics/admin/tutorials/interactive-tools/tutorial.md b/topics/admin/tutorials/interactive-tools/tutorial.md index d4562838d524da..265f943e2f5a65 100644 --- a/topics/admin/tutorials/interactive-tools/tutorial.md +++ b/topics/admin/tutorials/interactive-tools/tutorial.md @@ -1007,7 +1007,7 @@ The next step is configuring Galaxy and the Interactive Tool Proxy to use the ne > galaxy: > # ... existing configuration options in the `galaxy` section ... > # interactivetools_map: "{{ gie_proxy_sessions_path }}" # comment, remove or leave this line in place (it will be overridden by the option below) -> interactivetools_map_sqlalchemy: "{{ gie_proxy_sessions_path }}" +> interactivetoolsproxy_map: "{{ gie_proxy_sessions_path }}" > # ... other existing configuration options in the `galaxy` section ... > > # ... other existing configurations ... # From fa2d197696d295a4528d052ef47aee6bda6c5f50 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Manuel=20Dom=C3=ADnguez?= <43052541+kysrpex@users.noreply.github.com> Date: Tue, 29 Oct 2024 09:38:22 +0100 Subject: [PATCH 5/5] Explictly declare contributors as authors --- .../admin/tutorials/interactive-tools/tutorial.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/topics/admin/tutorials/interactive-tools/tutorial.md b/topics/admin/tutorials/interactive-tools/tutorial.md index 76341239069dd6..58759af6b17dbb 100644 --- a/topics/admin/tutorials/interactive-tools/tutorial.md +++ b/topics/admin/tutorials/interactive-tools/tutorial.md @@ -19,12 +19,13 @@ key_points: - nginx routes GxIT requests to the GxIT(/GIE) Proxy, which routes them to the node/port on which the GxIT is running - GxITs require wildcard SSL certificates - GxITs expose your Galaxy server's user datasets unless configured to use Pulsar -contributors: - - natefoo - - slugger70 - - hexylena - - abretaud - - kysrpex +contributions: + authorship: + - natefoo + - slugger70 + - hexylena + - abretaud + - kysrpex tags: - ansible - interactive-tools