Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support high-availability setups for the interactive tools proxy #18481

Merged
merged 24 commits into from
Sep 4, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
b54bf35
Support using a table in any database supported by SQLAlchemy as mapp…
kysrpex Jun 28, 2024
bf6b6ed
Fix bug when using a mapping for interactive tools from a SQLAlchemy …
kysrpex Jul 4, 2024
967b591
Merge branch 'refs/heads/dev' into interactivetoolssqlalchemy
kysrpex Jul 4, 2024
d85e1b5
Replace raw SQL queries with SQLAlchemy Core SQL expressions
kysrpex Jul 17, 2024
ce6d0f7
Configure SQLAlchemy interactive tool maps via new `interactivetools_…
kysrpex Jul 18, 2024
647fd4c
Revert changes to description for setting `sessions` in galaxy.yml.sa…
kysrpex Jul 18, 2024
678bea1
Forbid setting `interactivetools_map_sqlalchemy` from matching `datab…
kysrpex Jul 18, 2024
dddf5ba
Add note in interactive tools docs page informing about `interactivet…
kysrpex Jul 18, 2024
6be9b60
workaround `mypy` error on statement using `sqlalchemy.select`
kysrpex Jul 22, 2024
e7dbe42
Create `gxitproxy` database table when using interactive tools for th…
kysrpex Jul 23, 2024
c1fd2b5
Merge branch 'dev' into interactivetoolssqlalchemy
kysrpex Jul 23, 2024
b6ad695
Add `sqlalchemy` to config package requirements
kysrpex Aug 2, 2024
5e8ca55
Test that Galaxy raises `ConfigurationError` when setting `interactiv…
kysrpex Aug 2, 2024
26fd210
Set `interactivetools_map` to `None` when `interactivetools_map_sqlal…
kysrpex Aug 9, 2024
ffa8ea4
Fix typos in descriptions of `interactivetools_map` and `interactivet…
kysrpex Aug 9, 2024
018e541
Mention that `interactivetools_map_sqlalchemy` overrides `interactive…
kysrpex Aug 9, 2024
e040750
Refactor variable name `query` to `stmt` for calls to `select()`, `in…
kysrpex Aug 9, 2024
8305016
Access individual table columns via `__getattr__` rather than `__geti…
kysrpex Aug 9, 2024
6fc0d46
Restore original error handling for `InteractiveToolPropagatorSQLAlch…
kysrpex Aug 9, 2024
4e6688c
Workaround to have original error handling for `get()` and pass mypy …
kysrpex Aug 9, 2024
0b6ad17
Remove dead code `InteractiveToolPropagatorSQLAlchemy.get()`
kysrpex Sep 3, 2024
9ee6636
Rename `interactivetools_map_sqlalchemy` to `interactivetoolsproxy_map`
kysrpex Sep 3, 2024
5c0b5d6
Use `urlparse` to compare `interactivetoolsproxy_map` with `database_…
kysrpex Sep 3, 2024
7510372
Revert "Add `sqlalchemy` to config package requirements"
kysrpex Sep 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions config/galaxy.yml.interactivetools
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ gravity:

galaxy:
interactivetools_enable: true
# either a path to a SQLite database file or a SQLAlchemy database URL, in both cases the table "gxitproxy" on
# the target database is used
interactivetools_map: database/interactivetools_map.sqlite

# outputs_to_working_directory will provide you with a better level of isolation. It is highly recommended to set
Expand Down
9 changes: 6 additions & 3 deletions doc/source/admin/galaxy_options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2128,9 +2128,12 @@
~~~~~~~~~~~~~~~~~~~~~~~~

:Description:
Map for interactivetool proxy.
The value of this option will be resolved with respect to
<data_dir>.
Map for interactivetool proxy. It may be either a path to a SQLite
database or a SQLAlchemy database URL (see
https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls).
In either case, mappings will be written to the table "gxitproxy" within the
database. If it is a path, the value of this option will be resolved with
respect to <data_dir>.
:Default: ``interactivetools_map.sqlite``
:Type: str

Expand Down
2 changes: 2 additions & 0 deletions doc/source/admin/special_topics/interactivetools.rst
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@ Set these values in ``galaxy.yml``:

galaxy:
interactivetools_enable: true
# either a path to a SQLite database file or a SQLAlchemy database URL, in both cases the table "gxitproxy" on the
# target database is used
interactivetools_map: database/interactivetools_map.sqlite

# outputs_to_working_directory will provide you with a better level of isolation. It is highly recommended to set
Expand Down
9 changes: 6 additions & 3 deletions lib/galaxy/config/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1103,9 +1103,12 @@ def _process_config(self, kwargs: Dict[str, Any]) -> None:
self.proxy_session_map = self.dynamic_proxy_session_map
self.manage_dynamic_proxy = self.dynamic_proxy_manage # Set to false if being launched externally

# InteractiveTools propagator mapping file
self.interactivetools_map = self._in_root_dir(
kwargs.get("interactivetools_map", self._in_data_dir("interactivetools_map.sqlite"))
# InteractiveTools propagator mapping
interactivetools_map = kwargs.get("interactivetools_map", self._in_data_dir("interactivetools_map.sqlite"))
self.interactivetools_map = (
interactivetools_map
if urlparse(interactivetools_map).scheme
else "sqlite:///" + self._in_root_dir(interactivetools_map)
)

# Compliance/Policy variables
Expand Down
4 changes: 2 additions & 2 deletions lib/galaxy/config/sample/galaxy.yml.sample
Original file line number Diff line number Diff line change
Expand Up @@ -189,8 +189,8 @@ gravity:
# Public-facing port of the proxy
# port: 4002

# Routes file to monitor.
# Should be set to the same path as ``interactivetools_map`` in the ``galaxy:`` section. This is ignored if
# Database to monitor.
# Should be set to the same value as ``interactivetools_map`` in the ``galaxy:`` section. This is ignored if
# ``interactivetools_map is set``.
# sessions: database/interactivetools_map.sqlite

Expand Down
6 changes: 4 additions & 2 deletions lib/galaxy/config/schemas/config_schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1518,9 +1518,11 @@ mapping:
interactivetools_map:
type: str
default: interactivetools_map.sqlite
path_resolves_to: data_dir
path_resolves_to: data_dir # does not apply to SQLAlchemy database URLs
desc: |
jdavcs marked this conversation as resolved.
Show resolved Hide resolved
Map for interactivetool proxy.
Map for interactivetool proxy. It may be either a path to a SQLite database or a SQLALchemy database URL (see
https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls). In either case, mappings will be written
to the table "gxitproxy" within the database.

interactivetools_prefix:
type: str
Expand Down
201 changes: 105 additions & 96 deletions lib/galaxy/managers/interactivetool.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
import json
import logging
import sqlite3
from urllib.parse import (
urlsplit,
urlunsplit,
)

from sqlalchemy import (
create_engine,
or_,
select,
text,
)

from galaxy import exceptions
Expand All @@ -18,125 +19,133 @@
)
from galaxy.model.base import transaction
from galaxy.security.idencoding import IdAsLowercaseAlphanumEncodingHelper
from galaxy.util.filelock import FileLock
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I may be missing something regarding the FileLock. What role was it actually playing? Isn't SQLite already taking care of locks?

The pager module effectively controls access for separate threads, or separate processes, or both. Throughout this document whenever the word "process" is written you may substitute the word "thread" without changing the truth of the statement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Galaxy has a long history of concurrency issues with respect to SQLite. I cannot promise this FileLock was needed but experience taught me it was better to be safe than sorry here.

Copy link
Contributor Author

@kysrpex kysrpex Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some information that may help us understand whether the lock is needed in this case.

However, no matter what locking modes are used, SQLite will still always lock the database file once a transaction is started and DML (e.g. INSERT, UPDATE, DELETE) has at least been emitted, and this will block other transactions at least at the point that they also attempt to emit DML. By default, the length of time on this block is very short before it times out with an error.

This behavior becomes more critical when used in conjunction with the SQLAlchemy ORM. SQLAlchemy’s Session object by default runs within a transaction, and with its autoflush model, may emit DML preceding any SELECT statement. This may lead to a SQLite database that locks more quickly than is expected. The locking mode of SQLite and the pysqlite driver can be manipulated to some degree, however it should be noted that achieving a high degree of write-concurrency with SQLite is a losing battle.

2. Situations Where A Client/Server RDBMS May Work Better

  • [...]

  • High-volume Websites

    SQLite will normally work fine as the database backend to a website. But if the website is write-intensive or is so busy that it requires multiple servers, then consider using an enterprise-class client/server database engine instead of SQLite.

  • [...]

  • High Concurrency
    SQLite supports an unlimited number of simultaneous readers, but it will only allow one writer at any instant in time. For many situations, this is not a problem. Writers queue up. Each application does its database work quickly and moves on, and no lock lasts for more than a few dozen milliseconds. But there are some applications that require more concurrency, and those applications may need to seek a different solution.

From both excerpts I understand that it is likely that the long history of concurrency issues that Galaxy has with SQLite probably has to do with using the SQLAlchemy ORM under high load (maybe you can either support/refute this?).

As far as I know, the interactive tools mapping file is written only by InteractiveToolSqlite (or InteractiveToolPropagatorSQLAlchemy in this refactor) and read only by Galaxy (the same class) and the proxy. I understand that issues would appear if interactive tools are in extremely high demand (Galaxy constantly writes keys and tokens to the file). However, in that case, wouldn't it just make better sense, as the SQLite docs claim, that the Galaxy admin uses PostgreSQL rather than SQLite? Another solution would be increasing the SQLite lock timeout.

Maybe if @jdavcs agrees to go forward with a new config property for mappings based on arbitrary SQLAlchemy database URLs, it may even make sense to discourage the use of the original property interactivetools_map in the Galaxy docs to favour the use of Postgres? After all, SQLite use is already discouraged.

In that case though, I think we should also have a look together at the proxy-side implementation, because it also has it's drawbacks regarding database setup (maybe it should just be done automatically by Galaxy).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From both excerpts I understand that it is likely that the long history of concurrency issues that Galaxy has with SQLite probably has to do with using the SQLAlchemy ORM under high load (maybe you can either support/refute this?).

I think the core of the issue here is just SQLite: "SQLite will still always lock the database file once a transaction is started" - it's just suboptimal for write-intensive scenarios. Using the ORM makes it worse: the autoflush model mentioned in the docs ensures that the state of objects loaded from the db is kept in sync with the data in the db by issuing DML statements before retrieving new data (when needed) - so the db file gets locked much more frequently if we rely on the Session. However, we don't use the ORM here.

If we go with this proposed approached, we can still keep SQLite as an option (it may be still a good enough option for smaller instances or dev purposes, etc.; after all we support SQLite as an option for the main database too). We may consider adding a note to Galaxy's documentation and updating the IT tutorial accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a note (dddf5ba) to the documentation and opened galaxyproject/training-material#5178 and galaxyproject/training-material#5179


log = logging.getLogger(__name__)

DATABASE_TABLE_NAME = "gxitproxy"


class InteractiveToolSqlite:
def __init__(self, sqlite_filename, encode_id):
self.sqlite_filename = sqlite_filename
self.encode_id = encode_id
class InteractiveToolPropagatorSQLAlchemy:
"""
Propagator for InteractiveToolManager implemented using SQLAlchemy.
"""

def __init__(self, database_url, encode_id):
"""
Constructor that sets up the propagator using a SQLAlchemy database URL.

:param database_url: SQLAlchemy database URL, read more on the SQLAlchemy documentation
https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls.
:param encode_id: A helper class that can encode ids as lowercase alphanumeric strings and vice versa.
"""
self._engine = create_engine(database_url)
self._encode_id = encode_id

def get(self, key, key_type):
with FileLock(self.sqlite_filename):
conn = sqlite3.connect(self.sqlite_filename)
try:
c = conn.cursor()
select = f"""SELECT token, host, port, info
FROM {DATABASE_TABLE_NAME}
WHERE key=? and key_type=?"""
c.execute(
select,
(
key,
key_type,
),
)
try:
token, host, port, info = c.fetchone()
except TypeError:
log.warning("get(): invalid key: %s key_type %s", key, key_type)
return None
return dict(key=key, key_type=key_type, token=token, host=host, port=port, info=info)
finally:
conn.close()
with self._engine.connect() as conn:
query = text(
f"""
SELECT token, host, port, info
FROM {DATABASE_TABLE_NAME} WHERE key=:key AND key_type=:key_type
"""
)
parameters = dict(
key=key,
key_type=key_type,
)
result = conn.execute(query, parameters).fetchone()
return (
None
if result is None
else dict(
key=key,
key_type=key_type,
token=result[0],
host=result[1],
port=result[2],
info=result[3],
)
)

def save(self, key, key_type, token, host, port, info=None):
"""
Writeout a key, key_type, token, value store that is can be used for coordinating
with external resources.
Write out a key, key_type, token, value store that is can be used for coordinating with external resources.
"""
assert key, ValueError("A non-zero length key is required.")
assert key_type, ValueError("A non-zero length key_type is required.")
assert token, ValueError("A non-zero length token is required.")
with FileLock(self.sqlite_filename):
conn = sqlite3.connect(self.sqlite_filename)
try:
c = conn.cursor()
try:
# Create table
c.execute(
f"""CREATE TABLE {DATABASE_TABLE_NAME}
(key text,
key_type text,
token text,
host text,
port integer,
info text,
PRIMARY KEY (key, key_type)
)"""
)
except Exception:
pass
delete = f"""DELETE FROM {DATABASE_TABLE_NAME} WHERE key=? and key_type=?"""
c.execute(
delete,
(
key,
key_type,
),
)
insert = f"""INSERT INTO {DATABASE_TABLE_NAME}
(key, key_type, token, host, port, info)
VALUES (?, ?, ?, ?, ?, ?)"""
c.execute(
insert,
(
key,
key_type,
token,
host,
port,
info,
),
)
conn.commit()
finally:
conn.close()
with self._engine.connect() as conn:
# create gx-it-proxy table if not exists
query = text(
f"""
CREATE TABLE IF NOT EXISTS {DATABASE_TABLE_NAME} (
key TEXT,
key_type TEXT,
token TEXT,
host TEXT,
port INTEGER,
info TEXT,
PRIMARY KEY (key, key_type)
)
"""
)
conn.execute(query)

# delete existing data with same key
query = text(
f"""
DELETE FROM {DATABASE_TABLE_NAME} WHERE key=:key AND key_type=:key_type
"""
)
parameters = dict(
key=key,
key_type=key_type,
)
conn.execute(query, parameters)

# save data
query = text(
f"""
INSERT INTO {DATABASE_TABLE_NAME} (key, key_type, token, host, port, info)
VALUES (:key, :key_type, :token, :host, :port, :info)
"""
)
parameters = dict(
key=key,
key_type=key_type,
token=token,
host=host,
port=port,
info=info,
)
conn.execute(query, parameters)

conn.commit()

def remove(self, **kwd):
"""
Remove entry from a key, key_type, token, value store that is can be used for coordinating
with external resources. Remove entries that match all provided key=values
"""
assert kwd, ValueError("You must provide some values to key upon")
delete = f"DELETE FROM {DATABASE_TABLE_NAME} WHERE"
value_list = []
for i, (key, value) in enumerate(kwd.items()):
if i != 0:
delete += " and"
delete += f" {key}=?"
value_list.append(value)
with FileLock(self.sqlite_filename):
conn = sqlite3.connect(self.sqlite_filename)
try:
c = conn.cursor()
try:
# Delete entry
c.execute(delete, tuple(value_list))
except Exception as e:
log.debug("Error removing entry (%s): %s", delete, e)
conn.commit()
finally:
conn.close()
with self._engine.connect() as conn:
query_template = f"DELETE FROM {DATABASE_TABLE_NAME} WHERE {{conditions}}"
conditions = (
"{key}=:{value}".format(
key=key,
value=f"value_{i}",
)
for i, key in enumerate(kwd)
)
query = text(query_template.format(conditions=" AND ".join(conditions)))
parameters = {f"value_{i}": value for i, value in enumerate(kwd.values())}
conn.execute(query, parameters)
conn.commit()

def save_entry_point(self, entry_point):
"""Convenience method to easily save an entry_point."""
return self.save(
self.encode_id(entry_point.id),
self._encode_id(entry_point.id),
entry_point.__class__.__name__.lower(),
entry_point.token,
entry_point.host,
Expand All @@ -151,7 +160,7 @@ def save_entry_point(self, entry_point):

def remove_entry_point(self, entry_point):
"""Convenience method to easily remove an entry_point."""
return self.remove(key=self.encode_id(entry_point.id), key_type=entry_point.__class__.__name__.lower())
return self.remove(key=self._encode_id(entry_point.id), key_type=entry_point.__class__.__name__.lower())


class InteractiveToolManager:
Expand All @@ -166,7 +175,7 @@ def __init__(self, app):
self.sa_session = app.model.context
self.job_manager = app.job_manager
self.encoder = IdAsLowercaseAlphanumEncodingHelper(app.security)
self.propagator = InteractiveToolSqlite(app.config.interactivetools_map, self.encoder.encode_id)
self.propagator = InteractiveToolPropagatorSQLAlchemy(app.config.interactivetools_map, self.encoder.encode_id)

def create_entry_points(self, job, tool, entry_points=None, flush=True):
entry_points = entry_points or tool.ports
Expand Down
Loading