Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add connection pooling #483

Open
josephmancuso opened this issue Aug 5, 2021 · 10 comments
Open

Add connection pooling #483

josephmancuso opened this issue Aug 5, 2021 · 10 comments
Labels
enhancement A feature that exists, works as intended but needs to be improved hard These issues are geared for people who have contributed to the project a lot

Comments

@josephmancuso
Copy link
Member

Right now, Masonite ORM makes a connection and drops a connection on every query call. This is fine in most instances but if you have several application making the same connections to the same database, you'll want to set a connection_pool_limit. Like 100 connections. Then we can pull from the connection pool rather than making a new connection every time

@josephmancuso josephmancuso added the enhancement A feature that exists, works as intended but needs to be improved label Aug 5, 2021
@josephmancuso
Copy link
Member Author

we're also going to need some kind of timer delay for waiting for a connection to be open

@josephmancuso josephmancuso added the hard These issues are geared for people who have contributed to the project a lot label Dec 20, 2021
@federicoparroni
Copy link
Contributor

Hello @josephmancuso! Another enhancement I can suggest is to make Masonite thread-safe. If I open a transaction in a thread and another thread is doing queries, I get InterfaceError from pymysql. Do you have a workaround at the moment?
I can see that ConnectionResolver is storing the open connections as a class field indexed by connection name:

class ConnectionResolver:
_connection_details = {}
_connections = {}
_morph_map = {}

Is there any way to segregate the open connections in order that each thread can access only its own connections?

@josephmancuso
Copy link
Member Author

If you can give some code examples I can test it out and figure out the issue. The connection resolver just holds the available connection classes. I don't think that should be a thread issue but I could be wrong

@federicoparroni
Copy link
Contributor

I will try to provide a minimal example as soon as possible since I have a quite complex application.
Looking at the code, I think that the issue is the shared _connections field. When a transaction is opened, the connection is stored in that dictionary (line 60):

def begin_transaction(self, name=None):
if name is None:
name = self.get_connection_details()["default"]
driver = self.get_connection_details()[name].get("driver")
connection = (
self.connection_factory.make(driver)(
**self.get_connection_information(name)
)
.make_connection()
.begin()
)
self.__class__._connections.update({name: connection})
return connection

Then, the connection is reused when a connection is required (line 68-69):
def make_connection(self):
"""This sets the connection on the connection class"""
if self._dry:
return
try:
import pymysql
except ModuleNotFoundError:
raise DriverNotFound(
"You must have the 'pymysql' package installed to make a connection to MySQL. Please install it using 'pip install pymysql'"
)
try:
import pendulum
import pymysql.converters
pymysql.converters.conversions[
pendulum.DateTime
] = pymysql.converters.escape_datetime
except ImportError:
pass
if self.has_global_connection():
return self.get_global_connection()
self._connection = pymysql.connect(
cursorclass=pymysql.cursors.DictCursor,
autocommit=True,
host=self.host,
user=self.user,
password=self.password,
port=self.port,
db=self.database,
**self.options
)
self.open = 1
return self

In summary, I suppose that connections are shared between different threads...

@josephmancuso
Copy link
Member Author

Hmm if you are using transactions than that's possible it's trying to access a connection in another thread

@circulon
Copy link
Contributor

circulon commented Jun 19, 2022

Hmm if you are using transactions than that's possible it's trying to access a connection in another thread

Would this effect be due to every call to the db when creating a model or querybuilder instance or just getting DB (connectionresolver with config) for Transaction use generates a new instance of the ConnectionResolver?

Shouldn't the ConnectionResolver be cached in some way to utilise the same connection during multiple consecutive queries?

Or am I just looking at this wrong?

@josephmancuso
Copy link
Member Author

josephmancuso commented Jun 19, 2022

So theres a few definitions that need to be defined. Theres a few different "connection" references here.

  1. Connection Resolver. This is kind of a god object in the library responsible as a central store that parts of the ORM can use to store and retrieve things.

  2. Connection Classes - These are the classes with the logic responsible for calling the specific database. These are standalone classes that get registered to the connection resolver. These are instantiated on every new connection to the database.

  3. Connections - These would be the actual connections that exist once a connection is made to the database. For example, you go to make a query to a MYSQL database and it opens a connection for the duration of the query and then closes it after.

With queries we open and close a connection on every single query. I would assume this would be thread safe since the thread doing the query also should be the one creating and closing the connection class (3.).

With transactions the current implementation is to store the "connections" (the 3. here) and then save the connection globally so we can keep the transaction open for the duration of the transaction and then once the transaction is committed or rolled back we close this open transaction from the connection.

I think what @federicoparroni is suggesting is that doing that can result in 2 threads sharing the same "connection" class (3.) and then one thread closing the connection while another thread still is using it

@federicoparroni
Copy link
Contributor

Yes @josephmancuso, it's exactly that situation. My current workaround for this issue is to create ad-hoc connections with unique names for each thread, so that when a transaction begins, it is only used by that single thread. I'm using the on method to force all the queries in the transaction context to use that unique connection, but I am also experiencing some problems with this approach (see #726)

@josephmancuso
Copy link
Member Author

Yes @josephmancuso, it's exactly that situation. My current workaround for this issue is to create ad-hoc connections with unique names for each thread, so that when a transaction begins, it is only used by that single thread. I'm using the on method to force all the queries in the transaction context to use that unique connection, but I am also experiencing some problems with this approach (see #726)

I think the solution is to probably use SAVEPOINTS instead of using the cursor on the connection class but i'll have to research that a bit.

@stoutput
Copy link
Contributor

sqlalchemy uses something called a "session factory" which creates a new sqla session if one does not exist, or returns the current session if it does. Session instantiation involves obtaining a new connection. This session object provides thread-safe property access to things like the current transaction and its db connection. Something akin to that could be of use here – but I think thread safety should probably be another issue altogether though it's important to think about in this context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A feature that exists, works as intended but needs to be improved hard These issues are geared for people who have contributed to the project a lot
Projects
None yet
Development

No branches or pull requests

4 participants