Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database is locked when running in highly concurrent environment #221

Open
0xGhostCasper opened this issue Oct 30, 2024 · 1 comment
Open

Comments

@0xGhostCasper
Copy link

Hi, I am running twscrape in highly concurrent environment with hundreds of separate processes and after a while sqlite database, which is being used by Twscrape library gets locked and such exception is thrown in my application code:

b3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./web3socialgraph/scraper/scraper.py", line 93, in scrape_user_followers web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | followers = await gather(self._api.followers(user_id, limit=limit)) web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ │ │ └ 500 web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ │ └ '1421397882666635271' web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ └ <function API.followers at 0x7a5773b9f6d0> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ └ <twscrape.api.API object at 0x7a5773bb0910> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ └ <web3socialgraph.scraper.scraper.TwscrapeTwitterScraper object at 0x7a5773bebfd0> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <function gather at 0x7a5773d0d000> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./twscrape/utils.py", line 27, in gather web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | async for x in gen: web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <async_generator object API.followers at 0x7a5772740d40> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./twscrape/api.py", line 273, in followers web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | async for rep in gen: web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <async_generator object API.followers_raw at 0x7a5772742d40> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./twscrape/api.py", line 268, in followers_raw web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | async for x in gen: web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <async_generator object API._gql_items at 0x7a5772743e40> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./twscrape/api.py", line 110, in _gql_items web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | async with QueueClient(self.pool, queue, self.debug, proxy=self.proxy) as client: web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ │ │ │ │ └ None web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ │ │ │ └ <twscrape.api.API object at 0x7a5773bb0910> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ │ │ └ False web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ │ └ <twscrape.api.API object at 0x7a5773bb0910> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ └ 'Followers' web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ └ <twscrape.accounts_pool.AccountsPool object at 0x7a5773a539a0> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ └ <twscrape.api.API object at 0x7a5773bb0910> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <class 'twscrape.queue_client.QueueClient'> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./twscrape/queue_client.py", line 77, in __aenter__ web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | await self._get_ctx() web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ └ <function QueueClient._get_ctx at 0x7a5773b9eb90> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <twscrape.queue_client.QueueClient object at 0x7a577312ae30> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./twscrape/queue_client.py", line 105, in _get_ctx web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | acc = await self.pool.get_for_queue_or_wait(self.queue) web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ │ └ 'Followers' web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ └ <twscrape.queue_client.QueueClient object at 0x7a577312ae30> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ └ <function AccountsPool.get_for_queue_or_wait at 0x7a5773b9e200> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ └ <twscrape.accounts_pool.AccountsPool object at 0x7a5773a539a0> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <twscrape.queue_client.QueueClient object at 0x7a577312ae30> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./twscrape/accounts_pool.py", line 289, in get_for_queue_or_wait web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | account = await self.get_for_queue(queue) web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ └ 'Followers' web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ └ <function AccountsPool.get_for_queue at 0x7a5773b9e170> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <twscrape.accounts_pool.AccountsPool object at 0x7a5773a539a0> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./twscrape/accounts_pool.py", line 284, in get_for_queue web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | return await self._get_and_lock(queue, q) web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ └ "\n SELECT username FROM accounts\n WHERE active = true AND (\n locks IS NULL\n OR json_e... web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ └ 'Followers' web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ └ <function AccountsPool._get_and_lock at 0x7a5773b9e0e0> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <twscrape.accounts_pool.AccountsPool object at 0x7a5773a539a0> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./twscrape/accounts_pool.py", line 255, in _get_and_lock web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | rs = await fetchone(self._db_file, qs) web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ └ "\n UPDATE accounts SET\n locks = json_set(locks, '$.Followers', datetime('now', '+15 minutes')),\n... web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ └ 'accounts.db' web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ └ <twscrape.accounts_pool.AccountsPool object at 0x7a5773a539a0> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <function lock_retry.<locals>.decorator.<locals>.wrapper at 0x7a5773b66830> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./twscrape/db.py", line 27, in wrapper web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | raise e web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./twscrape/db.py", line 24, in wrapper web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | return await func(*args, **kwargs) web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ └ {} web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ └ ('accounts.db', "\n UPDATE accounts SET\n locks = json_set(locks, '$.Followers', datetime('now', '+... web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <function fetchone at 0x7a5773b667a0> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./twscrape/db.py", line 142, in fetchone web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | async with db.execute(qs, params) as cur: web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ └ None web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ └ "\n UPDATE accounts SET\n locks = json_set(locks, '$.Followers', datetime('now', '+15 minutes')),\n... web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ └ <function Connection.execute at 0x7a5773b65480> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <Connection(Thread-41598, started 134515999639232)> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./aiosqlite/context.py", line 39, in __aenter__ web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | self._obj = await self._coro web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ └ <member '_coro' of 'Result' objects> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ └ <aiosqlite.context.Result object at 0x7a5774082bc0> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ └ <member '_obj' of 'Result' objects> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <aiosqlite.context.Result object at 0x7a5774082bc0> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./aiosqlite/core.py", line 193, in execute web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | cursor = await self._execute(self._conn.execute, sql, parameters) web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ │ │ └ [] web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ │ └ "\n UPDATE accounts SET\n locks = json_set(locks, '$.Followers', datetime('now', '+15 minutes')),\n... web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ │ └ <property object at 0x7a5773b5d710> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ │ └ <Connection(Thread-41598, started 134515999639232)> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | │ └ <function Connection._execute at 0x7a5773b64e50> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <Connection(Thread-41598, started 134515999639232)> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./aiosqlite/core.py", line 132, in _execute web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | return await future web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ <Future finished exception=OperationalError('database is locked')> web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | File "/usr/local/lib/python3.10/site-packages/./aiosqlite/core.py", line 115, in run web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | result = function() web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | └ functools.partial(<built-in method close of sqlite3.Connection object at 0x7a5771a30440>) web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | web3socialgraph_twitter_scraper_service.1.e69c5xln3mht@web3socialgraph | sqlite3.OperationalError: database is locked

Running this in FastStram RabbitBroker app with Python 3.10, Twscrape 0.14 version

Any ideas? Thank you for the effort and support of library, it's really useful!

@vladkens
Copy link
Owner

Hi, @0xGhostCasper.

Do you run it from single python process or many? SQLite is single threaded database, so it can't be called from multiple processes.

twscrape have lock_retry on each database operation, but it not real solution for highly concurrency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants