Skip to content

Stack Brainstorming

Daniel Reeves edited this page Jun 7, 2020 · 4 revisions

Task queue

What it is: Runs tasks called from other places asynchronously of what called it. For example, refreshing a page can call an asynchronous task.

Why we need it: In the very long run, we want our database to do either one of two things (1) run once an hour and update the model, (2) run up to once an hour, updating whenever someone refreshes the web page. Both of these designs, but especially the first, would benefit from a task queue. The first design benefits from a task queue because it lets us set cron jobs; the second task benefits because it handles what we should do when someone refreshes the page multiple times before results are cached (i.e. only run the task once).

Comparison of task queue frameworks:

Celery Huey RQ Dramatiq
Windows Support OK Good Bad OK to Good
Unix Support Good Good Good Good
API Decorators Decorators Procedural calls Decorators
Resource intensiveness Medium Low Low Low
Github stars 14.9k 2.9k 6.9k 2.0k
Native cron scheduling Yes Yes Yes No
I/O manager (broker) Redis, RabbitMQ, Amazon SQS Redis, SQLite, memory Redis Redis, RabbitMQ, memory
Complexity High Low Medium Low
Monitoring UI Yes (Flower) No Yes (rq-dashboard) Yes, but not well-maintained (dramatiq_dashboard)

Object Relational Mapping ("ORM")

What it is: Translates SQL objects into Python objects and vice-versa.

Why we need it: ORMs handle the logic of connecting your database to your Python code. ORM's let you write code in Python's idiom, as oppossed to your database's idiom. That said, we probably don't need to rely too heavily on an ORM. In a sense, Pandas is kind of our "ORM." By that I mean: it is very likely we will be passing the database into Pandas DataFrames in most cases, and interact with the data that way. There are a few cases where this might not be true, and in any case it doesn't hurt to pass an ORM's database connection through pd.read_sql, hence why we still want to decide on an ORM.

Comparison of ORM frameworks:

SQL-Alchemy Peewee
Github stars 3.1k (depends if you count other repos) 7.5k
Complexity Medium Low

SQL Database

The choices are:

  • SQLite
  • Postgres