Exclusive lock during migration #366

marton78 · 2019-11-04T11:42:38Z

marton78
Nov 4, 2019

Hi,

as far as I can tell, currently there is no explicit lock applied during migrations. That would mean that it is dangerous to run migrations in a concurrent setting, e.g. when several instances of some API server including dbmate are running.

It would be advisable to obtain an exclusive lock of the migrations table before migrating.

amacneil · 2019-11-04T16:51:41Z

amacneil
Nov 4, 2019
Maintainer

This is a good feature request, I've discussed it with others in person too. Happy to accept a PR adding this.

0 replies

marton78 · 2019-11-04T20:40:48Z

marton78
Nov 4, 2019
Author

Sorry but I need to pass, I have zero knowledge of Golang :)

0 replies

amacneil · 2019-11-04T21:06:52Z

amacneil
Nov 4, 2019
Maintainer

No problem, I think someone else is already working on this. If not I will keep on the backlog.

0 replies

palmenhq · 2020-03-17T19:29:37Z

palmenhq
Mar 17, 2020

I miss this also. Is somebody working on it already? Otherwise I’d be happy to give it a go (pun intended, hehe). If so, I see a few paths that could be taken, and I’d just like to check what’s your opinion before moving forward :)

Firstly, do you think this would be better as a default or an opt-in? I’m thinking of something like a --lock or --concurrent flag. To me making it default would make sense though as it shouldn't really affect backwards compatibility of usage, right? And I don't really see a use case where you'd want to opt out or? Possibly to avoid deadlocks but.. idk, sounds unlikely. Maybe an opt-out flag would be better for that case?

An approach I'm thinking of is a two-phase locking based on hostname where one host acquires the lock, runs pending migrations and releases the lock again - so the next host can do the same after waiting for the first lock to release. In the case of a deadlock this makes it easy to introspect the database and find which host caused it, as opposed to using something like one-time generated ids. Possibly one could make a hybrid, for the case of someone running concurrent migrations on the same host? Or maybe that's just over engineering it?

Finally, I'm thinking of how to solve testing. This feature would need two dbmate implementations to run synchronised if it should be tested againts a real database i guess? So maybe we could add another testing phase in docker, that tests this particular feature? Additionally to some unit tests. Whats your thoughts on this?

Cheers!

0 replies

amacneil · 2020-03-18T00:14:57Z

amacneil
Mar 18, 2020
Maintainer

Hey, contribution sounds like a great idea!

I think it would be fine to make this behavior the default. I can't even think of a good reason to support opt-out: it should be completely transparent to people unless you happen to try to run multiple dbmates at the same time. Let me know if you can think of a good reason to expose this to users, but my initial reaction is that it should just be default behavior to prevent concurrent migrations.

For the implementation: I think it's going to be database specific. I'll talk about postgres because that's what I'm most familiar with. There are a couple primitives you could look into - the LOCK TABLE statement, and advisory locks (pg_advisory_lock). LOCK TABLE seems like it might be harder to use because it only lasts for the life of a transaction, and we don't hold a transaction open while updating all migrations (we open a transaction for each migration). Advisory locks I think you can use at the session level, which is probably exactly what we need (basically, once you start the dbmate up command, you grab the lock, and then release it after all migrations are complete (or maybe you can just let it automatically expire once the process dies?).

I don't follow your question about naming the locks based on hostname. I would think that you can just grab a lock with a global static name like dbmate_migration, and then anyone else running dbmate on any other hosts will be prevented from proceeding until after you release it.

Since the above are postgres-specific features, you would need to see what options exist for sqlite and mysql, but I assume there are similar features available.

Testing: Highest priority would be some unit tests to ensure the lock is acquired and released when expected. There's a separate question about having some integration tests which actually exercise dbmate concurrently to check for issues - I think this would be nice to have, but it's probably something along the lines of a script which can be run manually to look for concurrency errors. Long term it would also be nice to have a suite of integration tests which work against a dbmate binary and cover all functionality, but we don't have that yet. I wouldn't worry as much about solving this for the initial implementation.

0 replies

palmenhq · 2020-03-18T08:02:22Z

palmenhq
Mar 18, 2020

Sounds great!

Sorry for the unclarity, I was thinking of building the lock mechanism by having a table called something like change_lock or so - where we could store for example the hostname, and based on that each script can judge whether it has an acquired lock or not really yet. In regards of advisory locking that looks really neat though! Didn't really think of that. After some digging it seems mysql has something similar called GET_LOCK. Sqlite on the other hand.. seems trickier, as that as i understand it uses exclusive locking on a file/database level for all writes (as File Locking And Concurrency In SQLite Version 3), and I can't really find a way to acquire the database lock manually for the session (only per transaction). That being said running concurrent migrations against a single database file seems like a weird case no? But also skipping concurrency handling for one driver seems bad. Maybe a manual locking mechanism would be the way for this particular driver?

As for testing - alright, amaze!

0 replies

snarlysodboxer · 2022-07-18T23:11:43Z

snarlysodboxer
Jul 18, 2022

Would this project be open to a PR that adds a --concurrent-id <number> flag, and only supports certain drivers for now?

We could add validation that would error out if an unsupported driver is chosen along with this flag.
The locking logic would only be run when this flag is specified.
The <number> would allow users to ensure the locking ID is unique enough for their usage of dbmate and the backing database.

0 replies

ericwooley · 2024-04-18T00:57:09Z

ericwooley
Apr 18, 2024

I would love to see this feature. A couple ideas to add:

If you go with a custom lock management setup to manage the lock, you probably want to have a timeout tracking system of some sort because the process running the migrations could die or lose internet connectivity or who knows what else. So other concurrent processes might see that the lock was beyond some reasonable timeout (configurable?) and take over the lock.

Instead of the hostname, which should be unique but might not be due to misconfigurations, just have each process generate a uuid. It doesn't actually matter what the id is, as long as it's unique throughout the life of the process.

To avoid a specific lock table though, you could have a status for each migration.

So you have:

migration_name	status (waiting, running, done, error) default: waiting	started_at (datetimez)	ended_at (datetimez)	owned_by (process_uuid)
20240417_migration	running	2024-04-18T01:03:43.475Z	2024-04-18T01:03:44.475Z	4d908104-4fda-40cd-bfa3-8953aba1bb97

When a second process connects, it should query for any migrations that are not in the "done" state. If there are migrations in the "running" state, the process should wait for them to finish. Once all running migrations are complete, the process should acquire ownership over any new migrations that it is aware of but are not yet present in the database, and run those migrations.

Additionally if there are migrations in the running state, but those migrations are not known to the second process, and all migrations known to the second process are done, it should consider it's migration job done.

0 replies

jared-mackey · 2024-05-30T18:25:10Z

jared-mackey
May 30, 2024

Would also love to see this. I don't have a ton of input regarding the proposed approaches but would like to ask about ways to support this via using dbmate as a library. A couple things that might be needed to make it nice.

Allow me to provide a connection to the calls to perform the operations. This way I can acquire a connection myself, obtain a lock, then provide that connection which holds the lock to the library to perform it's operations.
Have the library approach provide a method to automatically parse command line options so that my custom binary can work with the same CLI arguments the dbmate CLI has.

I'm not sure this is best and would still prefer it being built-in but might be a quicker way to achieve this while we work through implementation details. Thoughts?

0 replies

zapling · 2024-11-30T10:51:31Z

zapling
Nov 30, 2024

We have solved this for our use-case in postgres with the use of postgres advisory locks #596

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exclusive lock during migration #366

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 10 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Exclusive lock during migration #366

marton78 Nov 4, 2019

Replies: 10 comments

amacneil Nov 4, 2019 Maintainer

marton78 Nov 4, 2019 Author

amacneil Nov 4, 2019 Maintainer

palmenhq Mar 17, 2020

amacneil Mar 18, 2020 Maintainer

palmenhq Mar 18, 2020

snarlysodboxer Jul 18, 2022

ericwooley Apr 18, 2024

jared-mackey May 30, 2024

zapling Nov 30, 2024

marton78
Nov 4, 2019

amacneil
Nov 4, 2019
Maintainer

marton78
Nov 4, 2019
Author

amacneil
Nov 4, 2019
Maintainer

palmenhq
Mar 17, 2020

amacneil
Mar 18, 2020
Maintainer

palmenhq
Mar 18, 2020

snarlysodboxer
Jul 18, 2022

ericwooley
Apr 18, 2024

jared-mackey
May 30, 2024

zapling
Nov 30, 2024