Postgresのレースコンディションを「同期バリア」で攻略する：再現困難なバグを確実に潰すテスト手法

lirbank

約19時間前

ディスカッション (8件)

lirbankOP👍 54

約19時間前

データベースの並行処理で発生する「レースコンディション（競合状態）」は、実行タイミングがわずかにズレるだけで発生しなくなるため、再現が非常に困難な厄介なバグです。この記事では、Postgresにおいて「同期バリア（Synchronization Barriers）」を活用し、複数のスレッドやプロセスの実行を特定のポイントで意図的に待ち合わせることで、競合状態を確実にシミュレーションしてテストする高度なデバッグ手法について解説します。

🔗 リンク先:https://www.lirbank.com/harnessing-postgres-race-conditions

haliliceylan

約18時間前

Thats not postgresql problem, thats your code

IMHO you should never write code like that, you can either do
UPDATE employees
SET salary = salary + 500
WHERE employee_id = 101;

Or if its more complex just use STORED PROCEDURE, there is no point of using database if you gonna do all transactional things in js

scottlamb

約17時間前

It'd be interesting to see a version of this that tries all the different interleavings of PostgreSQL operations between the two (or N) tasks. https://crates.io/crates/loom (https://crates.io/crates/loom) does something like this for Rust code that uses synchronization primitives.

anvevoice

約17時間前

One pattern I've found useful alongside this: Postgres advisory locks (pg_advisory_xact_lock) for cases where the contention isn't row-level but logic-level. For example, if two requests try to create the "first" resource of a type - there's no existing row to SELECT FOR UPDATE against.

Advisory locks let you serialize on an arbitrary key (like a hash of the entity type + parent ID) without needing a dummy row or separate lock table. They auto-release on transaction end, so no cleanup.

The barrier testing approach from the article would work nicely here too - inject the barrier between acquiring the advisory lock and the subsequent insert, then verify the second transaction blocks until the first commits.

HackerThemAll

約17時間前

Javascript developers learn kindergarten basics of transactions and SQL. LOL. Is it the camp "we don't need a degree to be programmers"?

throwaway2ge5hg

約17時間前

Postgres has SERIALIZABLE transaction isolation level. Just use it and then you never have to worry about any of these race conditions.

And if for some reason you refuse to, then this "barrier" or "hooks" approach to testing will in practice not help. It requires you to already know the potential race conditions, but if you are already aware of them then you will already write your code to avoid them. It is the non-obvious race conditions that should scare you.

To find these, you should use randomized testing that runs many iterations of different interleavings of transaction steps. You can build such a framework that will hook directly into your individual DB query calls. Then you don't have to add any "hooks" at all.

But even that won't find all race condition bugs, because it is possible to have race conditions surface even within a single database query.

You really should just use SERIALIZABLE and save yourself all the hassle and effort and spending hours writing all these tests.

piskov

約15時間前

That whole article should have been:

Use transactions table (just a name, like orders)

On it have an Insert trigger.

It should make a single update with simple “update … set balance += amount where accoundId = id”. This will be atomic thanks to db engine itself.

Also add check constraint >= 0 for balance so it would never become negative even if you have thousands of simultaneous payments. If it becomes negative, it will throw, insert trigger will rethrow, no insert will happen, your backend code will catch it.

—

That’s it: insert-trigger and check constraint.

No need for explicit locking, no stored procedures, no locks in you backend also, nada. Just a simple insert row. No matter the load and concurrent users it will work like magic. Blazingly fast too.

That’s why there is ACID in DBs.

—

Shameless plug: learn your tool. Don’t approach Postgresql/Mssql/whathaveyousql like you’re a backend engineer. DB is not a txt file.

jijji

約15時間前

to avoid these conditions i have usually inserted a row into a lock table used for this purpose to create a lock with a unique key for that row with a few minute timer, the once the transaction is complete it will delete the lock row. This way, simultaneous users will only get the first lock, all other requests would fail, and then if the timer expired, we would assume the transaction never completed and it could try again after a few minutes