Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race during insertion of key/values #23

Open
AKKamath opened this issue Aug 11, 2021 · 8 comments
Open

Race during insertion of key/values #23

AKKamath opened this issue Aug 11, 2021 · 8 comments

Comments

@AKKamath
Copy link

I noticed that there were races in the code.
For example, in line 45 of src/concurrent_map/warp:

uint32_t src_unit_data = (next == SlabHashT::A_INDEX_POINTER)
? *(getPointerFromBucket(src_bucket, laneId))
: *(getPointerFromSlab(next, laneId));

Here a weak load operation is used to read data from the bucket.
Later on (line 65/line 83), an atomicCAS is performed to the same location.

Different threads may read and atomicCAS on the same location, creating a race.

@jowens
Copy link

jowens commented Aug 24, 2021

@sashkiani @maawad

@jowens
Copy link

jowens commented Aug 24, 2021

@AKKamath did you see other races of this type? Thank you for this report!

@AKKamath
Copy link
Author

This seems repeated in many locations.
For example src/concurrent_set/cset_warp_operations.cuh has a similar code snippet at lines 41 - 43.

@maawad
Copy link
Member

maawad commented Aug 24, 2021

Hi @AKKamath, although the bucket read operation doesn't perform any synchronization, the atomicCAS operation (where we perform the write) will only succeed if the pointer points to an EMPTY_PAIR_64 (i.e., it will fail if we read an empty value, but before the atomicCAS operation a different thread modified that memory location). If different threads try to perform a CAS operation, only one thread will succeed.

Could you explain what scenario would produce a race condition?

I'm also curious about how you found these multiple race conditions.

@AKKamath
Copy link
Author

Since the read operation is "weak" it has no consistency guarantees.
While I don't see any obvious race conditions, there has been prior work that shows data races in CUDA code breaks assumptions.
In the same work, they change a weak store to an atomic operation as it was racing with an atomicCAS.

A simple fix would be changing the weak loads to a "strong" operation by typecasting them to volatile, or using an atomic operation instead, e.g. atomicAdd(getPointerFromBucket(src_bucket, laneId), 0)

@maawad
Copy link
Member

maawad commented Aug 24, 2021

Thanks for sharing that paper. I read it before.
But as I mentioned, even though the read operation is weak, the operation that matters is the atomic CAS operation.

@AKKamath
Copy link
Author

So by definition (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#conflict-and-data-races), there is a data race between the weak load and atomic.
I guess it could be a benign data race in this case.

@jowens
Copy link

jowens commented Aug 28, 2021

I continue to be interested in getting to the bottom of this. @maawad is there a data race / is it benign, or is this not a race?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants