- Out-Of-Order Stores to different cache lines
- All stores to the same cache line is sequentially consistent
- That is, they get written into it immediately.
- Trade-off for using a granularity of cache lines
- PMDK's pmat has granularity of individual stores
- Bookkeeping for individual stores requires far more time and space (linear)
- Bookkeeping for cache lines requires only a constant amount of time and space.
- Might be possible to use a hybrid approach for individual stores and cache lines
- Just place a set of store meta-data to an individual cache-line
- All stores to the same cache line is sequentially consistent
- Aligned pointers declared to the Valgrind host as being 'persistent' get their own shadow heap
- Shadow heap represents a file that is written to in a way that simulates the out-of-order nature of the CPU
- Due to constructs such as
mmap
andshm*
being unavailable in Valgrind, we uselseek
andwrite
- Valgrind builds without linking in standard libraries, likely for portability reasons
- Makes many assumptions that would 'break' Valgrind if circumvented without non-trivial effort to remedy
- Replicated Client State that is written to file is used by a child process
- Valgrind does offer ability to
fork
and thenexecv
a verification process- Verification will be provided the file name, and have freedom to mmap into memory as they please to verify.
- When it is time to verify, a 'fork' of the file must be created via copying into a newer file
- Older file is give to verification process which can then begin immediately.
- Valgrind does offer ability to
- Stores to persistent memory are checked at runtime via valgrind instrumentation
- Instrumentation occurs the first time a portion of code is to be executed, I.E Just-In-Time
- Instrumentation allows for certain 'hooks' to be called whenever some event occurs
- Unlike static analysis tools, valgrind can intercepts calls to all code, including dynamic libraries like glibc
- Stores are managed at the granularity of cache lines...
- Mapping of cache line of store to a data structure representing cache line...
- Cache line keeps a list of store metadata that includes line number, function name, etc., information
- Enable tracing of origins of store that did not persist after a program crash.
- Each cache line also keeps a bitmap of 'dirty bits' that determine what should and shoud not be written back...
- Flushes to a cache line will place cache line to a write-buffer
- Write-buffer is randomly flushed when full, to simulate out-of-order write-back of cache lines
- Each write buffer entry keeps track of the thread id and origin of flush
- Fences will flush all cache lines for a particular thread
- Ensures that all write-buffer entries are written back for that thread
- A process is created based on user arguments and is passed a file name
- This file can then be
mmap
'd and verified - Not as fast as actual persistent memory, but necessary compromise
- Multiple verification processes can be performed concurrently.
- This file can then be
- Given a system with N threads, at most N - 2 verification processes will occur concurrently
- Thread serialization in Valgrind makes it 'not so bad' to do so since only one thread runs at any given time
- Child will keep an array of
(pid, fileName)
associatied with current running grandchildren and their files - When a grandchild finishes, can collect return value for verification; if bad, keep bad
fileName
so can be analyzed - Child will poll on pipe for data, and while not busy, will check results of verification
- Further requests for verification are queued up for later
- Saving all files provides option to attempt 'recovery' from each individual file, to further test verification.
- IDEA: Experiment with
cp --reflink=auto
to implement a Copy-on-Write scheme
- IDEA: Experiment with
- Verification should be called based on a combination of the time since last verification and random chance...
- Static random chance happens far too often
- When a verification process fails, we do the following...
- Do not delete the faulty binary file that is the 'shadow heap'...
- Associate with the prefix of the faulty binary file, the state of...
- The cache, including the location of all stores that have not yet been written back
- The write-buffer, including the location of all flushes and affiliated stores...
- Perhaps the state of all the other threads when the power-failure occurred?
- Should be possible to determine "where", "when", and let the programmer infer "why"
- A
store
atprogram:L126
with did not persist (show value written) - A
flush
atprogram:L127
was did not reach afence
and was not written-back (show flush and store info) - Show time may be helpful, as it can identify how long this leak has occurred (microseconds, that's okay... minutes? Thats bad!)
- A
- At program exit...
- Print out all leaked cache lines, as well as leaked stores
- Cache and write-buffer entries will store only their last pending
ExeContext
- Coalescing of multiple cache-entries needed to prevent having a ton of duplicate information.
- Maintain only the parent; parent writes directly to a file; forks to spawn child process to handle verification on file
- Experiment with
fork
to create a child with current write-buffer and cache- Child will
fork
to create grandchild verification process and will monitor it - Child still has access to write-buffer and cache and so can handle reporting errors
- Child will
- Parent has to handle creating copy of 'file' and updating it, but is more natural and intuitive
- Experiment with