barrucadu / dejafu Goto Github PK

Systematic concurrency testing meets Haskell.

Home Page: https://dejafu.docs.barrucadu.co.uk/

Haskell 99.89% Shell 0.11%

dejafu's Introduction

dejafu

[Déjà Fu is] A martial art in which the user's limbs move in time as well as space, […] It is best described as "the feeling that you have been kicked in the head this way before"

-- Terry Pratchett, Thief of Time

Installation
Quick start guide
Why Déjà Fu?
Contributing
Release notes
Questions, feedback, discussion
Bibliography
The website!

Déjà Fu is a unit-testing library for concurrent Haskell programs. Tests are deterministic and expressive, making it easy and convenient to test your threaded code.

Available on GitHub, Hackage, and Stackage.

Features:

An abstraction over the concurrency functionality in IO
Deterministic testing of nondeterministic code
Both complete (slower) and incomplete (faster) modes
A unit-testing-like approach to writing test cases
A property-testing-like approach to comparing stateful operations
Testing of potentially nonterminating programs
Integration with HUnit and tasty

There are a few different packages under the Déjà Fu umbrella:

	Version	Summary
concurrency	1.11.0.3	Typeclasses, functions, and data types for concurrency and STM.
dejafu	2.4.0.5	Systematic testing for Haskell concurrency.
hunit-dejafu	2.0.0.6	Deja Fu support for the HUnit test framework.
tasty-dejafu	2.1.0.1	Deja Fu support for the Tasty test framework.

See the latest package documentation.

Each package has its own README and CHANGELOG in its subdirectory.

There is also dejafu-tests, the test suite for dejafu. This is in a separate package due to Cabal being bad with test suite transitive dependencies.

Installation

Install from Hackage globally:

$ cabal install dejafu

Or add it to your cabal file:

build-depends: ...
             , dejafu

Or to your package.yaml:

dependencies:
  ...
  - dejafu

Quick start guide

Déjà Fu supports unit testing, and comes with a helper function called autocheck to look for some common issues. Let's see it in action:

import Control.Concurrent.Classy

myFunction :: MonadConc m => m String
myFunction = do
  var <- newEmptyMVar
  fork (putMVar var "hello")
  fork (putMVar var "world")
  readMVar var

That MonadConc is a typeclass abstraction over concurrency, but we'll get onto that shortly. First, the result of testing:

> autocheck myFunction
[pass] Never Deadlocks
[pass] No Exceptions
[fail] Consistent Result
    "hello" S0----S1--S0--

    "world" S0----S2--S0--
False

There are no deadlocks or uncaught exceptions, which is good; but the program is (as you probably spotted) nondeterministic!

Along with each result, Déjà Fu gives us a representative execution trace in an abbreviated form. Sn means that thread n started executing, and Pn means that thread n pre-empted the previously running thread.

Why Déjà Fu?

Testing concurrent programs is difficult, because in general they are nondeterministic. This leads to people using work-arounds like running their testsuite many thousands of times; or running their testsuite while putting their machine under heavy load.

These approaches are inadequate for a few reasons:

How many runs is enough? When you are just hopping to spot a bug by coincidence, how do you know to stop?
How do you know if you've fixed a bug you saw previously? Because the scheduler is a black box, you don't know if the previously buggy schedule has been re-run.
You won't get that much scheduling variety! Operating systems and language runtimes like to run threads for long periods of time, which reduces the variety you get (and so drives up the number of runs you need).

Déjà Fu addresses these points by offering complete testing. You can run a test case and be guaranteed to find all results with some bounds. These bounds can be configured, or even disabled! The underlying approach used is smarter than merely trying all possible executions, and will in general explore the state-space quickly.

If your test case is just too big for complete testing, there is also a random scheduling mode, which is necessarily incomplete. However, Déjà Fu will tend to produce much more schedule variety than just running your test case in IO the same number of times, and so bugs will tend to crop up sooner. Furthermore, as you get execution traces out, you can be certain that a bug has been fixed by simply following the trace by eye.

Contributing

See the "contributing" page on the website.

If you'd like to get involved with Déjà Fu, check out the "good first issue" label on the issue tracker.

Release notes

See the CHANGELOG.markdown file.

Questions, feedback, discussion

For general help talk to me in IRC (barrucadu in #haskell) or shoot me an email ([email protected])
For bugs, issues, or requests, please file an issue.

Bibliography

These libraries wouldn't be possible without prior research, which I mention in the documentation. Haddock comments get the full citation, whereas in-line comments just get the shortened name:

[BPOR] Bounded partial-order reduction, K. Coons, M. Musuvathi, and K. McKinley (2013)
[RDPOR] Dynamic Partial Order Reduction for Relaxed Memory Models, N. Zhang, M. Kusano, and C. Wang (2015)
[Empirical] Concurrency Testing Using Schedule Bounding: an Empirical Study, P. Thompson, A. Donaldson, and A. Betts (2014)
[RMMVerification] On the Verification of Programs on Relaxed Memory Models, A. Linden (2014)

There are also a few papers on dejafu itself:

Déjà Fu: A Concurrency Testing Library for Haskell, M. Walker and C. Runciman (2015)

This details dejafu-0.1, and was presented at the 2015 Haskell Symposium.
Déjà Fu: A Concurrency Testing Library for Haskell, M. Walker and C. Runciman (2016)

This is a more in-depth technical report, written between the dejafu-0.2 and dejafu-0.3 releases.
Revealing Behaviours of Concurrent Functional Programs by Systematic Testing, M. Walker (2018)

This is my Ph.D thesis, which discusses dejafu and my other research projects.

dejafu's People

Contributors

Stargazers

Watchers

Forkers

jwaldmann silky gamazeps platonic-io spacejam gdeest larskuhtz aherrmann qrilka gip dfinity-lab pepeiborra d3v3l0 typechecker jacobgawel nicolast advancedalpaka the-dr-lazy

dejafu's Issues

Rename "take"

"take" steals a name from the Prelude, meaning that to use Control.Monad.Conc either requires hiding take, which is uses quite often, or importing qualified, which is annoying.

Control.Concurrent.MVar uses "takeMVar", could something similar be done? (yes)

Make autocheck' functions take the schedule bounds.

For consistency with dejafu' et al.

Relaxed memory

Relaxed memory is needed for accurately testing primitives like compare-and-swap and modifyIORef (NOT atomicModifyIORef).

As relaxed memory only shows itself in reads, it seems that the best way to do this would be to record a set of possible values for fields, and return one of them on a read. The value returned would also need to be recorded in the trace. When a barrier is encountered, all of these sets would need to be resolved to a consistent state.

Multiple memory models could be available, and affect how these sets change when loads and stores are performed. This would allow keeping the current behaviour by specifying a sequentially-consistent model. Other interesting models are TSO and PSO (and perhaps others?).

Better POR in the presence of exceptions

POR for exceptions is really bad, as throwing an exception to a thread conflicts with everything. This is definitely the worst case, but surely this can be improved upon with knowledge of what types of exceptions a thread is prepared to catch, and its masking state.

For example, throwing an exception to a thread with an uninterruptible mask: it doesn't matter where exactly inside the masked chunk it gets delivered, as the same thing will happen in every case, the throwing thread is blocked and the receiving thread continues as normal.

Remove potentially-harmful type synonyms

Type synonyms which could result in breakage should be newtypes.

Detection of non-global deadlock

Global deadlock is detected when all threads are blocked, but more generally a collection of threads are deadlocked when they are all blocked on the same CVar which no thread outside of the collection has a reference to. This is probably impossible to solve at the level of Haskell, as it's a GC problem.

Add warning to documentation
Maybe have a deadlock-timeout as a parameter to the runner, where if a collection of threads are blocked on the same CVar for longer than it deadlock is assumed?

Tree of executions, rather than list

This allows a potentially huge shortcut to be taken when determining outcome of testing: if an execution with n pre-emptions exhibits a bug, we can skip all child executions of that one on the assumption that they will exhibit the same bug.

Need to see how this behaves in practice, but I expect it will vastly speed up testing where a bug is present.

Automated instrumentation

One of the major design motivators has been keeping the interface similar to what already exists, so it would definitely be good to implement some sort of automated instrumentation of existing code, with warnings printed where automated translation is not possible.

Random, execution-limited, DPOR

Two changes:

Don't pick the first "todo" in findSchedulePrefix, instead take an index and interpret that modulo the size of the (partitioned) possibilities.
Take an optional execution limit. Terminate when this is reached, or when there are no "todo"s remaining.

Implement random scheduling

Like PCT, but worse. However, implementing this with the dpor machinery will at least prevent duplicate schedules from being tried.

Support unbounded DPOR

Combination of bug and enhancement because there are two issues involved here:

cBacktrack (https://github.com/barrucadu/dejafu/blob/master/dejafu/Test/DejaFu/SCT.hs#L176) doesn't work if there are no bounds set; it won't add backtracking points
Need to write a new backtracking function for unbounded DPOR

Can probably also factor out the backtrack-adding code from pbBacktrack, fBacktrack, and lBacktrack to achieve this.

More examples

Currently the only not-immediately-trivial example is the logger, it would be nice to have more (and maybe rename "tests" to "examples", and drop the trivial ones).

Integrate bounding into `bporSched`

Avoid making decisions which would exceed the bound.

Separate test suite package

As stack test dejafu and cabal test dejafu don't work becasue Cabal-the-library is naff (commercialhaskell/stack#1122), there should really be a separate dejafu-tests package to avoid confusing people.

test-framework integration?

As I'm now pulling that in for the test suite so that QuickCheck can be used, it might be nice to provide a test-framework-dejafu package as a first-class citizen, rather than go via HUnit.

Haddock module headers

`IORef` equivalent

Add an equivalent of IORef: non-blocking, always-full, mutable state.

Unfortunately, the documentation states that IORef operations can appear out-of-order on some processor architectures (eg, x86), and testing this would be rather tricky. There are a few solutions:

Ignore the problem completely, and add a warning to the documentation that out-of-order cases just aren't tested for.
Test up to some number of actions offset, and add a warning to the documentation that out-of-order cases may be found (makes testing of IORef-intensive things slower).
Test with full generality, only constraining things by the barrier imposed by atomicModifyIORef (makes testing of IORef-intensive things much slower).

(note: talk to Mike Dodds about this)

Reduce code duplication between `CVar`s and `CTVar`s

CTMVars can be used to implement CVars if the transactions are small enough, so given that the MonadSTM machinery is now in place and working, it seems silly to have a separate CVar implementation. Unfortunately, they can't be dropped entirely, as otherwise we lose the ability to distinguish between CVar deadlock and CTVar deadlock, but that should be able to be resolved by slightly extending the STM blocking code.

`Eq` for `CVar`s.

MVars can be equated, CVars can't (necessarily). It would be nice to make this possible.

Sadly, this doesn't work:

class (Monad m, Eq (CVar m a)) => MonadConc m where
  type CVar m :: * -> *

Because the a isn't in scope. Is there some way to have universally-quantified type variables in constraints?

dejafu-0.1.0.0 release tarball lacks modules required to build the test suite

Citing from http://hydra.cryp.to/build/1128096/log/raw:

Preprocessing test suite 'tests' for dejafu-0.1.0.0...

tests/Tests.hs:6:18:
    Could not find module ‘Tests.Cases’
    Use -v to see a list of the files searched for.

tests/Tests.hs:7:18:
    Could not find module ‘Tests.Logger’
    Use -v to see a list of the files searched for.

STM traces

The STM constructor in ThreadAction could include a trace of what the transaction did. This would probably be very useful for debugging transaction-heavy applications.

Better POR for invisible actions.

The _concMessage action is purely for adding information to the test execution trace and cannot influence the actual results in any way, so we can probably make use of that fact to improve the POR somehow.

Drop `_concKnows` stuff

It's not a great solution to the problem, clutters execution traces, and I've not actually had a case where it was essential yet. It was more useful when I only did schedule bounding, as then anything to prune extra schedules was good, but the current partial-order reduction is good enough to render it unnecessary.

Add missing modules

Every module in the Control.Concurrent hierarchy in base and stm should be available in classy form.

Incompleteness issue with a combination of daemon threads and sleep sets.

This only has one execution:

{-# LANGUAGE FlexibleContexts #-}

module Terminal where

import Control.Monad.Monitor
import Control.Monad.Monitor.Property
import Control.Concurrent.Classy
import Control.Monad

data Event = Waiting Int | Trying Int deriving (Eq, Ord, Show)

prop :: Int -> Property Event
prop i = Exists (globally waiting) `SAnd`
         Exists (finally trying)   `SAnd`
         All (globally waiting `POr` finally trying)
  where
    waiting = Event (Waiting i)
    trying  = Event (Trying i)

server :: (MonadConc m, MonadMonitor Event m) => MVar m Char -> Lock m -> Int -> m ()
server charVar lock i = do
  addProperty (show i) (prop i)
  forever $ do
    -- Wait for input
    char <- withEvent (Waiting i) (takeMVar charVar)

    -- Wait for the lock
    withEvent (Trying i) (takeLock lock)

    -- Print the output
    -- liftIO (putChar char)

    -- Release the lock
    releaseLock lock

testCase :: (MonadConc m, MonadMonitor Event m) => m ()
testCase = do
  charVar <- newEmptyMVar
  lock    <- newLock

  -- Start some servers
  mapM_ (fork . server charVar lock) [0]

  -- Send some input
  mapM_ (putMVar charVar) "a"

type Lock m = MVar m ()

newLock :: MonadConc m => m (Lock m)
newLock = newMVar ()

takeLock :: MonadConc m => Lock m -> m ()
takeLock = takeMVar

releaseLock :: MonadConc m => Lock m -> m ()
releaseLock lock = putMVar lock ()

Surely there are more: the main thread stopping should be preemptable by the server thread. That doesn't exceed any bounds, so it should show up. Need to add tests to dpor to see if the breakage is there, or somewhere in dejafu.

Tutorial

A nice tutorial in the README, showing how to write a thing with (or port a thing to) MonadConc and test with DejaFu.

Unify IO and ST test instances

Now that I'm using a typeclass abstraction for mutable references, all the code to actually run things should be polymorphic over the monad.

This also fixes an issue where I keep needing unsafeCoerce to work around the universal quantification in ST t because I am already doing the runST elsewhere. Everything should be in ST t with the caller doing runST where necessary.

STM abstraction

This is just a fanciful idea, floating in a nice blue sky, but it would be rather nice.

A MonadSTM class and CTVar type.
Every MonadConc has an associated MonadSTM, and can run transactions atomically.
Testing handles getting blocked on STM actions, rolling them back, and reports STM deadlock differently to CVar deadlock.

Why? Well, STM is probably the main go-to for new concurrent Haskell stuff now. It's just so much more convenient and easy than using IORefs or MVars directly, so the natural complaint about DejaFu is "well, this is nice, but what if I'm using atomic transactions in the STM monad instead?"

Remove penultimate `Lift`

This is just a wart.

`yield`

Gives rise to a slight complication when doing pre-emption bounding, as yield lets you have one free pre-emption.

atomic-primops integration

This one is tricky, as it requires changes to atomic-primops to make this possible. Specifically, a type-class abstraction. @rrnewton seems ok with this, and I have a preliminary implementation here, although I'm sure things could be improved.

Ties into #15 very intimately.

rtsSupportsBoundThreads

We should have this operation, and have it return false.

Available thread IDs.

forkIO returns the ID of the thread produced, and there are a few actions on thread IDs (eg, throwing an exception to a thread). So, the MonadConc interface should support this.

This would require adding a ThreadId type to the MonadConc typeclass. Unfortunately, GHC-ThreadIds support Eq, Ord, and Show, but there isn't a way to add constraints to type families. Would this require the use of MPTCs and FunDeps? Seems a bit of a shame as I got rid of those.

Fix "free preemption" after CRef commits.

Currently, for ease of implementation, a context switch after a commit is not counted as a preemption in any circumstances. This is bad, because it means in very relaxed test cases, threads get lots of free preemptions, which is defeating the point of preemption bounding. More seriously, I wonder if this is what introduces the extra results seen in f3457b2 and 29150d0.

The fix is to count the context switch as a preemption if it is not a switch back to the thread which was executing before the commit.

Better POR in the presence of blocking

Currently POR doesn't take blocking into account. For example, if we have the situation where two actions are dependent, but the thread which performs the later action is blocked, then there is no point adding a backtracking point if the former thread is the one which unblocks it, I think.

The Great Renaming

Rename everything to be more in line with the IO variants. This is mostly the case for functions, but the type families could do with some renaming.
Remodularise everything so it's easier to change existing imports.

Exceptions

An uncaught exception only kills its thread, and exceptions can be thrown to a thread, killing it. Furthermore, actions can be registered which are called when a thread is killed.

In DejaFu, however, an exception during testing would kill everything. This should be fixed, and the missing thread operations should be added to MonadConc.

Depends on #3.

(note: read Tackling the awkward squad: monadic input/output, concurrency, exceptions, and foreign-language calls in Haskell - SPJ)

Avoid re-running computations in `dejafus`

dejafus runs the computation for every test, if it's expensive this takes some time. It should instead run the computation once, producing a lazy list of all results up to the bound, which the predicates are then applied to.

doc request: can dejafu find deadlocks in code that runs `forever`?

E.g., this straightforward dining philosophers implementation: https://github.com/jwaldmann/dejafu/blob/master/dejafu/examples/Diner.hs has

 forM [1..n] $ \ p ->
    (if p < n then void . forkIO else id ) $ forever $ do

and when I run it under dejavu, nothing happens for a long time (for n >= 2), while I would hope the deadlock is detected.

isLocked doesn't work for allKnown

No other runnable threads with a reference to the thing blocked on is too loose. The condition should be: no other threads with a reference to the thing which are not blocked on (a) the same variable, or (b) a thread blocked on the same variable (eg, in throwTo)

Implement PCT

The probabilistic concurrency testing (PCT) algorithm [Burckhardt et al. 2010] is an-
other controlled scheduling technique that repeatedly executes a multithreaded pro-
gram using a randomized priority-based scheduler, where the highest priority thread
that is enabled is scheduled. The initial thread priorities are chosen randomly, as are
d − 1 priority change points. Importantly, the depth of each change point (the number
of execution steps before the change point causing the currently executed thread to be
lowered) is chosen uniformly over the length of the execution. Given a program with
n threads, at most k execution steps and a bug that requires d ordering constraints to
manifest, PCT guarantees that it will find the bug in a single execution with proba-
bility 1/nk d−1 [Burckhardt et al. 2010]. We refer to PCT as being non-systematic (i.e.
non-exhaustive), because the same schedule could be explored more than once and
schedules are not exhaustively enumerated in some order. However, PCT is still a con-
trolled technique: because all scheduling nondeterminism is controlled, the schedule to
be executed is precisely determined by the PCT algorithm and not by the OS scheduler.
Thus, buggy schedules can still be recorded and replayed precisely, as with SCT.

Two tricky aspects:

dpor library currently has no facility for randomness.
The threads aren't known up-front (maybe the paper covers this), so priorities would need to be assigned at fork time.

Move "Deterministic" modules to "Conc", and factor out common internal stuff.

This way the MonadConc and MonadSTM stuff will be in "Test.DejaFu.{Conc,STM}", and "Test.DejaFu.Common" (or whatever) will have all the common things.

Split out generic SCT impl to separate library

The current SCT implementation is fairly tightly coupled with the rest of dejafu, but this doesn't need to be the case. Given execution/dependency functions, the actual algorithm can be (should be able to be) fairly generic and suitable for re-use in other cases. This would make it possible to re-use this machinery in other contexts: like the execution of distributed systems.

It would also allow for more experimentation on the SCT front (eg: adding optional randomness) without either (a) sacrificing completeness even more in dejafu or (b) adding yet more variants of the dejafu/autocheck functions to Test.DejaFu.

Missing capability operations

setNumCapabilities
threadCapability

threadCapability might be tricky, as it introduces another source of nondeterminism. Might be best to not support that one.

HUnit and tasty integration

Basically done, but filing an issue so I remember to check documentation and READMEs

Split concurrency/stm classes out of dejafu?

One big disadvantage of dejafu is the additional load of dependencies. Control.Concurrent is in base, it's hard to get more lightweight than that. And yet, if you want to use MonadConc, and not even do any testing, you incur the following additional dependencies:

dejafu
array
atomic-primops
containers
deepseq
dpor
exceptions
monad-control
monad-loops
mtl
semigroups
stm
template-haskell
transformers
transformers-base

And all of their dependencies (some of which are massive, like text for semigroups). Some of these are used in the typeclass modules, but a lot of this is only used by the testing code. It's unreasonably heavyweight. It's fine for a testsuite to opt in to additional dependencies, but this is a bit much for everyday library and executable use.

So, should I make a separate "concurrency" module which (in addition to claiming a very sweet name on hackage) would have a far lighter load of dependencies?

async equivalent

Basically done, but filing an issue to remember to check docs and README.

To include in docs: differences from async and why.

Pre-empt `return` in testing

This will always return Right ():

test :: MonadConc m => m (Either SomeException ())
test = do
  v <- newEmptyCVar
  t <- mask $ \restore -> fork $ try (restore $ return ()) >>= putCVar v
  killThread t
  readCVar v

This is because return is invisible as far as dejafu is concerned, there isn't a corresponding primitive action, as the monad instances for Conc and ConcIO are built on that of Cont.

In order to be able to pre-empt a return, a new primitive action is needed and the monad instance must be changed.

Fair bounding

Kind of essential with #16

Change default memory model to TSO

No point using a memory model which is less accurate!