Giter Club home page Giter Club logo

ra.hs's Introduction

ra.hs

ra.hs is Reactive Analytics for Haskell. It is a non-recursive symbolic execution engine that understands threads and thread-safe shared variables. It steps through the AST of a program to find possible non-recursive values at points in the program, including those that come from things like IORef. ra.hs then tracks forks to identify what threads are writing where.

Try it at http://stacker.lam.io/ra.hs/

Symbolic analysis can be thought of as sitting somewhere between typechecking and execution. It is able to reduce some regions of the typechecked AST by brute-forcing all possible branches and guards, provided that there exist values in the execution context/stack to fulfill pattern matches. On the other hand, the analysis is totally static, and in this engine, it is also non-recursive. By design, ra.hs should also always halt.

Basic examples

Here are a few simple but illustrative examples:

  • f = const g h reduces to f = g
  • f = do { v <- newEmptyMVar; putMVar v (); readMVar v } reduces to f = IO ()*.

For use on practical programs, the top-level function under test is saturated where necessary with undefined, then the program is executed symbolically. All writes to shared variables are logged and all possible non-recursive results for the function under test are returned.

*ra.hs handles IO and STM monad laws implicitly and "transparently", so results are actually unwrapped, without IO or STM constructors around them. Thus in this example you would actually see f = () directly.

For a slightly more involved example, consider this web server that responds to a request with a cache vs. database race:

listenHTTP :: Int -> IO Socket
read :: Socket -> IO String
send :: Socket -> String -> IO ()
toRequest :: String -> Request
type Route u = (u, Req -> String)
dispatch :: URI u => Request -> [Route u]
db : Database
cache :: Cache

main = do
	sock <- listenHTTP 80
	forever $ do -- :: IO ()
		req <- toRequest <$> read sock
		send sock $ dispatch req $ [
			("/public/*", \req' -> do
					assets <- newEmptyMVar
					forkIO $ do 
						assets' <- getDBAssets db req'
						(fromMaybe assets' <$> tryReadMVar assets) >>= putMVar assets
					forkIO $ do
						assets' <- getCacheAssets cache req'
						(fromMaybe assets' <$> tryReadMVar assets) >>= putMVar assets
						
					(send . render) <$> readMVar assets
				)
			-- ...
		]

The forked threads initiate the race condition to putMVar on assets :: MVar Assets. Notably, this is done at two separate locations in the program. Race conditions of this sort are amenable to analysis by ra.hs, as opposed to, say a race from threads spawned by iterating a list (a recursive data type).

The race is contained within a lambda \req' -> ... which will only be explored by ra.hs when it is called by the body of dispatch, and provided ra.hs has access to its source. This will be discussed further in limitations.

Use

ra.hs analyzes base functions with purebase, a custom Prelude that can be typechecked by the GHC API and doesn't conflict with uses of Prelude in the target source. So that the GHC API resolves purebase, you need to import qualified C.Union to the top-level module. ra.hs employs type matching to find all top-level declarations, so while this is a no-op since the qualified import means purebase won't be compiled into your code, it will be available to ra.hs.

ra.hs itself includes a GHC frontend plugin so it can find the Cabal modules of the code under test (following a great suggestion of Edward Yang's @ezyang.com). It's called through the --with-ghc/-w option of cabal repl, which points to its executable after building, somewhere in dist or new-dist depending on the version of cabal. A proper call looks something like this:

cabal repl -w dist-newstyle/build/x86_64-osx/ghc-8.6.5/ra.hs-0.1.0.0/x/ra.hs/build/ra.hs/ra <module-under-test>

Limits to analysisย 

If the body for a function (e.g. dispatch above) is not accessible to ra.hs, then the block representing its application to arguments is symbolized and passed flows through the rest of the program as a normal form. This has two consequences:

  1. Unsaturated lambda arguments are never executed.
  2. All pattern matches other than top-level variable and wildcard patterns will fail.

It is worth emphasizing that ra.hs does not synthesize arguments and combinations of arguments to generate behaviors for functions it failed to resolve. This allows it to be perfectly selective for pattern matches at the expense of sensitivity. In the case of dispatch for example, the unsaturated lambda argument \req' -> ... is already a normal form, so the whole block dispatch req (\req' -> ...) will be substituted wherever it's referenced, in this case in the second argument of send.

This approach is not related to another behavior of ra.hs to examine all IO side effects that it can (obeying the above behavior) regardless of if they contribute to the final return value: this is what is meant by ra.hs "understanding" shared state.

Trajectory

ra.hs was originally started to detect race conditions in Reactive programs, by evaluating the rule that different threads can't write values from the same source to the same shared variable.

For now, this analysis is a first-class citizen to the ra.hs ecosystem, although a non-intrusive, additive one as it does not impact any core functionality. It adds a state variable to track value sources (arguments to Consumer types) and an HsVar guard to identify when expressions are incident on Consumer types. This extra state is combined with the core thread and write tracking to identify violations to this rule. This may be temporary, depending on how flexible the ra.hs engine can be and what kinds of generalization can be applied.

ra.hs's People

Contributors

acrylic-origami avatar

Watchers

 avatar

ra.hs's Issues

Type equality logic is wrong

The type equality routine inst_subty was initially designed for finding instance methods, which could only be found by type comparison. Now, purebase is designed around the concept of type comparison rather than strict Unique matching because this way the types of built-in classes, especially Monad (and do blocks by association), are correct. Then, it's no longer accurate that a more concrete function is an admissible one. This is due to the interaction between contravariance and higher-rank types, such that something like forall a. (a -> a) -> a will work for a call to (Int -> Int) -> Int, but not (forall a. a -> a) -> b.

In general, this problem is hard. How do we know the variance of an arbitrary type constructor, like (forall a. c a)? How does a vary? Of course, between modules there may be many well-typed functions for callsites too, although outside of matching module names, this problem is unsolvable anyways.

I will admit that this is something of a corner case, as it assumes that there are many functions with the same name and very similar types that would fail this analysis. Even things like Data.Map.{...}.map vs. Prelude.map are different enough in signature for this current test to work at picking them apart.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.