Comments (19)
I recall that (possibly in the paper?), the authors of monad-bayes were aware that ListT
is not a valid monad transformer, but not that their semantics does not require it to be. But as you say, there are other effects of it being not a monad transformer.
Is there a reason we can't using one of the existing ListT
replacement libraries?
If not, shipping our own seems reasonable to me.
And re. performance, I agree. I briefly tried refactoring with Vector, but the performance benchmarking was too hard for me.
from monad-bayes.
Is there a reason we can't using one of the existing
ListT
replacement libraries?
They generally work differently than what we want here. They allow for an effect after every element of the list, so it's more like a side-effectful stream. In fact, ListT m a = MSF (MaybeT m) () a
. And I guess, Sequential
is also very similar to these ListT
s. So they definitely have their use case, but it's not Population
.
If not, shipping our own seems reasonable to me.
I'll also see next week whether I can write a "set monad transformer" that is useful for us.
And re. performance, I agree. I briefly tried refactoring with Vector, but the performance benchmarking was too hard for me.
👍 I might look into that in the future.
from monad-bayes.
With help of the Berlin Haskell Users Group, I figured out that I didn't understand the situation about why ListT
isn't lawful completely. The issue is not so much that one mustn't be able to observe the final order of the list elements, but rather that the order of lists on which we later bind matters subtly.
For example, if you have:
newtype ListT m a = ListT (m [a])
-- insert instance implementations here
failure :: Monad m => String -> ListT (Either String) a
failureList :: ListT (Either String) a
failureList = do
foo <- pure True <|> pure False
bar <- if foo then pure "late" else failure "early"
failure bar
This can go along two paths, and on the first path it will fail late, while on the second path it will fail earlier.
Now should this fail with "early"
or "late"
? It depends on whether we first compute bind on foo
or on bar
.
Bind on foo
first
failureList = do
bar <- pure "late" <|> failure "early"
failure bar
= do
bar <- failure "early"
failure bar
= failure "early"
Bind on bar
first
failureList = do
foo <- pure True <|> pure False
if foo then failure "late" else failure "early"
= do
failure "late" <|> failure "early"
= failure "late"
Intuition
One imagines the whole program as a tree, where every node is a list containing the different branches. Also, every node is decorated with an effect. We wish to enumerate all the leaf nodes. Now the question is whether we traverse this tree width first or depth first. Depending on this choice, the order of effects is different.
What that means for monad-bayes
The same works for other noncommutative monads, not just Either
. For example, sampling, Density
(and thus Traced
). These will all break the laws when combined with Population
.
How else to fix ListT
One really simple fix that doesn't need a set monad transformer is simply a free list monad transformer.
newtype ListT m a = ListT (FreeT m [] a)
There is a simple interpretation function ListT m a -> m [a]
, and everything is lawful. The interpreted program will always bind early.
This fixes the bit rot issue and the lawfulness. It doesn't fix the assumed performance, I'd even assume that if anything it worsens them, so we should look at at benchmark before merging such a change.
from monad-bayes.
The best free list transformer for this is probably https://hackage.haskell.org/package/free-5.1.3/docs/Control-Monad-Trans-Free-Ap.html#t:FreeT, which makes use of the Applicative
instance of lists. Together with ApplicativeDo
, this should be more performant in usual use cases because it requires fewer layers of m
.
from monad-bayes.
There is also the Church transformed FreeT
(https://hackage.haskell.org/package/free-5.1.10/docs/Control-Monad-Trans-Free-Church.html) but that is not making use of the Applicative
instance.
Is the lawlessness an actual problem here? That is, does the package either claim or require that the distribution representations be lawful monads? (similar question re. the Lazy sampler).
Might be interesting to benchmark performance of FreeT
(having globally enabled ApplicativeDo
) using the current benchmarks.
What about FreeT
over e.g. Vector
. Probably not fast...
from monad-bayes.
@reubenharry 's reply is correct: monad-bayes only assumes (and provides) a monad structure, and the monad laws don't hold for the resulting structure.
Shipping our own copy of ListT
will avoid the bit-rot, and the issue with laws is a red-herring.
It would be cool to have more performant implementations in the future --- I agree!
from monad-bayes.
and the monad laws don't hold for the resulting structure.
Yes, they certainly don't. For reproduceability when debugging a sampler, or also a formal implementation like delayed sampling (#224), this can matter.
and the issue with laws is a red-herring.
I don't agree. While it's true that the issue isn't as big as with, say, ListT IO
, it definitely exists. But I agree that the maintainability aspects are more important.
Is the lawlessness an actual problem here? That is, does the package either claim or require that the distribution representations be lawful monads? (similar question re. the Lazy sampler).
Well, as a Haskell developer, I will always assume that of course anything that has a Monad
instance is lawful. A library that sweeps that under the rug without big red warning letters I wouldn't regard as production ready. So I see fixing the laws as a waypoint on the road to industry strength.
Might be interesting to benchmark performance of FreeT (having globally enabled ApplicativeDo) using the current benchmarks.
Yes, I'll work on that next week.
What about FreeT over e.g. Vector. Probably not fast...
It's hard to say yet. Part of a more extensive benchmark would be finding out what the typical usages of Population
are. <*>
and >>=
will probably not play such a great role, instead fmap
will be used a lot I guess. In FreeT
, fmap
is relatively cheap as far as I know. So if fmap
is faster for vectors than for lists, we might see a speedup. But maybe I'm misjudging vectors here and they're really only faster for indexed lookup.
If there is much more <*>
than >>=
then Ap.FreeT
is cheap, because it preserves the applicative structure and only adds a free >>=
.
from monad-bayes.
Well, as a Haskell developer, I will always assume that of course anything that has a
Monad
instance is lawful. A library that sweeps that under the rug without big red warning letters I wouldn't regard as production ready. So I see fixing the laws as a waypoint on the road to industry strength.
Maybe the most appropriate thing to do is to put those big red warning letters then. In the paper we were quite explicit about using a monad structure rather than a monad.
What concrete thing should we do in the library?
Type-classes don't really allow for re-use, in the sense that if we now need a completely separate type-class hierarchy for monad structure transformers, that will bloat up the code and make it a lot less maintainable and susceptible to bit-rot. And since Haskell doesn't really require, enforce, or use the monad laws for monad interfaces, the issue feels more ideological than pragmatic. In that case, perhaps it's better to hash out this issue on a Haskell mailing list or GitHub repo, rather than here, since it's a much broader issue than monad-bayes.
from monad-bayes.
Well, as a Haskell developer, I will always assume that of course anything that has a Monad instance is lawful. A library that sweeps that under the rug without big red warning letters I wouldn't regard as production ready. So I see fixing the laws as a waypoint on the road to industry strength.
I also feel that when something is a Monad, it should obey the monad laws. Wouldn't making use of qualified do-notation be a viable alternative than breaking monad laws?
from monad-bayes.
I'm not sure how the qualified do-notation helps here when what we're using is the existing monad-structure transformer ecosystem that postulates a Monad interface.
from monad-bayes.
I tried replacing the deprecated ListT
by a free monad over the applicative list. In fact it speeds up the benchmarks significantly!
See #253 for details. So I think it's uncontroversial to replace ListT
by the free construction because it's maintained, lawful, and faster.
from monad-bayes.
The issue of looking into Vector
or other replacements for lists is a separate one.
from monad-bayes.
Is it correct though?
from monad-bayes.
Good question, I'm pretty sure that yes.
Old ListT
Let's look at how the deprecated ListT m a
worked. It's isomorphic to m [a]
, and join :: m [(m [a])] -> m [a]
is implemented in the straightforward way.
The issue with it is that it's not associative, i.e. if there is a tower m [(m [(m [a])])]
, it matters if we join the inner layer with the middle layer first (join . fmap join
) or the outer layer with the middle layer first (join . join
). Why does it matter?
One can think of a "join-tower" m [(m [(.... m [a])])])...]
as a computation tree where the leaves are labelled with effectful a
s and the nodes with effects m
. (It's a bit more complicated because the topology of the subtrees can depend on these effects.) Now the issue is that when we join this tree, join . fmap join
does depth-first traversal over these effects, and join . join
does width-first. If m
isn't commutative, this isn't the same. Which choice is actually taken in a program depends on coding style and refactorings. (So to be honest I'm not even sure how one can argue that the old version was correct, because its semantics isn't very well-defined.)
Free monad
PopulationT m a = FreeT m [] a = m (a + [m (a + [m (a + ...)]...])
The free monad is defined (up to an extra m
at the leaf nodes) as the sum of all "join-towers", or computation trees of any height. It's associative by definition because it doesn't make a choice of whether to join innermost or outermost. We don't usually observe values from PopulationT m a
directly. Instead, one recovers them by applying:
runPopulationT :: PopulationT m a -> m [a]
This joins the tree systematically width-first, like join . join . join . ... . join
. So it has the same semantics as the deprecated ListT
for one particular, consistent, choice of bracketing. So if the previous choice was correct for every possible bracketing, then it's still correct for this particular bracketing.
Free monad over an applicative
The free monad over an applicative is the same type, the same functor, and the same monad as the ordinary free monad. But it has a different implementation for Applicative (FreeT m [])
, one that makes use of Applicative []
. It seems that it breaks the law <*> = ap
though, and it's unclear whether the law is recovered after interpretation. Thanks for pointing me towards looking into correctness more deeply, I wouldn't have noticed this otherwise.
I thought at first that together with ApplicativeDo
, this free construction should be faster than the standard one, but the speedup seems to be much smaller than the one we get from switching to a free construction in the first place, so I'm happy to use the standard free construction instead.
from monad-bayes.
Thanks for the detailed explanation. First, can we please correct some imprecise assertions in it?
I'm not even sure how one can argue that the old version was correct, because its semantics isn't very well-defined.
The semantics is well-defined, it is based on this paper that formally defines these data structures, the semantics of the inference algorithm, and proves that the result is an unbiased sampler.
So it has the same semantics as the deprecated
ListT
for one particular, consistent, choice of bracketing. So if the previous choice was correct for every possible bracketing, then it's still correct for this particular bracketing.
I don't really follow this argument. Can you make it more precise please, for example, along the lines of the paper?
The paper and its follow-up used the monad-structure transformer ListT
because the computation it produces makes the correct samples and in the correct order. @adscib (and later the Tweag team) have tested that the resulting inference algorithms produces the expected distributions when run on models, and the paper proves the algorithms, when working on these data structures, are correct.
In general, free monads may do something different, for example, suspend random samples that some of the monad-bayes combinators do in crucial places. This might mean, for example, that using the same combinations of monad-bayes's building blocks to implement algorithms such as smc, smc^2 etc. are not longer correct.
I suggest we first understand better how this proposed change affects the existing inference algorithms, not just in runtime, but in the actual samples they compute, both by testing and using an accompanying formalisation.
from monad-bayes.
My point is that runPopulation $ (a >=> b) >=> c = runPopulation $ a >=> (b >=> c)
in the new implementation, but not in the old. Furthermore, runPopulation $ a >=> (b >=> c)
agrees in the old and the new implementation, so it has correct semantics. I'm not saying that FreeT m [] a
represents a population, but I'm saying that if you apply runPopulation
, it does, and it has the same semantics as before. I think this is clear to see from how free monads are defined, or am I overlooking something here?
Maybe the confusion is that runPopulation
is not a monad morphism, but noone requires it to be.
I suggest we first understand better how this proposed change affects the existing inference algorithms, not just in runtime, but in the actual samples they compute, both by testing and using an accompanying formalisation.
Note that all the existing test suites pass. Do you have suggestions for a more extensive test suite that we should implement first?
from monad-bayes.
Thanks for implementing additional tests! (Something is currently failing in the build, though.)
I'm not saying that
FreeT m []
a represents a population, but I'm saying that if you applyrunPopulation
, it does, and it has the same semantics as before.
When we combine different inference building blocks, we do it before running runPopulation
.
So if we change the data structure we use for populations, it's not clear that the way we combine these building blocks to implement, say, smc^2, still gives smc^2.
Put differently, you argue that the monad laws used to fail and now they hold.
So something has changed.
Before the change the algorithm was doing the correct sampling based on quite a subtle semantic argument on the one hand, and careful implementation of various inference algorithms.
We should hesitate to merge this kind of change without a more detailed analysis for how it affects the algorithms.
from monad-bayes.
Ah ok, I think I'm getting it now. Your last explanation helped me. Put yet differently, the way the other transformers push around the population can result in all sorts of bracketing of joins and binds, not necessarily coinciding with the canonical one chosen by runPopulation
. Even intentionally so. Yes, I was assuming that such a situation would have been a bug in the first place, but in fact this extra expressivity is used in RMSMC (and thus in SMC2) in the intermediate MH steps. This particular place can be fixed easily, but yes, now we need to look whether there are more such places.
from monad-bayes.
I was mostly afraid of meddling with the semantics when I made changes to the library, but there are a couple of things that did change, that are perhaps worth noting:
- Dominic and I updated the
Sampler
implementation to allow for other RNGs. - I added Sam Staton's lazy sampler, and also a "measure" monad (i.e.
ContT m Double
) which can be used for quadrature, to make plots of simple distributions. Neither of these affect existing code, but I can't vouch for the correctness of their interaction with the other inference transformers. - I implemented a simpler alternative to the Church transformed
FreeT
used for sampling, which just usesStateT
. This isn't actually used in the code anywhere, partly because it's slower, and partly because I was scared to make a semantic change. - I wrapped the parameters for SMC and MCMC in SMCConfig and MCMCConfig datatypes. This is a superficial change.
- I introduced an implementation of the mcmc chain using the streaming library
pipes
. Again, this isn't used anywhere by default, but I think it's convenient to think of the chain in this way.
I will say that subtleties like this make me wish the library was in Idris, and that the proofs of correctness were in the types. (I suspect that is something Ohad would like?). Even having worked with monad-bayes for a while, the correctness proofs of the inference algorithms is still beyond me.
from monad-bayes.
Related Issues (20)
- Use parallelism for Population HOT 1
- nix flake check doesn't work on linux? HOT 2
- Extend benchmarks again HOT 3
- nix develop silently adds .git/hooks/pre-commit HOT 1
- Support GHC 9.4 HOT 2
- Consider moving pipes dependency to separate library? HOT 4
- Question: Free density monad vs. state density monad HOT 3
- Nix cache on CI doesn't work properly HOT 1
- Test "Integrator Variance/variance numerically" is flaky HOT 10
- Test "Pipes: HMM pipe model is equivalent to standard model" is flaky HOT 7
- Support GHC 9.6 HOT 1
- nix develop broken in multiple ways HOT 3
- hamilton not buildable anymore HOT 1
- Netlify bot does not report anymore
- Is there a reason for no MonadFail instance? HOT 4
- Can't access `mcmcP` output because `MHResult` is in a hidden module HOT 1
- Link to upcoming rhine-bayes blog post
- Lazy sampler is not a monad transformer, strict sampler misses MonadTrans instance
- Support GHC 9.8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from monad-bayes.