Probabilistic Programming

Rapid, easy model development

BAli-Phy contains a simple language for expressing probabilistic models as programs. Inference on parameters can then be performed automatically. Such languages are called probabilistic programming languages (PPL). Other well-known PPLs include BUGS, RevBayes, and Stan. The goal of the language is to allow researchers to spend their time designing models instead of designing new inference programs. The inference should take care of itself after the model is specified.

Language properties

The modeling language is a functional language, and uses Haskell syntax. Features currently implemented include:

1. Functions work, and can be used to define random variables.
2. Modules work, and allow code to be factored in a clean manner.
3. Packages work, and allow researchers to distribute their work separately from the BAli-Phy architecture.
4. Optimization works, and speeds up the code written by the user by using techniques such as inlining.
5. Composite Objects work, and can be used to define random data structures.
6. Random control flow works, allowing if-then-else and loops that depend on random variables.
7. Recursive random variables. Random processes on trees that are not known in advance.
8. JSON logging. This enables logging inferred parameters when their dimension and number is not fixed.

Features that are expected to be completed during 2020 include:

• Rooted Trees. Rooted trees implemented as a data structure within the language. (partially implemented)
• Random numbers of random variables. Random variables can be conditional created, without the need for reversible-jump methods.
• Lazy random variables. Infinite lists of random variables can be created. Random variables are only instantiated if they are accessed.
• Type checking. Type checking will enable polymorphism and give useful error messages for program errors.

Examples

These examples work with current git version of bali-phy.

Linear regression

Here is a short program that performs linear regression (i) given normally-distributed errors (ii) at fixed values of x.

Changing the error distribution to another distribution simply invoves replacing normal with another distribution, such as cauchy.

import           Probability
import           System.Environment

sample_model = do

b <- normal 0.0 1.0

a <- normal 0.0 1.0

s <- exponential 1.0

return (a, b, s)

main = do

(a, b, s) <- random $sample_model let f x = b * x + a observe (independent [ normal (f x) s | x <- xs ]) ys return ["b" %=% b, "a" %=% a, "s" %=% s]  • the priors look like b <- normal 0.0 1.0 • the prediction function let f x = b*x + a defines the best fit line. • the error distribution normal (f x) s indicates how far points might fall from the line. • the likelihood is given by observe distribution data. • the parameters are logged as JSON and are given by the return ["b" %=% b, ...] command. You can find this file in bali-phy/tests/prob_prog/regression/ and run it as bali-phy -m LinearRegression.hs --iter=1000. Trait evolution on random tree Graphical models are a natural way to describe the evolution of a trait on a fixed tree. This is because the value of the trait at each node depends on the value at its parent node. This dependance can be encoded in a fixed graph with the same structure as the tree. However, graphical models cannot easily describe the evolution of a trait on a random tree. This is because the parent node for each node is not known in advance, and is in fact random. One way to solve this problem is to allow recursive random sampling. In this model below, the random variable xs recursively depends on itself. In practice this avoids infinite loops because each element of xs only depends on other elements, and not on itself. {-# LANGUAGE RecursiveDo #-} import Probability import Tree main = random$ do
tree <- uniform_topology 5
let rtree = add_root tree 0

let ps    = map (show . parentNode rtree) [0 .. 5]

rec let mu node = case parentNode rtree node of
Nothing   -> 0.0
Just node -> xs !! node
xs <- independent [ normal (mu node) 1.0 | node <- nodes rtree ]

return ["tree" %=% write_newick rtree, "xs" %=% xs, "ps" %=% ps]


The keyword rec introduces a recursive block inside the do block.

Random data structures

The iid distribution returns a list of random values from another distribution. We can apply the map and sum operations to such lists to sample a sum of squares.

import           Probability

main = random $do xs <- iid 10 (normal 0.0 1.0) let ys = map (\x -> x * x) xs return ["xs" %=% xs, "squares" %=% ys, "sum" %=% sum ys]  Here (\x -> x*x) describes an un-named function that takes an argument x and returns x*x. Random control flow I: if-then-else The modeling language can handle graphs that change automatically when the value of a random variable changes. One thing that leads to a changing graph is control-flow statements like if-then-else: import Probability main = random$ do
i <- bernoulli 0.5
y <- normal 0.0 1.0
let x = if (i == 1) then y else 0.0
return ["i" %=% i, "x" %=% x]



Random control flow II: arrays with random subscripts

Traditional graphical modelling languages, like BUGS, allow arrays of random variables. However, they do not allow selecting a random element of these arrays. Dynamic graphs allow random subscripts. This can be used to divide observations into categories. Here different elements of ys will be exactly equal, if they belong to the same category:

import           Probability

main = random $do xs <- iid 10 (normal 0.0 1.0) categories <- iid 10 (categorical (replicate 10 0.1)) let ys = [ xs !! (categories !! i) | i <- [0 .. 9] ] return ["ys" %=% ys]  Haskell uses the !! operator to subscript a list. The C equivalent of xs!!(categories!!i)) would be something like xs[categories[i]]. Random sampling via recursive functions: Brownian Bridge We can use recursive functions to randomly sample lists of random values. Here, we define a function random_walk that produces a list of random values starting from x0. • The argument next is a function: next x0 gives the distribution of the point after x0. • In Haskell, we write next x0 instead of next(x0) to apply a function. import Probability random_walk next x0 = do x1 <- next x0 xs <- random_walk next x1 return (x0 : xs) -- 20 element brownian bridge from 0.0 to 2.0 main = do walk <- random$ random_walk (\mu -> normal mu 1.0) 0.0

let zs = take 19 walk

observe (normal (last zs) 1.0) 2.0

return ["zs" %=% zs]


comments and suggestions: benjamin . redelings * gmail + com