This post has already been read 76 times!

Le théorème de Bayes, ou de probabilité des causes, s'en dérive aussitôt en mettant à profit la symétrie de la règle de multiplication

 

P ( A | B ) = P ( B | A ) . P ( A ) P ( B ) . {\displaystyle P(\mathrm {A} |\mathrm {B} )={\frac {P(\mathrm {B} |\mathrm {A} ).P(\mathrm {A} )}{P(\mathrm {B} )}}.}

 

L'inférence bayésienne est une méthode par laquelle on calcule les probabilités de diverses causes hypothétiques à partir de l'observation des conséquences connues, pouvant même réviser ces probabilités aux fur et à mesure de l'accumulation des données. Le raisonnement bayésien interprète la probabilité comme le degré de confiance a accorder à une cause hypothétique. Elle est utilisée en informatique pour l'auto-apprentissage de logiciels.

Elle s'appuie principalement sur le théorème de Bayes. Ainsi un diagnostic indique-t-il qu'une maladie plus qu'une autre est à l'origine des symptômes d'un patient.

Le raisonnement bayésien s'intéresse aux cas où une proposition pourrait être vraie ou fausse, non pas en raison de son rapport logique à des axiomes tenus pour assurément vrais, mais selon des observations où subsiste une incertitude. On attribue à toute proposition une valeur dans l'intervalle ouvert allant de 0 (faux à coup sûr) à 1 (vrai à coup sûr). Quand un événement possède plus de deux issues possibles, on considère une distribution de probabilité pour ces issues.

Le théorème de Bayes, ou de probabilité des causes, s'en dérive aussitôt en mettant à profit la symétrie de la règle de multiplication

 

P ( A | B ) = P ( B | A ) . P ( A ) P ( B ) .

 

https://fr.wikipedia.org/wiki/Inf%C3%A9rence_bay%C3%A9sienne
https://brohrer.github.io/how_bayesian_inference_works.html

http://www.bayespy.org/user_guide/quickstart.html

BayesPy provides tools for Bayesian inference with Python. The user constructs a model as a Bayesian network, observes data and runs posterior inference. The goal is to provide a tool which is efficient, flexible and extendable enough for expert use but also accessible for more casual users.

Currently, only variational Bayesian inference for conjugate-exponential family (variational message passing) has been implemented. Future work includes variational approximations for other types of distributions and possibly other approximate inference methods such as expectation propagation, Laplace approximations, Markov chain Monte Carlo (MCMC) and other methods. Contributions are welcome.

Quick start guide

This short guide shows the key steps in using BayesPy for variational Bayesian inference by applying BayesPy to a simple problem. The key steps in using BayesPy are the following:

  • Construct the model
  • Observe some of the variables by providing the data in a proper format
  • Run variational Bayesian inference
  • Examine the resulting posterior approximation

To demonstrate BayesPy, we’ll consider a very simple problem: we have a set of observations from a Gaussian distribution with unknown mean and variance, and we want to learn these parameters. In this case, we do not use any real-world data but generate some artificial data. The dataset consists of ten samples from a Gaussian distribution with mean 5 and standard deviation 10. This dataset can be generated with NumPy as follows:

>>> import numpy as np
>>> data = np.random.normal(5, 10, size=(10,))

Constructing the model

Now, given this data we would like to estimate the mean and the standard deviation as if we didn’t know their values. The model can be defined as follows:

\begin{split}
p(\mathbf{y}|\mu,\tau) &= \prod^{9}_{n=0} \mathcal{N}(y_n|\mu,\tau) \\
p(\mu) &= \mathcal{N}(\mu|0,10^{-6}) \\
p(\tau) &= \mathcal{G}(\tau|10^{-6},10^{-6})
\end{split}

where \mathcal{N} is the Gaussian distribution parameterized by its mean and precision (i.e., inverse variance), and \mathcal{G} is the gamma distribution parameterized by its shape and rate parameters. Note that we have given quite uninformative priors for the variables \mu and \tau. This simple model can also be shown as a directed factor graph:

% tikzlibrary.code.tex
%
% Copyright 2010-2011 by Laura Dietz
% Copyright 2012 by Jaakko Luttinen
%
% This file may be distributed and/or modified
%
% 1. under the LaTeX Project Public License and/or
% 2. under the GNU General Public License.
%
% See the files LICENSE_LPPL and LICENSE_GPL for more details.

% Load other libraries
\usetikzlibrary{shapes}
\usetikzlibrary{fit}
\usetikzlibrary{chains}
\usetikzlibrary{arrows}

% Latent node
\tikzstyle{latent} = [circle,fill=white,draw=black,inner sep=1pt,
minimum size=20pt, font=\fontsize{10}{10}\selectfont, node distance=1]
% Observed node
\tikzstyle{obs} = [latent,fill=gray!25]
% Constant node
\tikzstyle{const} = [rectangle, inner sep=0pt, node distance=1]
% Factor node
\tikzstyle{factor} = [rectangle, fill=black,minimum size=5pt, inner
sep=0pt, node distance=0.4]
% Deterministic node
\tikzstyle{det} = [latent, diamond]

% Plate node
\tikzstyle{plate} = [draw, rectangle, rounded corners, fit=#1]
% Invisible wrapper node
\tikzstyle{wrap} = [inner sep=0pt, fit=#1]
% Gate
\tikzstyle{gate} = [draw, rectangle, dashed, fit=#1]

% Caption node
\tikzstyle{caption} = [font=\footnotesize, node distance=0] %
\tikzstyle{plate caption} =  %
\tikzstyle{factor caption} =  %
\tikzstyle{every label} +=  %

\tikzset{>={triangle 45}}

%\pgfdeclarelayer{b}
%\pgfdeclarelayer{f}
%\pgfsetlayers{b,main,f}

% \factoredge [options] {inputs} {factors} {outputs}
\newcommand{\factoredge}[4][]{ %
  % Connect all nodes #2 to all nodes #4 via all factors #3.
  \foreach \f in {#3} { %
    \foreach \x in {#2} { %
      \draw[-,#1] (\x) edge[-] (\f) ; %
    } ;
    \foreach \y in {#4} { %
      \draw[->,#1] (\f) -- (\y) ; %
    } ;
  } ;
}

% \edge [options] {inputs} {outputs}
\newcommand{\edge}[3][]{ %
  % Connect all nodes #2 to all nodes #3.
  \foreach \x in {#2} { %
    \foreach \y in {#3} { %
      \draw[->,#1] (\x) -- (\y) ;%
    } ;
  } ;
}

% \factor [options] {name} {caption} {inputs} {outputs}
\newcommand{\factor}[5][]{ %
  % Draw the factor node. Use alias to allow empty names.
  \node[factor, label={[name=#2-caption]#3}, name=#2, #1,
  alias=#2-alias] {} ; %
  % Connect all inputs to outputs via this factor
  \factoredge {#4} {#2-alias} {#5} ; %
}

% \plate [options] {name} {fitlist} {caption}
\newcommand{\plate}[4][]{ %
  \node[wrap=#3] (#2-wrap) {}; %
  \node[plate caption=#2-wrap] (#2-caption) {#4}; %
  \node[plate=(#2-wrap)(#2-caption), #1] (#2) {}; %
}

% \gate [options] {name} {fitlist} {inputs}
\newcommand{\gate}[4][]{ %
  \node[gate=#3, name=#2, #1, alias=#2-alias] {}; %
  \foreach \x in {#4} { %
    \draw [-*,thick] (\x) -- (#2-alias); %
  } ;%
}

% \vgate {name} {fitlist-left} {caption-left} {fitlist-right}
% {caption-right} {inputs}
\newcommand{\vgate}[6]{ %
  % Wrap the left and right parts
  \node[wrap=#2] (#1-left) {}; %
  \node[wrap=#4] (#1-right) {}; %
  % Draw the gate
  \node[gate=(#1-left)(#1-right)] (#1) {}; %
  % Add captions
  \node (#1-left-caption)
  {#3}; %
  \node (#1-right-caption)
  {#5}; %
  % Draw middle separation
  \draw [-, dashed] (#1.north) -- (#1.south); %
  % Draw inputs
  \foreach \x in {#6} { %
    \draw [-*,thick] (\x) -- (#1); %
  } ;%
}

% \hgate {name} {fitlist-top} {caption-top} {fitlist-bottom}
% {caption-bottom} {inputs}
\newcommand{\hgate}[6]{ %
  % Wrap the left and right parts
  \node[wrap=#2] (#1-top) {}; %
  \node[wrap=#4] (#1-bottom) {}; %
  % Draw the gate
  \node[gate=(#1-top)(#1-bottom)] (#1) {}; %
  % Add captions
  \node (#1-top-caption)
  {#3}; %
  \node (#1-bottom-caption)
  {#5}; %
  % Draw middle separation
  \draw [-, dashed] (#1.west) -- (#1.east); %
  % Draw inputs
  \foreach \x in {#6} { %
    \draw [-*,thick] (\x) -- (#1); %
  } ;%
}

\node[obs]                                  (y)     {$y_n$} ;
\node[latent, above left=1.5 and 0.5 of y]  (mu)    {$\mu$} ;
\node[latent, above right=1.5 and 0.5 of y] (tau)   {$\tau$} ;
\factor[above=of mu] {mu-f} {left:$\mathcal{N}$} {} {mu} ;
\factor[above=of tau] {tau-f} {left:$\mathcal{G}$} {} {tau} ;

\factor[above=of y] {y-f} {left:$\mathcal{N}$} {mu,tau}     {y};

\plate {} {(y)(y-f)(y-f-caption)} {$n=0,\ldots,9$} ;

Directed factor graph of the example model.

This model can be constructed in BayesPy as follows:

>>> from bayespy.nodes import GaussianARD, Gamma
>>> mu = GaussianARD(0, 1e-6)
>>> tau = Gamma(1e-6, 1e-6)
>>> y = GaussianARD(mu, tau, plates=(10,))

This is quite self-explanatory given the model definitions above. We have used two types of nodes GaussianARD and Gamma to represent Gaussian and gamma distributions, respectively. There are much more distributions in bayespy.nodes so you can construct quite complex conjugate exponential family models. The node y uses keyword argument plates to define the plates n=0,\ldots,9.

Performing inference

Now that we have created the model, we can provide our data by setting y as observed:

>>> y.observe(data)

Next we want to estimate the posterior distribution. In principle, we could use different inference engines (e.g., MCMC or EP) but currently only variational Bayesian (VB) engine is implemented. The engine is initialized by giving all the nodes of the model:

>>> from bayespy.inference import VB
>>> Q = VB(mu, tau, y)

The inference algorithm can be run as long as wanted (max. 20 iterations in this case):

>>> Q.update(repeat=20)
Iteration 1: loglike=-6.020956e+01 (... seconds)
Iteration 2: loglike=-5.820527e+01 (... seconds)
Iteration 3: loglike=-5.820290e+01 (... seconds)
Iteration 4: loglike=-5.820288e+01 (... seconds)
Converged at iteration 4.

Now the algorithm converged after four iterations, before the requested 20 iterations. VB approximates the true posterior p(\mu,\tau|\mathbf{y}) with a distribution which factorizes with respect to the nodes: q(\mu)q(\tau).

Examining posterior approximation

The resulting approximate posterior distributions q(\mu) and q(\tau) can be examined, for instance, by plotting the marginal probability density functions:

>>> import bayespy.plot as bpplt
>>> bpplt.pyplot.subplot(2, 1, 1)
<matplotlib.axes...AxesSubplot object at 0x...>
>>> bpplt.pdf(mu, np.linspace(-10, 20, num=100), color='k', name=r'\mu')
[<matplotlib.lines.Line2D object at 0x...>]
>>> bpplt.pyplot.subplot(2, 1, 2)
<matplotlib.axes...AxesSubplot object at 0x...>
>>> bpplt.pdf(tau, np.linspace(1e-6, 0.08, num=100), color='k', name=r'\tau')
[<matplotlib.lines.Line2D object at 0x...>]
>>> bpplt.pyplot.tight_layout()
>>> bpplt.pyplot.show()
import numpy as np
np.random.seed(1)
data = np.random.normal(5, 10, size=(10,))
from bayespy.nodes import GaussianARD, Gamma
mu = GaussianARD(0, 1e-6)
tau = Gamma(1e-6, 1e-6)
y = GaussianARD(mu, tau, plates=(10,))
y.observe(data)
from bayespy.inference import VB
Q = VB(mu, tau, y)
Q.update(repeat=20)
import bayespy.plot as bpplt
bpplt.pyplot.subplot(2, 1, 1)
bpplt.pdf(mu, np.linspace(-10, 20, num=100), color='k', name=r'\mu')
bpplt.pyplot.subplot(2, 1, 2)
bpplt.pdf(tau, np.linspace(1e-6, 0.08, num=100), color='k', name=r'\tau')
bpplt.pyplot.tight_layout()
bpplt.pyplot.show()
 

{\displaystyle P(\mathrm {A} |\mathrm {B} )={\frac {P(\mathrm {B} |\mathrm {A} ).P(\mathrm {A} )}{P(\mathrm {B} )}}.}

 

Leave a Reply

Post Navigation