## What is a Bayesian Network?

A Bayesian Network is a graphical model representing the joint probability distribution of multiple random variables. Through the use of the chain rule, this distribution can be factorized in various ways, depending on how the variables are ordered. This factorization can be further simplified if the distribution satisfies certain conditional independencies between the variables.

## Factorization and Joint Distribution

There are different ways to factorize a joint distribution, and Bayesian Networks serve as a tool for visualizing these possibilities. By assigning a node to each variable and linking the conditioned variables to their parent variables, we create a structure that illustrates these relationships. The nodes on the left side of the conditioning symbol are considered the “children,” while the ones on the right are the “parents.” The outcome of this process is a directed acyclic graph (DAG), also known as the Bayesian Network structure.

## Reading the DAG and Conditional Independencies

Once a DAG is established, it becomes easy to reconstruct the factorization of the original joint distribution. This is done by multiplying the probability of each node, conditioned on its parent nodes. Additionally, conditional independencies can be inferred from the structure. For instance, each node is independent of its non-descendant nodes, as long as the parent nodes are known.

## Bayesian Network Structure vs. Parameters

It is essential to distinguish between the structure and the parameters of a Bayesian Network. While the DAG provides the structural view, the Bayesian Network parameters, also known as Conditional Probability Distributions (CPDs), offer a deeper insight. CPDs describe the conditional probabilities, and when combined with the DAG, the Bayesian Network becomes a complete representation of the joint probability distribution.

In summary, a Bayesian Network serves as both a visualization and a functional tool for understanding how a joint probability distribution factorizes and how each part of this distribution interacts with others. It merges structure and parameterization, providing a comprehensive view of the underlying probabilistic relationships.