# Probability space

The definition of the probability space is the foundation of probability theory. It was introduced by Kolmogorov in the 1930s. For an algebraic alternative to Kolmogorov's approach, see algebra of random variables.

## Definition

A probability space $(\Omega, \mathcal F, P)$ is a measure space with a measure P that satisfies the probability axioms.

### Sample space

The sample space $\Omega,$ is a nonempty set whose elements are known as outcomes or states of nature and are often given the symbol $\omega.$

The set of all the possible outcomes of an experiment is known as the sample space of the experiment.

### Events

The second item, $\mathcal F$, is a σ-algebra of subsets of $\Omega$. Its elements are called events, which are sets of outcomes for which one can ask a probability.

Because $\mathcal F$ is a σ-algebra, it contains $\Omega$; also, the complement of any event is an event, and the union of any (finite or countably infinite) sequence of events is an event.

Usually, the events are the Lebesgue-measurable or Borel-measurable sets of real numbers.

### Probability measure

The probability measure $P$ is a function from $\mathcal F$ to the real numbers that assigns to each event a probability between 0 and 1. It must satisfy the probability axioms.

Because $P$ is a function defined on $\mathcal F$ and not on $\Omega$, the set of events is not required to be the complete power set of the sample space; that is, not every set of outcomes is necessarily an event.

When more than one measure is under discussion, probability measures are often written in blackboard bold to distinguish them. When there is only one probability measure under discussion, it is often denoted by Pr, meaning "probability of".

## Related concepts

### Probability distribution

A probability distribution is any probability measure.

### Random variables

A random variable X is a measurable function from the sample space $\Omega$; to another measurable space called the state space.

If X is a real-valued random variable, then the notation Pr(X ≥ 60) is shorthand for ${\scriptstyle\Pr(\{ \omega \in \Omega \mid X(\omega) \ge 60 \})}$, assuming that "X ≥ 60" is an event.

### Defining the events in terms of the sample space

If $\Omega$ is countable we almost always define $\mathcal F$ as the power set of $\Omega$, i.e $\mathcal F=\mathbb P (\Omega)$ which is trivially a σ-algebra and the biggest one we can create using $\Omega$. We can therefore omit $\mathcal{F}$ and just write $(\Omega, P)$ to define the probability space.

On the other hand, if $\Omega$ is uncountable and we use $\mathcal F=\mathbb P (\Omega)$ we get into trouble defining our probability measure $P$ because $\mathcal{F}$ is too 'huge', i.e. there will often be sets to which it will be impossible to assign a unique measure, giving rise to problems like the Banach–Tarski paradox. In this case, we have to use a smaller σ-algebra $\mathcal F$ (e.g. the Borel algebra of $\Omega$, which is the smallest σ-algebra that makes all open sets measurable).

### Conditional probability

Kolmogorov's definition of probability spaces gives rise to the natural concept of conditional probability. Every set $A$ with non-zero probability (that is, P(A) > 0 ) defines another probability measure

$P(B \vert A) = {P(B \cap A) \over P(A)}$

on the space. This is usually read as the "probability of B given A".

### Independence

Two events, A and B are said to be independent if P(AB)=P(A)P(B).

Two random variables, X and Y, are said to be independent if any event defined in terms of X is independent of any event defined in terms of Y. Formally, they generate independent σ-algebras, where two σ-algebras G and H, which are subsets of F are said to be independent if any element of G is independent of any element of H.

The concept of independence is where probability theory departs from measure theory.

### Mutual exclusivity

Two events, A and B are said to be mutually exclusive or disjoint if P(AB)=0. (This is weaker than AB=, which is the definition of disjoint for sets).

If A and B are disjoint events, then P(AB)=P(A)+P(B). This extends to a (finite or countably infinite) sequence of events. However, the probability of the union of an uncountable set of events is not the sum of their probabilities. For example, if Z is a normally distributed random variable, then P(Z=x) is 0 for any x, but P(Z is real)=1.

The event AB is referred to as A AND B, and the event AB as A OR B.

## Examples

### First example

If the space concerns one flip of a fair coin, then the outcomes are heads and tails:

$\Omega = \{H,T\}$

The events are

• {T}: tails,
• {}: neither heads nor tails, and

So, $F=\{\{H\},\{T\},\{\},\{H,T\}\}.$

There is a fifty percent chance of tossing either heads or tail: P({H}) = P({T}) = 0.5. The chance of tossing neither is zero: P({})=0, and the chance of tossing one or the other is one: P({H,T})=1.

### Second example

If 100 voters are to be drawn randomly from among all voters in California and asked whom they will vote for governor, then the set of all sequences of 100 Californian votes would be the sample space Ω.

The set of all sequences of 100 Californian voters in which at least 60 will vote for Schwarzenegger is identified with the "event" that at least 60 of the 100 chosen voters will so vote.

Then, $\mathcal F$ contains: (1) the set of all sequences of 100 where at least 60 vote for Schwarzenegger; (2) the set of all sequences of 100 where fewer than 60 vote for Schwarzenegger (the converse of (1)); (3) the sample space Ω as above; and (4) the empty set.

An example of a random variable is the number of voters who will vote for Schwarzenegger in the sample of 100.

## Bibliography

• Pierre Simon de Laplace (1812) Analytical Theory of Probability
The first major treatise blending calculus with probability theory, originally in French: Théorie Analytique des Probabilités.
• Andrei Nikolajevich Kolmogorov (1950) Foundations of the Theory of Probability
The modern measure-theoretic foundation of probability theory; the original German version (Grundbegriffe der Wahrscheinlichkeitrechnung) appeared in 1933.
• Harold Jeffreys (1939) The Theory of Probability
An empiricist, Bayesian approach to the foundations of probability theory.
• Edward Nelson (1987) Radically Elementary Probability Theory
Discrete foundations of probability theory, based on nonstandard analysis and internal set theory. downloadable. http://www.math.princeton.edu/~nelson/books.html
• Patrick Billingsley: Probability and Measure, John Wiley and Sons, New York, Toronto, London, 1979.
• Henk Tijms (2004) Understanding Probability
A lively introduction to probability theory for the beginner, Cambridge Univ. Press.