3.2 Probability

We will assign a real number \(\mathbb{P}(A)\) to every event \(A\), called the probability of \(A\). We also call \(\mathbb{P}\) a probability distribution or a probability measure. To qualify as a probability \(\mathbb{P}\) must satisfy three axioms.

Definition: A function \(\mathbb{P}\) that assigns a real number \(\mathbb{P}(A)\) to each event \(A\) is a probability distribution if it satifies the following three axioms:

  • \(\mathbb{P}(A)\geq 0\) for every event \(A\)
  • \(\mathbb{P}(\Omega)=1\)
  • If \(A_1,A_2,\dots\) are disjoint then \(\mathbb{P}(\cup_{i=1}^\infty A_i)=\sum_{i=1}^\infty\mathbb{P}(A_i)\)

One can derive many properties from the axioms, such as:

  • \(\mathbb{P}(\emptyset)=0\)
  • \(A\subset B \Rightarrow \mathbb{P}(A) \leq \mathbb{P}(B)\)
  • \(0\leq \mathbb{P}(A) \leq 1\)
  • \(\mathbb{P}(A^c)= 1-\mathbb{P}(A)\)
  • \(A\cap B = \emptyset \Rightarrow \mathbb{P}(A\cup B) = \mathbb{P}(A)+\mathbb{P}(B)\)
  • In general \(\mathbb{P}(A\cup B) = \mathbb{P}(A)+\mathbb{P}(B)- \mathbb{P}(A\cap B)\)

Consider the experiment of throwing two dice. The sample space is

\[ \Omega = \{(\omega_1,\omega_2): \omega_1=1,\dots,6, \omega_2=1,\dots,6\} \]

where \(\omega_1\) is the number of the first dice and \(\omega_2\) is the number of the second dice. There are 36 elements in the sample space and if the dice are fair each have probability 1/36. Consider the event that the sum of the dice is seven. The event of interest is

\[ A=\{(\omega_1,\omega_2):\omega_1+\omega_2=7\}=\{(1,6),(2,5),(3,4),(4,3),(5,2),(6,1)\} \]

and thus \(\mathbb{P}(A)=6/36=1/6\).

Consider now the event of observing at least one six. The event of interest is now

\[ A=\{(\omega_1,\omega_2): \omega_1=6 \; and/or \; \omega_2=6\} \]

We can write \(A=A_1\cup A_2\) where \(A_1\{(\omega_1,\omega_2): \omega_1=6 \}\) and \(A_2\{(\omega_1,\omega_2): \omega_2=6 \}\). Notice that \(A_1\cap A_2 =\{(6,6)\}\) and that \(\mathbb{P}(A_1)=\mathbb{P}(A_2)=6/36\) whilst \(\mathbb{P}(A_1\cap A_2)=1/36\). Thus

\[ \mathbb{P}(A)=\mathbb{P}(A_1\cup A_2)=\mathbb{P}(A_1)+\mathbb{P}(A_2)-\mathbb{P}(A_1\cap A_2)=6/36+ 6/36 -1/36 =11/36. \]

3.2.1 Independence

If we flip a fair coin twice, then the probability of two heads is \(\frac{1}{2}\times \frac{1}{2}\). We multiply the probabilities because we regard the two tosses as independent.

Definition: Two events \(A\) and \(B\) are independent if

\[ \mathbb{P}(A,B)=\mathbb{P}(A)\mathbb{P}(B) \]

Independence can arise in two distinct ways. Sometimes, we explicitly assume that two events are independent (for example in the toss of two coins). In other instances we derive independence by veryfing that \(\mathbb{P}(A,B)=\mathbb{P}(A)\mathbb{P}(B)\) holds.

Suppose that \(A\) and \(B\) are disjoint events, each with positive probability. Can they be independent? No. This follows since \(\mathbb{P}(A)\mathbb{P}(B)>0\) yet \(\mathbb{P}(A,B)=\mathbb{P}(\emptyset)=0.\)

Consider the toss of a fair dice. Let \(A=\{2,4,6\}\) and \(B=\{1,2,3,4\}\). Then \(A\cap B = \{2,4\}\), \(\mathbb{P}(A,B)=2/6\), \(\mathbb{P}(A)\mathbb{P}(B)=(1/2)\times (2/3)\) and so \(A\) and \(B\) are independent.

Toss a fair coin 10 times. Let \(A\) be the event at least one head. Let \(T_j\) be the event that tails occurs on the jth toss. Then

\[\begin{align} \mathbb{P}(A) & = 1 - \mathbb{P}(A^c)\\ & = 1- \mathbb{P}(all\; tails)\\ & = 1- \mathbb{P}(T_1,T_2,\dots,T_{10})\\ & = 1- \mathbb{P}(T_1)\times\cdots\times\mathbb{P}(T_{10})\\ & = 1- (1/2)^{10}\approx 0.999 \end{align}\]

3.2.2 Conditional Probabilities

Definition The conditional probability of \(A\) given that \(B\) has occurred, assuming \(\mathbb{P}(B)>0\), is

\[ \mathbb{P}(A|B)=\frac{\mathbb{P}(A,B)}{\mathbb{P}(B)} \]

Example A medical test for a disease \(D\) has outcomes + and -. The probabilities are

\[\begin{align} \mathbb{P}(D,+)&=0.009\\ \mathbb{P}(D,-)&=0.001 \\ \mathbb{P}(D^c,+)&=0.099\\ \mathbb{P}(D^c,-)&=0.891 \end{align}\]

From the definition of conditional probability

\[ \mathbb{P}(+|D)=\frac{\mathbb{P}(+,D)}{\mathbb{P}(D)}=\frac{0.009}{0.009+0.001}=0.9 \]

and \[ \mathbb{P}(-|D^c)=\frac{\mathbb{P}(-,D^c)}{\mathbb{P}(D^c)}=\frac{0.891}{0.891+0.099}\approx 0.9 \]

The test is fairly accurate: sick people yield a positive 90 percent of the time and healthy people yield a negative about 90 percent of the times. Suppose you go for a test and get a positive. What is the probability you have the disease?

\[ \mathbb{P}(D|+)=\frac{\mathbb{P}(+,D)}{\mathbb{P}(+)}=\frac{0.009}{0.009+0.099}\approx 0.08 \]

3.2.3 Conditional Probabilities and Independence

Theorem If \(A\) and \(B\) are independent events then \(\mathbb{P}(A|B)=\mathbb{P}(A)\). Also for any pair of events \(A\) and \(B\) \[ \mathbb{P}(A,B)=\mathbb{P}(A|B)\mathbb{P}(B)=\mathbb{P}(B|A)\mathbb{P}(A) \]

Example Draw two cards from a deck without replacement. Let \(A\) be the event that the first draw is the ace of clubs and let \(B\) be the event that the second draw is the queen of diamonds. Then \(\mathbb{P}(A,B)=\mathbb{P}(B|A)\mathbb{P}(A)=(1/51)\times (1/52)\).