#  Probability: Definitions and Rules 

The intention here is not to have a comprehensive introduction to probability, but just to provide a reminder of the basic definitions and rules. Every statistics textbook has a chapter on probability that is more complete than this section. We encourage the readers who have not encountered the concept of probability to find a good introductory chapter, and we offer a suggestion/reference at the end of this section.

In the following, we start with some basic definitions illustrated on three examples.

**Random phenomenon**:  where individual outcomes are uncertain; for example:
1. Roll a die and record the outcome. We do not know before rolling the die what the outcome will be.
2. The number of boys in 100 births in a Chicago hospital; outcome is uncertain as we do not know if the number of boys will be 50, or 40 or something else. 
3. The set of birthdays in a group of 30 people. The outcome is uncertain because we do not know beforehand what the birthdays will be (whether some people share a birthday or all are different).

**Sample space (denote by $S$)**: the set of all possible outcomes of a phenomenon; in the above examples:
1. $S$ is the set of integers from 1 to 6: $S = \{1,2,3,4,5,6\}$.
2. $S$ is the set of integers from 0 to 100 (possible outcomes for the number of boys are 0, 1, 2 , ..., 100). 
3. $S$ is the set of all possible assignments of 30 birthdates (one for each person).

**Event (denoted by $A$ or $B$ here)**: A set of outcomes of a random phenomenon; for example:
1. Rolling an even number: $A = \{2,4,6\}$.
2. $A$ is the event that less than half of the babies are boys. $A$ is the set of integers from 0 to 49.
3. $A$ is the event that at least two people in the group share a birthday.

**Mutually exclusive events**: Events $A$ and $B$ are mutually exclusive (or disjoint) if they have no outcomes in common. Examples:
1. $A$ is as above (rolling an even number) and $B$ is rolling a 3.
2. $A$ is as above (less than half of the babies are boys) and $B$ is the event that the number of boys is between 60 and 70. 
3. $A$ is as above (at least two people share birthdays) and $B$ is the event that there is a birthday to celebrate for every day in the month of June.


**Complement of an event**: The complement of an event $A$ is the event that $A$ does not occur, denoted by $A^C$. 

<img align="center" src="./Img_Prob_Def/Complement.png" width="200"/>

For the events $A$ defined above:
1. If $A$ is rolling an even number then $A^C$ is rolling an odd number: $A^C = \{1,3,5\}$. 
2. If $A$ is less than half of the babies are boys then $A^C$ is the event that half or more than half of the babies are boys, or the set of integers from 50 to 100.
3. If $A$ is at least two people share birthdays then $A^C$ is the event when there are no shared birthdays.



**Compound events**: Events built from combinations of other events; for example, union and intersection.

**Union:** ($A$ or $B$) = ($A\cup B$): set of all outcomes in $A$, or in $B$, or in both.

<img align="center" src="./Img_Prob_Def/Union.png" width="200"/>



**Intersection:**  ($A$ and $B$) = ($A\cap B$): set of all outcomes that are in $A$ and in $B$.

<img align="center" src="./Img_Prob_Def/Intersection.png" width="200"/>


## Definition of Probability

Probabilities describe how likely events are and so probability models consist of:
- A list of possible outcomes (sample space)
- An assignment of probabilities $\text{P}$ for each possible outcome

The **frequentist interpretation of the probability** of an event $A$, $\text{P}(A)$, is the long run relative frequency of the event $A$. Suppose you are interested in the probability of "Heads" when tossing a coin. In this frequentist interpretation, probability is given by the limit of the relative frequency of "Heads" when tossing the coin repeatedly. Note that while you can imagine repeating the coin toss for a large number of times (and some people have done it!), there are other events where the intuition behind frequentists probabilities are not as evident. For example, what is the probability that it will rain next Sunday? This is where the **Bayesian interpretation** of probability - based on a subjective degree of belief - is more natural. In the Bayesian world, two people could have different viewpoints and assign different probabilities. 

Note that the rules below are universal.

## Basic Probability Rules

Given a sample space $S$ and events $A, B \subseteq S$, we have:

- $0 \le \text{P}(A) \le 1$

- $\text{P}(S) = 1$

-  $\text{P}(A^C) = 1 - \text{P}(A)$

- $\text{P}(A \cup B) = \text{P}(A) +
\text{P}(B) - \text{P}(A \cap B)$

- **Equally likely outcomes**:
  
$$\text{P}(A)=\frac{\text{ Number of outcomes in } A}{\text{ Total number of outcomes}}$$

The last rule refers to situations where all outcomes of an experiment are equally likely (for example, roll a fair die).


## Conditional Probability
If $\text{P}(B) \ne 0$, the conditional probability of event $A$
given $B$ has occurred, denoted by $\text{P}(A|B)$, is defined by,

$$ \text{P}(A|B) = \frac{\text{P}(A \text{ and } B)}{\text{P}(B)}$$

<img align="center" src="./Img_Prob_Def/Conditional_Prob.png" width="400"/>

Example:
- Select one subject at random in US;
- A is the event that the subject read a book last week;
- B is the event that the subject is a college student;
- Consider P(A|B) versus P(A): the fraction of college students who read a book last week is likely different than the fraction of US population who did that.

**Multiplication rule**: $\text{P}(A \text{ and } B) = \text{P}(A|B) \text{P}(B)$. Note that this follows directly from the definition of conditional probability.

## Independence

Independent events are two or more events where the occurrence of one does not affect the probability of the others. Events $A$ and $B$ are called independent if $\text{P}(A|B) =
\text{P}(A)$ (or equivalently, $\text{P}(B|A) = \text{P}(B)$)

Equivalent condition for **independence**: 

$$\text{P}(A \text{ and } B) = \text{P}(A) \text{P}(B)$$

## Bayes' Theorem

The following property follows directly from the definition of conditional independence and the multiplication rule. If $\text{P}(B) \neq 0$,

$$\text{P}(A|B)  = \frac{\text{P}(B|A) \text{P}(A)}{\text{P}(B)}$$

This is one of the most important rules in statistics and data science because it describes statistical learning, and provides a way to update a belief (probability) given additional evidence (data).

### The solution to the birthday problem

We will use the **equally likely outcomes** formula from the Basic Probability Rules above. Note that, for $n$ random subjects, the total number of outcomes (number of possible combination of birthdays) is 

$$365^n.$$

The number of outcomes that lead to a set of distinct birthdays is
$$365\times364\times ...\times (365-n+1)$$
The intuition comes from the way we can count the total number of distinct birthdays as follows:
- suppose you look at people sequentially (one by one);
- first person can have any of the 365 birthdays without leading to matched birthdays;
- the second can have any of birthdays except the one of the first person: so 364 possibilities;
- following this pattern, the $n$-th person can have any of birthdays except the (n-1) distinct birthdays of the other people (already selected): so (365-n+1) possibilities.

So the probability of having $n$ distinct birthdays is:

$$\frac{365\times364\times ...\times (365-n+1)}{365^n}$$

The complement of this event is the event of interest (at least two people share birthdays) and so the probability of interest is:

$$\text{P}_n ~=~ 1-\frac{365\times364\times ...\times (365-n+1)}{365^n}$$

**Reference.**

1. OpenIntro Statistics (Chapter 3 on Probability). Available for  download at https://www.openintro.org/book/os/.