VOOZH about

URL: https://en.wikipedia.org/wiki/Bonferroni_bound

โ‡ฑ Boole's inequality - Wikipedia


Jump to content
From Wikipedia, the free encyclopedia
(Redirected from Bonferroni bound)
Inequality applying to probability spaces
This article includes a list of general references, but it lacks sufficient corresponding inline citations. Please help to improve this article by introducing more precise citations. (February 2012) (Learn how and when to remove this message)
Part of a series on statistics
Probability theory
๐Ÿ‘ Image

In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events. This inequality provides an upper bound on the probability of occurrence of at least one of a countable number of events in terms of the individual probabilities of the events. Boole's inequality is named for its discoverer, George Boole.[1]

Formally, for a countable set of events A1, A2, A3, ..., we have

๐Ÿ‘ {\displaystyle {\mathbb {P} }\left(\bigcup _{i=1}^{\infty }A_{i}\right)\leq \sum _{i=1}^{\infty }{\mathbb {P} }(A_{i}).}

In measure-theoretic terms, Boole's inequality follows from the fact that a measure (and certainly any probability measure) is ฯƒ-sub-additive. Thus Boole's inequality holds not only for probability measures ๐Ÿ‘ {\displaystyle {\mathbb {P} }}
, but more generally when ๐Ÿ‘ {\displaystyle {\mathbb {P} }}
is replaced by any measure.

Proof

[edit]

Proof using induction

[edit]

Boole's inequality may be proved for finite collections of ๐Ÿ‘ {\displaystyle n}
events using the method of induction.[citation needed]

For the ๐Ÿ‘ {\displaystyle n=1}
case, it follows that

๐Ÿ‘ {\displaystyle \mathbb {P} (A_{1})\leq \mathbb {P} (A_{1}).}

For the case ๐Ÿ‘ {\displaystyle n}
, we have

๐Ÿ‘ {\displaystyle {\mathbb {P} }\left(\bigcup _{i=1}^{n}A_{i}\right)\leq \sum _{i=1}^{n}{\mathbb {P} }(A_{i}).}

Since ๐Ÿ‘ {\displaystyle \mathbb {P} (A\cup B)=\mathbb {P} (A)+\mathbb {P} (B)-\mathbb {P} (A\cap B),}
and because the union operation is associative, we have

๐Ÿ‘ {\displaystyle \mathbb {P} \left(\bigcup _{i=1}^{n+1}A_{i}\right)=\mathbb {P} \left(\bigcup _{i=1}^{n}A_{i}\right)+\mathbb {P} (A_{n+1})-\mathbb {P} \left(\bigcup _{i=1}^{n}A_{i}\cap A_{n+1}\right).}

Since

๐Ÿ‘ {\displaystyle {\mathbb {P} }\left(\bigcup _{i=1}^{n}A_{i}\cap A_{n+1}\right)\geq 0,}

by the first axiom of probability, we have

๐Ÿ‘ {\displaystyle \mathbb {P} \left(\bigcup _{i=1}^{n+1}A_{i}\right)\leq \mathbb {P} \left(\bigcup _{i=1}^{n}A_{i}\right)+\mathbb {P} (A_{n+1}),}

and therefore

๐Ÿ‘ {\displaystyle \mathbb {P} \left(\bigcup _{i=1}^{n+1}A_{i}\right)\leq \sum _{i=1}^{n}\mathbb {P} (A_{i})+\mathbb {P} (A_{n+1})=\sum _{i=1}^{n+1}\mathbb {P} (A_{i}).}

Proof without using induction

[edit]

Let events ๐Ÿ‘ {\displaystyle A_{1},A_{2},A_{3},\dots }
in our probability space be given. The countable additivity of the measure ๐Ÿ‘ {\displaystyle \mathbb {P} }
states that if ๐Ÿ‘ {\displaystyle B_{1},B_{2},B_{3},\dots }
are pairwise disjoint events, then

๐Ÿ‘ {\displaystyle \mathbb {P} \left(\bigcup _{i}B_{i}\right)=\sum _{i}\mathbb {P} (B_{i}).}

Set

๐Ÿ‘ {\displaystyle B_{i}:=A_{i}-\bigcup _{j=1}^{i-1}A_{j}.}

Then ๐Ÿ‘ {\displaystyle B_{1},B_{2},B_{3},\dots }
are pairwise disjoint. We claim that:

๐Ÿ‘ {\displaystyle \bigcup _{i=1}^{\infty }A_{i}=\bigcup _{i=1}^{\infty }B_{i}.}

One inclusion is clear. Indeed, since ๐Ÿ‘ {\displaystyle B_{i}\subset A_{i}}
for all i, thus ๐Ÿ‘ {\displaystyle \bigcup _{i=1}^{\infty }B_{i}\subset \bigcup _{i=1}^{\infty }A_{i}}
.

For the other inclusion, let ๐Ÿ‘ {\displaystyle x\in \bigcup _{i=1}^{\infty }A_{i}}
be given. Write ๐Ÿ‘ {\displaystyle k}
for the minimum positive integer such that ๐Ÿ‘ {\displaystyle x\in A_{k}}
. Then ๐Ÿ‘ {\displaystyle x\in A_{k}-\bigcup _{j=1}^{k-1}A_{j}=B_{k}}
. Thus ๐Ÿ‘ {\displaystyle x\in \bigcup _{i=1}^{\infty }B_{i}}
. Therefore ๐Ÿ‘ {\displaystyle \bigcup _{i=1}^{\infty }A_{i}\subset \bigcup _{i=1}^{\infty }B_{i}}
.

Therefore

๐Ÿ‘ {\displaystyle \mathbb {P} \left(\bigcup _{i}A_{i}\right)=\mathbb {P} \left(\bigcup _{i}B_{i}\right)=\sum _{i}\mathbb {P} (B_{i})\leq \sum _{i}\mathbb {P} (A_{i}),}

where the last inequality holds because ๐Ÿ‘ {\displaystyle B_{i}\subset A_{i}}
implies that ๐Ÿ‘ {\displaystyle \mathbb {P} (B_{i})\leq \mathbb {P} (A_{i}),}
for all i.

Bonferroni inequalities

[edit]

Boole's inequality for a finite number of events may be generalized to certain upper and lower bounds on the probability of finite unions of events.[2] These bounds are known as Bonferroni inequalities, after Carlo Emilio Bonferroni; see Bonferroni (1936).

Let

๐Ÿ‘ {\displaystyle S_{1}:=\sum _{i=1}^{n}{\mathbb {P} }(A_{i}),\quad S_{2}:=\sum _{1\leq i_{1}<i_{2}\leq n}{\mathbb {P} }(A_{i_{1}}\cap A_{i_{2}}),\quad \ldots ,\quad S_{k}:=\sum _{1\leq i_{1}<\cdots <i_{k}\leq n}{\mathbb {P} }(A_{i_{1}}\cap \cdots \cap A_{i_{k}})}

for all integers k in {1, ..., n}.

Then, when ๐Ÿ‘ {\displaystyle K\leq n}
is odd:

๐Ÿ‘ {\displaystyle \sum _{j=1}^{K}(-1)^{j-1}S_{j}\geq \mathbb {P} {\Big (}\bigcup _{i=1}^{n}A_{i}{\Big )}=\sum _{j=1}^{n}(-1)^{j-1}S_{j}}

holds, and when ๐Ÿ‘ {\displaystyle K\leq n}
is even:

๐Ÿ‘ {\displaystyle \sum _{j=1}^{K}(-1)^{j-1}S_{j}\leq \mathbb {P} {\Big (}\bigcup _{i=1}^{n}A_{i}{\Big )}=\sum _{j=1}^{n}(-1)^{j-1}S_{j}}

holds.

The inequalities follow from the inclusionโ€“exclusion principle, and Boole's inequality is the special case of ๐Ÿ‘ {\displaystyle K=1}
. Since the proof of the inclusion-exclusion principle requires only the finite additivity (and nonnegativity) of ๐Ÿ‘ {\displaystyle \mathbb {P} }
, the Bonferroni inequalities holds more generally when ๐Ÿ‘ {\displaystyle \mathbb {P} }
is replaced by any finite content, in the sense of measure theory.

Proof for odd K

[edit]

Let ๐Ÿ‘ {\displaystyle E=\bigcap _{i=1}^{n}B_{i}}
, where ๐Ÿ‘ {\displaystyle B_{i}\in \{A_{i},A_{i}^{c}\}}
for each ๐Ÿ‘ {\displaystyle i=1,\dots ,n}
. These such ๐Ÿ‘ {\displaystyle E}
partition the sample space, and for each ๐Ÿ‘ {\displaystyle E}
and every ๐Ÿ‘ {\displaystyle i}
, ๐Ÿ‘ {\displaystyle E}
is either contained in ๐Ÿ‘ {\displaystyle A_{i}}
or disjoint from it.

If ๐Ÿ‘ {\displaystyle E=\bigcap _{i=1}^{n}A_{i}^{c}}
, then ๐Ÿ‘ {\displaystyle E}
contributes 0 to both sides of the inequality.

Otherwise, assume ๐Ÿ‘ {\displaystyle E}
is contained in exactly ๐Ÿ‘ {\displaystyle L}
of the ๐Ÿ‘ {\displaystyle A_{i}}
. Then ๐Ÿ‘ {\displaystyle E}
contributes exactly ๐Ÿ‘ {\displaystyle \mathbb {P} (E)}
to the right side of the inequality, while it contributes

๐Ÿ‘ {\displaystyle \sum _{j=1}^{K}(-1)^{j-1}{L \choose j}\mathbb {P} (E)}

to the left side of the inequality. However, by Pascal's rule, this is equal to

๐Ÿ‘ {\displaystyle \sum _{j=1}^{K}(-1)^{j-1}{\Big (}{L-1 \choose j-1}+{L-1 \choose j}{\Big )}\mathbb {P} (E)}

which telescopes to

๐Ÿ‘ {\displaystyle {\Big (}1+{L-1 \choose K}{\Big )}\mathbb {P} (E)\geq \mathbb {P} (E)}

Thus, the inequality holds for all events ๐Ÿ‘ {\displaystyle E}
, and so by summing over ๐Ÿ‘ {\displaystyle E}
, we obtain the desired inequality:

๐Ÿ‘ {\displaystyle \sum _{j=1}^{K}(-1)^{j-1}S_{j}\geq \mathbb {P} {\Big (}\bigcup _{i=1}^{n}A_{i}{\Big )}}

The proof for even ๐Ÿ‘ {\displaystyle K}
is nearly identical.[3]

Example

[edit]

Suppose that you are estimating five parameters based on a random sample, and you can control each parameter separately. If you want your estimations of all five parameters to be good with a chance 95%, what should you do to each parameter?

Tuning each parameter's chance to be good to within 95% is not enough because "all are good" is a subset of each event "Estimate i is good". We can use Boole's Inequality to solve this problem. By finding the complement of event "all five are good", we can change this question into another condition:

P(at least one estimation is bad) = 0.05 โ‰ค P(A1 is bad) + P(A2 is bad) + P(A3 is bad) + P(A4 is bad) + P(A5 is bad)

One way is to make each of them equal to 0.05/5 = 0.01, that is 1%. In other words, you have to guarantee each estimate good to 99%( for example, by constructing a 99% confidence interval) to make sure the total estimation to be good with a chance 95%. This is called the Bonferroni Method of simultaneous inference.

See also

[edit]

References

[edit]
  1. ^ Boole, George (1847). The Mathematical Analysis of Logic. Philosophical Library. ISBN 9780802201546. {{cite book}}: ISBN / Date incompatibility (help)
  2. ^ Casella, George; Berger, Roger L. (2002). Statistical Inference. Duxbury. pp. 11โ€“13. ISBN 0-534-24312-6.
  3. ^ Venkatesh, Santosh (2012). The Theory of Probability. Cambridge University Press. pp. 94โ€“99, 113โ€“115. ISBN 978-0-534-24312-8.

Other related articles

[edit]

This article incorporates material from Bonferroni inequalities on PlanetMath, which is licensed under the Creative Commons Attribution/Share-Alike License.