In this post I’m going to prove that the following four statements are equivalent given the ZF axioms:
(1) Hausdorff Maximal Principle (HMP): Every poset has a maximal chain.
(2) Zorn’s lemma: If $X$ is a poset and every chain has an upper bound in $X$, then $X$ has a maximal element.
(3) Well-ordering theorem (WO): Every nonempty set $X$ can be well ordered.
(4) Axiom of choice (AC): if $\{X_\alpha : \alpha \in A\}$ is a nonempty collection of nonempty sets, then $\prod_{\alpha \in A} X_\alpha$ is nonempty.
I’ll prove this by first showing that $1\iff 2$, and then $2\implies 3\implies 4 \implies 2$. My proofs will pretty closely follow those given by Folland in Chapter 0 of Real Analysis and I’m also using the definitions given there.
For all of this I’ll take $(X,\leq)$ to be a nonempty poset. I’m not very interested in special cases for empty posets so I won’t be worrying about those.
1: HMP $\iff$ Zorn
Pf: Suppose the HMP is true, and assume every chain in $X$ has an upper bound. By the HMP there is a maximal chain in $X$, say $C$, and by assumption $C$ has an upper bound, say $u$. Suppose there is a $v \in X$ such that $u \leq v \wedge u \neq v$. Then this would mean that $C^* := C \cup \{v\}$ is a chain, and $C \subset C^*$. But $C$ was a maximal chain so that’s not possible. Therefore there is no such $v$, and $u$ is a maximal element in $X$.
Now suppose Zorn’s lemma is true. Let $\mathcal C$ be the set of all chains in $X$ (this is a subset of $\mathcal P(X)$ and so is a valid set); since $X$ is nonempty and singletons are totally ordered, $\mathcal C$ is guaranteed to be nonempty. $\mathcal C$ is a collection of subsets of $X$ and is therefore partially ordered by $\subseteq$. Let $C$ be a chain in $\mathcal C$, so $C$ is a collection of subsets of $X$, where each subset of $X$ is totally ordered by $\leq$, and $C$ itself is totally ordered by $\subseteq$. Consider
$$
U := \bigcup_{c \in C} c
$$
so $U \subseteq X$. First, note that for every $c \in C$ $c \subseteq U$ so $U$ is an upper bound for $C$. Furthermore, let $x, y \in U$. Then there are elements $c_x, c_y \in C$ such that $x \in c_x$ and $y \in c_y$. Since $c_x$ and $c_y$ are in $C$, which is totally ordered, it must be that either $c_x\subseteq c_y$ or $c_y \subseteq c_x$; WLOG take $c_x\subseteq c_y$. Then $x,y \in c_y\in \mathcal C$ so $c_y$ is a totally ordered subset of $X$, and therefore $x \leq y$ or $y \leq x$. Thus every arbitrary pair of elements in $U$ are comparable so $U$ is itself a totally ordered subset of $X$.
I am assuming Zorn’s lemma, and I just showed that every chain in the poset $(\mathcal C, \subseteq)$ has an upper bound. This means $\mathcal C$ has a maximal element, which is precisely the maximal chain in $X$ that the HMP asserts the existence of.
$\square$
2: Zorn $\implies$ Well Ordering
For this I’m just going to consider $X$ as a set without any particular partial order on it yet, since WO doesn’t mean that any given partial order can be extended to a well order. For instance, given Zorn, this does mean that $\mathbb R$ can be well ordered. But it certainly isn’t with respect to the usual order on it.
Pf of WO theorem: let $\mathcal W$ be the collection of all well orderings of subsets of $X$. This means elements of $\mathcal W$ are things like $w = (E, \leq)$ for $E\subseteq X$ and $E$ is well ordered by $\leq$. $\leq$ is a binary relation and corresponds to a subset of $E\times E$, and $E\times E \subseteq X\times X$. This means
$$
\mathcal W \subseteq \mathcal P(X\times X)
$$
and since $X$ is a valid set this means $\mathcal W$ is too. Now define a binary relation $R$ on $\mathcal W$ where for $w_1,w_2\in \mathcal W$ with $w_i = (E_i,leq_i)$, $w_1 R w_2$ iff
- $w_2$ extends $w_1$: $E_1\subseteq E_2$ and $\leq_1$ and $\leq_2$ agree on $E_1$
- all elements of $E_2\backslash E_1$ are upper bounds w.r.t. $\leq_2$ for all elements of $E_1$, i.e. for any $x \in E_2\backslash E_1$ and $y \in E_1$, $y \leq_2 x$
Now I’ll show $\mathcal W$ is partially ordered by $R$ so let $w_1,w_2,w_3 \in \mathcal W$.
Reflexivity: $E_1\subseteq E_1$ and $\leq_1 = \leq_1$ on $E_1$; also $E_1\backslash E_1 = \emptyset$ so the other property holds vacuously. Therefore $w_1 R w_1$.
Antisymmetry: suppose $w_1 R w_2$ and $w_2 R w_1$. Then $E_1\subseteq E_2\subseteq E_1$ so $E_1=E_2$ and $\leq_1$ and $\leq_2$ agree everywhere they’re defined. This means $w_1=w_2$.
Transitivity. Suppose $w_1 R w_2$ and $w_2 R w_3$. Then $E_1\subseteq E_2\subseteq E_3$ so $E_1\subseteq E_3$, and since $\leq_3 = \leq_2$ on $E_2$, and $\leq_2 = \leq_1$ on $E_1\subseteq E_2$, this means $\leq_3 = \leq_1$ on $E_1$. Next, note $E_3\backslash E_1 = (E_3\backslash E_2) \cup (E_2\backslash E_1)$. Let $x \in E_3\backslash E_1$. If $x \in E_3\backslash E_2$ then $x$ is an upper bound for all of $E_2$, and therefore for $E_1$. If instead $x \in E_2\backslash E_1$ then by $w_1 R w_2$ $x$ is a UB for $E_1$. But either way $x$ is a UB for $E_1$, so $w_1 R w_3$.
That establishes that $R$ is a partial order on $\mathcal W$.
Next I want to apply Zorn, so let $C$ be a chain in $\mathcal W$, and define
$$
S = \{E \subseteq X : E \text{ is well ordered by an element of }\mathcal W\}
$$
and note that by the first part of the definition of $R$, $S$ is totally ordered by $\subseteq$.
I need to produce a $u =(U,\leq_U) \in \mathcal W$ such that $w R u$ for all $w \in C$. To that end, take
$$
U = \bigcup_{E \in S} E
$$
so $U\subseteq X$ is an upper bound to every set in $S$ w.r.t. $\subseteq$.
Now let $x, y \in U$ so there are sets $E_x,E_y \in S$ such that $x\in E_x$ and $y\in E_y$. Furthermore, since $S$ is totally ordered, one must include the other, so WLOG $x,y \in E$ for some $E \in S$. Now define a binary relation $\leq_U$ on $U$ by $x\leq_U y$ iff $x\leq_E y$. This is well defined because the particular choice of $E$ doesn’t actually matter due to extension part of the definition of $R$.
I now need to show that $u \in \mathcal W$, i.e. $\leq_U$ well orders $U$, and that $u$ is an upper bound to every $w \in C$ w.r.t. $R$.
First, $(U, \leq_U)$ is a well order: for $x,y,z \in U$, they’re all in some $E \in S$, so $\leq_U=\leq_E$ on that set and therefore $\leq_U$ satisfies reflexivity, antisymmetry, and transitivity for these three arbitrary points in $U$. They’re also all comparable since $\leq_E$ is a total order in $E$, so $\leq_U$ totally orders $U$.
Next, let $A\subseteq U$ be an arbitrary nonempty set and let $F \in \{E \in S : A\cap E \neq \emptyset\}$ (I’m choosing $F$ from a nonempty set so this is existential instantiation rather than something needing a choice function). Then $A\cap F \subseteq F$ and $F$ is well ordered so $a^* := \min A\cap F$ exists w.r.t. $\leq_F$. $\leq_F$ equals $\leq_U$ on $F$ so $a^*$ is also the least element of $A\cap F$ w.r.t. $\leq_U$. Furthermore, let $a \in A\backslash F$. Then since $a \in U$ it must be that $a \in G$ for some $G \in S$; but since $a \notin F$ I must have $F \subset G$ since $S$ is totally ordered. This means $a$ is an upper bound for $F$ w.r.t. $\leq_G$, and therefore w.r.t. $\leq_U$ too. This makes $a^*$ is the minimal element of all of $A$ w.r.t. $\leq_U$ so $\leq_U$ well orders $U$.
Now let $w =(E,\leq) \in C$ and I want to show $w R u$. By construction $E\subseteq U$ and $\leq$ and $\leq_U$ agree on $E$. Furthermore, for any $x \in U\backslash E$ I have $x \in F$ for some $F \supset E$, so $x$ is an upper bound for $E$. Thus $w R u$.
At last I can apply Zorn’s lemma to produce a maximal element $m = (M,\leq_M) \in \mathcal W$. Suppose there is some $x \in X\backslash M$. Then I can create a new well order $m_x := (M\cup\{x\}, \leq)$ by setting $x$ to be greater than all $y \in M$, and this makes $m R m_x$. But that contradicts the maximality of $m$, so it must be that there actually is no such $x \in X\backslash M$, and this means that $m$ is actually a well order on all of $X$.
$\square$
3: WO $\implies$ AC
Pf: Assume WO, and suppose $\{X_\alpha : \alpha \in A\}$ is a nonempty collection of nonempty sets. Then by WO every $X_\alpha$ can be well ordered, and therefore each one has a unique minimal element $x_\alpha$. Define a function
$$
f : A \to \bigcup_{\alpha \in A} X_\alpha
$$
by
$$
f(\alpha) = x_\alpha.
$$
Then for each $\alpha \in A$ $f(\alpha) \in X_\alpha$ so $f \in \prod_{\alpha \in A} X_\alpha$.
$\square$
In the next post I’ll prove that AC $\implies$ Zorn.