The Journey of a Mathematics Novice: 2012

Wednesday, December 19, 2012

A splitting lemma for groups

We recall the definition of the semidirect product of two groups. Suppose $N$ and $H$ are two groups and $H$ acts on $N$ by $\vphi$, i.e. we have a homomorphism $\func{\vphi}{H}{\mathrm{Aut}(N)}$, $h \mapsto \vphi_h$. The semidirect product $N \rtimes_\vphi H$ is defined by endowing the set $N \times H$ the multiplication \[ (n_1, h_1) \cdot (n_2, h_2) = (n_1 \vphi_{h_1}(n_2), h_1 h_2). \] Identifying $N$ with $N \times \{1\}$ and $H$ with $\{1\} \times H$, $N$ and $H$ are subgroups of $N \rtimes_\vphi H$ with $N$ normal.

Theorem (Splitting Lemma). Let $G$ be a group, $H$ be a subgroup of $G$ and $N$ be a normal subgroup of $G$. Then $G$ is isomorphic to a semidirect product $N \rtimes_\vphi H$ if and only if
(i) there exist a short exact sequence $1 \rightarrow N \overset{\iota}\rightarrow G \overset{\pi}{\rightarrow} H \rightarrow 1$ and
(ii) a homomophism $\func{\alpha}{H}{G}$ such that $\pi \circ \alpha = \mathrm{id}_H$.
Proof: $(\Leftarrow)$ Suppose $G = N \rtimes_\vphi H$. To prove (i) let $\iota$ be the inclusion map of $N$ into $G$, which is a injective homomorphism. Define $\func{\pi}{G}{H}$ by $\pi((n, h)) = h$. Clearly $\pi$ is a surjective homomorphism and its kernel is $N \times \{1\} \cong N$. For (ii), define $\func{\alpha}{H}{G}$ by $\alpha(h) = (1, h)$. Then $\alpha$ is a homomorphism with $\pi(\alpha(h)) = \pi((1, h)) = h$ for all $h \in H$.
$(\Rightarrow)$ Define a homomorphism $\func{\vphi}{H}{\mathrm{Aut}(N)}$ by $\vphi_h(n) = \iota^{-1}(\alpha(h)\iota(n)\alpha(h)^{-1})$, note that since $N \lhd G$ we have $gng^{-1} \in N$ whenever $g \in G$ and $n \in N$, so $\vphi$ is well-defined. We claim that $G \cong N \rtimes_\vphi H$. Define $\func{\Psi}{N \rtimes_\vphi H}{G}$ by $\Psi((n, h)) = \iota(n)\alpha(h)$. It is a homomorphism since \[ \begin{align*}\Psi((n_1, h_1)\cdot(n_2, h_2)) &= \iota(n_1)\alpha(h_1)\iota(n_2)\alpha(h_1)^{-1}\alpha(h_1 h_2) \\
&= \iota(n_1)\alpha(h_1)\iota(n_2)\alpha(h_2) \\
&= \Psi((n_1, h_1))\Psi((n_2, h_2)). \end{align*} \] Now define $\func{\Phi}{G}{N \rtimes_\vphi H}$ by $\Phi(g) = (\iota^{-1}(g (\alpha \circ \pi)(g^{-1})), \pi(g))$. It is well-defined since \[ \pi(g(\alpha \circ \pi)(g^{-1})) = \pi(g)(\pi \circ \alpha)(\pi(g)^{-1}) = \pi(g)\pi(g)^{-1} = 1 \] and thus $g(\alpha \circ \pi)(g^{-1}) \in \ker(\pi) = \mathrm{im}(\iota)$. It is a homomorphism as
\[ \begin{align*} \Phi(g_1)\Phi(g_2) &= (\iota^{-1}(g_1 (\alpha \circ \pi)(g_1^{-1})), \pi(g_1)) \cdot (\iota^{-1}(g_2 (\alpha \circ \pi)(g_2^{-1})), \pi(g_2)) \\
&= (\iota^{-1}(g_1 (\alpha \circ \pi)(g_1^{-1})) \vphi_{\pi(g_1)}(\iota^{-1}(g_2 (\alpha \circ \pi)(g_2^{-1}))), \pi(g_1)\pi(g_2)) \\
&= (\iota^{-1}(g_1 (\alpha \circ \pi)(g_1^{-1})) \iota^{-1}(\alpha(\pi(g_1))g_2(\alpha \circ \pi)(g_2^{-1})\alpha(\pi(g_1))^{-1}), \pi(g_1 g_2)) \\
&= (\iota^{-1}(g_1 g_2 (\alpha \circ \pi)(g_2^{-1} g_1^{-1})), \pi(g_1 g_2)) \\
&= \Phi(g_1 g_2).
\end{align*}\] Now \[ \begin{align*}\Phi \circ \Psi(n, h) &= \Phi(\iota(n)\alpha(h)) = \Phi(\iota(n))\Phi(\alpha(h)) \\
&= (\iota^{-1}(\iota(n)(\alpha \circ \pi \circ \iota)(n^{-1})), (\pi \circ \iota)(n)) \cdot (\iota^{-1}(\alpha(h)(\alpha \circ \pi \circ \alpha)(h^{-1})), (\pi \circ \alpha)(h)) \\
&= (n, 1) \cdot (1, h) = (n, h)
\end{align*} \] and \[ \begin{align*} \Psi \circ \Phi(g) &= \Psi((\iota^{-1}(g (\alpha \circ \pi)(g^{-1})), \pi(g))) \\
&= \iota(\iota^{-1}(g (\alpha \circ \pi)(g^{-1})))\alpha(\pi(g)) = g.
\end{align*} \] Hence $\Phi$ and $\Psi$ are inverse to each other and thus are isomorphisms.

Tuesday, December 18, 2012

A proof of Cayley-Hamilton Theorem using Zariski topology

The notation $\mathbb{A}^n$ is used to denote the space $\mathbb{C}^n$ with the Zariski topology.

We note the following basic properties of the Zariski topology.

Proposition 1. Nonempty open subsets of $\mathbb{A}^n$ is dense.

Lemma 2. Polynomial maps from $\mathbb{A}^n$ to $\mathbb{A}^m$ is continuous.

We proceed to prove the Cayley-Hamilton Theorem. Identify $M_n(\mathbb{C})$ with $\mathbb{A}^{n^2}$. For any matrix $A$, let $p_A(\lambda)$ denotes its characteristic polynomial. The Cayley-Hamilton Theorem is the statement that the map $\Psi: \mathbb{A}^{n^2} \rightarrow \mathbb{A}^{n^2}$, $\Psi(A) = p_A(A)$ vanishes identically. Let $D$ be the subset of $\mathbb{A}^{n^2}$ of matrices with distinct eigenvalues.

Lemma 3. $D$ is nonempty and open.
Proof: Clearly $D$ is not empty. The complement of $D$ is the collection of n x n matrices with repeated eigenvalues. But a matrix has repeated eigenvalues if and only if the discriminant of its characteristic polynomial vanishes. Hence $D$ is closed.

It follows from Proposition 1 and Lemma 3 that $D$ is dense in $\mathbb{A}^{n^2}$. Note that $\Psi$ is a polynomial map, so it is continuous by Lemma 2. Hence it suffices to show that $\Psi$ vanishes on $D$. Let $A \in D$. As $A$ is diagonalizable, there exists an invertible matrix $P$ such that $P^{-1}AP$ is diagonal. Note that $p_A(A) = p_{P^{-1}AP}(P^{-1}AP)$. So we may assume $A$ itself is diagonal. If $A = \mathrm{diag}(\lambda_1, \ldots, \lambda_n)$, then $p_A(\lambda) = \prod_{i = 1}^n (\lambda - \lambda_i)$, thus it is clear that $p_A(A) = 0$.

Saturday, November 3, 2012

Deutsch-Jozsa Algorithm

Denote $\Z_2$ the field with 2 elements. A function $f: \Z_2^n \rightarrow \Z_2$ is said to be balanced if $|f^{-1}(0)| = |f^{-1}(1)|$.

Problem: Suppose $f: \Z_2^n \rightarrow \Z_2$ is either constant or balanced. Determine whether $f$ is constant or balanced.
Given: A quantum oracle for $f$, represented by $U_f: \ket{\mathbf{x}}\ket{y} \mapsto \ket{\mathbf{x}}\ket{y \oplus f(\mathbf{x})}$.

Deutsch-Jozsa algorithm:
1. Prepare an (n + 1)-qubit initial state \[ \ket{\psi_1} = \ket{\mathbf{0}}\left(\frac{\ket{0} - \ket{1}}{\sqrt{2}}\right). \]
2. Apply $\otimes_{k = 1}^n H$ to the first n-qubit. Get \[ \ket{\psi_2} = \frac{1}{\sqrt{2^n}} \sum_{k = 0}^{2^n - 1} \ket{\mathbf{k}} \left(\frac{\ket{0} - \ket{1}}{\sqrt{2}}\right). \]
3. Apply $U_f$ to obtain \[ \ket{\psi_3} = \frac{1}{\sqrt{2^n}} \sum_{k = 0}^{2^n - 1} (-1)^{f(\mathbf{k})}\ket{\mathbf{k}} \left(\frac{\ket{0} - \ket{1}}{\sqrt{2}}\right). \]
4. Apply again $\otimes_{k = 1}^n H$ to the first n-qubit. Get \[ \ket{\psi_4} = \frac{1}{2^n}\sum_{k = 0}^{2^n - 1} (-1)^{f(\mathbf{k})}\sum_{s = 0}^{2^n - 1} (-1)^{\mathbf{k} \cdot \mathbf{s}}\ket{s}\left( \frac{\ket{0} - \ket{1}}{\sqrt{2}} \right). \]
5. If $f$ is constant, then $\ket{\psi_4} = \ket{\mathbf{0}}\frac{\ket{0} - \ket{1}}{\sqrt{2}}$ by $\sum_{s = 0}^{2^n - 1} (-1)^{\mathbf{k} \cdot \mathbf{s}} = 0$ if $k \neq 0$ and $2^n$ if $k = 0$, so on measuring first n-qubit we get $\ket{\mathbf{0}}$ with probability one. If $f$ is balanced, then write $\ket{\psi_4} = \ket{\psi}\frac{\ket{0} - \ket{1}}{\sqrt{2}}$, we have \[ \braket{\mathbf{0}}{\psi} = \frac{1}{2^n}\sum_{k = 0}^{2^n - 1} (-1)^{f(\mathbf{k})}\sum_{s = 0}^{2^n - 1} (-1)^{\mathbf{k} \cdot \mathbf{s}} \braket{\mathbf{0}}{\mathbf{s}} = \sum_{k = 0}^{2^n - 1} (-1)^{f(\mathbf{k})} = 0 \] and the probability of getting $\ket{\mathbf{0}}$ on measuring first n-qubit is zero. Therefore, we measure the first n-qubit. If the result is $\ket{\mathbf{0}}$, then $f$ is constant, otherwise $f$ is balanced.

Thursday, November 1, 2012

Some common quantum gates

Identity: $\ket{x} \mapsto \ket{x}$

$I = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}$

NOT (Pauli X): $\ket{x} \mapsto \ket{1 \oplus x}$

$X = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}$

Pauli Y:

$Y = \begin{pmatrix} 0 & -i \\ i & 0 \end{pmatrix}$

Pauli Z:

$Z = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}$

Hadamard:

$H = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}$

CNOT: $\ket{x} \ket{y} \mapsto \ket{x} \ket{x \oplus y}$

$C = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0\end{pmatrix}$

Phase shift:

$R_\theta = \begin{pmatrix} 1 & 0 \\ 0 & e^{i\theta} \end{pmatrix}$

(To be continued)

Thursday, October 11, 2012

There is no translation-invariant measure on an infinite-dimensional space

One difficulty in carrying out analysis on an infinite-dimensional normed vector space, e.g. $L^p(\mathbb{R})$, is that there is no analogue of Lebesgue measure on such a space. This comes from the observation that an open ball in an infinite-dimensional normed vector space contains infinitely many disjoint open balls of the same radius. We first give a formal statement of our goal.

Theorem. Let $X$ be an infinite-dimensional normed vector space. Then there is no translational-invariant Borel measure $\mu$ on $X$ such that $\mu(U) > 0$ for every nonempty open set $U$ and $\mu(U_0) < \infty$ for some open set $U_0$.

To prove the theorem, we recall Riesz's Lemma.

Lemma (Riesz). Let $X$ be a normed vector space and $Y$ be a proper closed subspace. For every $\epsilon > 0$, there exists $x \in X$ such that $\|x\| = 1$ and $\mathrm{dist}(x, Y) = \inf\{\|x - y\|: y \in Y\} \geq 1 - \epsilon$.

Proof of Theorem: Suppose $\mu$ is a translation-invariant Borel measure on $X$ with $\mu(U) > 0$ for every nonempty open $U \subseteq X$. Fix a nonempty open set $U$ in $X$. Then there exists $r > 0$ and $x_0 \in X$ such that $W = r(U + x_0)$ contains $B = \{x \in X: \|x\| < 2 \}$. Using Riesz's Lemma and induction, we can find a sequence $\{x_n\}$ of unit vectors in $X$ such that $\mathrm{dist}(x_{n + 1}, \mathrm{span}\{x_1, \ldots, x_n\}) \geq 1/2$ for all $n \geq 1$. Hence $\{x_n\} \subseteq B$ and $\|x_n - x_m\| \geq 1/2$ whenever $n \neq m$. Take $B_n = \{x \in X: \|x - x_n\| < 1/2 \}$, then $\{B_n\}$ is a disjoint collection of open balls contained in $B$ of radius half. Now let $U_n = \frac{1}{r}B_n - x_0$, then $\{U_n\}$ is a disjoint collection of open balls contained in $U$ of the same radius. It follows that \[ \mu(U) \geq \mu(\bigcup_{n = 1}^\infty U_n) = \sum_{n = 1}^\infty \mu(U_n) = \sum_{n = 1}^\infty \mu(U_1) \] by translation-invariance of $\mu$. But $\mu(U_1) > 0$, forcing $\mu(U) = \infty$.

Reference:
Fabian, Habala, Hajek, Montesinos, Zizler - Banach Space Theory: The Basis for Linear and Nonlinear Analysis

Sunday, October 7, 2012

A criterion for writing a functional as a linear combination

Let $X$ be a vector space over $\mathbb{F}$ with dual space $X'$ and fix $\phi_1, \ldots, \phi_n \in X'$. We have a useful criterion for when a linear functional $f \in X'$ lies in the span of $\phi_1, \ldots, \phi_n$.

Theorem 1. Let $f \in X'$. Then $f \in \mathrm{span}\{\phi_1, \ldots, \phi_n\}$ if and only if $\cap_{i = 1}^n \ker(\phi_i) \subseteq \ker(f)$.

To prove this, we first establish a general result on when a linear map can factor through another linear map.

Theorem 2. Suppose we are given linear maps $T: X \rightarrow Y$ and $S: X \rightarrow Z$. Then there exists a linear map $R: Z \rightarrow Y$ such that $T = R \circ S$ if and only if $\ker(S) \subseteq \ker(T)$.

Proof: (=>) Clear. (<=) Suppose $\ker(S) \subseteq \ker(T)$. Define $R': \mathrm{im}(S) \rightarrow Y$ by $R'(S(x)) = T(x)$. If $S(x) = S(y)$, then $x - y \in \ker(S)$, so $x - y \in \ker(T)$, i.e. $T(x) = T(y)$. Hence $R'$ is well-defined. It is easy to check that $R'$ is linear. Now let $P$ be the projection of $Z$ onto $\mathrm{im}(S)$ and take $R = R' \circ P$. Then for all $x \in X$, $R \circ S(x) = R'(P(S(x))) = R'(S(x)) = T(x)$.

Proof of Theorem 1: (=>) Clear. (<=) Take $T: X \rightarrow \mathbb{F}$ and $S: X \rightarrow \mathbb{F}^n$ defined by $T(x) = f(x)$ and $S(x) = (\phi_1(x), \ldots, \phi_n(x))$. Then $\ker(S) = \cap_{i = 1}^n \ker(\phi_i) \subseteq \ker(f) = \ker(T)$. By Theorem 2 there exists $R: \mathbb{F}^n \rightarrow \mathbb{F}$ such that $T = R \circ S$ or $f(x) = R(\phi_1(x), \ldots, \phi_n(x))$ for all $x \in X$. Writing $R(y_1, \cdots, y_n) = c_1 y_1 + \cdots + c_n y_n$, we see that $f = c_1 \phi_1 + \cdots + c_n \phi_n$.

Reference:
Holmes - Geometric Functional Analysis and its Applications

Saturday, September 29, 2012

Weak* separation theorem and Goldstine's theorem

Recall a version of separation theorem for a Banach space is the following.

Theorem 1. Let $X$ be a Banach space. Let $C$ be a nonempty open convex subset of $X$ and $x_0 \in X \backslash C$. Then there exists $\ell_0 \in X^*$ such that $\sup_{x \in C} \Re \ell_0(x) < \Re \ell_0(x_0)$.

There is a separation theorem in the weak* topology setting.

Theorem 2. Let $X$ be a Banach space. Let $C$ be a nonempty weak*-closed convex subset of $X^*$ and $\ell_0 \in X^* \backslash C$. Then there exists $x_0 \in X$ such that $\sup_{\ell \in C} \Re \ell(x_0) < \Re \ell_0(x_0)$.
Proof: Since $X^* \backslash C$ is weak*-open, we can find $x_1, \ldots, x_n \in X$ and $\epsilon > 0$ such that $U = \{\ell \in X^* : |\ell(x_i)| < \epsilon, i = 1, \ldots, n\}$ satisfies $\ell_0 + U \subset X^* \backslash C$. This implies that $\ell_0$ does not lie in the convex open set $C + U$. By Theorem 1 there exists $\psi \in X^{**}$ such that $\Re \psi(\ell_0) > \sup_{\ell \in C + U} \Re \psi(\ell) \geq \sup_{\ell \in C} \Re \psi(\ell)$. We want to show that $\cap_{i = 1}^n \ker(x_i) \subseteq \ker(\psi)$, since then $\psi$ is a linear combination of $x_1, \ldots, x_n$, thus $x_0 = \psi \in X$ and we are done. Let $\ell \in X^*$ with $\ell(x_i)$ for all $1 \leq i \leq n$. Note that $t\ell \in U$ for all $t \in \mathbb{R}$. Fix any $c \in C$. Then $M := \sup_{f \in U} \psi(f) < \psi(\ell_0) - \psi(c)$ is finite. Now for all $t \in \mathbb{R}$, $\psi(t\ell) \leq M$, i.e. $\psi(\ell) \leq M/t$ for all $t > 0$ and $\psi(\ell) \geq M/t$ for all $t < 0$. Therefore $\psi(\ell) = 0$.

The following is an application of the weak* separation theorem. (Notation: $B_Y = \{y \in Y: \|y\| \leq 1\}$.)

Theorem (Goldstine). The weak* closure of $B_X$ in $X^{**}$ is $B_{X^{**}}$.
Proof: First, $B_X \subseteq B_{X^{**}}$ and $B_{X^{**}}$ is weak*-compact by Alaoglu's Theorem. Hence the weak* closure $\tilde{B}$ of $B_X$ is a subset of $B_{X^{**}}$. Suppose the inclusion is proper and choose $w \in B_{X^{**}} \backslash \tilde{B}$. Apply Theorem 2 to find an $\ell \in X^*$ and $\epsilon > 0$ such that $\sup_{\psi \in \tilde{B}} \Re \psi(\ell_0) \leq \Re w(\ell_0) - \epsilon$. In particular, $\Re \ell_0(x) \leq \Re w(\ell_0) - \epsilon$ for all $x \in B_X$. This implies that $|\ell_0(x)| \leq \Re w(\ell_0) - \epsilon$ for all $x \in B_X$. Replace $w$ by some $e^{-i\theta}w$, we have $|\ell_0(x)| \leq |w(\ell_0)| - \epsilon$ for all $x \in B_X$. But then $\|\ell_0\| \leq |w(\ell_0)| - \epsilon \leq \|w\|\|\ell_0\| - \epsilon = \|\ell_0\| - \epsilon$. This is a contradiction.

Reference:
Fabian, Habala, Hajek, Santalucia, Pelant, Zizler - Functional Analysis and Infinite Dimensional Geometry

Thursday, September 27, 2012

Positive matrices I: Positive linear functionals

Let $\mathbb{M}_n$ denotes the algebra of n by n complex matrices. A matrix $A \in \mathbb{M}_n$ is said to be positive semi-definite or just positive if any one of the following equivalent conditions holds:

$x^*Ax \geq 0$ for any $x \in \mathbb{C}^n$;
$A = X^*X$ for some $X \in \mathbb{M}_n$;
$A$ is Hermitian (i.e. $A = A^*$) and all its eigenvalues are nonnegative (i.e. $\sigma(A) \subseteq [0, \infty)$).

We write $A \geq O$ if $A$ is a positive matrix.

A linear subspace $\mathcal{S}$ of $\mathbb{M}_n$ is an operator system if it contains $I$ and closed under the adjoint operation.

Let $\mathcal{S} \subseteq \mathbb{M}_n$ be an operator system. A linear map $\Phi: \mathcal{S} \rightarrow \mathbb{M}_k$ is said to be positive if $\Phi(A) \geq O$ whenever $A \geq O$.

Note that we can write any $X \in \mathcal{S}$ as $X = A + iB$ where $A, B \in \mathcal{S}$ are Hermitian matrices given by $A = \frac{1}{2}(X + X^*)$ and $B = \frac{1}{2i}(X - X^*)$. Further, if $Y \in \mathbb{M}_n$ is Hermitian, then $Y = P - N$ for some positive matrices $P, N \in \mathcal{S}$. Here $P = \frac{1}{2}(\|Y\|I + Y)$ and $N = \frac{1}{2}(\|Y\|I - Y)$. The above decomposition gives the following result.

Proposition. If $\Phi: \mathcal{S} \rightarrow \mathbb{M}_k$ is a positive linear map, then $\Phi(X^*) = \Phi(X)^*$ for any $X \in \mathcal{S}$.

We now consider positive linear functionals on an operator system $\mathcal{S}$. It turns out in this case positivity is equivalent to that the functional attains its norm at the identity.

Theorem. Let $\varphi$ be a linear functional on $\mathcal{S}$. Then $\varphi$ is positive if and only if $\|\varphi\| = \varphi(I)$.

Proof: (=>) Let $X \in \mathcal{S}$. It is enough to show that $|\varphi(X)| \leq \|X\|\varphi(I)$. By replace $X$ by $e^{-i\theta}X$ for some suitable $\theta \in \mathbb{R}$, we may assume that $\varphi(X) \geq 0$. Write $X = A + iB$ where Hermitian $A, B \in \mathcal{S}$ are given by $A = \frac{1}{2}(X + X^*)$ and $B = \frac{1}{2i}(X - X^*)$. Since $\varphi(A), \varphi(B) \in \mathbb{R}$, we have $\varphi(X) = \varphi(A)$. Now write $A = P - N$ where $P, N \in \mathcal{S}$ are positive and given by $P = \frac{1}{2}(\|A\|I + A)$ and $N = \frac{1}{2}(\|A\|I - A)$. Note that $O \leq P \leq \|P\|I$, so $\varphi(P) \leq \|P\|\varphi(I)$. Now we have $\varphi(X) = \varphi(A) = \varphi(P) - \varphi(N) \leq \varphi(P) \leq \|P\|\varphi(I) \leq \|A\|\varphi(I) \leq \|X\|\varphi(I)$.
(<=) WLOG, assume $\|\varphi\| = \varphi(I) = 1$. Let $X \in \mathcal{S}$ be positive. Let $a = \min \sigma(X), b = \max \sigma(X)$ and $J = [a, b] \supseteq \sigma(X)$. We are going to show that $\varphi(X) \in J$, hence $\varphi(X) \geq 0$. Suppose not. Take a closed disk $D$ centered at $z_0$ with radius $r$ such that $J \subseteq D$ and $\varphi(X) \notin D$. It follows that $\sigma(X - z_0I) \subseteq \{z: |z| \leq r\}$. Now $|\varphi(X) - z_0| = |\varphi(X - z_0I)| \leq \|\varphi\|\|X_0 - z_0I\| \leq r$. This contradicts the fact that $\varphi(X) \notin D$.

The above characterization immediately gives the following extension theorem for positive linear functionals.

Theorem (Krein Extension Theorem). Every positive linear functional on $\mathcal{S}$ extends to a positive linear functional on $\mathbb{M}_n$.

Proof: Suppose $\varphi$ is a positive linear functional on $\mathcal{S}$. Then $\|\varphi\| = \varphi(I)$. By Hahn-Banach Theorem, $\varphi$ extends to a linear functional $\tilde{\varphi}$ on $\mathbb{M}_n$ with $\|\tilde{\varphi}\| = \|\varphi\| = \varphi(I) = \tilde{\varphi}(I)$, so $\tilde{\varphi}$ is positive.

Positive linear functionals on $\mathbb{M}_n$ have the following characterization.

Proposition. Let $\varphi$ be a positive linear functonal on $\mathbb{M}_n$. Then there exists a positive matrix $X \in \mathbb{M}_n$ such that $\varphi(A) = \mathrm{tr} AX$ for all $A \in \mathbb{M}_n$.

Proof: For $1 \leq i, j \leq n$, let $E_{ij} \in \mathbb{M}_n$ be the matrix with entry 1 at i-th row and j-th column and other entries zero and let $p_{ij} = \varphi(E_{ij})$. Take any $A = [a_{ij}] \in \mathbb{M}_n$. Then $\varphi(A) = \sum_{i, j} a_{ij}p_{ij} = \mathrm{tr}(AX)$ where $X = [x_{ij}]$ with $x_{ij} = p_{ji}$. Let $P = [p_{ij}]$, so that $X = P^*$. For any $x \in \mathbb{C}^n$, $x^*Xx = \sum_{i, j} \bar{x_i} x_{ij} x_j = \varphi(\sum_{i, j} \bar{x_i} E_{ji} x_j) = \varphi([\bar{x_j}x_i]) \geq 0$ since $\varphi$ and $[\bar{x_j}x_i] = xx^*$ are positive. Hence $X$ is positive.

Reference:
Bhatia - Positive Definite Matrices

Tuesday, August 14, 2012

Arithmetic functions I: Dirichlet convolution and Euler's totient function

An arithmetic function is a real or complex-valued functions defined on the set of positive integers. Often, arithmetic functions reflect arithmetical information concerning integers and are intensively studied in number theory.

Defintion. An arithmetic function $f$ is said to be multiplicative if $f(1) = 1$ and $f(mn) = f(m)f(n)$ whenever $m$ and $n$ are relatively prime.

By unique factorization of integers, a multiplicative arithmetic function is determined completely by its values on all of the prime powers.

Examples. (i) Identity function $id$: $id(n) = n$.
(ii) Delta function $\delta$: $\delta(1) = 1, \delta(n) = 0$ for $n > 1$.
(iii) Constant one function $\mathbf{1}$: $\mathbf{1}(n) = 1$.
(iv) Moebius function $\mu$: $\mu(1) = 1$ and if $n$ is the product of $m$ distinct primes, then $\mu(n) = (-1)^m$, otherwise $\mu(n) = 0$.

The set of all multiplicative functions can be made into a commutative ring with identity with pointwise addition and the Dirichlet convolution defined below as multiplication.

Definition. Let $f$ and $g$ be multiplicative functions. The Dirichlet convolution of $f$ and $g$ is the arithmetic function $f * g$ defined by $$f * g(n) = \sum_{d \mid n} f(d)g(\frac{n}{d}).$$

Proposition. Let $f, g, h$ be multiplicative functions.
(i) $f * g$ is multiplicative.
(ii) $(f * g) * h = f * (g * h)$.
(iii) $f * g = g * f$.
(iv) $f * \delta = f$.

Hence the multiplicative identity for Dirichlet convolution is the delta function. Below we see that the multiplicative inverse of constant one function is Moebius function.

Proposition. $\mu * \mathbf{1} = \delta$.
Proof: Firstly, $\mu * \mathbf{1}(1) = \sum_{d \mid 1} \mu(d) = \mu(1) = 1 = \delta(1)$. Now let $n > 1$ with prime factorization $n = p_1^{k_1} \cdots p_m^{k_m}$. Then $$\begin{align*}
\mu * \mathbf{1}(n) &= \sum_{d \mid n} \mu(d) \\
&= \mu(1) + \mu(p_1) + \cdots + \mu(p_m) + \mu(p_1 p_2) + \cdots + \mu(p_{m - 1} p_m) + \cdots + \mu(p_1 \cdots p_m) \\
&= 1 + \binom{m}{1} (-1) + \binom{m}{2} (-1)^2 + \cdots + \binom{m}{m} (-1)^m \\
&= (1 - 1)^m = 0 = \delta(n).
\end{align*}$$ Q. E. D.

We now introduce an important arithmetic function in number theory, Euler's totient function $\phi$. For a positive integer $n$, $\phi(n)$ is the number of integers from 1 to $n$ which are relative prime to $n$.

Theorem. $\phi = \mu * id$.
Proof: $$\begin{align*}
\phi(n) &= \sum_{1 \leq k \leq n, (k, n) = 1} 1 = \sum_{k = 1}^n \delta((k, n)) \\
&= \sum_{k = 1}^n \mu * \mathbf{1}((k, n)) \\
&= \sum_{k = 1}^n \sum_{d \mid (k, n)} \mu(d) \\
&= \sum_{k = 1}^n \sum_{d \mid k, d \mid n} \mu(d) \\
&= \sum_{d \mid n} \sum_{1 \leq k \leq n, d \mid k} \mu(d) \\
&= \sum_{d \mid n} \sum_{1 \leq k \leq n, k = dm} \mu(d) \\
&= \sum_{d \mid n} \sum_{1 \leq m \leq n/d} \mu(d) \\
&= \sum_{d \mid n} \mu(d) \frac{n}{d} \\
&= \mu * id(n).
\end{align*}$$ Q. E. D.

Corollary. $\phi$ is multiplicative.

Another corollary is the classical formula by Euler.

Corollary. $\sum_{d \mid n} \phi(d) = n$, i.e, $\phi * \mathbf{1} = id$.
Proof: $\phi * \mathbf{1} = (\mu * id) * \mathbf{1} = (\mu * \mathbf{1}) * id = \delta * id = id$. Q. E. D.

Now we derive a formula for Euler's totient function in terms of prime factorizations.

Lemma. For a prime $p$ and positive integer $k$, $\phi(p^k) = p^k - p^{k - 1}$.
Proof: It suffices to note that the numbers between 1 and $p^k$ not relative prime to $p^k$ are $p, 2p, 3p, \ldots, p^{k - 1} \cdot p$. Q. E. D.

Theorem. Let $n > 1$ with prime factorization $n = p_1^{k_1} \cdots p_m^{k_m}$. Then $$\phi(n) = n(1 - \frac{1}{p_1}) \cdots (1 - \frac{1}{p_m}).$$
Proof: $\phi(n) = \prod_{j = 1}^m (p_j^{k_j} - p_j^{k_j - 1}) = \prod_{j = 1}^m p_j^{k_j} (1 - \frac{1}{p_j}) = n \prod_{j = 1}^m (1 - \frac{1}{p_j})$. Q. E. D.

Reference:
Apostol - Introduction to Analytic Number Theory

Friday, May 4, 2012

Additive subgroups of the reals and irrational rotation

We present a useful characterization of the additive subgroups of $\mathbb{R}$.

Theorem. Let $G$ be a subgroup of $(\mathbb{R}, +)$. Then precisely one of the following holds:

(i) $G = d\mathbb{Z}$ for some $d \in \mathbb{R}$;

(ii) $G$ is dense in $\mathbb{R}$.

Proof: If $G = \{0\}$, then $G = 0\mathbb{Z}$. Assume $G$ is nontrivial. Let \[ d = \inf\{g: 0 < g \in G\}.\] Note that $d \geq 0$ since for every nonzero $y \in G$, $ky \in G$ and $ky > 0$ for some $k \in \mathbb{Z}$.
Case 1: $d > 0$. We claim that $d \in G$. If not, then for every $\epsilon > 0$ there exists $g \in G$ with $g \in (d, d + \epsilon)$. It follows that for every $\epsilon > 0$ there are $g, h \in G$ such that $0 < g - h < \epsilon$, thus there exists $x \in G$ with $0 < x < \epsilon$ (let $x = g - h$). This contradicts to $d > 0$. We also have $d = \min\{|g|: 0 \neq g \in G\}$. Now fix any $0< g \in G$ and consider $S = \{g - nd: n = 0, 1, 2, \ldots\}$. The set $S$ is discrete, we can find $n \geq 0$ such that $g - nd \geq 0$ and $g - (n + 1)d < 0$. Since $g - nd, g - (n +1)d \in G$, $g - (n + 1)d \leq -d$, so $g - nd \leq 0$ and $g = nd$. Therefore $G = d\mathbb{Z}$.
Case 2: $d = 0$. Arguing as in the beginning of case 1, we find that 0 is a cluster point of $G \cap \mathbb{R}_+$. Let $(a, b)$ be any nonempty bounded open interval of $\mathbb{R}$. Take $\epsilon = b - a$ and pick some $g \in G \cap (0, \epsilon)$. For sufficiently large $n$, $ng \geq a$, and we take the smallest such $n$. Then $ng - g < a$, i.e. $ng \in (a - g, a]$, thus $ng + g \in (a, a + g] \subseteq (a, b)$. Therefore $G \cap (a, b) \neq \emptyset$. It follows that $G$ is dense in $\mathbb{R}$.

The above result can be used to prove the density of the orbit of an irrational rotation in the unit circle.

Corollary. (i) For an irrational $t$, $\mathbb{Z} + t\mathbb{Z}$ is dense in $\mathbb{R}$.
(ii) For an irrational $t$, $\{e^{2\pi int}: n \in \mathbb{Z}\}$ is dense in the unit circle.

Proof: (i) Note that $\mathbb{Z} + t\mathbb{Z}$ is a subgroup of $(\mathbb{R}, +)$. If $\mathbb{Z} + t\mathbb{Z} = a\mathbb{Z}$ for some $a \in \mathbb{R}$, then $1 = an$ for some $n \in \mathbb{Z}$, so $a \in \mathbb{Q}$. But $t = am$ for some $m \in \mathbb{Z}$, so $t \in \mathbb{Q}$, contradiction. By Theorem, $\mathbb{Z} + t\mathbb{Z}$ must be dense in $\mathbb{R}$.
(ii) Let $e^{2\pi i x}$ be a point on the unit circle. By (i), there exist sequences $m_k, n_k$ of integers such that $|m_k + n_k t - x| \rightarrow 0$. Then $|e^{2 \pi i n_k t} - e^{2 \pi i x}| = |e^{2\pi i x}||e^{2 \pi i (n_k t - x)} - 1| = |e^{2\pi i (m_k + n_k t - x)} - 1| = O(|m_k + n_k t - x|) \rightarrow 0$. Hence $\{e^{2\pi int}\}_{n \in \mathbb{Z}}$ is dense in the unit circle.

Wednesday, May 2, 2012

Hadamard gap theorem

Definition. A sequence $\{n_k\}$ of nonnegative integers is said to be lacunary if there is some $\delta > 0$ such that $n_{k + 1}/n_k \geq 1 + \delta$ for all $k \geq 1$.

A typical example of a lacunary sequence is $n_k = 2^k$.

Theorem (Hadamard Gap Theorem). Suppose $\{n_k\}$ is a lacunary sequence and the power series $f(z) = \sum_{n = 0}^\infty a_n z^n$ with $a_n$ nonzero precisely when $n = n_k$ for some $k$ and has radius of convergence 1. Then $f(z)$ cannot be analytically extended to any larger region containing a point on the unit circle.

Proof: Suppose $f(z)$ extends to an analytic function on a region containing the open unit disk $B$ and a point $w_0$ on the unit circle. WLOG, we may assume $w_0 = 1$. Then $f(z)$ is analytic on a region $\Omega$ containing $B \cup \{1\}$. Define $g(w) = f(w^m(1 + w)/2)$ where $m$ is a positive integer. Since $1^m(1 + 1)/2 = 1$ and $|w^m(1+w)/2| = |w|^m|1+w|/2 < |w|^m \leq 1$ when $|w| \leq 1$ and $w \neq 1$ because $|1 + w| < 2$, so $g(w)$ is well-defined when $|w| \leq 1$, hence when $|w| < 1 + \epsilon$ for some $\epsilon > 0$. Write $g(w) = \sum_{m = 0}^\infty b_m w^m$ for $|w| < 1 + \epsilon$. Note that the powers of $w$ in the terms of $[w^m(1 + w)/2]^{n_k}$ range from $n_k m$ to $n_k (m + 1)$. Since $\{n_k\}$ is lacunary, we can choose $m$ to be so large that $n_{k + 1} m > n_k (m + 1)$ for all $k$, then the powers of $w$ in all of $[w^m(1 + w)/2]^{n_k}$ are distinct. Now
\[ \sum_{n = 0}^{n_k} a_n z^n = \sum_{m = 0}^{n_k (m + 1)} b_m w^m. \]
When $|w| < 1 + \epsilon$, RHS converges as $k \rightarrow \infty$. But LHS is the partial sums of $\sum_{k = 1}^\infty a_{n_k} z^{n_k}$ as $k$ varies, so the power series converges for all $z = w^m(1 + w)/2$ where $|w| < 1 + \epsilon$, which includes the case $z = 1$. The image of $B(0, 1 + \epsilon)$ under the map $w^m(1 + w)/2$ is an region $W$ containing 1, so the power series of $f(z)$ at 0 converges on $W$, contradicting its radius of convergence being 1. QED.

Reference:
Gamelin - Complex Analysis
Rudin - Real and Complex Analysis (3rd Ed.)

Tuesday, May 1, 2012

Some results in measure theory

Proposition. Let $\mu$ be a positive Borel measure on $\mathbb{R}$ and $\epsilon > 0$. Then for almost every $x$, we have \[ \int_\mathbb{R} \frac{d\mu(t)}{|x - t|^{1 + \epsilon}} = +\infty. \]
Proof: For each positive integer $N$, define $E_N = \{x \in \mathbb{R}: \int_\mathbb{R} d\mu(t) / |x - t|^{1 + \epsilon} \leq N\}$. Let $I$ be any bounded closed interval with $E_N \cap I$ nonempty. Pick $x \in E_N \cap I$ and write $I = [x - \delta_1, x + \delta_2]$. We have \[ N \geq \int_\mathbb{R} \frac{d\mu(t)}{|x - t|^{1 + \epsilon}} \geq \int_I \frac{d\mu(t)}{|x - t|^{1 + \epsilon}} \geq \frac{\mu(I)}{\max\{\delta_1, \delta_2\}^{1 + \epsilon}} \geq \frac{\mu(E_N \cap I)}{(\delta_1 + \delta_2)^{1 + \epsilon}}, \] i.e. $\mu(E_N \cap I) \leq Nm(I)^{1 + \epsilon}$ where $m(I)$ is the length of the interval $I$. Now let $[a, b] \subseteq \mathbb{R}$ and $n \in \mathbb{N}$. Decompose $[a, b]$ into a union of $n$ closed subintervals of length $(b - a)/n$, then \[ \mu(E_N \cap [a, b]) \leq nN\left(\frac{b - a}{n}\right)^{1 + \epsilon} \rightarrow 0 \] as $n \rightarrow \infty$. Hence $\mu(E_N \cap [a, b]) = 0$. It follows that $\mu(E_N) = 0$ for all $N \in \mathbb{N}$ and our result follows.

(To be continued)

Saturday, April 28, 2012

Some results in real analysis

Proposition. Let $f: [a, b] \rightarrow \mathbb{R}$ be a function such that left-hand limit $f(x-)$ and right-hand limit $f(x+)$ of $f$ exist at every $x \in [a, b]$. Then the number of discontinuities of $f$ is at most countable.

Proof: Define a real-valued function $g$ on $[a, b]$ by $g(x) = \max\{|f(x) - f(x+)|, |f(x) - f(x-)|\}$. Then $f$ is continuous at $x$ iff $g(x) = 0$. Let $D_n = \{x \in [a, b]: g(x) \geq 1/n\}$. The points of discontinuity of $f$ is exactly $\bigcup_{n = 1}^\infty D_n$. We prove that each $D_n$ is finite.

Suppose on the contrary that $D_n$ is infinite. Then there is a sequence $\{x_n\}$ of distinct points in $D_n$ which converges to $x \in [a, b]$. Since $f(x+), f(x-)$ exist, we can choose some $\delta > 0$ such that $|f(y) - f(x+)| < 1/4n$ whenever $0 < y - x < \delta$ and $|f(y) - f(x-)| < 1/4n$ whenever $0 < x - y < \delta$. Therefore $|f(y) - f(z)| < 1/2n$ whenever $y, z \in (x - \delta, x)$ or $y, z \in (x, x + \delta)$. We can find some $k$ such that $0 < |x_k - x| < \delta$. If $x_k > x$, then $|f(y) - f(x_k)| < 1/2n$ whenever $y \in (x, x + \delta)$, let $y \rightarrow x_k$ from both left and right we get $|f(x_k+) - f(x_k)| \leq 1/2n$ and $|f(x_k-) - f(x_k)| \leq 1/2n$, so $g(x) \leq 1/2n < 1/n \leq g(x)$, contradiction. If $x_k < x$ then similarly we also obtain a contradiction. Hence $D_n$ must be finite. QED.

Proposition. Let $f_n: U \rightarrow \mathbb{R}$ be a sequence of continuous functions on $U \subseteq \mathbb{R}^d$ such that $f_1 \leq f_2 \leq \cdots$ and $\{f_n\}$ converges pointwise to a continuous function $f$ on $U$. Then $f_n \rightarrow f$ uniformly on compact subsets.

Proof: Note that $f_1 \leq \cdots \leq f_n \leq \cdots \leq f$. Let $K \subseteq U$ be compact and $\epsilon > 0$. For any $x \in K$, since $f_n(x) \rightarrow f(x)$, there is $N_x \in \mathbb{N}$ such that $f(x) - f_n(x) < \epsilon/3$ whenever $n \geq N_x$. For any $n \in \mathbb{N}$, by uniform continuity of $f$ and $f_n$ on $K$, there is $\delta_n > 0$ such that for any $x, y \in K$, $|f(x) - f(y)| < \epsilon/3$ and $|f_n(x) - f_n(y)| < \epsilon/3$ whenever $|x - y| < \delta_n$. For every $x \in K$, if $y \in B(x, \delta_{N_x})$, then
$\begin{align*}
&f(y) - f_n(y) \leq f(y) - f_{N_x}(y) \\ &= [f(y) - f(x)] + [f(x) - f_{N_x}(x)] + [f_{N_x}(x) - f_{N_x}(y)] \\ &< \epsilon/3 + \epsilon/3 + \epsilon/3 = \epsilon
\end{align*}$
whenever $n \geq N_x$. Since $\{B(x, \delta_{N_x}): x \in K\}$ is an open cover of $K$, then there are $x_1, \ldots, x_k \in K$ such that $K = \bigcup_{j = 1}^k B(x_j, \delta_{N_{x_j}})$. Take $N = \max\{N_{x_1}, \ldots, N_{x_k}\}$. For $n \geq N$, if $x \in K$, then $x \in B(x_j, \delta_{N_{x_j}})$ for some $1 \leq j \leq k$, so $f(x) - f_n(x) < \epsilon$. Hence $f_n \rightarrow f$ uniformly on $K$. QED.

(To be continued)

Friday, April 27, 2012

Conformal equivalence of annuli

For $0 < r < R$, let $A(r, R) = \{z \in \mathbb{C}: r < |z| < R\}$, the annulus centered at the origin with inner radius $r$ and outer radius $R$. We are going to show that the conformal equivalence classes of such anuuli in the complex plane are parametrized by the ratio $R/r$.

Theorem. $A(r_1, R_1)$ is conformally equivalent to $A(r_2, R_2)$ if and only if $R_1/r_1 = R_2/r_2$.
Proof: Suppose $R_1/r_1 = R_2/r_2$. Then there exists $k > 0$ such that $R_2 = k R_1$ and $r_2 = k r_1$. The map $f(z) = kz$ gives a conformal equivalence from $A(r_1, R_1)$ onto $A(r_2, R_2)$.
Conversely, let $f$ be a biholomorphic map from $A(r_1, R_1)$ onto $A(r_2, R_2)$. By scaling if necessary, we may assume $r_1 = r_2 = 1$. Let $A_1 = A(1, R_1)$ and $A_2 = A(1, R_2)$. Fix some $1 < r < R_2$.
Let $C = \{z \in \mathbb{C}: |z| = r\}$.
Since $f^{-1}$ is continuous, $f^{-1}(C)$ is compact, so we can find some $\delta > 0$ such that $A(1, 1 + \delta) \cap f^{-1}(C) = \emptyset$. Let $V = f(A(1, 1 + \delta))$. Since $f$ is continuous, $V$ is connected, so either $V \subseteq A(1, r)$ or $V \subseteq A(r, R_2)$. By replacing $f$ with $R_2/f$, we may assume the first case holds.
Claim: $|f(z_n)| \rightarrow 1$ whenever $|z_n| \rightarrow 1$.
Proof of claim: Let $\{z_n\} \subseteq A(1, 1 + \delta)$. Note that $\{f(z_n)\}$ does not have a limit point in $A_2$ since otherwise $\{z_n\}$ would have a limit point in $A_1$ by continuity of $f^{-1}$, contradicting $|z_n| \rightarrow 1$. Hence $|f(z_n)| \rightarrow 1$ or $|f(z_n)| \rightarrow R_2$ (it must converge by continuity). The latter case is ruled out as $f(z_n) \in V \subseteq A(1, r)$.
Similarly, we also have:
Claim: $|f(z_n)| \rightarrow R_2$ whenever $|z_n| \rightarrow R_1$.
Now set $\alpha = \log R_2/ \log R_1$. Define $g: A_1 \rightarrow \mathbb{R}$ by \[ g(z) = \log |f(z)|^2 - \alpha \log |z|^2 = 2(\log |f(z)| - \alpha \log |z|). \] We know that $\log |h|$ is harmonic whenever $h$ is holomorphic and nonzero, so $g$ is a harmonic function. By the two claims, $g$ extends to a continuous function on $\overline{A_1}$ vanishing on $\partial A_1$. This forces $g$ to vanish identically on $A_1$. In particular, \[ 0 = \frac{\partial g}{\partial z} = \frac{f'(z)}{f(z)} - \frac{\alpha}{z}. \] Take some $1 < c < R_1$ and let $\gamma(t) = ce^{it}, t \in [0, 2\pi]$. We have \[ \alpha = \frac{1}{2 \pi i}\int_\gamma \frac{\alpha}{z} dz = \frac{1}{2 \pi i}\int_\gamma \frac{f'(z)}{f(z)} dz = \mathrm{Ind}_{f \circ \gamma}(0), \] so $\alpha > 0$ is an integer. Observe that \[ \frac{d}{dz}(z^{-\alpha}f(z)) = z^{-\alpha - 1}(-\alpha f(z) + zf'(z)) = 0, \] thus $f(z) = Kz^{\alpha}$ for some nonzero constant $K$. Since $f$ is injective, $\alpha = 1$. Therefore $R_2 = R_1$.

Reference:
Rudin - Real and Complex Analysis (3rd Ed.)

Saturday, April 21, 2012

Koebe's 1/4-Theorem

Notation: $B(a, r) = \{z \in \mathbb{C}: |z - a| < r \}, B = B(0, 1), \mathbb{S}$ is the Riemann sphere.

Let $\mathcal{S}$ be the collection of all injective analytic function $f$ on the unit disk with $f(0) = 0$ and $f'(0) = 1$. We know from the Open Mapping Theorem that $f(B)$ must contain a disk $B(0, r_f)$. For the class $\mathcal{S}$, there is a universal $r > 0$ such that $f(B)$ contains $B(0, r)$ for all $f \in S$.

Theorem (Koebe's 1/4-Theorem). If $f \in \mathcal{S}$ then $B(0, 1/4) \subseteq f(B)$.

We need to the following result, which is of independent interest.

Theorem (The Area Theorem). Let $g: B \backslash \{0\} \rightarrow \mathbb{C}$ be an injective analytic function with Laurent series expansion at 0 \[ g(z) = \frac{1}{z} + c_0 + c_1 z + c_2 z^2 + \cdots. \] Then \[ \sum_{n = 1}^\infty n|c_n|^2 \leq 1. \]
Proof: Note that $g$ is a conformal equivalence between $B$ and $g(B) \subseteq \mathbb{S}$ and $\infty \in g(B)$. Fix any $0 < r < 1$. Consider $U_r = \mathbb{S} \backslash \overline{B(0, r)}$. It is an open set in $\mathbb{C}$ with boundary $\gamma$ parametrized by $\theta \mapsto h(re^{i\theta})$ where $\theta$ runs from $2\pi$ to 0. The area of $U_r$ is given by $\begin{align*}
\mathrm{Area}(U_r) &= \frac{1}{2i}\int_\gamma \bar{w} dw = -\frac{1}{2i} \int_0^{2\pi} \overline{h(re^{i\theta})}h'(re^{i\theta})ire^{i\theta}d\theta \\
&= - \frac{1}{2i} \int_0^{2\pi} [ \left(-\frac{1}{r^2} + |c_1|^2 + 2|c_2|^2 + 3|c_3|^2 + \cdots \right)r^2 \\
&{\,\,} + \textrm{ terms with nonzero integral powers of } e^{i\theta} ] \\
&= \pi \left(\frac{1}{r^2} - \sum_{n = 1}^\infty n|c_n|^2 r^2 \right).
\end{align*}$
Since area is always non-negative, let $r \rightarrow 1$ and we are done.

Proof of Koebe's Theorem: Let $f \in \mathcal{S}$. Then $f$ omits some $w_0 \in \mathbb{C}$ (or otherwise $f^{-1}$ is a bounded entire function, which is then constant by Liouville's Theorem, contradiction), so $h = 1/f$ omits 0 and $1/w_0$. Observe that the Taylor series of $f$ at 0 takes the form $f(z) = z + a_2 z^2 + a_3 z^3 + \cdots$, so \[ h(z) = \frac{1}{z + a_2 z^2 + \cdots} = \frac{1}{z} + c_0 + c_1 z + c_2 z^2 + \cdots. \] Let $z_0 = 0 \textrm{ or } 1/w_0$. We have \[ g(z) = \frac{1}{h(z) - z_0} = \frac{z}{1 + (c_0 - z_0)z + c_1 z^2 + \cdots} = z + (z_0 - c_0)z^2 + \cdots. \] In particular, $g(0) = 0$ and $g'(0) = 1$. Also note that $g$ is injective. Thus $g \in \mathcal{S}$.
We claim that there exists some $u \in \mathcal{S}$ such that $u(z)^2 = g(z^2)$. Since $g$ is injective and $g(0) = 0$, the analytic function $g(z)/z$ omits 0 on $B$, we can find some analytic function $\varphi$ on $B$ such that $\varphi(z)^2 = g(z)/z$ and $\varphi(0) = 1$. Let $u(z) = z\varphi(z^2)$. Then $g(z^2) = z^2 \cdot g(z^2)/z^2 = z^2\varphi(z^2)^2 = u(z)^2$. Since $\varphi(0) = 1$ and $\varphi'(0) = \frac{1}{2}(g(z)/z)^{-1/2}\frac{d}{dz}(g(z)/z)|_{z = 0} = \frac{1}{2} (z_0 - c_0)$, \[ u(z) = z\varphi(z^2) = z + \frac{1}{2}(z_0 - c_0)z^3 + \cdots. \] Hence $u(0) = 0$ and $u'(0) = 1$. To show that $u \in \mathcal{S}$ it remains to prove that $u$ is injective. Suppose $u(z) = u(w)$. Then $g(z^2) = g(w^2)$. By injectivity of $g$, $z^2 = w^2$ or $z = \pm w$. If $z = -w$, then $u(w) = u(z) = u(-w) = -u(w)$, so $g(w^2) = u(w)^2 = 0$ and $w^2 = 0$ by injectivity of $g$ again, i.e. $w = z = 0$. In any case we have $z = w$.
Now consider the Laurent expansion of $1/u$ at 0, we have \[ \frac{1}{u(z)} = \frac{1}{z + \frac{1}{2}(z_0 - c_0)z^3 + \cdots} = \frac{1}{z}(1 - \frac{1}{2}(z_0 - c_0)z^2 + \cdots) = \frac{1}{z} - \frac{1}{2}(z_0 - c_0)z + \cdots. \] By the Area Theorem, $|-\frac{1}{2}(z_0 - c_0)| \leq 1$, or $|z_0 - c_0| \leq 2$. Therefore $|1/w_0| \leq |1/w_0 - c_0| + |0 - c_0| \leq 2 + 2 = 4$, or $|w_0| \geq 1/4$. It follows that $\mathbb{C} \backslash f(B) \subseteq \{w \in \mathbb{C}: |w| \geq 1/4\}$, so $B(0, 1/4) \subseteq f(B)$. QED.

Reference:
Andersson - Topics in Complex Analysis

Friday, April 20, 2012

Weak*-closed subspaces as dual spaces

Let $X$ be a Banach space and $M$ be a subspace of $X^*$ closed under the weak* topology. We show that M can be identified as the dual space of some Banach space.

Let $M_\perp = \{x \in X: f(x) = 0 \,\, \forall f \in M\} = \bigcap_{f \in M} \ker(f)$. This is a (weakly) closed subspace of $X$. Then we can form the quotient Banach space $X/M_\perp$ with the norm $\|x + M_\perp\| = \inf_{m \in M_\perp} \|x - m\|$.

Theorem. $M$ is isometrically isomorphic to $(X/M_\perp)^*$.
Proof: Define a map $\Phi: M \rightarrow (X/M_\perp)^*$ by $[\Phi(f)](x + M_\perp) = f(x)$ for all $f \in M$. Firstly, $\Phi(f)$ is a well-defined linear functional on $X/M_\perp$ because $f$ vanishes on $M_\perp$ whenever $f \in M$. Secondly, $\Phi(f)$ is bounded as \[ |\Phi(f)(x + M_\perp)| = |f(x)| = |f(x - m)| \leq \|f\|\|x - m\| \,\, \forall m \in M_\perp, \] so $|\Phi(f)(x + M_\perp)| \leq \|f\|\|x + M_\perp\|$. Hence $\Phi$ is well-defined.
Clearly $\Phi$ is linear. From the above we have $\|\Phi(f)\| \leq \|f\|$. Let $\epsilon > 0$. Choose some $x \in X$ with $\|x\| \leq 1$ and $|f(x)| > \|f\| - \epsilon$. Then $\|x + M_\perp\| \leq 1$ and $|[\Phi(f)](x + M_\perp)| = |f(x)| > \|f\| - \epsilon$. Thus $\|\Phi(f)\| = \|f\|$. It follows that $\Phi$ is an isometry.
It remains to show that $\Phi$ is surjective. Let $\ell \in (X/M_\perp)^*$ be given. Define $f \in X^*$ by $f(x) = \ell(x + M_\perp)$, i.e. $f = \ell \circ \pi$ where $\pi$ is the quotient map. Then $f$ vanishes on $M_\perp$ by definition, hence $f \in M$ by the following lemma and it is clear that $\Phi(f) = \ell$.

Lemma. If $f \in X^*$ vanishes on $M_\perp$, then $f \in M$.
Proof: Suppose $f \in X^* \backslash M$. By Hahn-Banach Theorem and weak*-closedness we can find some $x \in X$ such that $f(x) = 1$ and $g(x) = 0$ whenever $g \in M$. But then $x \in M_\perp$ and so $f(x) = 0$, contradiction.

This result implies the following important observation in the theory of operator algebras.

Theorem. Every von Neumann algebra is the dual space of some Banach space.
Proof: Let $M$ be a von Neumann algebra on a Hilbert space $\mathcal{H}$. Then $M$ is a $\sigma$-weakly closed subalgebra of $B(\mathcal{H})$, i.e. it is a weak*-closed subalgebra of $B(\mathcal{H})$ where the weak* topology of $B(\mathcal{H})$ comes from the fact that the dual space of the trace-class operators $L^1(\mathcal{H})$ on $\mathcal{H}$ can be identified with $B(\mathcal{H})$. By our main theorem, $M \cong (L^1(\mathcal{H})/M_\perp)^*$.

A predual of a von Neumann algebra $M$ is a Banach space $X$ such that $M = X^*$. The above result says that every von Neumann algebra has a predual. In fact, the predual is unique by a result of Sakai. Furthermore, the property that having a predual characterizes von Neumann algebras among the class of C*-algebras.

Theorem (Sakai). Suppose $A$ is a W*-algebra, i.e. $A$ is a C*-algebra and the dual space of some Banach space. Then $A$ is *-isomorphic to a von Neumann algebra over some Hilbert space.

Open mapping theorem in terms of norm estimates

Here we present a simple but useful corollary of the Open Mapping Theorem in functional analysis.

Notation: $B_X(a, r) = \{x \in X: \|x - a\| < r\}$.

Proposition. Suppose $X$ and $Y$ are Banach spaces. Let $f: X \rightarrow Y$ be a surjective bounded linear map. Then there exists some $\delta > 0$ such that $\|f(x)\| \geq \delta\|x\|$ for all $x \in X$.

Proof: By the Open Mapping Theorem, $f$ is an open map. Hence we can find some $\delta > 0$ such that $B_Y(0, \delta) \subseteq f(B_X(0, 1))$, so $f(X \backslash B(0, 1)) \subseteq Y \backslash B(0, \delta)$, i.e. $\|f(x)\| \geq \delta$ whenever $\|x\| \geq 1$. Therefore $\|f(x)\| = \|x\|\|f(x/\|x\|)\| \geq \delta\|x\|$.

Here is an application of the above result in the theory of Fourier series.

Theorem. The Fourier transform $\mathcal{F}: L^1(\mathbb{T}) \rightarrow c_0(\mathbb{Z})$ is not surjective.

Proof: Suppose not. By the Proposition, there exists $\delta > 0$ such that $\|\hat{f}\|_\infty \geq \delta\|f\|_1$. But if $\{D_N\}_{N = 0}^\infty$ is the Dirichlet kernel, i.e.

\[ D_N(x) = \sum_{n = -N}^N e^{inx} \]

then we have $\|D_N\|_1 \rightarrow \infty$ and $\|\widehat{D_N}\|_\infty \geq \delta\|D_N\|_1 \rightarrow \infty$, which is absurd.

Tuesday, April 17, 2012

Boundedness of weakly convergent sequences

Let X be a Banach space. A net $\{x_\alpha\}$ in X is said to weakly converge to $x$ if $\ell(x_\alpha) \rightarrow \ell(x)$ for all $\ell \in X^*$. Clearly, every norm convergent net is weakly convergent (with the same limit). We know that a norm convergent sequence is necessarily bounded. It is natural to ask boundedness still hold for weakly convergent sequences.

Theorem. Every weakly convergent sequence in X is bounded.
Proof: Let $\{x_n\}$ be a weakly convergent sequence in X. Let $T_n \in X^{**}$ be defined by $T_n(\ell) = \ell(x_n)$ for all $\ell \in X^*$. Fix an $\ell \in X^*$. For any $n \in \mathbb{N}$, since the sequence $\{\ell(x_n)\}$ is convergent, $\{T_n(\ell)\}$ is a bounded set. By Uniform Boundedness Principle,
\[ \sup_{n \in \mathbb{N}} \|x_n\| = \sup_{n \in \mathbb{N}} \|T_n\| < \infty, \] i.e. $\{x_n\}$ is bounded.

However, the above result is no longer true when sequences are replaced by nets. To give a counterexample, let $X = L^1(\mathbb{R})$. Denote the characteristic function of a subset A by $\chi_A$. Let $\mathcal{O}$ be the collection of all open neighborhoods of 0 in $\mathbb{R}$ of finite Lebesgue measure with the ordering given by $U \prec V$ iff $U \supseteq V$. Consider the net $\{\chi_U: U \in \mathcal{O}\}$ in X. It is not bounded since $\|\chi_{(-n, n)}\|_1 = 2n \rightarrow \infty$ as $n \rightarrow \infty$. On the other hand, it converges weakly to 0. Fix any $f \in X^* = L^\infty(\mathbb{R})$. Choose M > 0 such that $|f| \leq M$ almost everywhere. For every $\epsilon > 0$, we can choose some $V \in \mathcal{O}$ with $m(V) < \epsilon/M$, then $|\int f\chi_U dm| \leq M m(U) \leq M m(V) < \epsilon$ whenever $U \succ V$.

Tuesday, March 27, 2012

No-cloning Theorem

In a classical computer there is no problem to copy an arbitrary piece of data. However, in quantum computer this is not the case. This is the consequence of a straightforward observation in linear algebra, called the "No-cloning Theorem". In the formalism of quantum computation, we use a unit vector in a Hilbert space to represent the state of a quantum bit. Operations on quantum bits are represented by unitary operators on the Hilbert space. To represent a system with more than one quantum bit, we use tensor products. Here is the No-cloning Theorem in this formulation.

No-cloning Theorem. Let $\mathcal{H}$ be a vector space (over $\mathbb{R}$ or $\mathbb{C}$) of dimension larger than 1. There do not exist a linear operator $U$ on $\mathcal{H} \otimes \mathcal{H}$ and a nonzero vector $v_0 \in \mathcal{H}$ such that $U(x \otimes v_0) = x \otimes x$ for all $x \in \mathcal{H}$.
Proof: Suppose such a $U$ and $v_0$ exist. Choose a nonzero vector $v \in \mathcal{H}$ such that $v, v_0$ are linearly independent. Then $U(v_0 \otimes v_0) = v_0 \otimes v_0, U(v \otimes v_0) = v \otimes v$ and $U((v + v_0) \otimes v_0) = (v + v_0) \otimes (v + v_0) = v \otimes v + v_0 \otimes v + v \otimes v_0 + v_0 \otimes v_0$. But linearity of $U$ demands $U((v + v_0) \otimes v_0) = U(v \otimes v_0) + U(v_0 \otimes v_0) = v \otimes v + v_0 \otimes v_0$. This is a contradiction.

Additive functions on the real line

A function $f: \mathbb{R} \rightarrow \mathbb{R}$ is said to be additive if $f(x + y) = f(x) + f(y)$ for every $x, y \in \mathbb{R}$. Typical examples of additive functions are the linear ones, i.e. $f(x) = ax$ where $a$ is a real constant. One question is that are they the only possibilities? If we further require that $f$ to be continuous, then the answer is positive. But in general, the answer is no and one can use basic linear algebra to construct counter-examples.

The trick is that we can regard $\mathbb{R}$ as a vector space over $\mathbb{Q}$ (and the dimension is infinite, exercise). Let $\beta$ be a $\mathbb{Q}$-basis of $\mathbb{R}$ which contains 1. This can be done by a standard application of Zorn's Lemma. Define $f: \mathbb{R} \rightarrow \mathbb{R}$ on $\beta$ as below and extend $\mathbb{Q}$-linearly. Take $f(1) = 1$ and $f(v) = 0$ whenever $1 \neq v \in \beta$. By definition, $f$ is $\mathbb{Q}$-linear, so it is additive. To see that $f$ is not $\mathbb{R}$-linear, let $0 \neq x \in \mathbb{R}$ be spanned $\mathbb{Q}$-linearly by basis vectors in $\beta$ excluding 1. Then $f(x) = 0$ by construction but $xf(1) = x \neq f(x)$.

The law of a random variable

Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space and $X$ be a random variable on $\Omega$. The law (or distribution) of $X$ is the (Borel) probability measure on the real line defined by
\[ \mathbb{P}_X(A) = \mathbb{P}(X \in A) \]
for every Borel $A \subseteq \mathbb{R}$.

Conversely, suppose we are given a probability measure $\mu$ on $\mathbb{R}$. We can build a random variable so that its law is precisely $\mu$. Take $\Omega = [0, 1]$, $\mathcal{F}$ be the Borel $\sigma$-algebra and $\mathbb{P}$ be the Lebesgue measure. For each $\omega \in \Omega$, let
\[ X(\omega) = \inf \{t \in \mathbb{R}: \mu((-\infty, t]) \geq \omega\}. \]
This defines a random variable $X$ on the probability space $(\Omega, \mathcal{F}, \mathbb{P})$.

Claim: The law of $X$ is $\mu$.
Proof: Fix any real number $a$. We need to show that $\mathbb{P}(X \leq a) = \mu((-\infty, a])$. Now for every $\omega \in \Omega$,
\[ \begin{aligned}
&X(\omega) \leq a \\
&\Leftrightarrow \inf \{t \in \mathbb{R}: \mu((-\infty, t]) \geq \omega\} \leq a \\
&\Leftrightarrow \forall n \in \mathbb{N} \inf \{t \in \mathbb{R}: \mu((-\infty, t]) \geq \omega\} < a + \frac{1}{n} \\
&\Leftrightarrow \forall n \in \mathbb{N} \mu((-\infty, a + \frac{1}{n}]) \geq \omega \\
&\Leftrightarrow \mu((-\infty, a]) \geq \omega. \\
\end{aligned} \]
Hence we have that $\mathbb{P}(X \leq a) = \mathbb{P}(\{\omega \in \Omega: \mu((-\infty, a]) \geq \omega\}) = \mathbb{P}([0, \mu((-\infty, a])]) = \mu((-\infty, a])$. Q.E.D.

Reference:
Bass - Probabilistic Techniques in Analysis