Close property of expectation under convexity

Suppose we have a random vector ${Z}$ and a convex set ${C\in {\mathbb R} ^n}$ such that

$\displaystyle \mathop{\mathbb P}(Z\in C) =1.$

If you are doing things with convexity, then you may wonder whether

$\displaystyle \mathop{\mathbb E}(Z) \in C.$

This is certainly true if ${Z}$ only takes finitely many value in ${C}$ or ${C}$ is closed. In the first case, you just verify the definition of convexity and the second case, you may use the strong law of large numbers. But if you draw a picture and think for a while, you might wonder whether these conditions are needed as it looks like no matter what value ${Z}$ takes, it can not go out of ${C}$ and the average should still belong to ${C}$ as long as ${C}$ is convex. In this post, we are going to show that it is indeed the case and we then have a theorem.

Theorem 1 For any convex set ${C\subset {\mathbb R}^n}$ , and for any random vector ${Z}$ such that

$\displaystyle \mathop{\mathbb P}(Z\in C)=1,$

its expectation is still in ${C}$ , i.e,

$\displaystyle \mathop{\mathbb E}(Z) \in C$

as long as the mean exists.

Skip the following remark if you don’t know or not familiar with measure theory.

Remark 1 If you are a measure theoretic person, you might wonder whether ${C}$ should be Borel measurable. The answer is no. The set ${C}$ needs not to be Borel measurable. To make the point clear, suppose there is an underlying probability ${(\Omega, \mathcal{F},\mathop{\mathbb P})}$ and ${Z}$ is a random variable from this probability space to ${(\mathbb{R}^n, \mathcal{B})}$ where ${\mathcal{B}}$ is the borel sigma-algebra. Then we can either add the condition that the event ${\{\omega \in \Omega \mid Z\in C \} = F\in \mathcal{F}}$ or ${\mathop{\mathbb P}(F)=1}$ is understood as ${F}$ is a measurable event with respect to the completed measure space ${(\Omega, \bar{\mathcal{F}},\bar{\mathop{\mathbb P}})}$ and we overload the notation ${\mathop{\mathbb P}}$ to mean ${\bar{\mathop{\mathbb P}}}$ . The probability space ${(\Omega, \mathcal{F},\mathop{\mathbb P})}$ is completed by the probability measure ${\mathop{\mathbb P}}$ .

To have some preparation for the proof, recall the separating hyperplane theorem of convex set.

Theorem 2 (Separating Hyperplane theorem) Suppose ${C}$ and ${D}$ are convex sets in ${{\mathbb R}^n}$ and ${C\cap D = \emptyset}$ , then there exists a nonzero ${a \in {\mathbb R}^{n}}$ such that

$\displaystyle a^Tx \geq a^Ty$

for all ${x\in C,y\in D}$ .

Also recall the following little facts about convexity.

Any convex set in ${{\mathbb R}}$ is always an interval.
Any affine space of ${n-m}$ dimension in ${{\mathbb R}^n}$ is of the form ${\{x\in {\mathbb R}^{n}:Ax=b\}}$ for some ${A\in {\mathbb R}^{m\times n}}$ and ${b \in {\mathbb R}^m}$ .

We are now ready to prove Theorem 1.

Proof of Theorem 1: We may suppose that ${C}$ has nonempty interior. Since if it is not, we can take the affine plane containing ${C}$ with smallest dimension. Suppose ${L =\mathop{\mathbb E} (Z)}$ is not in ${C}$ , then by separating hyperplane theorem, there exists a nonzero $a$ such that

$\displaystyle L= a^T\mathop{\mathbb E}(Z) \geq a^Tx, \forall x \in C.$

Since ${Z\in C}$ almost surely, we should have

$\displaystyle a^TZ \leq L$

almost surely. Since ${\mathop{\mathbb E}( a^T Z)= a^T\mathop{\mathbb E}( Z)}$ , we see that ${a^T Z= L}$ with probability ${1}$ . Since intersection of the hyperplane of ${a^Tx = L}$ and ${C}$ is still convex, we see that that ${Z}$ only takes value in a convex set in a ${n-1}$ dimensional affine space.

Repeat the above argument, we can decrease the dimension until ${n=1}$ . After a proper translation and rotation, we can say that ${Z}$ takes its value in an interval in ${\mathbb{R}}$ and we want to argue that the mean of ${Z}$ is always in the interval.

This is almost trivial. Suppose the interval is bounded. If the interval is closed, then since taking expectation preserves order, i.e., ${X\geq Y \implies \mathop{\mathbb E} X\geq \mathop{\mathbb E} Y}$ , we should have its mean in the interval. If the interval is half open and half closed and if the means is not in the interval, then ${\mathop{\mathbb E} Z}$ must be the open end of the interval since expectation preserves order, but this means that ${Z}$ has full measure on the open end which contradicts the assumption that ${Z}$ is in the interval with probability one. The case both open is handled in the same way. If the interval is unbounded one way, then the previous argument still works and if it is just ${\mathbb{R}}$ , then for sure that ${\mathop{\mathbb E} Z \in\mathbb{R}}$ . This completes the proof. $\Box$