## Polynomial Optimization 1: Motivation

(Nov 28, 2013)

A (constrained) polynomial minimization problem is in the form

($\mathcal{PO}$)         $\displaystyle p^{\text{min}} \triangleq \inf_{x\in \mathbb{R}^n} p(x)$ subject to $g_1(x)\geq 0, \cdots, g_m(x)\geq 0$

where $p,g_1,\cdots,g_m$ are polynomials in $x = (x_1,\cdots,x_n)$ with real coefficients.

If $m = 1$ and $g_1(x)\equiv 0$, then the above problem becomes $\displaystyle \inf_{x\in \mathbb{R}^n} p(x)$, an unconstrained polynomial minimization problem.

The purpose of writing this series is to introduce the basic principles related to this kind of problem. Before we begin, the prerequisites are:

4. Knowledge about linear programming and semidefinite programming is a plus

We collect all points in $\mathbb{R}^n$ that meet the constraints of ($\mathcal{PO}$) by

$K \triangleq \{x\in \mathbb{R}^n:g_1(x)\geq 0,\cdots, g_m(x)\geq 0\}$.

This set $K$ is called a basic closed semialgebraic set. Abstract algebra is employed to study the geometry of $K$. The field doing this thing is called real algebraic geometry.

There is a duality between nonnegative polynomials and a special type of Borel measures on some Euclidean space. This suggests some relation between solving ($\mathcal{PO}$) and measure theory.

Now let’s look at some examples of ($\mathcal{PO}$).

Example 1 (Linear Programming).          When the objective function and all constraints are linear polyomials, ($\mathcal{PO}$) is a linear programming (LP) problem which can be expressed as

$\min \ c^T x\text{ subject to }Ax\leq b$

where $A\in \mathbb{R}^{m\times n}$, $b \in \mathbb{R}^m$ and $c\in \mathbb{R}^n$.

If we add the constraints $x_i^2 = x_i$ for all $i$ to the LP we get a (NP-hard) 0/1 LP problem:

$\min \ c^T x\text{ subject to }Ax\leq b\text{ and }x_i^2 = x_i\text{ for all }i$

Example 2 (Combinatorial Optimization).          Some well-known problems in combinatorial optimization are also polynomial optimization problems. They include maximum stable set problem and maximum cut problem.

Example 3 (Partition Problem).         This is a popular NP-complete problem: Given a sequence $a_1,\cdots,a_n$ of positive integers. Is there $x_1,\cdots,x_n\in\{\pm 1\}^n$ such that $x_1 a_1 + \cdots +x_n a_n = 0$?

If $p^{\text{min}}=0$ in the unconstrained polynomial minimization problem where $\displaystyle p(x) = \left( \sum_{i=1}^n a_i x_i \right)^2 + \sum_{i=1}^n (x_i^2-1)^2$, then the sequence $a_1,\cdots,a_n$ can be partitioned into two blocks whose sums are equal.

Example 4 (Quadratically constrained quadratic programming).         If the objective and constraints are quadratic polynomials then ($\mathcal{PO}$) is a quadratically constrained quadratic program (QCQP). This type of problem is also under our interest. We hope we can get to this later.

Polynomial optimization is an NP-complete problem. One approach to tackle it is to consider relaxations which are easy to solve. If the relaxations are solved we obtain lower bounds to the original problem. Sometimes these lower bounds are close to or even equal to the optimal value.

To begin with, we notice that since infimum  is the greatest lower bound, ($\mathcal{PO}$) is equivalent to solving

($\mathcal{N}$)          $\sup \ \rho\text{ subject to } p - \rho \geq 0 \text{ on }K$.

In other words, looking for the maximum $\rho$ such that $p-\rho$ is nonnegative on $K$. Next thing we can ask is, is the problem feasible i.e. Is there a $\rho$ such that $p -\rho$ is nonnegative on $K$? Or in the following way: What are the nonnegative polynomials on $K$?

If $K = \mathbb{R}^n$, there is an answer by David Hilbert in the late 19th century. To describe his result we need a few notations and definitions.

(1) $\mathbb{N} =$ the set of nonnegative integers.

(2) $\displaystyle \mathbb{N}^n_t \triangleq \left\{ \alpha\in \mathbb{N}^n: |\alpha| \triangleq \sum_{i=1}^n \alpha_i\leq t\right\}$

(3) ${\bf x} = (x_1,\cdots,x_n)\in \mathbb{R}^n$

(4) Set $\mathbb{R}[{\bf x}] =$ ring of multivariate polynomials in $n$ variables with real coefficients. A monomial in $\mathbb{R}[{\bf x}]$ is in the form ${\bf x}^\alpha = x_1^{\alpha_1} \cdots x_n^{\alpha_n}$ whose degree is $|\alpha|$. Any $p\in \mathbb{R}[{\bf x}]$ is in the form $\displaystyle p = \sum_{\alpha\in \mathbb{N}^n} p_\alpha {\bf x}_\alpha$ where there are only finitely many nonzero $p_\alpha$‘s. When $p_\alpha \neq 0$, $p_\alpha {\bf x}^\alpha$ is called a term of $p$. The degree of $p$ is given by

$\text{deg}(p) \triangleq \max\{t: p_\alpha \neq 0 \text{ for some }\alpha\in \mathbb{N}^n_t\}$

(5) Set $\mathbb{R}[{\bf x}]_t =$ set of polynomials of degree $\leq t$.

Definition 1.          A polynomial $p\in \mathbb{R}[{\bf x}]$ is said to be a sum of squares of polynomials (abbreviation: $p$ is SOS) if

$\displaystyle p = \sum_{j=1}^m u_j^2$ for some $u_1,\cdots,u_m\in \mathbb{R}[{\bf x}]$

(6) Set $\mathcal{P} = \mathcal{P}_n = \{p\in\mathbb{R}[{\bf x}]: p(x)\geq 0\text{ for all }x\in \mathbb{R}^n\}$ and $\mathcal{P}_{n,d} = \mathcal{P}_n \cap \mathbb{R}[{\bf x}]_d$

(7) Set $\Sigma = \Sigma_n = \{p\in \mathbb{R}[{\bf x}]:p\text{ is SOS}\}$ and $\Sigma_{n,d} = \Sigma_n \cap \mathbb{R}[{\bf x}]_d$

It is straightforward that $\Sigma_n \subset \mathcal{P}_n$ and $\Sigma_{n,d} \subset \mathcal{P}_{n,d}$. How about the reverse inclusion? Are all nonnegative polynomials on $\mathbb{R}^n$ sums of squares?

First look at what we and you can prove.

Proposition 1.           Any nonnegative univariate (i.e. $n = 1$) polynomial on $\mathbb{R}$ is a sum of two squares.

Proof. Let $p(x)$ be a nonnegative polynomial on $\mathbb{R}$. Then the coefficient of the highest degree term must be positive. The roots of $p$ are either real with even multiplicity or appear in complex conjugate pairs. So

$\displaystyle p = c\prod_{i=1}^r (x-a_i)^{2r_i} \prod_{j=1}^s ((x-b_j)^2 + c_j^2)^{s_j}$

for some $a_i,b_j,c_j,c\in \mathbb{R}$, $c>0$, and $r,s,r_i,s_j\in \mathbb{N}$. It follows that $p$ is SOS. Using the identity $(x_1^2+y_1^2)(x_2^2+y_2^2) = (x_1 x_2 + y_1 y_2)^2 + (x_1 y_2 - x_2 y_1)^2$ (follows from evaluating both sides, or a very basic property of complex numbers) we see that $\displaystyle \prod_{j=1}^s ((x-b_j)^2 +c_j^2)^{s_j}$ is SOS. Since $\displaystyle c\prod_{i=1}^r(x-a_i)^{2r_i}$ is just a square of some polynomial, $p$ is seen to be a sum of two squares.  $\square$

(Dec 2, 2013)

To prove the next proposition we need to know homogenization.

Definition 2.          A polynomial $p \in \mathbb{R}[{\bf x}]$ is said to be homogeneous (or a form) if all its terms have the same degree. If $p \in \mathbb{R}[{\bf x}]$ is of degree $d$, write $\displaystyle p = \sum_{|\alpha|\leq d} p_\alpha {\bf x}^\alpha$, then its homogenization is the polynomial $\widetilde{p}\in \mathbb{R}[{\bf x},x_{n+1}]$ defined by

$\displaystyle\widetilde{p}\triangleq \sum_{|\alpha|\leq d}p_\alpha {\bf x}^\alpha x_{n+1}^{d-|\alpha|}$.

This $\widetilde{p}$ is indeed the numerator of $\displaystyle p\left(\frac{{\bf x}}{x_{n+1}} \right)$.

Lemma 2.          If $p\in \mathbb{R}[{\bf x}]$ is SOS, then $\deg(p)$ is even. Also, any decomposition $p = \sum_{j=1}^m u_j^2$ where $u_j \in \mathbb{R}[{\bf x}]$ satisfies $\deg(u_j)\leq \deg(p)/2$ for all $j$.

Proof. Assume $p$ is SOS. Then $p\geq 0$ on $\mathbb{R}^n$ so $\deg(p)$ is even, say $\deg(p) = 2d$. Write $p = \sum_{j=1}^m u_j^2$ where $u_j = \sum_\alpha u_{j,\alpha}{\bf x}^\alpha$, and set $k \triangleq \max_j \deg(u_j)$. If $k\leq d$ then we are done. Suppose $k\geq d+1$. Then decompose each $u_j = a_j + b_j$ where $b_j \triangleq \sum_{|\alpha| = k} u_{j,\alpha}{\bf x}^\alpha$ and $a_j \triangleq u_j - b_j$. Thus $p - \sum_j (a_j^2 - 2a_j b_j) = \sum_j b_j^2$. The right hand side, $\sum_j b_j^2$ is a homogeneous polynomial of degree $2k$, while the left hand side $p-\sum_j (a_j^2 - 2a_jb_j)$ is a polynomial of degree $\leq 2k-1$, because $\deg(p) = 2d \leq 2k-2 < 2k-1$, $\deg(a_j) \leq k-1$ and $\deg(b_j)=k$ for all $j$. Contradiction. $\square$

Lemma 3.          Given a polynomial $p\in \mathbb{R}[{\bf x}]$ and let $\widetilde{p}\in \mathbb{R}[{\bf x},x_{n+1}]$ be its homogenization. Then

(1) $p\geq 0$ on $\mathbb{R}^n$ if and only if $\widetilde{p}\geq 0$ on $\mathbb{R}^{n+1}$;

(2) $p$ is SOS if and only if $\widetilde{p}$ is SOS.

Proof. For (1) and (2), the “if” part follows from the fact $p(\cdot) = \widetilde{p}(\cdot,1)$. Conversely, for (1), if $p\geq 0$ on $\mathbb{R}^n$ then $d \triangleq \deg(p)$ is even and so $\widetilde{p}({\bf x},x_{n+1}) = x_{n+1}^d p({\bf x}/x_{n+1}) \geq 0$ whenever $x_{n+1}\neq 0$. Letting $x_{n+1} \rightarrow 0$ and using the continuity of $\widetilde{p}$, $\widetilde{p}({\bf x},0) \geq 0$. So $\widetilde{p}\geq 0$. For (2), if $p = \sum_j u_j^2$ where $u_j \in \mathbb{R}[{\bf x}]$, then $p$ is nonnegative thus $d \triangleq \deg(p)$ is even, and hence

$\displaystyle\widetilde{p}({\bf x},x_{n+1}) = x_{n+1}^d \sum_j u_j^2({\bf x}/x_{n+1}) = \sum_j \left[ x_{n+1}^{d/2} u_j ({\bf x}/x_{n+1})\right]^2$.

By the previous lemma, $x_{n+1}^{d/2} u_j ({\bf x}/x_{n+1})\in \mathbb{R}[{\bf x},x_{n+1}]$ for all $j$ so $\widetilde{p}$ is SOS. $\square$

Proposition 4.          Any nonnegative quadratic polynomial (i.e. $d = 2$ and $n$ is arbitrary) is a sum of squares.

Proof. Let $p\in \mathbb{R}[{\bf x}]_2$ be of the form $p = {\bf x}^T Q{\bf x} + 2c^T {\bf x} + b$, where $Q$ is a symmetric $n\times n$ matrix, $c\in \mathbb{R}^n$ and $b\in \mathbb{R}$. Its homogenization is $\widetilde{p} = {\bf x}^T Q{\bf x} + 2x_{n+1} c^T {\bf x} + bx_{n+1}^2 = \widetilde{x}^T \widetilde{Q} \widetilde{x}$, where $\widetilde{x} = \begin{pmatrix} {\bf x}\\ x_{n+1} \end{pmatrix}$ and $\widetilde{Q} = \begin{pmatrix} Q & c \\ c^T & b \end{pmatrix}$. Using Lemma 3 we know $\widetilde{p}\geq 0$ on $\mathbb{R}^{n+1}$ so $\widetilde{Q}$ is positive semidefinite. As $\widetilde{Q}$ is symmetric positive semidefinite, it follows from eigenvalue decomposition with nonnegative eigenvalues that one can write $\widetilde{Q} = \sum_j u^{(j)} (u^{(j)})^T$ for some $u^{(j)} \in \mathbb{R}^{n+1}$. Thus $\widetilde{p} = \sum_j (\sum_{i=1}^{n+1} u_i^{(j)}x_i)^2$ is SOS and by Lemma 3 again, $p$ is SOS too. $\square$

Keeping track of the proof we see that above $p$ is a sum of $r$ squares where $r$ is the rank of $\widetilde{Q}$.

David Hilbert (1888) also proved that any nonnegative polynomial in $n = 2$ variables with degree 4 is a sum of three squares, together with this complete classification (we omit the proofs):

Theorem 5 (Hilbert’s Theorem). One has

$\mathcal{P}_{n,d} = \Sigma_{n,d}$ if and only if $n =1$, or $d = 2$, or $(n,d) = (2,4)$.

There is a popular example of nonnegative polynomial which is NOT SOS.

Example 5.          T. Motzkin (1967) published the first explicit example of non-SOS nonnegative polynomial, which is the following bivariate sextic ($n = 2$, $d = 6$) polynomial:

$p(x_1,x_2) \triangleq x_1^2 x_2^2 (x_1^2 + x_2^2 - 3) + 1 = x_1^4 x_2^2 + x_1^2 x_2^4 + 1 - 3 x_1^2 x_2^2$

We see that $p\geq 0$ on $x_1^2 + x_2^2 \geq 3$. In fact it is always nonnegative, because of the following inequality:

$\displaystyle \frac{x_1^4 x_2^2 + x_1^2 x_2^4 + 1}{3} \geq \sqrt[3]{x_1^6 x_2^6}$

To show that $p$ is not SOS, we first assume $p = \sum_k u_k^2$ where $u_k \in \mathbb{R}[x_1,x_2]$. Note that $p(x_1,0) = p(0,x_2) = 1$, thus $1 = \sum_k u_k^2 (x_1,0) = \sum_k u_k^2 (0,x_2)$. This means $u_k(x_1,0)$ and $u_k(0,x_2)$ are constant polynomials. So each $u_k$ has no term concerning only $x_1$ or only $x_2$. But we know each $u_k$ has degree at most 3. Thus we may write

$u_k(x_1,x_2) = a_k + b_k x_1 x_2 + c_k x_1^2 x_2 + d_k x_1 x_2^2$

Comparing the coefficients of $x_1^2 x_2^2$ in $p = \sum_k u_k^2$, we get

$\displaystyle -3 = \sum_k b_k^2$

which is clearly impossible. So $p$ is not SOS.

Then, by Lemma 3, the homogeneous Motzkin form

$M(x_1,x_2,x_3) = x_1^2 x_2^2 (x_1^2 + x_2^2 - 3x_3^2) + x_3^6$

is also a non-SOS nonnegative polynomial.

Different people constructed examples of non-SOS nonnegative polynomials, for example, Raphael M. Robinson (slightly after Motzkin), Man-Duen Choi and Tsit-Yuen Lam (1976-77), Anneli and Peter Lax (1978) (cf. 1971 International Math Olympiad), Konrad Schmudgen  (1979) etc. (Reference: Reznick, Some Concrete Aspects of Hilbert’s 17th Problem)

Example 6.          SOS decomposition is in general non-unique. This is an example:

$\displaystyle x_1^2 - x_1x_2^2 + x_2^4 + 1 = \frac{3}{4}(x_1-x_2^2)^2 + \frac{1}{4}(x_1+x_2^2)^2 + 1=\frac{1}{9}(3-x_2^2)^2+\frac{2}{3}x_2^2+\frac{1}{288}(9x_1-16x_2^2)^2+\frac{23}{32}x_1^2$

In this session we discuss a representation result about nonnegative polynomials on $\mathbb{R}^n$. Then, how to compute SOS decomposition effectively? Also, how about nonnegative polynomials on general $K$? Is there any representation result? We will do this next time, and the discussion will become more algebraic.

(End)