Polynomial Optimization 1: Motivation

(Nov 28, 2013)

A (constrained) polynomial minimization problem is in the form

 (\mathcal{PO})         \displaystyle p^{\text{min}} \triangleq \inf_{x\in \mathbb{R}^n} p(x) subject to g_1(x)\geq 0, \cdots, g_m(x)\geq 0

where p,g_1,\cdots,g_m are polynomials in x = (x_1,\cdots,x_n) with real coefficients.

If m = 1 and g_1(x)\equiv 0, then the above problem becomes \displaystyle \inf_{x\in \mathbb{R}^n} p(x), an unconstrained polynomial minimization problem.

The purpose of writing this series is to introduce the basic principles related to this kind of problem. Before we begin, the prerequisites are:

1. Complexity Notions

2. Undergraduate level abstract algebra

3. Undergraduate level real analysis

4. Knowledge about linear programming and semidefinite programming is a plus

We collect all points in \mathbb{R}^n that meet the constraints of (\mathcal{PO}) by

K \triangleq \{x\in \mathbb{R}^n:g_1(x)\geq 0,\cdots, g_m(x)\geq 0\}.

This set K is called a basic closed semialgebraic set. Abstract algebra is employed to study the geometry of K. The field doing this thing is called real algebraic geometry.

There is a duality between nonnegative polynomials and a special type of Borel measures on some Euclidean space. This suggests some relation between solving (\mathcal{PO}) and measure theory.

Now let’s look at some examples of (\mathcal{PO}).

Example 1 (Linear Programming).          When the objective function and all constraints are linear polyomials, (\mathcal{PO}) is a linear programming (LP) problem which can be expressed as

\min \ c^T x\text{ subject to }Ax\leq b

where A\in \mathbb{R}^{m\times n}, b \in \mathbb{R}^m and c\in \mathbb{R}^n.

If we add the constraints x_i^2 = x_i for all i to the LP we get a (NP-hard) 0/1 LP problem:

\min \ c^T x\text{ subject to }Ax\leq b\text{ and }x_i^2 = x_i\text{ for all }i

Example 2 (Combinatorial Optimization).          Some well-known problems in combinatorial optimization are also polynomial optimization problems. They include maximum stable set problem and maximum cut problem.

Example 3 (Partition Problem).         This is a popular NP-complete problem: Given a sequence a_1,\cdots,a_n of positive integers. Is there x_1,\cdots,x_n\in\{\pm 1\}^n such that x_1 a_1 + \cdots +x_n a_n = 0?

If p^{\text{min}}=0 in the unconstrained polynomial minimization problem where \displaystyle p(x) = \left( \sum_{i=1}^n a_i x_i \right)^2 + \sum_{i=1}^n (x_i^2-1)^2, then the sequence a_1,\cdots,a_n can be partitioned into two blocks whose sums are equal.

Example 4 (Quadratically constrained quadratic programming).         If the objective and constraints are quadratic polynomials then (\mathcal{PO}) is a quadratically constrained quadratic program (QCQP). This type of problem is also under our interest. We hope we can get to this later.

Polynomial optimization is an NP-complete problem. One approach to tackle it is to consider relaxations which are easy to solve. If the relaxations are solved we obtain lower bounds to the original problem. Sometimes these lower bounds are close to or even equal to the optimal value.

To begin with, we notice that since infimum  is the greatest lower bound, (\mathcal{PO}) is equivalent to solving

(\mathcal{N})          \sup \ \rho\text{ subject to } p - \rho \geq 0 \text{ on }K.

In other words, looking for the maximum \rho such that p-\rho is nonnegative on K. Next thing we can ask is, is the problem feasible i.e. Is there a \rho such that p -\rho is nonnegative on K? Or in the following way: What are the nonnegative polynomials on K?

If K = \mathbb{R}^n, there is an answer by David Hilbert in the late 19th century. To describe his result we need a few notations and definitions.

(1) \mathbb{N} = the set of nonnegative integers.

(2) \displaystyle \mathbb{N}^n_t \triangleq \left\{ \alpha\in \mathbb{N}^n: |\alpha| \triangleq \sum_{i=1}^n \alpha_i\leq t\right\}

(3) {\bf x} = (x_1,\cdots,x_n)\in \mathbb{R}^n

(4) Set \mathbb{R}[{\bf x}] = ring of multivariate polynomials in n variables with real coefficients. A monomial in \mathbb{R}[{\bf x}] is in the form {\bf x}^\alpha = x_1^{\alpha_1} \cdots x_n^{\alpha_n} whose degree is |\alpha|. Any p\in \mathbb{R}[{\bf x}] is in the form \displaystyle p = \sum_{\alpha\in \mathbb{N}^n} p_\alpha {\bf x}_\alpha where there are only finitely many nonzero p_\alpha‘s. When p_\alpha \neq 0, p_\alpha {\bf x}^\alpha is called a term of p. The degree of p is given by

\text{deg}(p) \triangleq \max\{t: p_\alpha \neq 0 \text{ for some }\alpha\in \mathbb{N}^n_t\}

(5) Set \mathbb{R}[{\bf x}]_t = set of polynomials of degree \leq t.

Definition 1.          A polynomial p\in \mathbb{R}[{\bf x}] is said to be a sum of squares of polynomials (abbreviation: p is SOS) if

\displaystyle p = \sum_{j=1}^m u_j^2 for some u_1,\cdots,u_m\in \mathbb{R}[{\bf x}]

(6) Set \mathcal{P} = \mathcal{P}_n = \{p\in\mathbb{R}[{\bf x}]: p(x)\geq 0\text{ for all }x\in \mathbb{R}^n\} and \mathcal{P}_{n,d} = \mathcal{P}_n \cap \mathbb{R}[{\bf x}]_d

(7) Set \Sigma = \Sigma_n = \{p\in \mathbb{R}[{\bf x}]:p\text{ is SOS}\} and \Sigma_{n,d} = \Sigma_n \cap \mathbb{R}[{\bf x}]_d

It is straightforward that \Sigma_n \subset \mathcal{P}_n and \Sigma_{n,d} \subset \mathcal{P}_{n,d}. How about the reverse inclusion? Are all nonnegative polynomials on \mathbb{R}^n sums of squares?

First look at what we and you can prove.

Proposition 1.           Any nonnegative univariate (i.e. n = 1) polynomial on \mathbb{R} is a sum of two squares.

Proof. Let p(x) be a nonnegative polynomial on \mathbb{R}. Then the coefficient of the highest degree term must be positive. The roots of p are either real with even multiplicity or appear in complex conjugate pairs. So

\displaystyle p = c\prod_{i=1}^r (x-a_i)^{2r_i} \prod_{j=1}^s ((x-b_j)^2 + c_j^2)^{s_j}

for some a_i,b_j,c_j,c\in \mathbb{R}, c>0, and r,s,r_i,s_j\in \mathbb{N}. It follows that p is SOS. Using the identity (x_1^2+y_1^2)(x_2^2+y_2^2) = (x_1 x_2 + y_1 y_2)^2 + (x_1 y_2 - x_2 y_1)^2 (follows from evaluating both sides, or a very basic property of complex numbers) we see that \displaystyle \prod_{j=1}^s ((x-b_j)^2 +c_j^2)^{s_j} is SOS. Since \displaystyle c\prod_{i=1}^r(x-a_i)^{2r_i} is just a square of some polynomial, p is seen to be a sum of two squares.  \square

(Dec 2, 2013)

To prove the next proposition we need to know homogenization.

Definition 2.          A polynomial p \in \mathbb{R}[{\bf x}] is said to be homogeneous (or a form) if all its terms have the same degree. If p \in \mathbb{R}[{\bf x}] is of degree d, write \displaystyle p = \sum_{|\alpha|\leq d} p_\alpha {\bf x}^\alpha, then its homogenization is the polynomial \widetilde{p}\in \mathbb{R}[{\bf x},x_{n+1}] defined by

\displaystyle\widetilde{p}\triangleq \sum_{|\alpha|\leq d}p_\alpha {\bf x}^\alpha x_{n+1}^{d-|\alpha|}.

This \widetilde{p} is indeed the numerator of \displaystyle p\left(\frac{{\bf x}}{x_{n+1}} \right).

Lemma 2.          If p\in \mathbb{R}[{\bf x}] is SOS, then \deg(p) is even. Also, any decomposition p = \sum_{j=1}^m u_j^2 where u_j \in \mathbb{R}[{\bf x}] satisfies \deg(u_j)\leq \deg(p)/2 for all j.

Proof. Assume p is SOS. Then p\geq 0 on \mathbb{R}^n so \deg(p) is even, say \deg(p) = 2d. Write p = \sum_{j=1}^m u_j^2 where u_j = \sum_\alpha u_{j,\alpha}{\bf x}^\alpha, and set k \triangleq \max_j \deg(u_j). If k\leq d then we are done. Suppose k\geq d+1. Then decompose each u_j = a_j + b_j where b_j \triangleq \sum_{|\alpha| = k} u_{j,\alpha}{\bf x}^\alpha and a_j \triangleq u_j - b_j. Thus p - \sum_j (a_j^2 - 2a_j b_j) = \sum_j b_j^2. The right hand side, \sum_j b_j^2 is a homogeneous polynomial of degree 2k, while the left hand side p-\sum_j (a_j^2 - 2a_jb_j) is a polynomial of degree \leq 2k-1, because \deg(p) = 2d \leq 2k-2 < 2k-1, \deg(a_j) \leq k-1 and \deg(b_j)=k for all j. Contradiction. \square

Lemma 3.          Given a polynomial p\in \mathbb{R}[{\bf x}] and let \widetilde{p}\in \mathbb{R}[{\bf x},x_{n+1}] be its homogenization. Then

(1) p\geq 0 on \mathbb{R}^n if and only if \widetilde{p}\geq 0 on \mathbb{R}^{n+1};

(2) p is SOS if and only if \widetilde{p} is SOS.

Proof. For (1) and (2), the “if” part follows from the fact p(\cdot) = \widetilde{p}(\cdot,1). Conversely, for (1), if p\geq 0 on \mathbb{R}^n then d \triangleq \deg(p) is even and so \widetilde{p}({\bf x},x_{n+1}) = x_{n+1}^d p({\bf x}/x_{n+1}) \geq 0 whenever x_{n+1}\neq 0. Letting x_{n+1} \rightarrow 0 and using the continuity of \widetilde{p}, \widetilde{p}({\bf x},0) \geq 0. So \widetilde{p}\geq 0. For (2), if p = \sum_j u_j^2 where u_j \in \mathbb{R}[{\bf x}], then p is nonnegative thus d \triangleq \deg(p) is even, and hence

\displaystyle\widetilde{p}({\bf x},x_{n+1}) = x_{n+1}^d \sum_j u_j^2({\bf x}/x_{n+1}) = \sum_j \left[ x_{n+1}^{d/2} u_j ({\bf x}/x_{n+1})\right]^2.

By the previous lemma, x_{n+1}^{d/2} u_j ({\bf x}/x_{n+1})\in \mathbb{R}[{\bf x},x_{n+1}] for all j so \widetilde{p} is SOS. \square

Proposition 4.          Any nonnegative quadratic polynomial (i.e. d = 2 and n is arbitrary) is a sum of squares.

Proof. Let p\in \mathbb{R}[{\bf x}]_2 be of the form p = {\bf x}^T Q{\bf x} + 2c^T {\bf x} + b, where Q is a symmetric n\times n matrix, c\in \mathbb{R}^n and b\in \mathbb{R}. Its homogenization is \widetilde{p} = {\bf x}^T Q{\bf x} + 2x_{n+1} c^T {\bf x} + bx_{n+1}^2 = \widetilde{x}^T \widetilde{Q} \widetilde{x}, where \widetilde{x} = \begin{pmatrix} {\bf x}\\ x_{n+1} \end{pmatrix} and \widetilde{Q} = \begin{pmatrix} Q & c \\ c^T & b \end{pmatrix}. Using Lemma 3 we know \widetilde{p}\geq 0 on \mathbb{R}^{n+1} so \widetilde{Q} is positive semidefinite. As \widetilde{Q} is symmetric positive semidefinite, it follows from eigenvalue decomposition with nonnegative eigenvalues that one can write \widetilde{Q} = \sum_j u^{(j)} (u^{(j)})^T for some u^{(j)} \in \mathbb{R}^{n+1}. Thus \widetilde{p} = \sum_j (\sum_{i=1}^{n+1} u_i^{(j)}x_i)^2 is SOS and by Lemma 3 again, p is SOS too. \square

Keeping track of the proof we see that above p is a sum of r squares where r is the rank of \widetilde{Q}.

David Hilbert (1888) also proved that any nonnegative polynomial in n = 2 variables with degree 4 is a sum of three squares, together with this complete classification (we omit the proofs):

Theorem 5 (Hilbert’s Theorem). One has

\mathcal{P}_{n,d} = \Sigma_{n,d} if and only if n =1, or d = 2, or (n,d) = (2,4).

There is a popular example of nonnegative polynomial which is NOT SOS.

Example 5.          T. Motzkin (1967) published the first explicit example of non-SOS nonnegative polynomial, which is the following bivariate sextic (n = 2, d = 6) polynomial:

p(x_1,x_2) \triangleq x_1^2 x_2^2 (x_1^2 + x_2^2 - 3) + 1 = x_1^4 x_2^2 + x_1^2 x_2^4 + 1 - 3 x_1^2 x_2^2

We see that p\geq 0 on x_1^2 + x_2^2 \geq 3. In fact it is always nonnegative, because of the following inequality:

\displaystyle \frac{x_1^4 x_2^2 + x_1^2 x_2^4 + 1}{3} \geq \sqrt[3]{x_1^6 x_2^6}

To show that p is not SOS, we first assume p = \sum_k u_k^2 where u_k \in \mathbb{R}[x_1,x_2]. Note that p(x_1,0) = p(0,x_2) = 1, thus 1 = \sum_k u_k^2 (x_1,0) = \sum_k u_k^2 (0,x_2). This means u_k(x_1,0) and u_k(0,x_2) are constant polynomials. So each u_k has no term concerning only x_1 or only x_2. But we know each u_k has degree at most 3. Thus we may write

u_k(x_1,x_2) = a_k + b_k x_1 x_2 + c_k x_1^2 x_2 + d_k x_1 x_2^2

Comparing the coefficients of x_1^2 x_2^2 in p = \sum_k u_k^2, we get

\displaystyle -3 = \sum_k b_k^2

which is clearly impossible. So p is not SOS.

Then, by Lemma 3, the homogeneous Motzkin form

M(x_1,x_2,x_3) = x_1^2 x_2^2 (x_1^2 + x_2^2 - 3x_3^2) + x_3^6

is also a non-SOS nonnegative polynomial.

Different people constructed examples of non-SOS nonnegative polynomials, for example, Raphael M. Robinson (slightly after Motzkin), Man-Duen Choi and Tsit-Yuen Lam (1976-77), Anneli and Peter Lax (1978) (cf. 1971 International Math Olympiad), Konrad Schmudgen  (1979) etc. (Reference: Reznick, Some Concrete Aspects of Hilbert’s 17th Problem)

Example 6.          SOS decomposition is in general non-unique. This is an example:

\displaystyle x_1^2 - x_1x_2^2 + x_2^4 + 1 = \frac{3}{4}(x_1-x_2^2)^2 + \frac{1}{4}(x_1+x_2^2)^2 + 1=\frac{1}{9}(3-x_2^2)^2+\frac{2}{3}x_2^2+\frac{1}{288}(9x_1-16x_2^2)^2+\frac{23}{32}x_1^2

In this session we discuss a representation result about nonnegative polynomials on \mathbb{R}^n. Then, how to compute SOS decomposition effectively? Also, how about nonnegative polynomials on general K? Is there any representation result? We will do this next time, and the discussion will become more algebraic.


This entry was posted in Algebra, Applied mathematics, Optimization. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s