This is intended to be supplementary notes for the course MATH3210 Linear Programming. My purpose is to aid your (and my) understanding by outlining the conceptual structure in just a few pages. I follow the lecture notes closely (you may download them from Professor Wei’s website). For detailed proofs of the theorems, refer to the lecture notes.
Linear programming is an subfield of (convex) optimization. We first formulate the linear programming problem (LPP). Using some linear algebra and convex analysis, we can interpret the problem geometrically. We then develop the simplex method, which relies heavily on basic feasible solution (BFS). We study carefully its implementation. A beautiful concept in linear programming is duality. We recall its key theorems and look at some applications. Finally we apply the theory to study transportation problems.
2. Formulation of LPP
In a LPP, we maximize or minimize a linear functional subject to linear constraints. In feasible canonical form (FCF), the problem is
subject to the constraints
, , .
It is both convenient and advantageous to use matrix notation. Thus we have the problem
Another “canonical representation” is the standard form (SF), which is more convenient mathematically since we now have equations.
We can use either form with no loss of generality. For example, we can transform an LPP from FCF to SF by adding appropriate slack variables.
Until further notice we consider problem in SF. Technically, we assume that
The feasible region is . It is a closed convex set (which may be unbounded). Its elements are called feasible solutions.
If and for all , we call an optimal solution (not necessarily unique), and call the optimal value (which is unique).
The first objective of this course is to develop the simplex method which decides whether the LPP has an optimal solution, and, if exists, find one. The theory uses concepts in linear algebra, our next topic.
3. Linear algebra
In secondary school, you should have learned, while sliding your ruler, that we only need to search for the ‘corners’. In convex analysis, these are called extreme points.
Definition 1.6. Let be convex. An element is an extreme point of if
Theorem 2.5. If the LPP has a finite optimal value, then the optimal value can be attained at an extreme point of the feasible region .
Algebraically, an extreme point is a BFS of the linear system.
Definition 1.1. Consider the linear system of . Pick linearly independent columns of and form a square matrix . The solution of the system is called a basic solution (BS) with basic matrix . If also , then is called a basic feasible solution (BFS). We identify it with a feasible solution of by letting the non-basic variables be .
Theorem 2.3, 2.4. The BFS of the linear system correspond precisely to the extreme points of the feasible region .
Armed with these theorems, to solve an LPP, it suffices to search through all BFS by brute force and pick one with the largest objective value. However, observe that the number of BFS is roughly of the order
For large and , it is simply infeasible to compute all BFS. For example, if and , there are about possible basic matrices.
Clearly, we need a systematic way to search through the BFS. In his ground-breaking work in 1947, George Dantzig invented the now famous simplex method that not only solves the problem but also provides a sound theoretical basis to analyze LPPs.
4. Simplex method
Again we work with a given LPP in SF. Suppose we have a BFS () and , where is the basic matrix. How do we know whether it is optimal, and, if not, how to find a better one?
The answer is based on the following calculation. Let be the current objective value. Then
We may transform it to get
(In practice, the transformation is done by row operations. The inverse , though useful in theory, need not be computed.) Since , the current BFS and can be read off directly. Now, suppose that the row vector contains a positive entry corresponding to . (They are called reduced cost coefficients.) Then by changing from to positive, we may increase the objective value (see below) . On the other hand, if , it is easy to see that the current BFS is optimal, i.e. for any feasible solution , .
Consider again the case that is not optimal. If we increase , some basic variables may decrease and we need to check that the new solution is feasible. Let us change notation and write as
where . On the other hand,
for some uniquely determined coefficients . To get a new BFS, we consider (1) minus a suitable multiple of (2), so that the coefficent of becomes and becomes positive. This leads to the feasibility creteria that should be chosen with
If such an exists, we obtain a new BFS. We may then apply the optimality criteria to this new BFS and iterate, until we find an optimal solution. This is the basic idea of simplex method.
Theorem 2.6, 2.7. Let be a BFS of with basic matrix and objective value . If (i) there exists a column not in B such that and (ii) at least one , then it is possible to obtain a new BFS by replacing one column in by , and the new objective value is at least . If (i) does not hold, then is optimal.
The above procedure can be rephrased in terms of the tableau (Section 3.1 and 3.2). In theory, we should also show that the simplex method solves the LPP within a ‘reasonable’ number of steps. This interesting and non-trivial question lies beyond the scope of this course. Suffice it to say that simplex method works amazingly well in actual problems.
5. Implementation of simplex method
This course emphasizes the implementation issues of simplex method.
5.1 Initialization. Suppose we standardize a LPP in FCF to get , where is the vector of slack variables. We may simply pick and . However, given an arbitrary LPP in SF, a BFS may not been seen directly. Mathematicians have developed two methods for initialization. Both methods involve introducing artificial variables , so that the constraint becomes , , and we may pick as a BFS of the new system.
In the -method (Section 3.3), we maximize instead the functional , where is very large. The idea is that since penalizes the objective value, the artificial variables will be eliminated during simplex iteration. If all artificial variables are eliminated, we have found a BFS of , and we may continue with the usual simplex method.
In the two-phase method (Section 3.4), we maximize instead the functional . Observe that the optimal value of the new problem is if and only if (P) is feasible, and any optimal BFS is a BFS of (P).
5.2 Dealing with ‘bad’ situations. In Theorem 2.6, 2.7 we saw that we need a in order to get a new BFS. If for all , we may go infinitely far in the direction of , hence the feasible region is unbounded. If also , then the optimal value is infinite (Theorem 4.1).
In some situations, at an optimal BFS, the plane corresponding to may intersect on a non-degenerate face. This implies that there are infinitely many optimal solutions, and can be detected from the row of the optimal value (Section 4.3).
A more serious problem is degeneracy, which occurs when some constraints are redundant. This may lead to cycling (Section 4.4).
6. Duality theory
Duality is a beautiful idea in linear programming. To motivate the dual problem, suppose we are given a primal problem in FCF (here the formulas become more ‘symmetric’ if we use FCF):
Suppose we form a non-negative linear combination of the rows of , so that the resulting row vector dominates . Taking transpose, we get
Then we obtain an upper bound of the optimal value (write this out yourself):
The minimization of this upper bound is the dual problem.
Definition 5.1. Let be the primal problem in FCF. The dual problem of is the LPP
It is easy to see that the dual of the dual of is , hence the name duality.
Similar to the dual case, each feasible solution of the primal problem gives a lower bound of the optimal value, so the optimal value is somewhere in between. Naturally, we expect that the maximum lower bound equals the minimal upper bound, and the common value is the optimal value.
Theorem. With the above notations, we have the following:
(Weak duality, Theorem 5.2) If is feasible to and is feasible to , then . In particular, if equality holds, then both solutions are optimal.
(Strong duality, Theorem 5.4) If has an optimal solution, then so does .
We note that theory of simplex method are used in the proof of the strong duality theorem.
Another nice relation between and is complementary slackness:
Theorem 5.7. If is optimal to and is optimal to , then . Conversely, if a pair of feasible solutions satisfy (5), then both solutions are optimal.
Motivated by the duality theorems, you might think that we may use both the primal and dual problems to device an algorithm. This is possible, and we have the dual simplex method. It is just the usual simplex method applied to the dual problem, but here we work with the primal tableau. Simplex iteration generates both a dual and a primal solution with the same objective value and reduced cost coefficients. The dual solution is always a BFS but the primal solution is not necessarily feasible. When we get a feasible primal solution, we are done by weak duality.
The dual formalism allows us to view an optimal solution in terms of the primal and the dual (Theorem 5.5). This observation, together with the dual simplex method, is useful in post-optimality/sensitivity analysis.
7. Transportation problems.
In the transportation problem , we distribute goods from the sources through channels with cost coefficients to the destinations . We assume that supply equals demand, thus . The problem is to minimize the transportation cost. Thus, if denotes the amount of goods transported from source to destination , the problem is
It is easy to show that always possesses an optimal solution. Of course we may apply the usual simplex method, but here the point is that we may take advantage of the special structure of the problem and reduce greatly the computational cost. To see this, arrange the decision variables in the vector
and define , and accordingly. Then is an by matrix, and its column corresponding to is simply
where is the standard basis of . It is not difficult to see that (Theorem 6.1). The structure of is exploited in the following
Theorem 6.4 (Stepping stone theorem). Let consists of independent columns of . For any column of , the unqiue representation takes the form
In particular, each is either or .
The theorem can be interpreted on the transportation tableau (Section 6.3). Starting at , you jump around the tableau along the unique closed path . Each is then a stepping stone.
The stepping stone theorem allows us to calculate rather quickly by diagram chasing. Moreover, at the BFS , the reduced cost coefficients are nothing but
In other words, we may just add and minus along the path. This motivates the table in Section 6.4. We may then develop the corresponding simplex iteration. Note that the transportation tableau has dimensions instead of in the usual simplex tableau, a great reduction.