Applications of Duality Theory - Quadratic Programming

Duality theory can simplify and speed up optimization problems by recasting the optimization problem into a possibly easy optimization problem. In this short post, I will show how a general positive definite inequality constrained quadratic program can be recast into a simple quadratic program constrained only on the positive orthant (vectors with positive coefficients).

Let us look at our optimization problem of interest, the inequality constrained strictly convex quadratic program.

\[\min_x \frac{1}{2}x^T Q x - b^Tx\] \[\text{s.t.: } Ax \leq c\]

This optimization problem is equivalent to the following,

\[\max_{\lambda\geq \theta}\min_x \{\frac{1}{2}x^T Q x - b^Tx +\lambda^T(Ax - c)\}\]

The inner minimization is now unconstrained, and we can write x in terms of $\lambda$ using normal calculus.

\[x = Q^{-1}(b^T-A^T\lambda)\]

We can resubstitute this back into the expression, and we can derive the following optimization problem in terms of $\lambda$.

\[\min_\lambda \frac{1}{2}\lambda^T P \lambda + d^T\lambda\] \[\text{s.t.: } \lambda_i \geq 0\] \[P = AQ^{-1}A^T\] \[d = c - AQ^{-1}b\]

There are many instances where problems with few optimization variables and many constraints would prefer the dual problem. The dual problem has the same number of constraints, but they are straightforward and computationally efficient to handle. Additionally, it is trivial to solve for a feasible point as one can pick any positive vector, and it is feasible. We can solve this with gradient descent with a correction step, i.e., whenever gradient decent pushes $\lambda_i\leq 0$, we set it to 0. It makes it rather simple to solve inequality constrained QPs. This is all on the assumption that the original problem is feasible!

As always, here is the source code for this algorithm.


def solve_dual_qp(Q: numpy.ndarray, b: numpy.ndarray, A:numpy.ndarray, c:numpy.ndarray, max_iter:int = 10**3)-> numpy.ndarray:
    
    #cache inverse Q
    Q_inv = numpy.linalg.inv(Q)
    P = A@Q_inv@A.T
    d = c - A@Q_inv@b
    
    
    #initialize the lambda vector
    l = numpy.ones((P.shape[0], 1))
    print(l)
    print(P)
    #create a list to store our lagrange multipliers sequence 
    l_list = list()
    l_list.append(l)
    
    for i in range(max_iter):
        
        #calculate the gradient of the QP
        del_f = P @ l + d
        print(del_f)
        #calculate the step size 
        alpha = del_f.T @ del_f / (del_f.T @ P @ del_f)
        
        #take the step
        l = l - alpha*del_f
        
        #correct for violations of the bounds
        l = numpy.maximum(l, numpy.zeros_like(l))
        
        #add to the list
        l_list.append(l)
        
        #check termination criteria, if lagrange multipliers have rel diff less then 10E-10 terminate
        if numpy.sum(numpy.abs(l_list[-1] - l_list[-2])) < 10**-10:
            break
    
    return -Q_inv@(b - A.T@l)

Here is a simple unit box constrained problem with an unconstrained optimal point of (10,10). Here we can see it arrives at the correct answer of (1,1), the constrained optimal point.

Q = numpy.eye(2)
A = numpy.block([[-numpy.eye(2)], [numpy.eye(2)]])
b = numpy.zeros((2,1)) - 10
c = numpy.ones((4, 1))

solve_dual_qp(Q, b, A,  c, 100)

Without using any licensed code or packages, we can modify one of the simplest optimization methods to solve a challenging class of optimization problems!