I ran into an issue with SDPs when trying to update cvxpy to interface with SCS 1.1.0.

This is what the above ends up with in Y_mat a

thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Bug in SCS SDP solver about scs HOT 10 CLOSED

SteveDiamond commented on June 15, 2024

Bug in SCS SDP solver

from scs.

Comments (10)

bodono commented on June 15, 2024

This has confused me many times, and I had to try and un-confuse myself all over again when I saw this issue. The README is definitely lacking on this, so I have to update that too. I think the basic idea is the following:

The question here is how do we define the semi-definite cone of parameter n in the vector space of dimension n*(n+1)/2? Whichever way we do it has to satisfy the properties of the semidefinite cone as we know them. In particular we want the inner product, <x,y> := sum_i (xi * yi) , to agree, because the theory surrounding SCS needs the inner product and the Moreau decomposition etc. Now, if we take just the lower triangular components of a matrix X and stack them to make x, then we'll get different inner products than if we had the full matrix, and SCS won't work at all (I tried it). So the way to get around that is to scale the elements of the vector corresponding to the off-diagonal terms by sqrt(2), that transformation preserves the inner product. In other words, getting from the vector representation back to the matrix solution requires scaling the matrix as you're re-building it. The projection onto the cone at each step of SCS is done in a similar manner.

This all means if we want to have the objective trace(C*X) then we represent that as vec(C)'*vec(X), where the vec operator is defined as above with the rescaling. Similarly a linear equality constraint trace(A*X) == b is now vec(A)'*vec(X) == b.

In other words you first have to perform this rescaling on the input data, pass it to SCS, then SCS will return vec(X) as the solution, and then the matrix is rebuilt by undoing the scaling. Michael Grant does it for CVX in the shim for scs if you want to see an example here: https://github.com/cvxr/CVX/blob/rework/shims/cvx_scs.m#L94.

More info from MOSEK and CVXOPT:
Slides 6-7 here: http://docs.mosek.com/slides/ismp2012/sdo.pdf
Top of page 3 here: http://www.seas.ucla.edu/~vandenbe/publications/mlbook.pdf

I'll rework the example you gave too, see how it turns out with the scaling.

from scs.

bodono commented on June 15, 2024

I should say that I'm open to changing how SCS does it if it can be done better. The thing that doesn't make sense is to pass in vec(C) and vec(A) but return X without the vec operation, because then the inner products and residuals etc. won't be what SCS is claiming them to be at termination.

from scs.

bodono commented on June 15, 2024

I clobbered my python install, so here is your example in matlab, with all the rescaling:

% basic problem data

clear all

A = [ 0.,  1.,  0.,  0.,  0., -1.;
     -1.,  0.,  0.,  1.,  0.,  0.;
      0.,  0.,  0.,  0.,  1., -1.;
      1.,  0.,  0.,  0.,  0.,  0.;
      0.,  0., -1.,  1.,  0.,  0.;
      0., -1.,  0.,  0.,  0.,  0.;
      0.,  0., -1.,  0.,  0.,  0.;
      0.,  0.,  0.,  0., -1.,  0.];

c = [ 0.,  0.,  0.,  0.,  0.,  1.]';
b = [-0., -0., -0.,  2., -0., -0., -0., -0.]';

dims = struct('q', [], 's', [2], 'f', 5, 'l', 0, 'ep', 0)

m = size(A,1);
n = size(A,2);

%% rescaling

A_s = A; b_s = b;
for i=1:n
    A_s(6:8,i) = rescaleSDP(A(6:8,i), 2, sqrt(2));
end
c_s = c;
b_s(6:8) = rescaleSDP(b(6:8), 2, sqrt(2));

data = struct('A', sparse(A_s), 'c', c_s, 'b', b_s)
%% solve
A_s = sparse(A_s);

[x,y,s,info] = scs(data,dims,[]);

Y_mat = refillMatrix(rescaleSDP(y(6:8), 2, 1/sqrt(2)), 2);
S_mat = refillMatrix(rescaleSDP(s(6:8), 2, 1/sqrt(2)), 2);

%% test with cvx (something fishy about dual variable returned)

cvx_begin
%cvx_solver 'scs'
variables xc(n) sc(m) L(2,2)
dual variable yc
minimize(c'*xc)
yc:A*xc + sc == b
sc(1:5) == 0;
L(tril(true(2,2))) == sc(6:8);
L == semidefinite(2)
cvx_end

scaling function:

function z = rescaleSDP(x, n, val)
% scale off-diagonals by val

M = zeros(n);
idx = tril(true(size(M)));
M(idx) = x;
dd = diag(M);
M = (M - diag(dd))*val + diag(dd);
z = M(idx);

stuffing the matrix

function M = refillMatrix(x, n)
M = zeros(n);
idx = tril(true(size(M)));
M(idx) = x;
dd = diag(M);
M = (M - diag(dd));
M = (M+M') + diag(dd);

from scs.

bodono commented on June 15, 2024

This is what the above script ends up with in Y_mat and S_mat:

>> S_mat
S_mat =
    2.0013    2.0013
    2.0013    2.0013
>> Y_mat
Y_mat =
    0.4999   -0.4999
   -0.4999    0.4999

from scs.

SteveDiamond commented on June 15, 2024

I see. That is kind of confusing. For the moment I'll change cvxpy so it passes in scaled data. I just tried this and everything works fine. It would be helpful too if you could clarify all of this in the documentation, as you mentioned.

But longer term this seems more like an internal SCS issue than something the modeling framework should know about. Could you change the operations involving PSD variables in SCS to use the scaling, so that cvxpy doesn't need to do the scaling? Or could you rescale the data in the Python interface?

Another solution would be to have a higher level interface like mathprogbase that handles all the details of going from cvxpy's idea of a cone program to what SCS (or any other cone solver) expects.

from scs.

mlubin commented on June 15, 2024

Agreed that from the modeling perspective, I'd consider the rescaling as an internal solver detail.

from scs.

madeleineudell commented on June 15, 2024

thanks @bodono. I'm inclined to think that scaling is an internal solver detail, and not something I should handle at the Convex.jl layer. Probably it should live in SCS.jl.

But I'd encourage you to think of the lower triangular encoding as being just an encoding: the matrices really are in S^n. When you say you're taking inner products, I assume (as a modeler) you're taking the standard inner product in S^n, which does not require me to scale the off diagonal entries by sqrt(2). If you buy that the input format is just an input format, and not the "real" representation of the problem, I'd say the scaling should live in SCS proper.

from scs.

bodono commented on June 15, 2024

I would argue that it is actually a modeling layer issue, since all data preparation is a modeling layer issue, and this is just another step in data preparation.

The contract that SCS is agreeing to is that it will find a point that satisfies the KKT conditions:

Ax + s == b
A'*y + c == 0
s'*y == 0
s \in K, y \in K^*

where the inner products are the standard ones in R^n, no special cases. The cone S^n cannot be represented as simply stuffing the lower triangular entries into the upper triangle, because then the standard inner-product and the projection method will break. So, we define the cone to be the set of vectors that when scaled and stuffed into the upper triangle form a matrix that is PSD. That means, in many cases the user will want to scale the input data and the solution, but that's not something SCS cares about, since it has found a point that satisfies the KKT conditions, as it agreed to.

The example at the top of the page was passed in and SCS solved it, it's entirely a modeling issue whether or not that represents the SDP that the user wants solved. Scaling the data so that it corresponds to a real-world problem is no different that converting some atom into a convex cone representable problem. For example, we could say that the cone {t, x | ||x||_1 <= t} is the true representation of the problem, and is a convex cone, so a cone solver should handle it. In reality none of the major cone solvers have as a contract that they will solve a problem with that cone, the user has to perform more operations in order to convert that cone problem into another cone problem that the solver has agreed to solve. If the transformation is done wrongly, then that is a separate issue. Some solvers will probably agree to do the scaling internally, not all do (SDPT3 demands that the user scales before hand, like SCS) and SCS won't because it would break some of the separations it has currently. SCS is designed to use as interfaces matrix multiplies and cone projections, so it would ruin the separation we have there now to do the scaling internally. Same with things like pre-solving, which is also just another step of data preparation, SCS won't do that because it breaks the separation.

What you're really saying is that it would be nice if SCS handled the scaling. But it would also be nice if SCS automatically converted the cone {t, x | ||x||_1 <= t} into other cones it can handle, after all that's just a solver detail since the true representation of the problem is the l1 cone. But both of those would be just taking some of the modeling responsibility away from the modeling layer and into the solver.

from scs.

SteveDiamond commented on June 15, 2024

Ok, if it makes sense the most sense for the SCS math/implementation for the PSD cone to be defined as the set of vectors that when scaled and stuffed into the upper triangle form a matrix that is PSD, then I completely agree. That had just seemed like an odd definition of the cone.

from scs.

bodono commented on June 15, 2024

Yeah, it is an odd definition, but unfortunately it's the only one that works in the space R^(n*(n+1)/2. Originally SCS had the definition of the cone in R^(n*n), which is more natural, but that ended up with problems where the solution was 'wrong' relative to what the user was expecting #31. To fix that we went with the other definition of the cone, but SCS won't be able to do the rescaling internally like other solvers do because it was designed from the start as a research tool that allows the user to plug in any cone projection and any matrix multiply routine (including ones that don't have access to the matrix directly) and just work, and rescaling the data internally would break that. It could be done in the wrappers around the C code (those are part of the modeling layer in my mind), I just haven't done that yet. The issue with that is that the C api and the other language apis would be different, which would be a bit strange.

from scs.

Bug in SCS SDP solver about scs HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent