Giter Club home page Giter Club logo

Comments (10)

bodono avatar bodono commented on June 15, 2024

This has confused me many times, and I had to try and un-confuse myself all over again when I saw this issue. The README is definitely lacking on this, so I have to update that too. I think the basic idea is the following:

The question here is how do we define the semi-definite cone of parameter n in the vector space of dimension n*(n+1)/2? Whichever way we do it has to satisfy the properties of the semidefinite cone as we know them. In particular we want the inner product, <x,y> := sum_i (xi * yi) , to agree, because the theory surrounding SCS needs the inner product and the Moreau decomposition etc. Now, if we take just the lower triangular components of a matrix X and stack them to make x, then we'll get different inner products than if we had the full matrix, and SCS won't work at all (I tried it). So the way to get around that is to scale the elements of the vector corresponding to the off-diagonal terms by sqrt(2), that transformation preserves the inner product. In other words, getting from the vector representation back to the matrix solution requires scaling the matrix as you're re-building it. The projection onto the cone at each step of SCS is done in a similar manner.

This all means if we want to have the objective trace(C*X) then we represent that as vec(C)'*vec(X), where the vec operator is defined as above with the rescaling. Similarly a linear equality constraint trace(A*X) == b is now vec(A)'*vec(X) == b.

In other words you first have to perform this rescaling on the input data, pass it to SCS, then SCS will return vec(X) as the solution, and then the matrix is rebuilt by undoing the scaling. Michael Grant does it for CVX in the shim for scs if you want to see an example here: https://github.com/cvxr/CVX/blob/rework/shims/cvx_scs.m#L94.

More info from MOSEK and CVXOPT:
Slides 6-7 here: http://docs.mosek.com/slides/ismp2012/sdo.pdf
Top of page 3 here: http://www.seas.ucla.edu/~vandenbe/publications/mlbook.pdf

I'll rework the example you gave too, see how it turns out with the scaling.

from scs.

bodono avatar bodono commented on June 15, 2024

I should say that I'm open to changing how SCS does it if it can be done better. The thing that doesn't make sense is to pass in vec(C) and vec(A) but return X without the vec operation, because then the inner products and residuals etc. won't be what SCS is claiming them to be at termination.

from scs.

bodono avatar bodono commented on June 15, 2024

I clobbered my python install, so here is your example in matlab, with all the rescaling:

% basic problem data

clear all

A = [ 0.,  1.,  0.,  0.,  0., -1.;
     -1.,  0.,  0.,  1.,  0.,  0.;
      0.,  0.,  0.,  0.,  1., -1.;
      1.,  0.,  0.,  0.,  0.,  0.;
      0.,  0., -1.,  1.,  0.,  0.;
      0., -1.,  0.,  0.,  0.,  0.;
      0.,  0., -1.,  0.,  0.,  0.;
      0.,  0.,  0.,  0., -1.,  0.];

c = [ 0.,  0.,  0.,  0.,  0.,  1.]';
b = [-0., -0., -0.,  2., -0., -0., -0., -0.]';

dims = struct('q', [], 's', [2], 'f', 5, 'l', 0, 'ep', 0)

m = size(A,1);
n = size(A,2);

%% rescaling

A_s = A; b_s = b;
for i=1:n
    A_s(6:8,i) = rescaleSDP(A(6:8,i), 2, sqrt(2));
end
c_s = c;
b_s(6:8) = rescaleSDP(b(6:8), 2, sqrt(2));

data = struct('A', sparse(A_s), 'c', c_s, 'b', b_s)
%% solve
A_s = sparse(A_s);

[x,y,s,info] = scs(data,dims,[]);

Y_mat = refillMatrix(rescaleSDP(y(6:8), 2, 1/sqrt(2)), 2);
S_mat = refillMatrix(rescaleSDP(s(6:8), 2, 1/sqrt(2)), 2);

%% test with cvx (something fishy about dual variable returned)

cvx_begin
%cvx_solver 'scs'
variables xc(n) sc(m) L(2,2)
dual variable yc
minimize(c'*xc)
yc:A*xc + sc == b
sc(1:5) == 0;
L(tril(true(2,2))) == sc(6:8);
L == semidefinite(2)
cvx_end

scaling function:

function z = rescaleSDP(x, n, val)
% scale off-diagonals by val

M = zeros(n);
idx = tril(true(size(M)));
M(idx) = x;
dd = diag(M);
M = (M - diag(dd))*val + diag(dd);
z = M(idx);

stuffing the matrix

function M = refillMatrix(x, n)
M = zeros(n);
idx = tril(true(size(M)));
M(idx) = x;
dd = diag(M);
M = (M - diag(dd));
M = (M+M') + diag(dd);

from scs.

bodono avatar bodono commented on June 15, 2024

This is what the above script ends up with in Y_mat and S_mat:

>> S_mat
S_mat =
    2.0013    2.0013
    2.0013    2.0013
>> Y_mat
Y_mat =
    0.4999   -0.4999
   -0.4999    0.4999

from scs.

SteveDiamond avatar SteveDiamond commented on June 15, 2024

I see. That is kind of confusing. For the moment I'll change cvxpy so it passes in scaled data. I just tried this and everything works fine. It would be helpful too if you could clarify all of this in the documentation, as you mentioned.

But longer term this seems more like an internal SCS issue than something the modeling framework should know about. Could you change the operations involving PSD variables in SCS to use the scaling, so that cvxpy doesn't need to do the scaling? Or could you rescale the data in the Python interface?

Another solution would be to have a higher level interface like mathprogbase that handles all the details of going from cvxpy's idea of a cone program to what SCS (or any other cone solver) expects.

from scs.

mlubin avatar mlubin commented on June 15, 2024

Agreed that from the modeling perspective, I'd consider the rescaling as an internal solver detail.

from scs.

madeleineudell avatar madeleineudell commented on June 15, 2024

thanks @bodono. I'm inclined to think that scaling is an internal solver detail, and not something I should handle at the Convex.jl layer. Probably it should live in SCS.jl.

But I'd encourage you to think of the lower triangular encoding as being just an encoding: the matrices really are in S^n. When you say you're taking inner products, I assume (as a modeler) you're taking the standard inner product in S^n, which does not require me to scale the off diagonal entries by sqrt(2). If you buy that the input format is just an input format, and not the "real" representation of the problem, I'd say the scaling should live in SCS proper.

from scs.

bodono avatar bodono commented on June 15, 2024

I would argue that it is actually a modeling layer issue, since all data preparation is a modeling layer issue, and this is just another step in data preparation.

The contract that SCS is agreeing to is that it will find a point that satisfies the KKT conditions:

Ax + s == b
A'*y + c == 0
s'*y == 0
s \in K, y \in K^*

where the inner products are the standard ones in R^n, no special cases. The cone S^n cannot be represented as simply stuffing the lower triangular entries into the upper triangle, because then the standard inner-product and the projection method will break. So, we define the cone to be the set of vectors that when scaled and stuffed into the upper triangle form a matrix that is PSD. That means, in many cases the user will want to scale the input data and the solution, but that's not something SCS cares about, since it has found a point that satisfies the KKT conditions, as it agreed to.

The example at the top of the page was passed in and SCS solved it, it's entirely a modeling issue whether or not that represents the SDP that the user wants solved. Scaling the data so that it corresponds to a real-world problem is no different that converting some atom into a convex cone representable problem. For example, we could say that the cone {t, x | ||x||_1 <= t} is the true representation of the problem, and is a convex cone, so a cone solver should handle it. In reality none of the major cone solvers have as a contract that they will solve a problem with that cone, the user has to perform more operations in order to convert that cone problem into another cone problem that the solver has agreed to solve. If the transformation is done wrongly, then that is a separate issue. Some solvers will probably agree to do the scaling internally, not all do (SDPT3 demands that the user scales before hand, like SCS) and SCS won't because it would break some of the separations it has currently. SCS is designed to use as interfaces matrix multiplies and cone projections, so it would ruin the separation we have there now to do the scaling internally. Same with things like pre-solving, which is also just another step of data preparation, SCS won't do that because it breaks the separation.

What you're really saying is that it would be nice if SCS handled the scaling. But it would also be nice if SCS automatically converted the cone {t, x | ||x||_1 <= t} into other cones it can handle, after all that's just a solver detail since the true representation of the problem is the l1 cone. But both of those would be just taking some of the modeling responsibility away from the modeling layer and into the solver.

from scs.

SteveDiamond avatar SteveDiamond commented on June 15, 2024

Ok, if it makes sense the most sense for the SCS math/implementation for the PSD cone to be defined as the set of vectors that when scaled and stuffed into the upper triangle form a matrix that is PSD, then I completely agree. That had just seemed like an odd definition of the cone.

from scs.

bodono avatar bodono commented on June 15, 2024

Yeah, it is an odd definition, but unfortunately it's the only one that works in the space R^(n*(n+1)/2. Originally SCS had the definition of the cone in R^(n*n), which is more natural, but that ended up with problems where the solution was 'wrong' relative to what the user was expecting #31. To fix that we went with the other definition of the cone, but SCS won't be able to do the rescaling internally like other solvers do because it was designed from the start as a research tool that allows the user to plug in any cone projection and any matrix multiply routine (including ones that don't have access to the matrix directly) and just work, and rescaling the data internally would break that. It could be done in the wrappers around the C code (those are part of the modeling layer in my mind), I just haven't done that yet. The issue with that is that the C api and the other language apis would be different, which would be a bit strange.

from scs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.