Giter Club home page Giter Club logo

Comments (8)

emanuele avatar emanuele commented on September 15, 2024

Conversely, if the rectangular cost matrix has more rows than columns, e.g. n_rows=20, n_cols=10, then lapmod() does not segfault but the result is puzzling: the optimized final cost is "infinite" and the assignment is wrong and does not match at all with that of lapjv() - which correctly works in this case too.

from lap.

gatagat avatar gatagat commented on September 15, 2024

Yes, you are completely right lapjv can deal with non-square matrices but lapmod cannot.

This has been already brought up once, see #4 (comment)

from lap.

gatagat avatar gatagat commented on September 15, 2024

What would be ideal is to make lapmod(..., extend_cost=True) analogous to lapjv(..., extend_cost=True) as that is what everybody expects.

I'd be happy to include a PR that adds this.

from lap.

emanuele avatar emanuele commented on September 15, 2024

First of all, thank you for the prompt answer and the code. I tried to use the code you referenced, but it does not seem to work because the results of lapjv and lapmod differ:

import numpy as np
from lap import lapjv, lapmod
from lapmod import prepare_sparse_cost
from scipy.sparse import coo_matrix, csr_matrix


def lapmod_rectangular(cost_coo, cost_limit=1e3):
    cc = cost_coo.data
    ii = cost_coo.row
    jj = cost_coo.col
    shape = cost_coo.shape
    n_rows = shape[0]
    cc, ii, kk = prepare_sparse_cost(shape, cc, ii, jj, cost_limit)
    ind1, ind0 = lapmod(len(ii)-1, cc, ii, kk, return_cost=False)
    ind1[ind1 >= shape[1]] = -1
    ind0[ind0 >= shape[0]] = -1
    ind1 = ind1[:shape[0]]
    ind0 = ind0[:shape[1]]
    return ind1, ind0


if __name__ == '__main__':
    np.random.seed(0)
    n_rows = 5
    n_cols = 10
    min_value = 0
    max_value = 10
    cost = np.random.randint(low=min_value, high=max_value, size=(n_rows, n_cols))
    print("cost matrix:")
    print(cost)
    cost_opt, x, y = lapjv(cost, extend_cost=True, return_cost=True)
    print("lapjv cost_opt: %s" % cost_opt)
    print("lapjv x: %s" % (x,))
    print("lapjv y: %s" % (y,))
    cost_coo = coo_matrix(cost)
    cost_limit = 1e3
    ind1, ind0 = lapmod_rectangular(cost_coo, cost_limit)
    print("lapmod ind1: %s" % (ind1,))
    print("lapmod ind0: %s" % (ind0,))
    print("lapmod cost_opt: %s" % cost[range(n_rows), ind1].sum())

### execution ###
cost matrix:
[[5 0 3 3 7 9 3 5 2 4]
 [7 6 8 8 1 6 7 7 8 1]
 [5 9 8 9 4 3 0 3 5 0]
 [2 3 8 1 3 3 3 7 0 1]
 [9 9 0 4 7 3 2 7 2 0]]
lapjv cost_opt: 1.0
lapjv x: [1 4 9 8 2]
lapjv y: [-1  0  4 -1  1 -1 -1 -1  3  2]
lapmod ind1: [8 9 7 3 6]
lapmod ind0: [-1 -1 -1  3 -1 -1  4  2  0  1]
lapmod cost_opt: 9

Am I doing something wrong?

I'm targeting (very) large (very) sparse (slightly) rectangular cost matrices, around 1Mx1M. I'm also having a look at the auction algorithm to understand whether it could address my problems or not. Would you suggest it in such a setting? Unfortunately, I cannot find a nice Python implementation of it, like your lapmod().

from lap.

gatagat avatar gatagat commented on September 15, 2024

Thanks for the test case, that helps a lot.

From a quick look it really seems there is some issue with the way the matrix gets extended in prepare_sparse_cost. I'll try to find out whats happening and let you know.

Concerning the auction algorithm, i am not aware of a nice python implementation.

from lap.

gatagat avatar gatagat commented on September 15, 2024

Ok, the way cost is extended in lapjv and the proposed extension for lapmod above differ from each other, which could in principle cause some issues and i am looking into it, but the issue is elsewhere.

Your usage of scipy.sparse.coo_matrix assumes that the zero values in your cost matrix are the ones that can be left out, which is wrong. All entries "missing" from the sparse cost matrix are assumed to be infinite (not zero). So in the test case this effectively excludes all originally zero entries from the lapmod solution.

If you modify the original cost as follows, you will get the cost that lapmod is using. For this modified cost matrix also lapjv will output an optimal solution with optimum of 9.

cost[cost == 0] = cost_limit

Does this make sense? Any ideas on how to clarify this in the docs?

from lap.

emanuele avatar emanuele commented on September 15, 2024

Yes! Sorry for the rookie mistake! Indeed avoiding zeros in the cost matrix solves the problem, even just by setting min_value = 1:

cost matrix:
[[6 1 4 4 8 4 6 3 5 8]
 [7 9 9 2 7 8 8 9 2 6]
 [9 5 4 1 4 6 1 3 4 9]
 [2 4 4 4 8 1 2 1 5 8]
 [4 3 8 3 1 1 5 6 6 7]]
lapjv cost_opt: 6.0
lapjv x: [1 8 6 7 4]
lapjv y: [-1  0 -1 -1  4 -1  2  3  1 -1]
lapmod ind1: [1 8 6 5 4]
lapmod ind0: [-1  0 -1 -1  4  3  2 -1  1 -1]
lapmod cost_opt: 6

(the two solutions of lapjv and lapmod are slighty different but equivalent in terms of cost).

About how to describe this issue in the docs: clearly, coo_matrix(cost) is not the way to get cc, ii, jj in a straighforward way, because of the handling of zeros as special values, i.e. as missing data that lapmod() will consider instead as infinite value, as you pointed out. Maybe adding WARNING: scipy.sparse.coo_matrix(cost) will not return the correct (cc, ii, jj) if cost has some zero values. in the docstring of prepare_sparse_cost()?. Alterntively, scipy.sparse.coo_matrix() should trigger a warning every time it is instanced from a dense matrix with some zero values. After all, those values will disappear and such behavior is not even documented in the documentation(!).

from lap.

gatagat avatar gatagat commented on September 15, 2024

Thanks for the input, I've updated the docs of prepare_sparse_cost.

from lap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.