Comments (12)
a / N + b / N = (a + b) / N
If N is very large, it is indeed better to do it outside. But sure you can do it outside the loop to have fewer divisions.
Okey, I misunderstood.
from proxsuite.
Could you share your setup and a minimal example?
How do you set up the vectorization? Did you use -march=native
when compiling?
Did you try the Python interface with conda
?
from proxsuite.
Could you share your setup and a minimal example? How do you set up the vectorization? Did you use
-march=native
when compiling? Did you try the Python interface withconda
?
I only used the C++ interface. I didn't try the Python interface.
I first installed libsimde-dev_0.7.2-4_all.deb, and then installed the ProxSuite according to the document https://simple-robotics.github.io/proxsuite/md_doc_5_installation.html.
1. mkdir build && cd build
2. cmake .. -DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=OFF
3. make
4. make install
I did not build the Python interface.
All my QP problems are quite large, and the data can't be directly exported to a file. Can you provide an QP problem example of C++ to show the difference between before and after enabling the vectorization? I think I can try it on my computer.
from proxsuite.
We have limited time to handle this issue. It will simplify our work if you can generate a random QP according to your specifities in C++? Could you please?
from proxsuite.
We have limited time to handle this issue. It will simplify our work if you can generate a random QP according to your specifities in C++? Could you please?
Please try the following code
#include <chrono>
#include <proxsuite/proxqp/dense/dense.hpp>
#include <proxsuite/proxqp/utils/random_qp_problems.hpp> // used for generating a random convex qp
using namespace proxsuite::proxqp;
using T = double;
int main() {
auto time_begin = std::chrono::system_clock::now();
for (T sparsity_factor = 0.1; sparsity_factor < 0.5; sparsity_factor += 0.1) {
for (dense::isize dim = 100; dim < 200; dim += 20) {
dense::isize n_eq(dim * 3);
dense::isize n_in(dim * 3);
T strong_convexity_factor(1.e-2);
dense::Model<T> qp_random = utils::dense_strongly_convex_qp(
dim, n_eq, n_in, sparsity_factor, strong_convexity_factor);
dense::QP<T> qp(dim, n_eq, n_in);
qp.settings.max_iter = 10000;
qp.settings.max_iter_in = 1000;
qp.settings.eps_abs = 1e-5;
qp.settings.eps_rel = 0;
qp.init(qp_random.H, qp_random.g, qp_random.A, qp_random.b, qp_random.C,
qp_random.u, qp_random.l);
qp.solve();
}
std::cout << "sparsity_factor: " << sparsity_factor << std::endl;
}
auto time_end = std::chrono::system_clock::now();
std::cout << "Time consumption(s): "
<< std::chrono::duration_cast<std::chrono::milliseconds>(time_end -
time_begin)
.count() /
1000.0
<< std::endl;
return 0;
}
Whether the option "-DBUILD_WITH_VECTORIZATION_SUPPORT=OFF" is added during compiling ProxSuite or not, the time consumption of this program is almost 8.5s.
The cpu of my computer is Intel i7-12700K, and the OS is Ubuntu 20.04.
from proxsuite.
@RobustControl The way you proceed with the timings of the loops is not correct.
You should only time the qp.solve()
.
Small, how do you compile this tiny example? Could you share your lines or cmake file?
from proxsuite.
Try the following code:
#include <proxsuite/proxqp/dense/dense.hpp>
#include <proxsuite/proxqp/sparse/sparse.hpp> // get the sparse API of ProxQP
#include <proxsuite/proxqp/utils/random_qp_problems.hpp> // used for generating a random convex qp
using namespace proxsuite::proxqp;
using T = double;
int main() {
double solve_time = 0.0;
double setup_time = 0.0;
for (T sparsity_factor = 0.1; sparsity_factor < 0.5; sparsity_factor += 0.1) {
for (dense::isize dim = 100; dim < 200; dim += 20) {
dense::isize n_eq(dim * 3);
dense::isize n_in(dim * 3);
T strong_convexity_factor(1.e-2);
dense::Model<T> qp_random = utils::dense_strongly_convex_qp(
dim, n_eq, n_in, sparsity_factor, strong_convexity_factor);
dense::QP<T> qp(dim, n_eq, n_in);
qp.settings.max_iter = 10000;
qp.settings.max_iter_in = 1000;
qp.settings.eps_abs = 1e-5;
qp.settings.eps_rel = 0;
qp.init(qp_random.H, qp_random.g, qp_random.A, qp_random.b, qp_random.C,
qp_random.u, qp_random.l);
qp.solve();
solve_time += qp.results.info.solve_time;
setup_time += qp.results.info.setup_time;
}
std::cout << "sparsity_factor: " << sparsity_factor << std::endl;
}
std::cout << "Setup Time consumption(dense): " << setup_time / 1e6 << "s"
<< std::endl
<< "Solve Time consumption(dense): " << solve_time / 1e6 << "s"
<< std::endl;
solve_time = 0.0;
setup_time = 0.0;
for (T sparsity_factor = 0.1; sparsity_factor < 0.5; sparsity_factor += 0.1) {
for (dense::isize dim = 100; dim < 200; dim += 20) {
isize n_eq(dim * 3);
isize n_in(dim * 3);
T sparsity_factor = 0.15; // level of sparsity
T conditioning = 10.0; // conditioning level for H
auto H = ::proxsuite::proxqp::utils::rand::sparse_positive_definite_rand(
dim, conditioning, sparsity_factor);
auto A = ::proxsuite::proxqp::utils::rand::sparse_matrix_rand<T>(
n_eq, dim, sparsity_factor);
auto C = ::proxsuite::proxqp::utils::rand::sparse_matrix_rand<T>(
n_in, dim, sparsity_factor);
auto g = ::proxsuite::proxqp::utils::rand::vector_rand<T>(dim);
auto x_sol = ::proxsuite::proxqp::utils::rand::vector_rand<T>(dim);
auto b = A * x_sol;
auto l = C * x_sol;
auto u = (l.array() + 10).matrix().eval();
proxsuite::proxqp::sparse::QP<T, isize> qp(H.cast<bool>(), A.cast<bool>(),
C.cast<bool>());
qp.init(H, g, A, b, C, u, l);
qp.solve();
solve_time += qp.results.info.solve_time;
setup_time += qp.results.info.setup_time;
}
std::cout << "sparsity_factor: " << sparsity_factor << std::endl;
}
std::cout << "Setup Time consumption(sparse): " << setup_time / 1e6 << "s"
<< std::endl
<< "Solve Time consumption(sparse): " << solve_time / 1e6 << "s"
<< std::endl;
return 0;
}
I modified the proxsuite/examples/cpp/overview-simple.cpp file and used the command "g++ -std=c++17 examples/cpp/overview-simple.cpp -o overview-simple $(pkg-config --cflags proxsuite)" in the official document to compile the file.
Here is a typical result (whether or not the vectorization is enabled during install prosuite)
from proxsuite.
You have missed the -O3 -march=native
to enable the vectorization using your CPU config.
@fabinsch will provide more details soon.
from proxsuite.
Hey @RobustControl thanks for providing this benchmark file. I suggest adding another loop over N times init and solve to not measure the latency. Also, I modified it in order to be compatible with the newest version of proxsuite (v0.2.0) and to also take the sparsity into account in the sparse (you were overwriting it each time).
The timings are the following on my Intel i7-11850H and ubuntu 20.04:
sparsity_factor: 0.1
Setup Time consumption(dense): 0.00246659s
Solve Time consumption(dense): 0.00977696s
sparsity_factor: 0.2
Setup Time consumption(dense): 0.00414675s
Solve Time consumption(dense): 0.0160001s
sparsity_factor: 0.3
Setup Time consumption(dense): 0.00586993s
Solve Time consumption(dense): 0.0223937s
sparsity_factor: 0.4
Setup Time consumption(dense): 0.00758707s
Solve Time consumption(dense): 0.0289477s
sparsity_factor: 0.1
Setup Time consumption(sparse): 0.000428303s
Solve Time consumption(sparse): 0.169426s
sparsity_factor: 0.2
Setup Time consumption(sparse): 0.00111007s
Solve Time consumption(sparse): 0.33179s
sparsity_factor: 0.3
Setup Time consumption(sparse): 0.00204471s
Solve Time consumption(sparse): 0.484549s
sparsity_factor: 0.4
Setup Time consumption(sparse): 0.00323929s
Solve Time consumption(sparse): 0.651238s
to compile the file, I used
g++ -O3 -march=native -DNDEBUG -DPROXSUITE_VECTORIZE -std=gnu++17 timings.cpp -o timings $(pkg-config --cflags proxsuite)
I will document this in our readme file, thanks for pointing out that clear instructions were missing. Only using the -DPROXSUITE_VECTORIZE
option is not enough, you need to tell the compiler to use the corresponding instruction set for your CPU, see also here.
and the file timings.cpp has the following content:
#include <proxsuite/proxqp/dense/dense.hpp>
#include <proxsuite/proxqp/sparse/sparse.hpp> // get the sparse API of ProxQP
#include <proxsuite/proxqp/utils/random_qp_problems.hpp> // used for generating a random convex qp
using namespace proxsuite::proxqp;
using T = double;
int main() {
double N = 100;
double counter = 0.0; // outer loop
double solve_time = 0.0;
double setup_time = 0.0;
for (T sparsity_factor = 0.1; sparsity_factor < 0.5; sparsity_factor += 0.1) {
for (dense::isize dim = 100; dim < 200; dim += 20) {
dense::isize n_eq(dim * 3);
dense::isize n_in(dim * 3);
T strong_convexity_factor(1.e-2);
dense::Model<T> qp_random = utils::dense_strongly_convex_qp(
dim, n_eq, n_in, sparsity_factor, strong_convexity_factor);
for (int i = 0; i < N; i++) {
dense::QP<T> qp(dim, n_eq, n_in);
qp.settings.max_iter = 10000;
qp.settings.max_iter_in = 1000;
qp.settings.eps_abs = 1e-5;
qp.settings.eps_rel = 0;
qp.init(qp_random.H, qp_random.g, qp_random.A, qp_random.b, qp_random.C,
qp_random.l, qp_random.u);
qp.solve();
solve_time += qp.results.info.solve_time / N;
setup_time += qp.results.info.setup_time / N;
}
counter += 1.0;
}
std::cout << "sparsity_factor: " << sparsity_factor << std::endl;
std::cout << "Setup Time consumption(dense): " << setup_time / (1e6 * counter) << "s"
<< std::endl
<< "Solve Time consumption(dense): " << solve_time / (1e6 * counter) << "s"
<< std::endl;
counter = 0.0;
}
solve_time = 0.0;
setup_time = 0.0;
for (T sparsity_factor = 0.1; sparsity_factor < 0.5; sparsity_factor += 0.1) {
for (dense::isize dim = 100; dim < 200; dim += 20) {
isize n_eq(dim * 3);
isize n_in(dim * 3);
T conditioning = 10.0; // conditioning level for H
auto H = ::proxsuite::proxqp::utils::rand::sparse_positive_definite_rand(
dim, conditioning, sparsity_factor);
auto A = ::proxsuite::proxqp::utils::rand::sparse_matrix_rand<T>(
n_eq, dim, sparsity_factor);
auto C = ::proxsuite::proxqp::utils::rand::sparse_matrix_rand<T>(
n_in, dim, sparsity_factor);
auto g = ::proxsuite::proxqp::utils::rand::vector_rand<T>(dim);
auto x_sol = ::proxsuite::proxqp::utils::rand::vector_rand<T>(dim);
auto b = A * x_sol;
auto l = C * x_sol;
auto u = (l.array() + 10).matrix().eval();
for (int i = 0; i < N; i++) {
proxsuite::proxqp::sparse::QP<T, isize> qp(H.cast<bool>(), A.cast<bool>(),
C.cast<bool>());
qp.settings.max_iter = 10000;
qp.settings.max_iter_in = 1000;
qp.settings.eps_abs = 1e-5;
qp.settings.eps_rel = 0;
qp.init(H, g, A, b, C, l, u);
qp.solve();
solve_time += qp.results.info.solve_time / N;
setup_time += qp.results.info.setup_time / N;
}
counter += 1.0;
}
std::cout << "sparsity_factor: " << sparsity_factor << std::endl;
std::cout << "Setup Time consumption(sparse): " << setup_time / (1e6 * counter) << "s"
<< std::endl
<< "Solve Time consumption(sparse): " << solve_time / (1e6 * counter) << "s"
<< std::endl;
counter = 0.0;
}
return 0;
}
Note after discussing with @Bambade: your specific setup of having n_constraint = 3 * n_vars
is not favorable for the current implementation of the sparse backend. If you have fewer constraints, like n_constraint = 0.1 * n_vars
, using the sparse backend gets more interesting.
from proxsuite.
I will close this issue as it seems to be solved.
from proxsuite.
fabinsch
Hello @fabinsch Thank you for your detailed reply! I ran your example and got the same result.
But I have a doubt about your test code, that is why divide by N is placed in the for loop? I think you want to write the code as follow.
for (int i = 0; i < N; i++) {
proxsuite::proxqp::sparse::QP<T, isize> qp(H.cast<bool>(), A.cast<bool>(),
C.cast<bool>());
qp.settings.max_iter = 10000;
qp.settings.max_iter_in = 1000;
qp.settings.eps_abs = 1e-5;
qp.settings.eps_rel = 0;
qp.init(H, g, A, b, C, l, u);
qp.solve();
solve_time += qp.results.info.solve_time;
setup_time += qp.results.info.setup_time;
}
solve_time /= N;
solve_time /= N;
I will try to enable vectorization the in our project later.
from proxsuite.
a / N + b / N = (a + b) / N
If N is very large, it is indeed better to do it outside. But sure you can do it outside the loop to have fewer divisions.
from proxsuite.
Related Issues (20)
- does proxqp optimize the case when objective matrix(H) is dense but inequality constraint matrix(C) is very sparse, HOT 3
- How to specify a CPU time limit like qpOASES HOT 14
- Time consumption VS OSQP(with MPC problem) HOT 10
- Add support of nonconvex QPs HOT 1
- Build failure on ROS Buildfarm for Iron Binaries HOT 8
- ProxQP on old Clang/GCC HOT 8
- Potential typo when computing `primal_feasibility_in_rhs_0` in `global_primal_residual`? HOT 2
- Add a wheel for Apple M1 and Python 3.11 HOT 3
- Include failure: couldn't find "proxsuite/config.hpp" HOT 1
- Question about installation from source. HOT 4
- Too strict symmetric check in model.is_valid() HOT 2
- Any suggestion for setting params for float? HOT 2
- Building error HOT 20
- Detect infeasible QP problem before solving it HOT 2
- Dealing with CPU/GPU in QPFunctionFn's backward
- Type error about QPFunction with structurally infeasible HOT 5
- Question about the support of parallel HOT 1
- Fail to install by using python 3.12.2 and pip HOT 4
- Take Advantage of warm start after compute_backward HOT 3
- Add template instantiation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from proxsuite.