The code for calculating CIs runs very slow. This is noted in the documentation too. T

Parallelize CI calculation about openmx HOT 5 OPEN

mronkko commented on June 23, 2024

Parallelize CI calculation

from openmx.

Comments (5)

mcneale commented on June 23, 2024

Thank you very much for the suggestion! There are different places where parallel code is already used in OpenMx. Of particular note, raw data analyses essentially divide up the rows into as many chunks as there are processors and calculates the likelihoods separately for each chunk then gathers them at the end. This level of parallel isn't as embarrassingly parallel as the CIs, but it does speed up both the initial fitting and. the subsequent model fits to find the CIs. It is also placed where calculations are likely to be most demanding - much more so than, e.g., fitting to covariance matrices and means. I think parallel CIs would have to lose this feature so as not to parallel things that are already in parallel deeper in the calculations. That might, however, be faster depending on number of CIs and number of processors, and perhaps we could organize the lower level parallel to switch off for higher level parallel application. Thanks again - we are interested in improving both the flexibility of the code and its performance.

from openmx.

mronkko commented on June 23, 2024

That makes a lot of sense. In my use case, we are analyzing covariance matrices and there was no indication that more than one core was used in the computation. Parallelization of CI calculation would be really useful because we could move the calculation to a server with 128 cores.

Depending on how parallelization is implemented in OpenMX, you might not need to change the existing code that much because the parallel computing framework might take care of the potential problems in calling parallelized code from code that is already parallelized.

from openmx.

RMKirkpatrick commented on June 23, 2024

The relevant code is here

That's frontend R code. The code actually relevant to parallel computing is going to be backend C++ code.

You are correct, though, that parallelizing confidence limits (or confidence intervals, when using the Wu-Neale adjustment) would be a better use of multithreading in most cases, due to the coarser level of granularity.

In my use case, we are analyzing covariance matrices and there was no indication that more than one core was used in the computation.

Two questions... First, were you running OpenMx under Windows, or a CRAN build of OpenMx under macOS? Both of those cases lack multithreading support. Second, which optimizer were you using? SLSQP is supposed to know how to divide its computation of the gradient elements among multiple threads.

from openmx.

mronkko commented on June 23, 2024

A bit of background: I have not really used OpenMX for years myself, but this was a question from a student. He is running OpenMX indirectly throug metaSEM package. When we had a meeting today, he asked why the CI calculation was so slow. We took a look at the code to figure it out.

Now answering the questions:

First, were you running OpenMx under Windows, or a CRAN build of OpenMx under macOS? Both of those cases lack multithreading support.

The student used Windows and I use the CRAN version on mac. The 128 core server runs Linux.

Second, which optimizer were you using? SLSQP is supposed to know how to divide its computation of the gradient elements among multiple threads.

We are using whatever is the default. I believe this to be SLSQP.

from openmx.

RMKirkpatrick commented on June 23, 2024

A bit of background: I have not really used OpenMX for years myself, but this was a question from a student. He is running OpenMX indirectly throug metaSEM package. When we had a meeting today, he asked why the CI calculation was so slow. We took a look at the code to figure it out.

Note that OpenMx is carrying out (at least) two numerical optimizations for every confidence interval requested. So, suppose you request a confidence interval for one parameter. OpenMx would then, at minimum, do three numerical searches at runtime: one to find the maximum-likelihood estimate, one to find the lower limit of the confidence interval, and one to find the upper limit of the confidence interval.

The student used Windows and I use the CRAN version on mac. The 128 core server runs Linux.

OK, then neither you nor the student had a build of OpenMx compiled for parallel computing. However, the Linux-powered server probably is running a multithreaded OpenMx build.

We are using whatever is the default. I believe this to be SLSQP.

SLSQP is the on-load default, and is able to parallelize its computation of the objective function's gradient elements.

Again, I agree that parallelizing over confidence intervals or confidence limits, rather than over gradient elements (or over subsets of the dataset, in the raw-data case) is a better use of parallel computing. We would like to implement it sometime in the future, but it is not a high priority at present.

from openmx.

Parallelize CI calculation about openmx HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent