Comments (11)
I just reverted the commit that I think is likely causing the issue. Did that fix it?
from msieve_nfsathome.
That failed. The memory usage started going up again after multiplying 2425308 relations
from msieve_nfsathome.
➜ /home/nyancat/Tools/msieve_nfsathome git:(75ff92a) ✗ git bisect bad
75ff92a5c72f572c03ae3b7e201e2b92ec121837 is the first bad commit
commit 75ff92a5c72f572c03ae3b7e201e2b92ec121837
Author: Greg Childers <[email protected]>
Date: Sun Aug 14 12:52:27 2022 -0500
Add OpenMP support for sqrt
Makefile | 4 ++++
common/driver.c | 10 ++++++++++
gnfs/relation.c | 4 +++-
gnfs/sqrt/sqrt.c | 1 +
gnfs/sqrt/sqrt_a.c | 24 +++++++++++++++++++-----
include/msieve.h | 4 ++++
6 files changed, 41 insertions(+), 6 deletions(-)
Seems like the leak goes all the way back to the day sqrt became multithreaded
I'll mess with the code a bit to see what might be causing the leak
(Everything below is tested on commit 75ff92a)
OMP=0: Good
OMP=1 with every #pragma in sqrt_a.c removed: Good
else {
/* recursively compute the product of the first
half and the last half of the relations */
uint32 mid = (index1 + index2) / 2;
#pragma omp parallel
#pragma omp single
{
#pragma omp task
multiply_relations(prodinfo, index1, mid, &prod1);
#pragma omp task
multiply_relations(prodinfo, mid + 1, index2, &prod2);
#pragma omp taskwait
}
}
Simply having these #pragma causes the leak
from msieve_nfsathome.
GMP calls do reallocations as needed internally. Looks like there's still an issue with malloc and OpenMP as described in this thread from over a decade ago.
https://gmplib.org/list-archives/gmp-discuss/2009-July/003852.html
If you are so inclined, you could try using Hoard to see if it resolves the issue.
On the other hand, the real issue is that it took 11 dependencies to find the factors. That should (almost) never happen and is surely a bug. I'll look at this more tomorrow, but you can try redoing both LA and sqrt with the latest commit to see if it still requires so many dependencies.
from msieve_nfsathome.
Thanks for the info. I'll rerun everything starting from filtering on latest and see how that goes
from msieve_nfsathome.
Ran Valgrind on latest. Seems like there are two main kinds of leaks:
1.
==21437== 84,948,160 bytes in 482,660 blocks are possibly lost in loss record 264 of 264
==21437== at 0x4838751: malloc (vg_replace_malloc.c:307)
==21437== by 0x63557C7: ___kmp_allocate (in /usr/lib64/libomp.so)
==21437== by 0x63DD825: __ompt_lw_taskteam_link(ompt_lw_taskteam_s*, kmp_info*, int, bool) (in /usr/lib64/libomp.so)
==21437== by 0x636DF07: __kmp_fork_call (in /usr/lib64/libomp.so)
==21437== by 0x63C8DD3: __kmp_GOMP_fork_call(ident*, int, unsigned int, unsigned int, void (*)(void*), void (*)(int*, int*, ...), int, ...) (in /usr/lib64/libomp.so)
==21437== by 0x63CE089: GOMP_parallel@@VERSION (in /usr/lib64/libomp.so)
==21437== by 0x430C84: mpz_poly_mul.constprop.0 (sqrt_a.c:170)
==21437== by 0x4314C0: multiply_relations (sqrt_a.c:356)
==21437== by 0x638F946: __kmp_invoke_task(int, kmp_task*, kmp_taskdata*) (in /usr/lib64/libomp.so)
==21437== by 0x638FB95: __kmpc_omp_task (in /usr/lib64/libomp.so)
==21437== by 0x63CDA94: GOMP_task@@VERSION (in /usr/lib64/libomp.so)
==21437== by 0x4303C9: multiply_relations._omp_fn.0 (sqrt_a.c:347)
==21437== 30,783,104 bytes in 174,904 blocks are possibly lost in loss record 257 of 264
==21437== at 0x4838751: malloc (vg_replace_malloc.c:307)
==21437== by 0x63557C7: ___kmp_allocate (in /usr/lib64/libomp.so)
==21437== by 0x63DD825: __ompt_lw_taskteam_link(ompt_lw_taskteam_s*, kmp_info*, int, bool) (in /usr/lib64/libomp.so)
==21437== by 0x636DF07: __kmp_fork_call (in /usr/lib64/libomp.so)
==21437== by 0x63C8DD3: __kmp_GOMP_fork_call(ident*, int, unsigned int, unsigned int, void (*)(void*), void (*)(int*, int*, ...), int, ...) (in /usr/lib64/libomp.so)
==21437== by 0x63CE089: GOMP_parallel@@VERSION (in /usr/lib64/libomp.so)
==21437== by 0x4314A7: multiply_relations (sqrt_a.c:344)
==21437== by 0x638F946: __kmp_invoke_task(int, kmp_task*, kmp_taskdata*) (in /usr/lib64/libomp.so)
==21437== by 0x638FB95: __kmpc_omp_task (in /usr/lib64/libomp.so)
==21437== by 0x63CDA94: GOMP_task@@VERSION (in /usr/lib64/libomp.so)
==21437== by 0x4303C9: multiply_relations._omp_fn.0 (sqrt_a.c:347)
==21437== by 0x63CE0AD: GOMP_parallel@@VERSION (in /usr/lib64/libomp.so)
I'll be honest, that did not provide a clue on where the leak came from...
from msieve_nfsathome.
LD_PRELOAD=/home/nyancat/bin/libhoard.so msieve -nc3 -i N.txt -v -t 6
didn't help. In fact, the memory usage became 1.5x, which is odd
I'll try linking it in during compilation after LA finishes
That didn't work either. Looks like Hoard can't save me this time.
I've had a way better luck on the new run and had it finished on dependency 1.
This led to an...interesting discovery. It took 65 seconds on try_openmp and 80 seconds on master. Multithread only won by 20%, which makes me wonder whether we should just remove OMP from Square Root.
Okay...what? Removing the #pragma actually speed it up by another 10%? That's new.
from msieve_nfsathome.
I reran your postprocessing twice and another larger number once and got the factors on the first or second dependency each time, so reverting that commit fixed the corruption that was causing the need for many dependencies. Now I just have to figure out what is wrong with that commit...
from msieve_nfsathome.
Even with the cycle issue fixed and assume that each dependency has a 2/3 chance of yielding a non-trivial factor, there is still a 1/3^5 =1/243 chance that it would exhaust 16 GB of memory.
I guess a compromise could be to add an build/runtime option to enable/disable OpenMP specifically during Square Root, otherwise I risk triggering a OoM every time I run square root.
from msieve_nfsathome.
Another option would be to run msieve separately for each dependency until the number is fully factored.
./msieve -nc3 1,1 -t 4
if number isn't fully factored,
./msieve -nc3 2,2 -t 4
and so on until the number is fully factored.
from msieve_nfsathome.
- The memory usage, just per dependency, is 16x compared to master. It is possible that the leak on larger numbers would exhaust the RAM before even one dependency can finish
- That'll be tough to integrate with YAFU, as it uses the msieve library (nfs_find_factors)
from msieve_nfsathome.
Related Issues (3)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from msieve_nfsathome.