Comments (5)
Hi @raf-antwerpen,
Thanks for posting this bug report. Would it be possible for you to share your exist PySRRegressor settings, and maybe more info about how you installed it, how you are running it (IPython or Jupyter or script), and maybe any info about CPU/memory usage during the fit? Also is it completely frozen, or is the hall of fame csv file still being updated?
Cheers,
Miles
from pysr.
If you are running it in a Jupyter notebook, could you also try running it from the command line (i.e., save it in a .py
file and execute it)? Sometimes Jupyter messes with parallelism so would be good to have this info.
from pysr.
Hi,
Thank you for your quick response.
Executing the script as a .py file solved the issue!
But for documentation and reporting purposes, here is some info on the bug in the Jupyter notebook.
I installed PySR using mamba and run it in a Jupyter notebook with python3.9.16. I tried many configurations of PySRRegressor settings, but even with the most minimal settings the bug occurs:
model = PySRRegressor(
populations=80,
niterations=80,
)
I am using datasets of size X=(7000,2) and y=(7000,).
During the first couple of minutes or running PySRRegressor, CPU usage ranges erratically between 100-400%. After a couple of minutes, this goes done to 1-1.5%. This is when the hall of fame csv stops updating as well. When running the script as a .py file, the CPU usage is a lot more consistent between 500-600% instead of the erratic pattern when I run the notebook.
Cheers,
Raf
from pysr.
Thanks for this info. Yeah Jupyter has some weird interactions with multiprocessing, even in standard Python libraries like PyTorch, but it could be something else. It's good you found a workaround but I'll leave this open as original the issue is still there.
I wonder if it could also be due to how text streams work differently on Jupyter. Maybe PyJulia is trying to write to stdout or read stdin, but Jupyter isn't letting it or something, and so it's just stuck waiting to write. Perhaps the following line:
stdin_reader = watch_stream(stdin)
could be commented out, so that the search (and therefore PyJulia) stops watching stdin? (Along with the other lines interacting with stdin_reader)
That would certainly explain this issue if that solves it, because it only starts checking stdin a few steps into the search process, so it would make sense how the CPU goes up to 400% then down to 1%. The 1% just being from Julia waiting to read stdin...
If you have some time to try this fix out, you can follow the instructions here: https://astroautomata.com/PySR/backend/ for modifying the PySR backend (which is SymbolicRegression.jl), and implement those changes. It would be interesting to hear if that stdin issue solves it.
from pysr.
One other thing to try is multithreading=False
, which will switch to the multiprocessing mode.
from pysr.
Related Issues (20)
- [Feature]: Composite regressors
- [Feature]: Warn if better linear model available HOT 4
- [BUG]: Using dimensional constraints result in "UndefVarError: `k` not defined" error HOT 4
- Update Head Worker Occupation Warning HOT 3
- Command '['julia', '-e', '...']' returned non-zero exit status 1 HOT 2
- [BUG]: Encounter segmentation faults in running the toy example on RHEL machines HOT 14
- Windows Julia Install - could not load library "libpcre2-8" The specified module could not be found. HOT 19
- [BUG]: PyTorch module does not preserve dimensions of input tensor
- [Feature]: Recover up-to-date expression from exported PyTorch model (SingleSymPyModule)
- [Feature]: Select Julia version at first import.
- [Feature]: TensorBoard support
- Search-replace old code snippets with new syntax
- Error Running PySR with Julia in Windows 10 (Anaconda Environment)
- [BUG]: Can't pickle greater: attribute lookup greater on __main__ failed HOT 3
- [Feature] Should warn if populations < procs
- [BUG]: Compatibility Issue with Julia Version When Using PySR HOT 6
- [DOCS] Import sections of JuliaCall docs
- [Feature]: Warn user about low `niterations`
- [Feature]: Supplying initial guess to Symbolic Regression
- Prior of the model
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pysr.