Comments (5)
Thank you for the log entry. I think we'll need to fix both or at least make sure the problem is not in (1) as well.
The exception from java has combination of path separators (/
and \
) but the log entry contains just \
so I think there is some wrong conversion of path separators in the java backend. It's possible that the issue (1) is only in the error handling part which could explain why the previous version worked but we should look in to it to make sure.
java.lang.RuntimeException: java.io.FileNotFoundException: [C:\Users\ROEL](file:///C:/Users/ROEL)~1.VER\AppData\Local\Temp\tmpmg5yuobe.h2oframe2Convert.csv
from h2o-3.
Thank you for taking time to pinpoint the issue. Unfortunately, I don't have Windows machine so I have just 2 untested hypotheses:
(1) unexpected character in the path,
(2) two processes trying to open the same file (which is supported on unix-like systems but not on Windows).
If it's just the (1), could you provide us with part of the h2o log? I'm interested in log entry ExportFiles processing (SOME_PATH)
.
e.g.
01-31 10:35:21.528 127.0.0.1:54321 3076 5715826-38 INFO water.default: ExportFiles processing (/tmp/iris.csv)
If that is the only problem, you could workaround it by adding something like the following to the top of your script/jupyter notebook (just make sure the path exists).
import tempfile
tempfile.tempdir = "C:\\tmp\\"
If it's the (2), we will need to fix creating the temporary file. It should be a simple thing to fix. I think something like the following would do. cc @wendycwong
--- a/h2o-py/h2o/frame.py
+++ b/h2o-py/h2o/frame.py
@@ -1970,12 +1970,16 @@ class H2OFrame(Keyed, H2ODisplay):
if can_use_pandas() and use_pandas:
import pandas
if (can_use_datatable()) or (can_use_polars() and can_use_pyarrow()): # can use multi-thread
- with tempfile.NamedTemporaryFile(suffix=".h2oframe2Convert.csv") as exportFile:
+ exportFile = tempfile.NamedTemporaryFile(suffix=".h2oframe2Convert.csv", delete=False)
+ try:
+ exportFile.close()
h2o.export_file(self, exportFile.name, force=True)
if can_use_datatable(): # use datatable for multi-thread by default
return self.convert_with_datatable(exportFile.name)
elif can_use_polars() and can_use_pyarrow(): # polar/pyarrow if datatable is not available
return self.convert_with_polars(exportFile.name)
+ finally:
+ os.unlink(exportFile.name)
warnings.warn("converting H2O frame to pandas dataframe using single-thread. For faster conversion using"
" multi-thread, install datatable (for Python 3.9 or lower), or polars and pyarrow "
"(for Python 3.10 or above).", H2ODependencyWarning)
You can patch you h2o library using that code but it might get little more involved. If it's just the (2) I think we could manage to release the fix in the upcoming major release (likely within the next month). If the problem is in (1) as well we would probably require your help in providing us with the line from the log.
from h2o-3.
Hi @tomasfryda
Thanks for your reponse.
Here is that part of the logs:
01-31 10:18:43.312 127.0.0.1:54321 31396 8557915-20 INFO water.default: ExportFiles processing (C:\Users\ROEL~1.VER\AppData\Local\Temp\tmpmg5yuobe.h2oframe2Convert.csv)
01-31 10:18:43.314 127.0.0.1:54321 31396 8557915-20 WARN water.default: File C:\Users\ROEL~1.VER\AppData\Local\Temp\tmpmg5yuobe.h2oframe2Convert.csv exists, but will be overwritten!
01-31 10:18:43.325 127.0.0.1:54321 31396 FJ-1-7 ERROR water.default:
java.lang.RuntimeException: java.io.FileNotFoundException: C:\Users\ROEL~1.VER\AppData\Local\Temp\tmpmg5yuobe.h2oframe2Convert.csv (The process cannot access the file because it is being used by another process)
Sounds like it might be (2) rather?
from h2o-3.
@tomasfryda Do you have any workaround for the fix.
from h2o-3.
@kalaiselvan263 Not yet. I think the modification I suggested (#16045 (comment)) would work but I don't have a windows machine to test it on.
You would need to find where the h2o package is installed and navigate to file frame.py
. On macOS this gives me the path to the file in python3 which has the h2o installed: import sysconfig; print(sysconfig.get_paths()["purelib"]+"/h2o/frame.py")
and I think it would work on Windows as well (you'd just need to change /
to \
).
If that wouldn't work you can change the exportFile
to some predefined path that would not contain any special characters, e.g.:
import random
exportFile = "C:\\tmp\\h2o_tempfile_{}.csv".format(random.randint(0,1e8))
It's not perfect and with this change there could be issues with multiple users trying to do the same thing at the same time but the probability of that is pretty low (1e-8) and on Windows you'd be more likely to end up with the same error The process cannot access the file because it is being used by another process
so you'd just have to retry.
from h2o-3.
Related Issues (20)
- Rename loglikelihood to negative_loglikelihood when it actually means the -log(likelihood)
- xgboost extension fails to initialize on JDK 17 due to attempt to use reflection to load native library HOT 2
- UpliftDRF MLI - Implement Shapley values
- The explain function is not working with UpliftDRF model
- Reimplement the explain function to support uplift models
- h2o 3.44.0.3 does not support JDK/Java 21 HOT 1
- Address CVE-2023-35116 in h2o-steam.jar
- Add newer R versions on jenkins for automated tests
- Improve perRow metric calculation
- StackedEnsemble: Error reading MOJO JSON: Object not supported
- Update warning regarding multithreading in H2OFrame.as_data_frame
- Increase run time for Java 8 Core unit HOT 1
- CVEs for Sparkling Water CVE-2023-2976 HOT 1
- CVEs for Sparkling Water CVE-2023-36478 HOT 1
- fix intermittent test failure: Java 8 AutoML JUnit / ai.h2o.automl.AutoMLTest.test_algos_have_default_parameters_enforcing_reproducibility
- One-Class Classification to detect anomaly or not HOT 1
- Plotting H2O Decision Trees in R HOT 2
- Upgrade org.apache.commons:commons-compress to address CVE-2024-26308
- fix pyunit_ice_plot.py error.
- DRF more distribution types support
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from h2o-3.