Comments (7)
@akankshadash This isn't really enough information to troubleshoot your problem. I assume you are attempting to run the predict-next-purchase
notebook based on the code. This error appears to be coming from dask/pandas. What versions of the following libraries are you using?
- pandas
- dask
I do not get this error when I run the notebook.
from open_source_demos.
@akankshadash This isn't really enough information to troubleshoot your problem. I assume you are attempting to run the
predict-next-purchase
notebook based on the code. This error appears to be coming from dask/pandas. What versions of the following libraries are you using?
- pandas
- dask
I do not get this error when I run the notebook.
@thehomebrewnerd yes working on predict next purchase
pandas - 1.1.5
dask- 3.0
from open_source_demos.
I'm still not sure what your exact issue might be, but I don't think version 3.0 if a valid dask version. When I run pip install -r requirements.txt
in my environment I get pandas version 1.1.5
and dask version 2022.1.1
, and things still load fine for me. Can you double-check your dask version and try installing 2022.1.1
?
Note, there are some other package version conflicts that appear when I install the requirements, and those could potentially cause other issues in this notebook.
We are currently in process of updating all our demo notebooks to work with newer versions of libraries, namely pandas, Featuretools and EvalML. I just updated this particular notebook last week, but those changes have not yet been merged in to main
.
from open_source_demos.
I'm still not sure what your exact issue might be, but I don't think version 3.0 if a valid dask version. When I run
pip install -r requirements.txt
in my environment I get pandas version1.1.5
and dask version2022.1.1
, and things still load fine for me. Can you double-check your dask version and try installing2022.1.1
?Note, there are some other package version conflicts that appear when I install the requirements, and those could potentially cause other issues in this notebook.
We are currently in process of updating all our demo notebooks to work with newer versions of libraries, namely pandas, Featuretools and EvalML. I just updated this particular notebook last week, but those changes have not yet been merged in to
main
.
@thehomebrewnerd I have 8gb ram so is it also a contributor to the issue?
from open_source_demos.
@akankshadash I don't believe memory issues would cause this particular error. Can you load other data into a dask dataframe with the same type of dd.read_csv
command, or is the error specific to this dataset?
If you cannot load other data, that likely points to a problem with your environment.
from open_source_demos.
Oh, wait. I just spotted something in your code that is the source of the problem. The problem is that you are specifying a wildcard in your filename, pointing to a file that does not exist. You do not need the -*
included in your filenames. None of the files have a -
at the end, and the wildcard is not needed in this case, since you are just reading single files.
If you change your read commands to this, I think you should be fine:
order_products = dd.concat([dd.read_csv(os.path.join(data_dir, "order_products__prior.csv"), blocksize=blocksize),
dd.read_csv(os.path.join(data_dir, "order_products__train.csv"), blocksize=blocksize)])
orders = dd.read_csv(os.path.join(data_dir, "orders.csv"), blocksize=blocksize)
departments = dd.read_csv(os.path.join(data_dir, "departments.csv"), blocksize=blocksize)
products = dd.read_csv(os.path.join(data_dir, "products.csv"), blocksize=blocksize)'
from open_source_demos.
after using "-*" included in my file name i came out of the previous error ..after going through various suggestion I used that ,though my old error was solved I landed on this new error
from open_source_demos.
Related Issues (20)
- File Not Found Error HOT 11
- create features on one dataset HOT 6
- Hotfix: Broken link HOT 1
- Update notebook to use Featuretools Dask Implementation HOT 1
- AssertionError: target columns not found HOT 2
- fails to install for windows 10 computer and python 3.5.2 HOT 1
- Broken link to data HOT 1
- Credit Card Churn notebook does not have print outs
- where_primitives that are also not specified under agg_primitives don't get used and hence result in warnings.warn(warning_msg, UnusedPrimitiveWarning) HOT 1
- Update demos with Featuretools 1.0
- Problem with non-ASCII character in csv HOT 1
- NameError: name 'data_dir' is not defined HOT 2
- ImportError: cannot import name 'infer_feature_types' from 'evalml.utils.gen_utils' HOT 1
- data leakage in predict_next_purchases HOT 2
- Add link to Featuretools Time Series guide back into Daily Temperature 2 - Featuretools Solution notebook
- Remove make_agg and make_trans functions from demos
- ModuleNotFoundError: No module named 'woodwork.serialize' HOT 3
- Attribute Error
- module 'dask' has no attribute 'config' HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from open_source_demos.