Giter Club home page Giter Club logo

Comments (10)

daehwanahn avatar daehwanahn commented on August 23, 2024 1
!pip install pycox
from google.colab import files
files.upload() #upload your kaggle.json
!pip install -q kaggle
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!ls ~/.kaggle
!chmod 600 /root/.kaggle/kaggle.json
from pycox import datasets
datasets.kkbox.download_kkbox()
import numpy as np
from google.colab import files
kkbox_survival = np.array(datasets.kkbox.read_df())
np.save('kkbox_survival.npy', kkbox_survival)
files.download('kkbox_survival.npy') 

from pycox.

havakv avatar havakv commented on August 23, 2024 1

It's great that you found a way to get your data @daehwanahn, and thank you for testing py7zr on windows for me. I'll rewrite the code to use py7zr for windows then.

from pycox.

havakv avatar havakv commented on August 23, 2024

Thank you for posting the issue. I've not tested obtaining this dataset on windows, so it's not that surprising there might be some bugs.

It looks like the code is failing here, so if there is a file not found, then that path might not be correct. The other alternative would be that the 7z command doesn't work as expected.
To verify that the path is correct, can you try this:

from pycox.datasets import kkbox
self = kkbox
train_path = self._path_dir /  "train.csv.7z"
print(train_path.exists())  # This should print "True" if the file is found

And if this prints "True", can you then try:

print(subprocess.check_output(['7z', '--help']).decode('utf-8'))

which should print out the help pages for 7z to ensure that 7z works on your machine.

Finally, if both of these works, can you try this and poste the error message that you get from it?

import subprocess
subprocess.check_output(['7z',  'x', str(train_path), f"-o{self._path_dir}", '-y'])

from pycox.

daehwanahn avatar daehwanahn commented on August 23, 2024

Thanks for your reply!
I tested your suggestions and I got the following results.

from pycox.datasets import kkbox
self = kkbox
train_path = self._path_dir /  "train.csv.7z"
print(train_path.exists()) 

=> True

import subprocess
print(subprocess.check_output(['7z', '--help']).decode('utf-8'))

=> [WinError 2] The system cannot find the file specified

import subprocess
subprocess.check_output(['7z',  'x', str(train_path), f"-o{self._path_dir}", '-y'])

=> [WinError 2] The system cannot find the file specified

from pycox.

havakv avatar havakv commented on August 23, 2024

So then the issues seems to be that 7z doesn't work. Do you know how to check if it installed? And if it is not installed could you try to install it?

In the mean time I'll check if there is a way I can unzip with a python package, such that we don't have to call a non-python program for unzipping as we do now.

from pycox.

havakv avatar havakv commented on August 23, 2024

So, can you try installing py7zr with pip install py7zr and running the following?

import py7zr
archive = py7zr.SevenZipFile(str(train_path), mode='r')
archive.extractall(path=str(self._path_dir))
print((self._path_dir / 'train.csv').exists())

If this doesn't error out, and prints "True", we can use this package for uncompressing instead of the os command.

from pycox.

daehwanahn avatar daehwanahn commented on August 23, 2024

Hi, havakv

  1. I found that I didn't have py7zr. So, I installed it.

import py7zr
archive = py7zr.SevenZipFile(str(train_path), mode='r')
archive.extractall(path=str(self._path_dir))
print((self._path_dir / 'train.csv').exists())

This command works~ it returns 'True'.

  1. But, I had the same error with 'subprogress' and 'datasets.kkbox.download_kkbox()'.
    You're right. It seems like we need to use py7zr instead of subprogress in Windows OS.

from pycox.

daehwanahn avatar daehwanahn commented on August 23, 2024

Hi, havakv

I extracted the data by using the google colab.
So, this is not an urgent problem.

Many thanks~!

from pycox.

havakv avatar havakv commented on August 23, 2024

Let's just keep it open until this works smoothly in windows too.

from pycox.

havakv avatar havakv commented on August 23, 2024

#69

from pycox.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.