Giter Club home page Giter Club logo

Comments (9)

michaelharms avatar michaelharms commented on August 19, 2024 1

Ok, I think I know what's going on now.

Try renaming your python script to something different then "comcrawl.py". I think Python is confusing your script with the comcrawl package and basically tries to import the IndexClient from your own script file.

Please let me know if this resolves the problem :)

from comcrawl.

masonreznov avatar masonreznov commented on August 19, 2024 1

You are right, the filename was the culprit. Once again, thanks a lot for giving your time @michaelharms.

from comcrawl.

michaelharms avatar michaelharms commented on August 19, 2024

Hey, could you provide more information?

Your python version and a snippet of your code?

from comcrawl.

masonreznov avatar masonreznov commented on August 19, 2024

Python version :: Python 3.6.10
And, I am using your code snippet in the README.md

from comcrawl import IndexClient
import pandas as pd

client = IndexClient()
client.search("reddit.com/r/MachineLearning/*")

client.results = (pd.DataFrame(client.results)
                  .sort_values(by="timestamp")
                  .drop_duplicates("urlkey", keep="last")
                  .to_dict("records"))

client.download()

pd.DataFrame(client.results).to_csv("results.csv")

from comcrawl.

michaelharms avatar michaelharms commented on August 19, 2024

Mhm, this should work.

Can you run pip show comcrawl from your terminal to see if it is correctly installed?
And maybe post the output here?

Also, are you using a virtual environment?

from comcrawl.

masonreznov avatar masonreznov commented on August 19, 2024

pip show comcrawl output::

Name: comcrawl
Version: 1.0.1
Summary: A python utility for downloading Common Crawl data.
Home-page: https://github.com/michaelharms/comcrawl
Author: Michael Harms
Author-email: [email protected]
License: MIT
Location: /home/pokpok/py3610venv/lib/python3.6/site-packages
Requires: requests
Required-by: 

Yes, I have installed comcrawl under a virtual environment.

from comcrawl.

michaelharms avatar michaelharms commented on August 19, 2024

Wow, this is really strange.

Which OS are you using?

I am using Mac OS. If I am trying the following steps with Python 3.6.1, everything works as expected for me:

~:$ mkdir test-project
~:$ cd test-project/
test-project:$ python --version
Python 3.6.1
test-project:$ python -m venv venv
test-project:$ source venv/bin/activate
(venv) test-project:$ pip install comcrawl
Collecting comcrawl
  Using cached https://files.pythonhosted.org/packages/9e/50/54e114158b84a4f438e222ec176a894ded5ee76b891cff7f43c0398161ce/comcrawl-1.0.1-py3-none-any.whl
Collecting requests<3.0.0,>=2.22.0 (from comcrawl)
  Using cached https://files.pythonhosted.org/packages/1a/70/1935c770cb3be6e3a8b78ced23d7e0f3b187f5cbfab4749523ed65d7c9b1/requests-2.23.0-py2.py3-none-any.whl
Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 (from requests<3.0.0,>=2.22.0->comcrawl)
  Using cached https://files.pythonhosted.org/packages/e1/e5/df302e8017440f111c11cc41a6b432838672f5a70aa29227bf58149dc72f/urllib3-1.25.9-py2.py3-none-any.whl
Collecting certifi>=2017.4.17 (from requests<3.0.0,>=2.22.0->comcrawl)
  Using cached https://files.pythonhosted.org/packages/57/2b/26e37a4b034800c960a00c4e1b3d9ca5d7014e983e6e729e33ea2f36426c/certifi-2020.4.5.1-py2.py3-none-any.whl
Collecting idna<3,>=2.5 (from requests<3.0.0,>=2.22.0->comcrawl)
  Using cached https://files.pythonhosted.org/packages/89/e3/afebe61c546d18fb1709a61bee788254b40e736cff7271c7de5de2dc4128/idna-2.9-py2.py3-none-any.whl
Collecting chardet<4,>=3.0.2 (from requests<3.0.0,>=2.22.0->comcrawl)
  Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl
Installing collected packages: urllib3, certifi, idna, chardet, requests, comcrawl
Successfully installed certifi-2020.4.5.1 chardet-3.0.4 comcrawl-1.0.1 idna-2.9 requests-2.23.0 urllib3-1.25.9
You are using pip version 9.0.1, however version 20.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
(venv) test-project:$ python
Python 3.6.1 (default, May 11 2020, 09:42:15) 
[GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.10.44.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from comcrawl import IndexClient
>>> client = IndexClient()
>>> 

Could you maybe also try these exact same steps again for a new test project with a fresh installation and just post the whole console input and output like me?

Of course adjusting the steps for activating the virtual environment if you are using a different Operating System.

from comcrawl.

masonreznov avatar masonreznov commented on August 19, 2024

Hey @michaelharms , thanks for your reply.

I did made a new virtual env and ran on the terminal

(comcr) pokpok@pokpok:~/Desktop/pytorch_tut$ python
Python 3.6.10 (default, Dec 19 2019, 23:04:32) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from comcrawl import IndexClient
>>> client = IndexClient()

Similarly, in order to cross check, I ran on the previous virtual env in terminal

(py3610venv) pokpok@pokpok:~/Desktop/pytorch_tut$ python
Python 3.6.10 (default, Dec 19 2019, 23:04:32) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from comcrawl import IndexClient
>>> client = IndexClient()
>>> 

Both these settings are working well with terminal, but throws error when I run the python file

(py3610venv) pokpok@pokpok:~/Desktop/scrapper/scrapper$ python comcrawl.py 
Traceback (most recent call last):
  File "comcrawl.py", line 1, in <module>
    from comcrawl import IndexClient
  File "/home/pokpok/Desktop/scrapper/scrapper/comcrawl.py", line 1, in <module>
    from comcrawl import IndexClient
ImportError: cannot import name 'IndexClient'

from comcrawl.

michaelharms avatar michaelharms commented on August 19, 2024

No problem :)

from comcrawl.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.