Giter Club home page Giter Club logo

bookmark-manager's Introduction

Bookmark-Manager

NLP based approach to automatically categorize your bookmarks!

In-Depth

To understand this project in-depth, refer to my technical paper: Bookmark Classification using Multinomial Naive Bayes Model

How does this work ?

  • Enter your bookmarks in ./links.json file
  • To run the code, run categorize.py
  • scrape_filter_link.py contains the classes used to scrape information from each URL

What bookmarks is it categorizing ?

It can categorize a variety of bookmarks. Currently it supports all the categories mentioned in the ./corpus/ directory.

Will I have to enter my bookmark links manually ?

To a certain extent! For example: Firefox allows users to backup the bookmarks in a JSON format. You can extract the uri from that JSON file and feed it into ./links.json.
To backup your bookmarks in Firefox, press Ctrl+Shift+O, go to Import and Backup and then to Backup.
Chrome users can check this post on superuser.

Will the code create a directory structure with my bookmarks ?

No, the mapping of a URL with it's appropriate category is stored in a JSON file: result.json, in a dict format.
The keys are your bookmarks with values being their categories.

Can I see a demo ?

Sure, here's one (The highlighted part is the one stored in result.json): Bookmark-Manager output

Can I improve the corpus, by adding more categories in ./corpus/ directory ?

Yes, you can! The code is fairly scalable.
To add your own corpuses:

  • Create a directory with a unique category name in ./corpus/
  • Inside the ./corpus/your-category-dir add your corpus text in a JSON file with the format: {"text": "_your_corpus_text_here_"}

(NOTE: You can add multiple JSON files in a category directory)
When you run the code, you will find that the categorize.py will take the new/modified corpuses into consideration.

License

The code is under MIT License

bookmark-manager's People

Contributors

dependabot[bot] avatar iotarepeat avatar pncnmnp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

bookmark-manager's Issues

Please add a README

I have no idea what a "NLP based approach to automatically categorize your bookmarks!" looks like (functionally or visually). What bookmarks is it categorizing? Why does this help? Is this something to do with the bookmarks in my browser, or does it have its own store? Etc. etc. etc.

Please use the NLTK Downloader to obtain the resource:

Traceback (most recent call last):
  File "C:\Python39\lib\site-packages\nltk\corpus\util.py", line 83, in __load
    root = nltk.data.find("{}/{}".format(self.subdir, zip_name))
  File "C:\Python39\lib\site-packages\nltk\data.py", line 585, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource �[93mstopwords�[0m not found.
  Please use the NLTK Downloader to obtain the resource:

  �[31m>>> import nltk
  >>> nltk.download('stopwords')

How to solve that problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.