Giter Club home page Giter Club logo

lda_abstract_readme's Introduction

LDA_Abstract_README

This repository contains a collection of scripts, dataset and tools for LDA model for Abstract topics and README topics.

Requirements

nltk~=3.7
requests~=2.31.0
bs4~=0.0.1
beautifulsoup4~=4.10.0

Features

  • LDA Training and Prediction : Train and predict topics using Latent Dirichlet Allocation (LDA) models.Two scripts are available: abstract_LDA.py for training LDA on abstract data and readme_lda.LDApy for training LDA on README data. Two scripts are available: abstract_predict.py for predict distributions and topics of abstract data and readme_predict.LDApy for predict distributions and topics of README data.
  • LDA Visualization : Visualize LDA topics and their associated keywords using interactive visualizations with the help of the pyLDAvis library.
  • Text Preprocessing : Perform text preprocessing tasks such as tokenization, stop word removal, stemming, and more using the nltk library.
  • Data Extraction : Extract text data from github README files.

Getting Started

To get started with the Textual Analysis Toolbox, follow these steps:

  1. Clone the Repository : Clone this repository to your local machine using the following command:
git clone https://github.com/PythonSimilarity/LDA_Abstract_README.git
  1. Install Dependencies : Install the required dependencies by running the following command:
pip install -r requirements.txt

directory

./abstract_model and ./readme_model: These two directories store the trained LDA models.

./utils: This directory contains preprocessing methods and web scraping scripts.

./crawler: Start the crawler to crawl the README files in Github.

./lda_predict: Prediction and visualization of two LDA models.

./lda_train: Train two LDA models.

data: All data to train LDA models.

Examples and Documentation

The repository includes examples and documentation for each script and tool. Please refer to the individual script files and the accompanying documentation for detailed instructions on usage, customization, and examples.

lda_abstract_readme's People

Contributors

leyusf avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.