Giter Club home page Giter Club logo

dumb-gpt-llm's Introduction

DUMB GPT LLM

About Dumb GPT :

This is a learning project created by me.

It is a pre-processed Large Language Model meaning it is not smart as there is no Finetuning involved in the training process or not pre-processed transformers are used while creating it.

It is created to find the text which looks similar to it's prompt (the input which we give).

The current model is trained on a wopping 100k iterations taking a time of 18H to traing on my RTX 3050 grapics cards

You can train yours by getting the data from Open Web Text.

Steps to Run this LLM: (on linux environment)

  1. Create your venv. Use command python3 -m venv env_name

  2. Make sure you have pytorch on your venv Pytorch Installation link
    I would suggest to go with the lastest version of CUDA.

  3. Clone this repo.

  4. Download the data from Open Web Text.

  5. Extract the data using the Data-Extract.ipynb file.

  6. After the extraction is done. Go to chatbot.py and make sure the paths are correct.

  7. After that open your terminal and type the magical words (obviously this are for my model ๐Ÿ˜ That you have pre tarained with the repo) python3 chatbot.py -batch_size 48
    Don't forget to download the GPT model from here
    This is because the current model was trained on 48 batch size you can adjust this as per your graphics card's convenience. ๐Ÿ˜ฌ

Also make sure to change the batch size when ever you try to traing your own model using the file training.py

And don't forget to change the name of the traing file when you run your custom model.

I would like to thank Elliot Arledge, he is my mentor and guid through out this project.(check out the links below).

Below are the paper links that are useful in creating of this GPT:

  1. A Survey of Large Language Models

  2. Attention is all you need basic transformer

    Below paper is not using in this project as it is on fine tuning.
    QL O RA: Efficient Finetuning of Quantized LLMs

Connect with Me: LinkedIn

Links
=> Elliot Arledge's Youtube Channel
=> Free Code Camp Course For LLM

Thank you for viewing

dumb-gpt-llm's People

Contributors

birenmer avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.