Giter Club home page Giter Club logo

chrome-gpt-reader's Introduction

Gpt Screen Reader

This extension was written as a small experiment in the possibility of using LLMs to help with accesibility.

LLMs are terrible at facts but great at interacting with people. Tradisional screen readers are extremely simplistic, slow and can't provide context to the user.

I wass inspired by the work of OpenAI on ChatGPT 4o and their inclusion in to Be My Eyes. https://openai.com/index/be-my-eyes/

What if you could have a conversation with a website?

Only one way to find out!

Ohh and of course, please do not fully rely on the answers provided and expect some rough endges.

Features

  • Describe a web page using image analysis of the web page
  • Ask a specific question about the site in text. It doesn't have to be something included in the general description
  • Record a short audio clip of you asking a question about the page

Install

To power the extension you need an OpenAI API Key.

OpenAI Platform

Create an account, add some limited funding and generate a key. All initial development and testing of the extension cost $0.50

I would however reccomend turning off auto-topup and enforcing spending limits just to make sure.

As this is still in development, clone this repo and see the quickstart guide.

Quickstart Guide

Contribution

Suggestions and pull requests are welcome!

I'm not a JavaScript developer, I've developed this purely as a proof of concept.

Here's a list of potential enhancements and features to consider:

  • Addition of more keyboard shortcuts.
  • Capability to record while a keyboard shortcut is being held down.
  • Use of content scripts and functions to allow the Language Learning Model (LLM) to interact with the site.
  • Provision of a structured approach around a user goal and chat history.

This project was bootstrapped with Chrome Extension CLI.


This project was bootstrapped with Chrome Extension CLI

Significiant portions of this code were constructed with the help of Github Copilot

chrome-gpt-reader's People

Contributors

m-adams avatar

Stargazers

James Garside avatar

Watchers

Trevor Pierce avatar Mike Barlow avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.