horenbergerb / osrs_optical_recognition Goto Github PK
View Code? Open in Web Editor NEWUses various tools to parse a video feed of Oldschool Runescape into actionable intelligence
Uses various tools to parse a video feed of Oldschool Runescape into actionable intelligence
The current method is not very good.
This new method would let you take a color sample from a tooltip and use it as the mean value of a Gaussian distribution of colors. Then each pixel would be binary hypothesis tested against this distribution. You could have a similar test for each color of text. I think this model would get substantially better results than the naive thresholding I am currently using and would be easily configurable.
I want to write a script that lets you easily generate a config file for screen ROIs.
One way to do it would be a script that prompts you
"Click the top left corner of the window"
"Click the bottom right corner of the window"
And then you can add more named ROIs on top of that.
Another way would be some kind of search over the screen to find elements of the OSRS client.
Yet another way would be to use window names?
I made a powerpoint and a paper on this project, so I should really have a better README.
Not sure how to organize everything. Maybe an overview in the README and then more detailed breakdown in the docs folder?
It would be super convenient to have a testing system in place so I could automatically verify all the features work as intended after each push. I don't think it would take too long to set up.
I should start implementing unit tests and look into how to streamline the setup process for this repo.
Currently future frame prediction uses a 3D convolutional network. It's super janky. It has lots of problems and doesn't easily allow you to vary the quantity of input frames.
I found a survey of video prediction methodologies.
It seems like I might want to try one of the methods used on M-MNIST, since that dataset has some similarities to our segmentation masks.
Folded recurrent neural networks seem to perform well, but they're also somewhat complicated and don't seem to be discussed often.
CrevNet is interesting and puts an emphasis on efficiency, but it also seems pretty complex.
This paper was one of the most well-cited with pretty decent results on M-MNIST. This one looks pretty interesting and fairly achievable. Surprised it doesn't have convolution. Might be good to try implementing this one next.
I'd like to integrate a Kahlman filter into the object detection algorithm.
I think a simple "model" for objects like chickens would simply assume that object positions are stable and tend to displace locally according to a Gaussian distribution. You could do something similar for trees, but you'd expect much smaller variance for the Gaussian.
If you wanted to get really fancy, you could use a different model when the camera is being moved. In this case, you'd need to model the camera movement as object displacements. I will try to post some resources later for a simple way to do this.
Right now data collection is a (sloppy) pipeline for generating 32x32 images paired with text labels.
Ideally data would be collected passively in the most general possible form, such as screenshots of the play screen paired with mouse coordinates. Then, these could later be processed into collections of more useful data.
Data that would be interesting to extract includes:
Currently the demos are just a very messy script of commented-out lines.
Ideally the demos would be organized in src/Demos. The demos would be run via a demos.py script in the root directory which takes command line arguments that specify which demo to run.
Additionally, some of the machine learning scripts aren't really "demos" at all, such as collecting training data, training the neural networks, and running the actual segmentation. Maybe these belong somewhere else.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.