explorations in multisensory and multiaction global neuronal workspace
pliang279 / workspace Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
We want to detect the audio talking exists appeared in the video. We utilize whisper API to implement this.
No response
Need to add video-llama model and predict scripts for video understanding.
No response
We want to support our model with the ability to do mathmatical calculation. We want to utilize the wolfram alpha API to do this.
No response
We want to update the README with necessary information.
No response
We want to support Google Search python API to connect our model as part of the interenet.
No response
Add .gitignore to avoid files including checkpoints or other not important files to be uploaded
No response
We want to add code supporting video slicing for Ego4D or other video clipping.
No response
Wrap up API calling for google search and mathematica
No response
We want to create a demo based on Ego4D dataset. We plan to cherry pick several video segments as examples in our demo.
No response
We want to detect the face appeared in the frames in our video data. We use API or local running model to do this. Based on the detected face, we want to build emotion classification / gender classification based on detected face.
No response
One possible synergistic information could be brought due to modality disagreement or Visual QA type things.
No response
We want to classify the speaker's emotion based on their audio. We use Hubert model as encoder to do this.
No response
Need to expose each processor model as API for later inference
Optimally should write a generic flask script for each model; each model should wrap their prediction functions to be called by the flask script
Support input an image and output OCR results.
No response
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.