Giter Club home page Giter Club logo

workspace's Introduction

workspace

explorations in multisensory and multiaction global neuronal workspace

workspace's People

Contributors

lwaekfjlk avatar pliang279 avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

applexi

workspace's Issues

[FEAT]: Support ASR API

Description

We want to detect the audio talking exists appeared in the video. We utilize whisper API to implement this.

Additional Information

No response

[FEAT]: Add video-llama model

Description

Need to add video-llama model and predict scripts for video understanding.

Additional Information

No response

[FEAT]: Support wolfram alpha API

Description

We want to support our model with the ability to do mathmatical calculation. We want to utilize the wolfram alpha API to do this.

Additional Information

No response

[FEAT]: Support google search API

Description

We want to support Google Search python API to connect our model as part of the interenet.

Additional Information

No response

[FEAT]: Add .gitignore

Description

Add .gitignore to avoid files including checkpoints or other not important files to be uploaded

Additional Information

No response

[FEAT]: Add video slicing code

Description

We want to add code supporting video slicing for Ego4D or other video clipping.

Additional Information

No response

[FEAT]: Have access to data examples

Description

We want to create a demo based on Ego4D dataset. We plan to cherry pick several video segments as examples in our demo.

Additional Information

No response

[FEAT]: Support Face Detection

Description

We want to detect the face appeared in the frames in our video data. We use API or local running model to do this. Based on the detected face, we want to build emotion classification / gender classification based on detected face.

Additional Information

No response

[FEAT]: Support Audio Emotion Detection

Description

We want to classify the speaker's emotion based on their audio. We use Hubert model as encoder to do this.

Additional Information

No response

[FEAT]: Wrap Vidoe-LLaMA as API with Flask

Description

Need to expose each processor model as API for later inference

Additional Information

Optimally should write a generic flask script for each model; each model should wrap their prediction functions to be called by the flask script

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.