working on video understanding
tomchen-ctj Goto Github PK
Name: Tongjia
Type: User
Bio: [email protected]
Name: Tongjia
Type: User
Bio: [email protected]
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
[ICLR'23] AIM: Adapting Image Models for Efficient Video Understanding
Examples of how to create colorful, annotated equations in Latex using Tikz.
ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
General AI methods for Anything: AnyObject, AnyGeneration, AnyModel, AnyTask, AnyX
A collection of resources and papers on Diffusion Models and Score-based Models, a darkhorse in the field of Generative Models
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
Reading list for research topics in multimodal machine learning
A comprehensive collection of awesome research and other items about video domain adaptation
A curated list of awesome vision and language resources (still under construction... stay tuned!)
Collect some papers about transformer with vision. Awesome Transformer with Computer Vision (CV)
A curated list of prompt-based paper in computer vision and vision-language learning.
Contrastive Language-Image Pretraining
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning
Prompt Learning for Vision-Language Models
Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"
Codes for 2021 HNU EEIT Computer Vision course project
【CVPRW'23】First Place Solution to the CVPR'2023 AQTC Challenge
Code for Motion-aware Contrastive Video Representation Learning via Foreground-background Merging (CVPR 2022)
程序员在家做饭方法指南。
Learning to Prompt (L2P) for Continual Learning @ CVPR22 and DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning @ ECCV22
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Code release for "Learning Video Representations from Large Language Models"
Inference code for LLaMA models
Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.
Recent LLM-based CV and related works. Welcome to comment/contribute!
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models
🔥 chat with over 10k frames of video!
MulimgViewer is a multi-image viewer that can open multiple images in one interface, which is convenient for image comparison and image stitching.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.