sy-zhang Goto Github PK
Name: Songyang Zhang
Type: User
Company: Amazon
Bio: An applied scientist @ Amazon AGI, building video generative models.
Twitter: zhangsongyang
Location: Santa Clara, CA
Blog: sy-zhang.github.io
Name: Songyang Zhang
Type: User
Company: Amazon
Bio: An applied scientist @ Amazon AGI, building video generative models.
Twitter: zhangsongyang
Location: Santa Clara, CA
Blog: sy-zhang.github.io
AAAIβ20 - Learning 2D Temporal Localization Networks for Moment Localization with Natural Language
The source code of the paper: "To Find Where You Talk: Temporal Sentence Localization in Video with Attention Based Location Regression"
Living in Rochester with joy (especially for University of Rochester)
Codes of our paper: "BSN: Boundary Sensitive Network for Temporal Action Proposal Generation"
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos
Evaluation code for Dense-Captioning Events in Videos
Official Tensorflow Implementation of the paper "Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning" in CVPR 2018, with code, model and prediction results.
Pre-trained ELMo Representations for Many Languages
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
Codes for our WACV2017 paper: "On Geometric Features for Skeleton-Based Action Recognition Using Multilayer LSTM Networks"
I3D feature extractor
Release for Improved Denoising Diffusion Probabilistic Models
Reimplementation of the paper "LIME: A Method for Low-light IMage Enhancement" in ACM MM 2016.
[NeurIPS 2023 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards GPT-4V level capabilities.
Github for my ICCV 2017 paper: "Localizing Moments in Video with Natural Language"
MAC: Mining Activity Concepts for Language-based Temporal Localization
Video-aided Unsupervised Grammar Induction, NAACLβ21 [best long paper]
A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)
Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"
Codes for our EMNLP2022 paper: "Learning a Grammar Inducer by Watching Millions of Instructional YouTube Videos"
Fast, general, and tested differentiable structured prediction in PyTorch
[ACL 2020] PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
Deeper and Wider Siamese Networks for Real-Time Visual Tracking
A latent text-to-image diffusion model
Config files for my GitHub profile.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.