Topic: llava Goto Github
Some thing interesting about llava
Some thing interesting about llava
llava,RestAI is an AIaaS (AI as a Service) open-source platform. Built on top of LlamaIndex, Ollama and HF Pipelines. Supports any public LLM supported by LlamaIndex and any local LLM suported by Ollama. Precise embeddings usage and tuning.
User: apocas
Home Page: https://apocas.github.io/restai/
llava,Docker image for LLaVA: Large Language and Vision Assistant
User: ashleykleynhans
llava,LLaVA: Large Language and Vision Assistant | RunPod Serverless Worker
User: ashleykleynhans
llava,Docker image for SUPIR (Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild)
User: ashleykleynhans
llava,Your all-in-one platform to build and use AI apps effortlessly on your own computer.
Organization: blib-la
llava,ChatGPT爆火,开启了通往AGI的关键一步,本项目旨在汇总那些ChatGPT的开源平替们,包括文本大模型、多模态大模型等,为大家提供一些便利
User: chenking2020
llava,A Python tool to evaluate the performance of VLM on the medical domain.
User: corentin-ryr
llava,A Framework of Small-scale Large Multimodal Models
Organization: dlcv-buaa
Home Page: https://arxiv.org/abs/2402.14289
llava,FreeGenius AI, an advanced AI assistant that can talk and take multi-step actions. Supports numerous open-source LLMs via Llama.cpp or Ollama or Groq Cloud API, with optional integration with AutoGen agents, OpenAI API, Google Gemini Pro and unlimited plugins.
User: eliranwong
Home Page: https://letmedoit.ai
llava,SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild
User: fanghua-yu
Home Page: http://supir.xpixel.group/
llava,Chat with large languages models about the contents of an image via this native desktop client for Windows, macOS, and Linux.
User: fmxexpress
Home Page: https://www.fmxexpress.com/
llava,[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
User: fuxiaoliu
Home Page: https://fuxiaoliu.github.io/LRV/
llava,Famous Vision Language Models and Their Architectures
User: gokayfem
llava,Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
User: gokayfem
llava,Chain of Images for Intuitively Reasoning
Organization: graphpku
Home Page: https://huggingface.co/spaces/fxmeng/Chain-of-Image
llava,[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
User: haotian-liu
Home Page: https://llava.hliu.cc
llava,Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.
User: herrera-luis
llava,Code for "How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation"
User: jameszhou-gl
Home Page: https://arxiv.org/pdf/2312.07424.pdf
llava,Tag manager and captioner for image datasets
User: jhc13
llava,LLaVA inference with multiple images at once for cross-image analysis.
User: mapluisch
llava,"Video-ChatGPT" is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
Organization: mbzuai-oryx
Home Page: https://mbzuai-oryx.github.io/Video-ChatGPT
llava,A Multimodal Discord bot with machine learning functions, including LLM chat, Image generation, and Speech Generation capabilities
User: meatfucker
llava,A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
Organization: modelscope
llava,ms-swift: Use PEFT or Full-parameter to finetune 200+ LLMs or 15+ MLLMs
Organization: modelscope
Home Page: https://github.com/modelscope/swift/blob/main/docs/source/LLM/index.md
llava,An extension of the Planner-Actor-Reporter framework applied to autonomous vehicles in Highway-Env and CARLA.
User: oliverc1623
llava,Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 30+ HF models, 15+ benchmarks
Organization: open-compass
Home Page: https://rank.opencompass.org.cn/leaderboard-multimodal
llava,Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
Organization: paddlepaddle
llava,Image Classification Testing with LLMs
User: robert-mcdermott
llava,Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥
Organization: roboflow
Home Page: https://maestro.roboflow.com
llava,Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"
Organization: salt-nlp
Home Page: https://llavar.github.io/
llava,A C#/.NET library to run LLM models (🦙LLaMA/LLaVA) on your local device efficiently.
Organization: scisharp
Home Page: https://scisharp.github.io/LLamaSharp
llava,👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]
User: skalskip
llava,Embed arbitrary modalities (images, audio, documents, etc) into large language models.
User: sshh12
llava,🧘🏻♂️KarmaVLM (相生):A family of high efficiency and powerful visual language model.
User: thomas-yanxin
llava,[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Organization: tianyi-lab
llava,LLaVA server (llama.cpp).
User: trzy
llava,This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"
Organization: ucsc-vlaa
llava,Unified Multi-modal IAA Baseline and Benchmark
User: uniaa-mllm
llava,Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Organization: unum-cloud
Home Page: https://unum-cloud.github.io/uform/
llava,Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".
User: victorwz
Home Page: https://mlm-filter.github.io/
llava,[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
Organization: wisconsinaivision
Home Page: https://vip-llava.github.io/
llava,Kani extension for supporting vision-language models (VLMs). Comes with model-agnostic support for GPT-Vision and LLaVA.
User: zhudotexe
Home Page: https://kani-vision.readthedocs.io/en/latest/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.