Giter Club home page Giter Club logo

the-big-plan's Introduction

Laion - the big plan

Chat on discord

Overarching goal: Enable the open source community to openly build datasets, papers, models and tools in order to let AGI benefit humankind even faster.

Intro

Laion was initiated with the Laion5B project that successfully produced a 5B (image, text) pairs dataset by processing commoncrawl and filtering with clip. That method proved that it’s cheap to collect large scale dataset from the web using models like clip that give the similarity between items from 2 modalities.

Many models have been trained on laion400m proving the value of this method, with in particular openclip that reproduced the same results that the initial openai clip.

Let’s reproduce that method to more modalities!

Overall rationale

These projects and directions are projects that we would like to promote and help. We do not claim ownership as an organization on these projects. The people that build the projects own these projects.

Directions

Methods

  • Open source: releasing everything openly
    • Code: on github with an open license
    • Model: freely distributed models
    • Dataset: freely distributed datasets
  • Open development: development is done in public on github and discord, everyone is encouraged to participate, whatever their nationality, age and diploma

Axis of work:

  • Open tools
    • Dataset collection
    • Dataset preparation
    • Distributed inference
    • Distributed training
    • Evaluation
  • Datasets
    • Open distribution
    • Papers
  • Models
    • Open training
    • Open distribution

Scientific domains

  • All modalities dataset building
    • Text image
    • Text audio
    • Text video
    • Text 3d
  • Contrastive and generative
    • Contrastive
      • Text image
      • Text audio
      • Text video
    • Generative
      • Text to image
      • Image to text

Projects

These projects are collaborations between many people. If you want to know who, check the links and ask in discord. We are open to new collaborators!

Dataset

Name Modality Status Notes
Laion400m image/text Done > 10 papers using it
Laion5B image/text Done Largest open text/image dataset
Laion5B high-resolution image/text Done Largest open high-resolution text/image dataset
Laion5B balanced image/text Just started Balanced LAION-5B dataset for more efficient training
laion3d 3d/image/text Just started Trying to expand the laion idea to 3d
Audio dataset text/audio Started Started to be used to train an audio clip

Model

Name Modality Kind Status Notes
Openclip B/16 image/text contrastive released Reproduced openai clip
Dalle2 prior/decoder image/text generative Just started Trying to reproduce dalle2
Clipcap image/text generative works Generate text from embedding
Audio clip audio/text contrastive Training on going
Video clip video/text contrastive Just started
Mclip vit-l/14 image/text contrastive Just started Aligning a text encoder to be in clip space. Collaboration with mclip author
Super-resolution image->image generative Just started Using a high-resolution subset of LAION-5B for the training
Medical CLIP image/text contrastive Just started Using CLIP to improve MRI -> image synthesis (see project outline).
NSFW detection image/text contrastive Done Using CLIP to detect NSFW in images.
Watermark detection image/text contrastive Done Using CLIP to detect watermarks in images.
electric sheep image/text/audio/video contrastive/generative Just started Train contrastive and generative models on all modalities.

Tools

Name Modality Status Notes
img2dataset image/text working Used to download laion5B in a week, twice
Clip retrieval image/text working Used to compute 5B Vit-L/14 embeddings
Crawlingathome-gpu-hcloud image/text done Filtering common crawl using clip
clip benchmark image/text wip Evaluating clip performances easily

Papers

Name Modality Status Notes
Laion400m image/text In arvix Cited many times
laion5B image/text started

the-big-plan's People

Contributors

robvanvolt avatar rom1504 avatar christophschuhmann avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.