Giter Club home page Giter Club logo

humansignal / label-studio Goto Github PK

View Code? Open in Web Editor NEW
16.6K 172.0 2.0K 2.03 GB

Label Studio is a multi-type data labeling and annotation tool with standardized output format

Home Page: https://labelstud.io

License: Apache License 2.0

Python 26.28% CSS 0.72% JavaScript 48.65% HTML 0.98% Shell 0.25% Dockerfile 0.06% Stylus 4.13% TypeScript 18.38% Makefile 0.04% SCSS 0.51%
computer-vision deep-learning image-annotation annotation-tool annotation labeling labeling-tool image-labeling image-labelling-tool boundingbox

label-studio's Introduction

GitHub label-studio:build GitHub release

WebsiteDocsTwitterJoin Slack Community

What is Label Studio?

Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can be used to prepare raw data or improve existing training data to get more accurate ML models.

Gif of Label Studio annotating different types of data

Have a custom dataset? You can customize Label Studio to fit your needs. Read an introductory blog post to learn more.

Try out Label Studio

Install Label Studio locally, or deploy it in a cloud instance. Or, sign up for a free trial of our Enterprise edition..

Install locally with Docker

Official Label Studio docker image is here and it can be downloaded with docker pull. Run Label Studio in a Docker container and access it at http://localhost:8080.

docker pull heartexlabs/label-studio:latest
docker run -it -p 8080:8080 -v $(pwd)/mydata:/label-studio/data heartexlabs/label-studio:latest

You can find all the generated assets, including SQLite3 database storage label_studio.sqlite3 and uploaded files, in the ./mydata directory.

Override default Docker install

You can override the default launch command by appending the new arguments:

docker run -it -p 8080:8080 -v $(pwd)/mydata:/label-studio/data heartexlabs/label-studio:latest label-studio --log-level DEBUG

Build a local image with Docker

If you want to build a local image, run:

docker build -t heartexlabs/label-studio:latest .

Run with Docker Compose

Docker Compose script provides production-ready stack consisting of the following components:

  • Label Studio
  • Nginx - proxy web server used to load various static data, including uploaded audio, images, etc.
  • PostgreSQL - production-ready database that replaces less performant SQLite3.

To start using the app from http://localhost run this command:

docker-compose up

Run with Docker Compose + MinIO

You can also run it with an additional MinIO server for local S3 storage. This is particularly useful when you want to test the behavior with S3 storage on your local system. To start Label Studio in this way, you need to run the following command:

# Add sudo on Linux if you are not a member of the docker group
docker compose -f docker-compose.yml -f docker-compose.minio.yml up -d

If you do not have a static IP address, you must create an entry in your hosts file so that both Label Studio and your browser can access the MinIO server. For more detailed instructions, please refer to our guide on storing data.

Install locally with pip

# Requires Python >=3.8
pip install label-studio

# Start the server at http://localhost:8080
label-studio

Install locally with poetry

### install poetry
pip install poetry

### set poetry environment
poetry new my-label-studio
cd my-label-studio
poetry add label-studio

### activate poetry environment
poetry shell

### Start the server at http://localhost:8080
label-studio

Install locally with Anaconda

conda create --name label-studio
conda activate label-studio
conda install psycopg2
pip install label-studio

Install for local development

You can run the latest Label Studio version locally without installing the package from pypi.

# Install all package dependencies
pip install poetry
poetry install
# Run database migrations
python label_studio/manage.py migrate
python label_studio/manage.py collectstatic
# Start the server in development mode at http://localhost:8080
python label_studio/manage.py runserver

Deploy in a cloud instance

You can deploy Label Studio with one click in Heroku, Microsoft Azure, or Google Cloud Platform:

Apply frontend changes

For information about updating the frontend, see label-studio/web/README.md.

Install dependencies on Windows

To run Label Studio on Windows, download and install the following wheel packages from Gohlke builds to ensure you're using the correct version of Python:

# Upgrade pip 
pip install -U pip

# If you're running Win64 with Python 3.8, install the packages downloaded from Gohlke:
pip install lxml‑4.5.0‑cp38‑cp38‑win_amd64.whl

# Install label studio
pip install label-studio

Run test suite

To add the tests' dependencies to your local install:

poetry install --with test

Alternatively, it is possible to run the unit tests from a Docker container in which the test dependencies are installed:

make build-testing-image
make docker-testing-shell

In either case, to run the unit tests:

cd label_studio

# sqlite3
DJANGO_DB=sqlite DJANGO_SETTINGS_MODULE=core.settings.label_studio pytest -vv

# postgres (assumes default postgres user,db,pass. Will not work in Docker
# testing container without additional configuration)
DJANGO_DB=default DJANGO_SETTINGS_MODULE=core.settings.label_studio pytest -vv

What you get from Label Studio

Screenshot of Label Studio data manager grid view with images

  • Multi-user labeling sign up and login, when you create an annotation it's tied to your account.
  • Multiple projects to work on all your datasets in one instance.
  • Streamlined design helps you focus on your task, not how to use the software.
  • Configurable label formats let you customize the visual interface to meet your specific labeling needs.
  • Support for multiple data types including images, audio, text, HTML, time-series, and video.
  • Import from files or from cloud storage in Amazon AWS S3, Google Cloud Storage, or JSON, CSV, TSV, RAR, and ZIP archives.
  • Integration with machine learning models so that you can visualize and compare predictions from different models and perform pre-labeling.
  • Embed it in your data pipeline REST API makes it easy to make it a part of your pipeline

Included templates for labeling data in Label Studio

Label Studio includes a variety of templates to help you label your data, or you can create your own using specifically designed configuration language. The most common templates and use cases for labeling include the following cases:

Set up machine learning models with Label Studio

Connect your favorite machine learning model using the Label Studio Machine Learning SDK. Follow these steps:

  1. Start your own machine learning backend server. See more detailed instructions.
  2. Connect Label Studio to the server on the model page found in project settings.

This lets you:

  • Pre-label your data using model predictions.
  • Do online learning and retrain your model while new annotations are being created.
  • Do active learning by labeling only the most complex examples in your data.

Integrate Label Studio with your existing tools

You can use Label Studio as an independent part of your machine learning workflow or integrate the frontend or backend into your existing tools.

Ecosystem

Project Description
label-studio Server, distributed as a pip package
Frontend library The Label Studio frontend library. This uses React to build the UI and mobx-state-tree for state management.
Data Manager library A library for the Data Manager, our data exploration tool.
label-studio-converter Encode labels in the format of your favorite machine learning library
label-studio-transformers Transformers library connected and configured for use with Label Studio

Roadmap

Want to use The Coolest Feature X but Label Studio doesn't support it? Check out our public roadmap!

Citation

@misc{Label Studio,
  title={{Label Studio}: Data labeling software},
  url={https://github.com/heartexlabs/label-studio},
  note={Open source software available from https://github.com/heartexlabs/label-studio},
  author={
    Maxim Tkachenko and
    Mikhail Malyuk and
    Andrey Holmanyuk and
    Nikolai Liubimov},
  year={2020-2022},
}

License

This software is licensed under the Apache 2.0 LICENSE © Heartex. 2020-2022

label-studio's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

label-studio's Issues

typo: selelectWithoutLabel

Describe the bug
The text tag has a parameter selelectWithoutLabel. It should be selectWithoutLabel. I am hesitant to offer a pull request because a change could break users' existing solutions. A deprecation strategy is suggested, and I'm not sure how to do that with your code.

To Reproduce
Steps to reproduce the behavior:

  1. Visit https://labelstud.io/tags/text.html
  2. See spelling

Expected behavior
The label should be selectWithoutLabel.

CRA update

  • Examples outside the src folder
  • Global styles in App for CSS Modules
  • Different webpack loaders without eject
  • Require module

Reload UI for each task

Hi. Your project is awesome.
Is there any possibility to reload labels for each task? I tried to edit label_config_line in server.py reload_config. But for new labels needed to refresh page. Maybe, you can help me?

Use database in backend

Description

Currently the backend uses json to store tasks. It is flexible but less efficient when task size increases, as the backend have to store all tasks in memory.

Solution

Using external databases e.g. mongodb to store tasks.

Benefits

  • Database operations are much more efficient than direct r/w to json.
  • It will be easier to implement an authentication backend described in #38.

Drawbacks

  • It will be harder to write tasks and configure the server.
  • If SQL database is used, the database schema should be generated on the fly.

Notes

#29 should be resolved first to avoid extra setup procedures on the database. (use docker-compose)

Error am getting on backend....

Hello,
am new to this and am trying to install it on my device. But I
ERRORRRRR
am getting below error and am not getting what exactly it is .please let me know how to resolve this.

Configurable behaviour of dragging (moving) a polygon (like pressing CTRL key while dragging moves bounding shapes)

Is your feature request related to a problem? Please describe.
Many times I end up dragging whole polygon where I just want to move a point. In some cases it is very difficult to make it fit back in on the image.

Describe the solution you'd like
Allowing user to have a flag in config.json / config.xml to allow dragging of polygon when only CTRL key is pressed. This way I will be moving stuff when I am sure I want it to move.

I am using only polygons for now so my problem may seem to be related to those only. But you can evaluate it for other shapes too and decide.

Describe alternatives you've considered
None TBH..

Additional context
None

Note with various anatodators

Hello,
Annotation tasks are usually performed by many annotators, after these annotations are evaluated.

I wonder if the tool supports multiple simultaneous annotators?

Does the tool have any way of comparing the annotations of several annotators?

It is possible to export the annotations to a text file (this is thinking of text segmentation, so it would be possible to easily compare the segmentation of several annotators).

Thank you in advance for your attention

Can't build on Win 10

Describe the bug
Can't build on Win 10

C:\Programs\label-studio>npm run start

> [email protected] start C:\Programs\label-studio
> npm run copy-examples && react-scripts start


> [email protected] copy-examples C:\Programs\label-studio
> bash scripts/copy.sh
Starting the development server...
Failed to compile.

./src/env/development.js
Module not found: Can't resolve '../examples/audio_classification' in 'C:\Programs\label-studio\src\env'




Can I create completion JSON files using my own code (to migrate annotations from other sources)?

Hi,

I have some datasets which have annotations in JSON or CSV. I like this tool already (this is my first day of use!) and I am thinking of converting my other annotations into the format supported by label-studio.

Is the relation of a completion json file back to the image based on the "image" field in the "data" object of the JSON file?

I hope my question makes sense. Please ask if it needs to be rephrased.

NER Labelling: UI Performance issues with large tasks.json

Describe the bug
When tasks.json is a larger file (>300KB), with approximately 6*60KB input texts, the UI responds very slowly. It takes about ~1-2 tasks just to highlight one text, and the front-end loading takes ~5 seconds.

To Reproduce
Steps to reproduce the behavior:

  1. Create a large tasks.json file as input
  2. Update config to deploy frontend NER labelling using tasks.json
  3. Run the backend start.sh script to use pre-built static frontend
  4. Go to localhost UI

Expected behavior

  • The frontend components should have similar speeds to small 10KB json files.

Screenshots
N/A

Environment (please complete the following information):

  • OS: macOS Mohave
  • Chrome (Version 78.0.3904.97 (Official Build) (64-bit))
  • specs: 2.2 GHz Intel Core i7 (8 core), 16 GB 2400 MHz DDR4

Go to next tasks

When I submit a task on the front-end, it will show "all done", although tasks are tasks left. Is it possible to automatically load the next task when after submitting a task? This can improve the speed of labeling.

Render custom visualizations from Bokeh or iframe/url

We're labeling large, complex, multi-variant time series with > 2000 features. We have an existing workflow that integrates weak labeling, causal graph output and temporal clustering. This workflow runs as a Bokeh/Flask app.

We have a rudimentary internal labeling tool that I'd like to replace with label-studio but reuse the visualization/exploration logic.

Is it possible to embed an iframe as the media visualization but reuse all of the tagging/labeling functionality? I'd just need to pass the media parameters as URL query parameters or path elements to the Bokeh/Flask app.

Question about semantic graph feature

Hi,

I would like to know if you plan to implement a feature for semantic graph annotation. The idea is to build a semantic graph for a specific text.

Thanks!

More data for every example

Each example should contain 2 tasks:

  1. With completion and relation.
  2. Without completion and relation for labeling.

Multiple complementations per task

De completion json contains an array with the results. Hower, I can not manage to add multiple results. In the frontend I can add a new completion, but that complementation overwrites the previous one.

TypeError when completion json file doesn't exists

When I tried use Image object detection without initializing the completion files I noticed the following error:

Traceback (most recent call last):
  File "/home/username/label-studio/backend/utils.py", line 70, in exception_f
    return f(*args, **kwargs)
  File "/home/username/label-studio/backend/server.py", line 164, in api_completions
    completion_id = db.save_completion(task_id, completion)
  File "/home/username/label-studio/backend/db.py", line 199, in save_completion
    if 'completions' not in task:
TypeError: argument of type 'NoneType' is not iterable

the above error occurs then I hit "submit" because "$task_id".json is missing from the output folder.

I believe that we can bypass this unpleasant case by changing backend.db.get_completions function in that way so when the completion file is missing, an empty dictionary will be returned

Containerization

It would be interesting to have a container image of the app. This way I would be able to launch a container on a remote desktop and ping the host address.

Something like Docker.

Load images from local directory

When pointing the input to a directory of images I get this error:

Traceback`` (most recent call last): File "server.py", line 247, in <module> reload_config() File "server.py", line 33, in reload_config c = load_config() File "/home/shomerj/label-studio/backend/utils/misc.py", line 162, in load_config for new_config in generator(): File "/home/shomerj/label-studio/backend/utils/misc.py", line 158, in generator re_init(c) File "/home/shomerj/label-studio/backend/utils/db.py", line 95, in re_init init(config) File "/home/shomerj/label-studio/backend/utils/db.py", line 83, in init raise IOError(f'Unsupported file format: {os.path.splitext(f)[1]}') OSError: Unsupported file format: .jpg
How do I annotate images from location directory without creating a tasks.json file?

Pass an argument to api_generate_next_task

Hi there,

Great tool.

I'm currently writing a plugin for daily human-in-the-loop computer vision task.
I added Image_classification to examples and write results to a database.

However I struggle to find how to pass an argument to a api_generate_next_task because I can't find where it is called. I know for sure it is called from index page when task_id is None. Also, I need to pass this argument to a next image when clicking submit.

P.S. The reason I need this argument is that my images are split by dates and I need to label them separately.

Failed to compile

Describe the bug

This error occurred during the build time and cannot be dismissed.

./src/examples/image_bbox/index.js
Module not found: Can't resolve './completions/1.json' in '/Users/pirate/Developer/label-studio/src/examples/image_bbox'

To Reproduce
Steps to reproduce the behavior:

git clone https://github.com/heartexlabs/label-studio
npm install .
npm run start

Expected behavior
Loading the app.

How to fix it
It looks like that in the latest commit the 1.json file has been removed. Adding it back, fix the issue.
I didn't open any PR because I was not sure if it was intentional for a code refactoring or just overlooked.

Project configuration - multiple save buttons can lose data

Describe the bug
When configuring a project - each section has a save button. However its common to configure multiple options in multiple sections in 'one go'

Hitting save after doing so may cause one to loose configurations in other sections.

To Reproduce
Steps to reproduce the behavior:

  1. Go to 'Project, configure'
  2. Click on 'Collaborators'
  3. Configure options in Instruction, Interface and Completions.
  4. Click one of the save buttons.
  5. Only one of the options configured is saved.

Expected behavior
All options are saved.

Label UI editing without destruction of completed tasks

Is your feature request related to a problem? Please describe.
After building a labeling interface and using it for sometime, its often insightful and inspires some possible user interface changes. However, editing the labeling interface requires completions to be destroyed.

Describe the solution you'd like
It would be nice if one could edit the layout, without changing the labels, and if somehow a smart 'only destroy task completions if the label interface output is inconsistent with completed tasks'

Describe alternatives you've considered
Alternatively an opt out of destroying the task data, that says 'your labels may be inconsistent'

Alternatively somehow locking the editing of or other tags that effect labels, and allowing only edits to say, View or style components?

Additional context
Add any other context or screenshots about the feature request here.

Typo in Callbacks Example Snippet

onSubmitCompletion: function(result) {
console.log(result)
}

Should be:

submitCompletion: function(result) {
console.log(result)
}

New annotation type guide

Hi,

When can we expect the guide how to extend annotations types in your app?

I need to annotate the part of pictures, in general:
image; many polygons; each polygon should have label like text/logo.

I also need an annotation like image to text (for OCR fine-tuning).

I am able to try writing the code myself, although I need your guidance. Could you tell me what files should I modify to add more annotation types?

Ability to insert Views within <Choices> tags for more substantive label UI layouts.

Is your feature request related to a problem? Please describe.
A particular labeling task I have has close to 130 multiple choice for a multi class image classification task. These tags are broken up into logical groups, and it would be nice to be able to design a UI in such a way that allows me to group these choices within multiple flexed views.

It appears however, either due to a bug, or due to how specific design decisions, that one cannot have multiple choices tags (as they would require the same name) and nor can one have a nest view or header tags inside of a choices tag group.

ie:

<View>
  <choices name="someUniqueName">
    <view>
    <choice value="choice1"></choice>
    <choice value="choice3"></choice>
    </view>
    <view>
    <choice value="choice2"></choice>
    <choice value="choice4"></choice>
    </view>

  </choices>
</View>

While the above is a silly example, being able to control the layout (and perhaps even visibility) of some choices might be very helpful.

Ability to split NER labelling tasks by users

Is your feature request related to a problem? Please describe.

  • We have a small group of people who would like to divide an NER labelling job. We would like to assign the same jobs to multiple users, so we would need a basic notion of 'multi tenancy.'

Describe the solution you'd like

  1. I would like the ability to distribute NER tasks to multiple users. It would be great to be able to have a set of 'completed solutions' by each user, as opposed to a global solution set per running instance of label-studio.

  2. I saw a button called "AWS Groundtruth" in the UI, but I could not find any docs in the repository documentation on how to integrate this with AWS Groundtruth before (I have also not used Groundtruth before, so unsure how to set this up).

Describe alternatives you've considered

  • We are considering deploying multiple instances of label-studio, one per person. This is a bit cumbersome, but a workaround for now.
  • We are considering looking into AWS Groundtruth integration, but admittedly did not dig too deep into this approach.

Multi-person Annotation

Hi, this is a awesome tool.
Now, we can only start with the first un-label data, then annotate data one by one.
So, can we each choose a specific task id then go on?

Multiple tasks in the same page

First of all - I found this project to be the best tool out there when it comes to open source, thank you for that!

I am trying to customize it to my needs, but I think it is worth having this idea out there. The task I have at hand actually involves simultaneous labeling of text (named entity type), text sentiment analysis (annotator to give a quality metric of the provided text) and image polygon annotation (given information in text provide polygons on the image). These 3 elements should be displayed on the same screen for annotator's convenience.

I think I am no the only one who is dealing with multi-labeling this way. I am not asking this to be a core feature of this project, but it might be worth having in mind that this possible design can be useful.

Annotations should allow features

Rather than just a label from a selection of labels an annotation (a span of text) should be configurable to have additional features. This can serve two purposes: first, some ML tasks involve predicting more than one piece of information per instance and second, annotators can add meta-information about the annotation, e.g. confidence, a remark etc. which could be very useful for adjudication.

How to use local file on the tasks.json?

Hi I'm trying to do audio classification but the files are too big to upload somewhere, plus it's sensitive data too. Is there a way to provide the url field locally on the tasks.json?

Thanks in advance

Check for python3

Inside start.sh you need to add a check for python3 existence. Because I get

The executable python3 (from --python=python3) does not exist

Reset button

Describe the bug
The reset button does not work properly. The image is not visible after pushing this button.

Use prelabeling (precomputed suggestions) in NER task

Hello, thx for great work!

I'm stuck on simple problem. Can I add to tasks.json field (compatible with completion's result field) with precomputed suggestions? I.e. my goal is to verify or modify existing markup, not to create one.

And one more question: can I add bunch of classification fields to NER task?

Multiple labels for same image polygon

Hi,

Apologies if this feature is already present (I have not been unable to find it).

I am working on a (pet) project for classification and instance segmentation of vehicles. So I am going to use image polygon feature of label-studio. For that

  • I need broader classification like car, truck, scooter, bus, etc.
  • More specific classification (e.g. in cars, their make) like Ford, Toyota, BMW etc.
  • More specific classification (e.g. in cars, their models)
  • More specific classification (e.g. in cars, their colours)

For the above, if a polygon can be given multiple labels, I will be able top write data generator for different classifications.

Thanks
Vijay

Tasks per user

Is your feature request related to a problem? Please describe.
We want multiple experts to label one item/task.

Describe the solution you'd like
In the frontend we want to select a user (for this moment authentication is not required), so the user can walk through his own task list. The results of all users will be stored in the same completion json as an array of results.

Describe alternatives you've considered
We can run an instance per user, but that is not scalable.

Additional context
n/a

Overlapping annotations incorrect and not displayed

When adding an (entity) annotation to text that overlaps an existing one, e.g. starts before and ends after an existing one, an annotation is added that has wrong offsets (start=null, end=0) but no annotation is shown visually.

Correctly dealing with overlapping annotations is important for some tasks.
However if a task explicitly disallowes it, no annotations should get created and an error should get shown.

command line interface not working

Describe the bug
label-studio cli fails

To Reproduce
Steps to reproduce the behavior:

  1. pip install label-studio
  2. label-studio init labeling_project

Expected behavior
labeling_project to be created

Screenshots
See error TypeError: __init__() got an unexpected keyword argument 'required'

Environment (please complete the following information):

  • OS: Win 10
  • Browser: Chrome (not relevant)
  • Version: 79.0.3945.88

Additional context
See https://github.com/heartexlabs/label-studio/blob/6d2aed6a69d0cac1952d8935198f2790ace16598/label_studio/utils/misc.py#L250

required is not a valid argument for add_subparsers
probably meant to be

    subparsers = parser.add_subparsers(dest='command', help='Available commands')
    subparsers.required=True

Allow to connect Key Points

Is your feature request related to a problem? Please describe.
There is a case where the Key Points might be related (i.e. same person Nose, Hand). It would be easier to relate different Key Points to a same "entity" (i.e. same person) if they are connected.

Describe the solution you'd like
Add a tag to enable Key Point connection (i.e.KP Node to KP Ear) and Key Point connection schema.

Possible implementation?

<View>
  <KeyPointLabels name="tag" toName="img" strokewidth="5" connected="true">
    <Label value="Nose" background="green"></Label>
    <Label value="Ear" background="blue">
      <Joint connectTo="Nose" />
    </Label>
    <Label value="Lip" background="red">
       <Joint connectTo="Nose" />
    </Label>
  </KeyPointLabels>
  <Image name="img" value="$image" zoom="true"></Image>
</View>

Screenshot from 2019-12-04 11-16-07

Label multiple tasks at once

Is your feature request related to a problem? Please describe.
We want to use this tool for labeling texts. We have a lot of texts which are almost the same. It will be convenient when it is possible to select a few tasks based on a simple query and label them at once with the selected label

Describe the solution you'd like

  • enter a text to query all tasks for
  • show all queried tasks
  • select the tasks which are look-a-likes (with (de)select all option)
  • select the label
  • apply the label on all selected tasks

Describe alternatives you've considered
Labeling all tasks indivudually can be time consuming.

Additional context
n/a

Recommendations for OCR labeling task

Hi label-studio team & community,

I'm actually labeling some data for an OCR pipeline made of 3 tasks:

  1. Text Detection
  2. Text Recognition
  3. NER/Text Classification

Ideally, it would be great to do everything in a single shot but I'm facing some seatbacks. Let me give you more context.

I'm using the object detection template and adding a label to classify/tag the text inside the bbox, in this way I'm able to accomplish steps 1 & 3. To solve step 2, I tried this workaround: after the creation of a new Entity, I added the Normalization field with the text inside the bbox. Unfortunately, this is suboptimal because it's not possible to edit this field from the UI, and if I update the task in a second time, all the previous completed Normalization fields are deleted.

I can certainly split the tasks and adopt the solution mentioned here for task 2, but it will make the workflow a bit more cumbersome.

I was wondering if you have any other recommendations for this type of OCR workflow.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.