Giter Club home page Giter Club logo

cvat-ai / cvat Goto Github PK

View Code? Open in Web Editor NEW
11.4K 185.0 2.8K 252.34 MB

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

Home Page: https://cvat.ai

License: MIT License

Python 38.69% HTML 0.48% JavaScript 13.99% Shell 0.14% Dockerfile 0.12% TypeScript 41.53% SCSS 1.64% Smarty 0.05% Open Policy Agent 0.87% Mustache 2.49% Jinja 0.01%
video-annotation computer-vision computer-vision-annotation deep-learning image-annotation annotation-tool annotation labeling labeling-tool image-labeling

cvat's People

Contributors

activechoon avatar alexeyalexeevxperienceai avatar annapetrovicheva avatar arvfilippov avatar azhavoro avatar benhoff avatar bsekachev avatar cvat-bot[bot] avatar dependabot-preview[bot] avatar dependabot[bot] avatar dmitriyoparin avatar dmitriysidnev avatar dvkruchinin avatar k1won avatar klakhov avatar manasars avatar marishka17 avatar mdacoca avatar nmanovic avatar novda avatar pktiuk avatar pmazarovich avatar sizov-kirill avatar snyk-bot avatar speclad avatar tosmanov avatar vnishukov avatar yasakova-anastasia avatar zankevich avatar zhiltsov-max avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cvat's Issues

Remove all annotations inside a range of frames

This is very useful option to remove all options from one frame to another.
I want to reannotate part of video. It is not good idea to search for keyframes and turn off them or
delete lines in xml file and upload annotation.

CVAT - AWS-Deployment guide

It would be nice if we have some docs telling how we can deploy this into the AWS CUDA Deep-learning machine.
Let me add this to the doc of CVAT or if anyone can build CVAT AMI at AWS would be great. Most of the time, we use the CVAT in AWS. I believe it would be helpful to other teams.

Release notes?

Is there release notes available anywhere?
If not:

  • should they be added?
  • what's the best way to figure out what has changed? look through git logs?

Video/Image loading status as on youtube

Another question and likely feature suggestion.

When start a job, if I wait long enough, will all the frames be loaded into the browser?
Or, are they loaded on demand as I seek through the video?
Are they cached locally in memory?

I'm working with 4k video and the interface isn't that usable, at least for my current use model, until all frames have been loaded.

Based on the answer above, it would be great to have feedback as to whether the frames have all been loaded or, better, which frames have been loaded. What I've seen that works well is using a different color on the seek bar for frames that have been loaded.

If they are demand loaded, it would be nice to have a way to force it to load them all (as long as there's enough memory available).

Error: Failed to execute 'inverse' on 'SVGMatrix': The matrix is not invertible

Error: Failed to execute 'inverse' on 'SVGMatrix': The matrix is not invertible.
at translateSVGPos (https://cvat-icv.inn.intel.com/static/CACHE/js/33b452232897.js:9422:54)
at ShapeCreatorView. (https://cvat-icv.inn.intel.com/static/CACHE/js/33b452232897.js:7852:30)
at HTMLDivElement.dispatch (https://cvat-icv.inn.intel.com/static/CACHE/js/716e033f0bc5.js:24801:27)
at HTMLDivElement.elemData.handle (https://cvat-icv.inn.intel.com/static/CACHE/js/716e033f0bc5.js:24609:28)

Enable video stream access

Hi,

My workflow is such that I have thousands of frames per annotation task, which amounts to extensive disk space usage (e.g. a <15MB video (~40s VGA@30fps) results in ~2GB in jpeg's).

Adding the ability of CVAT to work directly on a video stream will be a significant improvement as it will allow for the user to only specify a URL/path with potentially an optional download and local storage capability.

One way I see this to be done is by usage of OpenCV.js (https://docs.opencv.org/3.4/d5/d10/tutorial_js_root.html).

I could invest time in this. Let me know of your thoughts.

Thanks

Navigation by frames may works incorrect

Navigation by frames may will work incorrect in next scenario:

  1. Open any task in CVAT
  2. Resize browser to size which is less then CVAT workspace
  3. Scroll browser slider right
  4. Try to navigate with player progress bar

Player will not react to the progress bar navigation if cursor near start of the progress bar. Such "non react" area will increase if you scroll the browser slider more to the right

Sort labels in alphabet order

Currently they are sorted by primary key. If all labels were provided at task creation it results in a semi-random order of labels. It makes significantly harder to find required label when working with a big number of them. One work-around is to add labels one-by-one, but it's not a pleasant process...

Improve documentation for overlap parameter

My understanding is that overlap just specifies how many frames overlap when splitting a video into segments. If that's correct, what's the purpose? Does it actually do anything for the user? (E.g. if I do the tracks on the first segment, are they copied across to the next segment in the overlapping region? This doesn't appear to be the case.) Put another way- should I just set overlap=0 in my video tagging examples, to avoid having to manually resolve different taggings from each segment in the overlap?

Sorry if I've misunderstood something obvious.

PS - great tool!

Not correct number of frames in video

I load video with resolution 4096x2178. Number of frames is 1079.
In job statistic I see number of frames is 959 and in xml file number of frames is the same.

Could not create the task. ffmpy.FFRuntimeError

Built docker image from the latest sources. Created superuser. Getting an error on task creation:

Could not create the task. ffmpy.FFRuntimeError: ffmpeg -i /home/django/data/2/.upload/20170209T193000.000000Z.mp4 -start_number 0 -b:v 10000k -vsync 0 -an -y -q:v 16 /tmp/cvat-p9csbe_h.data/%d.jpg exited with status 1 STDOUT: STDERR:

screen shot 2018-09-05 at 16 22 23

Connected to running cvat container with docker exec -it <container id> /bin/bash and pasted command from error message into terminal. It fails, because folder cvat-* in /tmp doesn't exist.

Video file name / url in output file

Hi,
First of all, thanks for the tool. It works great!

When I annotate video files, and for that purpose I create an annotation task per video, I cannot seem to find any reference to the original video name / path / url inside the task itself. Moreover, inside the output XML file generated after annotating, there are no references to that information at all. The only thing I can find is the url of the corresponding task, but I don't think I can extract from that url the information I'm looking for (i.e. the name of the video).

The only workaround I can think of is naming the annotation task after the video itself, and do the same for the xml file. However, I don't really like that solution. The ideal solution for me would be to have video file name inside the xml file.

Am I missing something? Please point me in the right direction.

Thank you very much

Keypoint Annotation

I wanted to ask about the keypoint annotation feature you are workig on now. Would that have standard configuration/format like keypoints for annotating human pose? Would it have the same interpolation feature as the current bounding boxes? Finally when would the feature be released? Do you have specific date in mind? Thank you

How to run it without docker?

It's tedious to install docker and configure the settings, is there any way to run it directly?
After install lots library missed for django, I meet a problem:
ERRORS:
engine.Task: (auth.E005) The permission codenamed 'view_task' clashes with a builtin permission for model 'engine.Task'.
the script is:
sudo python3 manage.py createsuperuser

如何配置环境(how to configure the environment?)

这个环境配置能给一个教程吗,安装完docker和docker-compose后,就无法继续进行下去了。
(can you give me a guide to this project?After I configured the docker and docker-compose,I don't know what to do the next.)

XML file metadata labels is incomplete

The labeling schema doesn't make it into the output XML file.

As an example, I created a job with a 'labels' spec of:

person @select=type:white,blue,ref ball

and the dumped XML file is:

<?xml version="1.0" encoding="utf-8"?>                          
<annotations>                   
  <version>1.0</version>        
  <meta>                        
    <task>                      
      <id>16</id>               
      <name>test</name>         
      <size>902</size>          
      <mode>interpolation</mode>                                
      <overlap>5</overlap>      
      <bugtracker></bugtracker> 
      <created>2018-07-26 02:58:56.014598+03:00</created>       
      <updated>2018-07-26 02:58:56.014613+03:00</updated>       
      <labels>                  
        <label>                 
          <name>ball</name>     
          <attributes>          
          </attributes>         
        </label>                
      </labels>                 
      <segments>                
        <segment>               
          <id>24</id>           
          <start>0</start>      
          <stop>901</stop>      
          <url>http://13.66.164.80/?id=24</url>                 
        </segment>              
      </segments>               
      <owner>                   
        <username>cvat</username>                               
        <email>[email protected]</email>                         
      </owner>                  
    </task>                     
    <dumped>2018-07-26 02:59:11.669206+03:00</dumped>           
  </meta>                       
</annotations>         

Note that most of the 'labels' information is missing. The only way I was able to be sure of my 'labels' spec for an existing job is that it was stored in the browser history.

Thanks.

The login page at localhost:8080 can't be reached

I followed the Installation instructions, but after running the docker-compose up -d command, I get a "connection was reset" error in chrome, and don't see the login page.

The output of the docker-compose up -d command was:
Creating network "cvat_default" with the default driver
Creating cvat_db ... done
Creating cvat_redis ... done
Creating cvat ... done

And docker ps outputs:

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ec26a5065b07 cvat "/usr/bin/supervisord" 25 minutes ago Up 25 minutes 0.0.0.0:8080->8080/tcp, 8443/tcp cvat

Extend contributing.md

Hi,

Could you, please, describe / suggest development / testing steps.
In particular, how would one perform edit - update - run (debug) cycle for both server and client parts.

Where does the shared server directory point at?

To create huge tasks the documentation suggests choosing Share option in the dialog box.
While trying to select the files I see a modal popping up with the following path as title:
//icv-cifs/icv_projects/cvat/data

However I can not navigate from there (nor I can find where this path is). The documentation also does not elaborate much for large number of frames. Any advice?

modal

Undo functionality

Have you considered undo functionality?
Seems like that would be a very useful feature.
Thanks.

The login page at localhost:8080 have Bad Request

I am using Ubuntu 16.04.

I followed the tutorial and installed it successfully, but on the second day, when I ran the docker, it couldn't open the page, showing the 400 status code(Bad Request), how can I fix this?

The output of the docker-compose up -d command was:
Creating network "cvat_default" with the default driver
Creating cvat_db ... done
Creating cvat_redis ... done
Creating cvat ... done

And the output of docker logs cvat:
logs.txt

Support Pascal VOC Format

Hi, it would be nice to be able to export the annotations as Pascal VOC format. I Couldn't find info about supported formats in the documentation, is this feature supported?

Mark ignore regions and keyframes for an object

Thanks for great annotation server.
It would be great to have "uncertain" flag for annotation (like you have occlusion). Which means that as human I see the object and can annotate but for detection it would be nice to detect but it is OK if detector does not detect it (do no penalize algorithm for that).

I encountered difficulties in the task configuration page.

Now,I hava create new task.After I filled out name,labels and select files,I submitted this page.But I have been waiting for this page for two or three hours for this page that has one word: "Sucessful Request!Creating...".
So,I want to know how to configure the task,can you share your configuration.
my configuration is as follows:
Name: task 1
Labels: vehicle @select=type:undefined,car,truck,bus,train ~radio=quality:good,bad ~checkbox=parked:false
Select Files: 2.mp4

Host Container on Docker Hub

Could you please connect this repository to docker hub. This way it would be possible just to download to already built container, since the build process is rather lengthy.

Using the same attribute for label twice -> Stuck

There is no warning that you are using the same attribute multiple times. This can easily happen when copy, pasting stuff.

Errors I've experienced when doing this:

  1. Job doesnt start up. Instead you can only see the loading screen.
    2. Can't exit out of drawing the bounding box.

EDIT: Number two has more to do with large files (3.5Gb) I think. I will further investigate.

Also the one line given makes it extremely uncomfortable to type/paste in labels.

Greetings

UI becomes slow after 300-400 annotations

I'm labeling large satellite images with hundreds to a few thousand objects of interest.

I noticed that after about 300-400 annotations, the UI slows down. It might take the program ~1 sec to become responsive again after creating a new bbox. After about 800-1000 annotations, it's nearly unusable -- adding an annotation might require ~5 seconds before it will register. For now, I'm just cropping my large images into smaller pieces as a workaround, but it'd be a lot nicer to add all annotations to a single large image (as raw satellite imagery often comes in fairly long strips). I'm using a 2017 MacBook pro to do the labeling.

I don't know enough about the backend to suggest a fix, but happy to answer questions if it's helpful.

How to keep track id?

After annotation done --> upload annotation:
The ID of the target is messed up.

How to include the id information of the calibration target in the exported annotation file, and re-import the annotation file without showing confusion.

Running on AWS EC2

Was trying to run cvat on EC2 AWS and met an issue to access cvat from outside AWS. It was returning Bad requests: 400 all the time. Found solution to add EC2 instance public IP to ALLOWED_HOSTS in docker-compose.override.yml as specified in documentation. But it is not the nicests solution, every time IP changes I have to change that value. It would be great if someone with move AWS experience can provide more elegant solution. Thanks

Register new users

Hi
Thanks for great project. It is exactly what I was looking for. Even was able to run on AWS EC2 instance.
When new user tries to register he is getting:

Forbidden
Your account doesn't have access to this page. To proceed, please login with an account that has access or contact your admin.

I guess it still not implemented. Am I right?

Feature request: add tracking

It'd be great to add a tracking mode e.g. see video here.

Specifically, if I enable tracking (per track) if there are no annotations in the future of the track, then attempt to track the last annotated box for all future frames. If the user goes to the next frame while one of these tracks is displayed, then this is considered as marking this as 'good', and it gets added as a key frame. Otherwise the user can edit it manually.

There are probably more UI considerations, but this would provide a lot of value. My use case is tracking people heads, and the standard interpolation is less useful (but still much better than without!) due to heads 'bobbing' while walking etc. Feature tracking would likely solve this in many situations (aside from when a head is occluded etc.).

Aside from UI considerations, this is pretty easy to implement. It can even be done in the browser with opencv.js (and suitable performance, depending on device).

Mechanical Turk Integration

Integration of CVAT with MTurk for deploying work as HITs would be very useful for such projects. Need to integrate Turkic framework of VATIC with CVAT.
I would also like to contribute to your project. Please help me in setting up the development environment for the same.

Re-id app to merge bboxes into tracks after TF annotation

Hi, great tool. For ground-truth annotation, there are often too many objects in every frame. It would be tremendously tedious work to annotate the track for every single one of objects. Is there any pre-trained model or a way to run a custom model that can detect possibly identical objects, and all I have to do is to review and merge their tracks/IDs into one?

thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.