cvat-ai / cvat Goto Github PK

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

Home Page: https://cvat.ai

License: MIT License

Python 38.69% HTML 0.48% JavaScript 13.99% Shell 0.14% Dockerfile 0.12% TypeScript 41.53% SCSS 1.64% Smarty 0.05% Open Policy Agent 0.87% Mustache 2.49% Jinja 0.01%

video-annotation computer-vision computer-vision-annotation deep-learning image-annotation annotation-tool annotation labeling labeling-tool image-labeling

cvat's People

Contributors

Stargazers

Watchers

Forkers

todkang syonekura yaoq nuzhny007 jdc08161063 hurmean southernpl opencvfun shanlizi kenderm abrams90 kamil-k matrixplayer lp249839965 fengshow12345 tuyaliang trendingtechnology treiden fitrialif aaelsay2 taoism-o armstrongyang berak drawhamxh wang4363688 whiskychoy junysss kobelin923 haffo emmaymjin levelsethu fendaq starstylesky chen849157649 keyky xiaoyc003 visionbang autohe jiyulongxu zmdfwh wyuzyf zfxu majian7654 ccfiona yangxs liangyifei suixiaodan amoliu feitiandemiaomi wangxiqiu jianweilin satchelwu raymondtaoer jacke121 lizhengdao noticeable nemodrive gzzgz abc3436645 valhongli bubalazi xyhxyh aitoraller erika1203 gc5218112 rkshuai wangzhe0623 daniellaszlo tazjel jiyongma husterjwx kenh1991 blankworld ayan1991 wangyangneu smartcai jangocheng lql0716 pandinosaurus quantumlab bowrian laidag sanyaade-machine-learning baby47 lisa20182017 hassansartaj huyun0 zhaojq-github mediaeater evertonteotonio dxawgzy nishanthjois chenkaiidy kmsravindra saikop99 tohigher itsmengzaime analystashok mathmanu robgf

cvat's Issues

Remove all annotations inside a range of frames

This is very useful option to remove all options from one frame to another.
I want to reannotate part of video. It is not good idea to search for keyframes and turn off them or
delete lines in xml file and upload annotation.

Minimum and recommended HW config

Hi,

I'm consider deploying CVAT on AWS. What is minimum / recommended HW config such as number of CPU cores, RAM, disk space?

CVAT - AWS-Deployment guide

It would be nice if we have some docs telling how we can deploy this into the AWS CUDA Deep-learning machine.
Let me add this to the doc of CVAT or if anyone can build CVAT AMI at AWS would be great. Most of the time, we use the CVAT in AWS. I believe it would be helpful to other teams.

Release notes?

Is there release notes available anywhere?
If not:

should they be added?
what's the best way to figure out what has changed? look through git logs?

Video/Image loading status as on youtube

Another question and likely feature suggestion.

When start a job, if I wait long enough, will all the frames be loaded into the browser?
Or, are they loaded on demand as I seek through the video?
Are they cached locally in memory?

I'm working with 4k video and the interface isn't that usable, at least for my current use model, until all frames have been loaded.

Based on the answer above, it would be great to have feedback as to whether the frames have all been loaded or, better, which frames have been loaded. What I've seen that works well is using a different color on the seek bar for frames that have been loaded.

If they are demand loaded, it would be nice to have a way to force it to load them all (as long as there's enough memory available).

Verification for data on client and server side

Last experience has showed the need to check data before load it to client or to server. Much invalid data save in database now.

Error: Failed to execute 'inverse' on 'SVGMatrix': The matrix is not invertible

Error: Failed to execute 'inverse' on 'SVGMatrix': The matrix is not invertible.
at translateSVGPos (https://cvat-icv.inn.intel.com/static/CACHE/js/33b452232897.js:9422:54)
at ShapeCreatorView. (https://cvat-icv.inn.intel.com/static/CACHE/js/33b452232897.js:7852:30)
at HTMLDivElement.dispatch (https://cvat-icv.inn.intel.com/static/CACHE/js/716e033f0bc5.js:24801:27)
at HTMLDivElement.elemData.handle (https://cvat-icv.inn.intel.com/static/CACHE/js/716e033f0bc5.js:24609:28)

Functionality "Upload annotation for task" works bad

It upload and saves extra shapes to database. Some errors occur later (for example during dump).

Enable video stream access

Hi,

My workflow is such that I have thousands of frames per annotation task, which amounts to extensive disk space usage (e.g. a <15MB video (~40s VGA@30fps) results in ~2GB in jpeg's).

Adding the ability of CVAT to work directly on a video stream will be a significant improvement as it will allow for the user to only specify a URL/path with potentially an optional download and local storage capability.

One way I see this to be done is by usage of OpenCV.js (https://docs.opencv.org/3.4/d5/d10/tutorial_js_root.html).

I could invest time in this. Let me know of your thoughts.

Thanks

Installation steps without Docker

Hello Guys,
Can you support installation steps without using docker ?

clarify how to create multiple labels

I might have missed it, but the documentation doesn't seem to specify how to add multiple labels. I assumed it was just comma-separated (e.g. "car,truck,bus") but that didn't work.

Navigation by frames may works incorrect

Navigation by frames may will work incorrect in next scenario:

Open any task in CVAT
Resize browser to size which is less then CVAT workspace
Scroll browser slider right
Try to navigate with player progress bar

Player will not react to the progress bar navigation if cursor near start of the progress bar. Such "non react" area will increase if you scroll the browser slider more to the right

Sort labels in alphabet order

Currently they are sorted by primary key. If all labels were provided at task creation it results in a semi-random order of labels. It makes significantly harder to find required label when working with a big number of them. One work-around is to add labels one-by-one, but it's not a pleasant process...

Improve documentation for overlap parameter

My understanding is that overlap just specifies how many frames overlap when splitting a video into segments. If that's correct, what's the purpose? Does it actually do anything for the user? (E.g. if I do the tracks on the first segment, are they copied across to the next segment in the overlapping region? This doesn't appear to be the case.) Put another way- should I just set overlap=0 in my video tagging examples, to avoid having to manually resolve different taggings from each segment in the overlap?

Sorry if I've misunderstood something obvious.

PS - great tool!

Links with frame and filter information

Time to time it may useful, ability to create links for specific frame or specific object (with help filter functionality). For example:
https://cvat-host.com/?id=3341&frame=1500
https://cvat-host.com/?id=5443?frame=1500&filter=*[id="123"]

All checkboxes in temporary attributes are checked when reopening job after saving the job

All checkboxes in temporary attributes are checked when reopening job after saving the job although when dumping the annotation, the file downloaded is correct which contain the attributes I checked it only not all of them.

Not correct number of frames in video

I load video with resolution 4096x2178. Number of frames is 1079.
In job statistic I see number of frames is 959 and in xml file number of frames is the same.

Could not create the task. ffmpy.FFRuntimeError

Built docker image from the latest sources. Created superuser. Getting an error on task creation:

Could not create the task. ffmpy.FFRuntimeError: ffmpeg -i /home/django/data/2/.upload/20170209T193000.000000Z.mp4 -start_number 0 -b:v 10000k -vsync 0 -an -y -q:v 16 /tmp/cvat-p9csbe_h.data/%d.jpg exited with status 1 STDOUT: STDERR:

Connected to running cvat container with docker exec -it <container id> /bin/bash and pasted command from error message into terminal. It fails, because folder cvat-* in /tmp doesn't exist.

Video file name / url in output file

Hi,
First of all, thanks for the tool. It works great!

When I annotate video files, and for that purpose I create an annotation task per video, I cannot seem to find any reference to the original video name / path / url inside the task itself. Moreover, inside the output XML file generated after annotating, there are no references to that information at all. The only thing I can find is the url of the corresponding task, but I don't think I can extract from that url the information I'm looking for (i.e. the name of the video).

The only workaround I can think of is naming the annotation task after the video itself, and do the same for the xml file. However, I don't really like that solution. The ideal solution for me would be to have video file name inside the xml file.

Am I missing something? Please point me in the right direction.

Thank you very much

Bad Request (400)

I run containers without tf_annotation app:
docker-compose up
and visit with ip like this:
http://10.239.38.156:8080
got this error:
Bad Request (400)

Is there something wrong?

Keypoint Annotation

I wanted to ask about the keypoint annotation feature you are workig on now. Would that have standard configuration/format like keypoints for annotating human pose? Would it have the same interpolation feature as the current bounding boxes? Finally when would the feature be released? Do you have specific date in mind? Thank you

docker-compose down command as written in the readme does not remove volumes

The option -v needs to be added. If that is not done, the data will still be there the next time docker-compose up is run.

How to run it without docker?

It's tedious to install docker and configure the settings, is there any way to run it directly?
After install lots library missed for django, I meet a problem:
ERRORS:
engine.Task: (auth.E005) The permission codenamed 'view_task' clashes with a builtin permission for model 'engine.Task'.
the script is:
sudo python3 manage.py createsuperuser

Defiant doesn't support dash (-) in xpath nodes

Filter is not working on labels with dash character (possible same situation with attributes)
Defiant issue: hbi99/defiant.js#12

如何配置环境（how to configure the environment?）

这个环境配置能给一个教程吗，安装完docker和docker-compose后，就无法继续进行下去了。
(can you give me a guide to this project?After I configured the docker and docker-compose,I don't know what to do the next.)

No support to change label name after creating task

I created a task with the wrong label (maybe with mistake). After creating the task, I can not change this label, only add a new one. The decision to create a new task does not look good.

XML file metadata labels is incomplete

The labeling schema doesn't make it into the output XML file.

As an example, I created a job with a 'labels' spec of:

person @select=type:white,blue,ref ball

and the dumped XML file is:

<?xml version="1.0" encoding="utf-8"?>                          
<annotations>                   
  <version>1.0</version>        
  <meta>                        
    <task>                      
      <id>16</id>               
      <name>test</name>         
      <size>902</size>          
      <mode>interpolation</mode>                                
      <overlap>5</overlap>      
      <bugtracker></bugtracker> 
      <created>2018-07-26 02:58:56.014598+03:00</created>       
      <updated>2018-07-26 02:58:56.014613+03:00</updated>       
      <labels>                  
        <label>                 
          <name>ball</name>     
          <attributes>          
          </attributes>         
        </label>                
      </labels>                 
      <segments>                
        <segment>               
          <id>24</id>           
          <start>0</start>      
          <stop>901</stop>      
          <url>http://13.66.164.80/?id=24</url>                 
        </segment>              
      </segments>               
      <owner>                   
        <username>cvat</username>                               
        <email>[email protected]</email>                         
      </owner>                  
    </task>                     
    <dumped>2018-07-26 02:59:11.669206+03:00</dumped>           
  </meta>                       
</annotations>

Note that most of the 'labels' information is missing. The only way I was able to be sure of my 'labels' spec for an existing job is that it was stored in the browser history.

Thanks.

Annotation file should contain original file names (extension can be wrong)

I have an annotation task which was created for 10 ".png" images. Inside annotation file all files have ".jpg" extension. It isn't correct and leads to problems when you try to use scripts to convert data into COCO format.

The login page at localhost:8080 can't be reached

I followed the Installation instructions, but after running the docker-compose up -d command, I get a "connection was reset" error in chrome, and don't see the login page.

The output of the docker-compose up -d command was:
Creating network "cvat_default" with the default driver
Creating cvat_db ... done
Creating cvat_redis ... done
Creating cvat ... done

And docker ps outputs:

CONTAINER ID	IMAGE	COMMAND	CREATED	STATUS	PORTS	NAMES
ec26a5065b07	cvat	"/usr/bin/supervisord"	25 minutes ago	Up 25 minutes	0.0.0.0:8080->8080/tcp, 8443/tcp	cvat

FEATURE: More better way to add/remove points for polygons

My docker shows this, help to see what the problem is.

My docker shows this, helps to see what the problem is, and why the box does not move along with the car.

Extend contributing.md

Hi,

Could you, please, describe / suggest development / testing steps.
In particular, how would one perform edit - update - run (debug) cycle for both server and client parts.

Where does the shared server directory point at?

To create huge tasks the documentation suggests choosing Share option in the dialog box.
While trying to select the files I see a modal popping up with the following path as title:
//icv-cifs/icv_projects/cvat/data

However I can not navigate from there (nor I can find where this path is). The documentation also does not elaborate much for large number of frames. Any advice?

Undo functionality

Have you considered undo functionality?
Seems like that would be a very useful feature.
Thanks.

is there a way to display only one track in annotation mode?

Say I have 10 person tracks, person0 - person9, can I hide all tracks except for one person, say person3?
The filtering options seems to hide the entire group only?

The login page at localhost:8080 have Bad Request

I am using Ubuntu 16.04.

I followed the tutorial and installed it successfully, but on the second day, when I ran the docker, it couldn't open the page, showing the 400 status code（Bad Request）, how can I fix this?

The output of the docker-compose up -d command was:
Creating network "cvat_default" with the default driver
Creating cvat_db ... done
Creating cvat_redis ... done
Creating cvat ... done

And the output of docker logs cvat:
logs.txt

Support Pascal VOC Format

Hi, it would be nice to be able to export the annotations as Pascal VOC format. I Couldn't find info about supported formats in the documentation, is this feature supported?

Mark ignore regions and keyframes for an object

Thanks for great annotation server.
It would be great to have "uncertain" flag for annotation (like you have occlusion). Which means that as human I see the object and can annotate but for detection it would be nice to detect but it is OK if detector does not detect it (do no penalize algorithm for that).

I encountered difficulties in the task configuration page.

Now，I hava create new task.After I filled out name,labels and select files,I submitted this page.But I have been waiting for this page for two or three hours for this page that has one word: "Sucessful Request!Creating...".
So,I want to know how to configure the task,can you share your configuration.
my configuration is as follows:
Name: task 1
Labels: vehicle @select=type:undefined,car,truck,bus,train ~radio=quality:good,bad ~checkbox=parked:false
Select Files: 2.mp4

Host Container on Docker Hub

Could you please connect this repository to docker hub. This way it would be possible just to download to already built container, since the build process is rather lengthy.

Using the same attribute for label twice -> Stuck

There is no warning that you are using the same attribute multiple times. This can easily happen when copy, pasting stuff.

Errors I've experienced when doing this:

Job doesnt start up. Instead you can only see the loading screen.
2. Can't exit out of drawing the bounding box.

EDIT: Number two has more to do with large files (3.5Gb) I think. I will further investigate.

Also the one line given makes it extremely uncomfortable to type/paste in labels.

Greetings

UI becomes slow after 300-400 annotations

I'm labeling large satellite images with hundreds to a few thousand objects of interest.

I noticed that after about 300-400 annotations, the UI slows down. It might take the program ~1 sec to become responsive again after creating a new bbox. After about 800-1000 annotations, it's nearly unusable -- adding an annotation might require ~5 seconds before it will register. For now, I'm just cropping my large images into smaller pieces as a workaround, but it'd be a lot nicer to add all annotations to a single large image (as raw satellite imagery often comes in fairly long strips). I'm using a 2017 MacBook pro to do the labeling.

I don't know enough about the backend to suggest a fix, but happy to answer questions if it's helpful.

Is there any documentation?

Under README.md it says to read the documentation, but I don't see any.

How to keep track id?

After annotation done --> upload annotation:
The ID of the target is messed up.

How to include the id information of the calibration target in the exported annotation file, and re-import the annotation file without showing confusion.

Running on AWS EC2

Was trying to run cvat on EC2 AWS and met an issue to access cvat from outside AWS. It was returning Bad requests: 400 all the time. Found solution to add EC2 instance public IP to ALLOWED_HOSTS in docker-compose.override.yml as specified in documentation. But it is not the nicests solution, every time IP changes I have to change that value. It would be great if someone with move AWS experience can provide more elegant solution. Thanks

Add frame parameter for links

Example: https://cvat.com/?id=15&frame=1500

Register new users

Hi
Thanks for great project. It is exactly what I was looking for. Even was able to run on AWS EC2 instance.
When new user tries to register he is getting:

Forbidden
Your account doesn't have access to this page. To proceed, please login with an account that has access or contact your admin.

I guess it still not implemented. Am I right?

Feature request: add tracking

It'd be great to add a tracking mode e.g. see video here.

Specifically, if I enable tracking (per track) if there are no annotations in the future of the track, then attempt to track the last annotated box for all future frames. If the user goes to the next frame while one of these tracks is displayed, then this is considered as marking this as 'good', and it gets added as a key frame. Otherwise the user can edit it manually.

There are probably more UI considerations, but this would provide a lot of value. My use case is tracking people heads, and the standard interpolation is less useful (but still much better than without!) due to heads 'bobbing' while walking etc. Feature tracking would likely solve this in many situations (aside from when a head is occluded etc.).

Aside from UI considerations, this is pretty easy to implement. It can even be done in the browser with opencv.js (and suitable performance, depending on device).

Mechanical Turk Integration

Integration of CVAT with MTurk for deploying work as HITs would be very useful for such projects. Need to integrate Turkic framework of VATIC with CVAT.
I would also like to contribute to your project. Please help me in setting up the development environment for the same.

Re-id app to merge bboxes into tracks after TF annotation

Hi, great tool. For ground-truth annotation, there are often too many objects in every frame. It would be tremendously tedious work to annotate the track for every single one of objects. Is there any pre-trained model or a way to run a custom model that can detect possibly identical objects, and all I have to do is to review and merge their tracks/IDs into one?

thanks