Giter Club home page Giter Club logo

Comments (8)

stephenpiet avatar stephenpiet commented on June 9, 2024 4

You should try using record_confidence=True, it worked for me!

from autodistill.

nguyenthekhoig7 avatar nguyenthekhoig7 commented on June 9, 2024 2

I just encoutered the same issue and found the reason, but not yet a solution.

The code stuck when splitting images and labels file into train/valid folders, it could not find the generated label file in the previous step. I think what caused this is that Google Colab sometimes could not read file immediately after writing it, maybe network speed issue.

I fixed it by manually move the images & labels from images and annotations folder into train and valid, it works with small datasets. With regular and large datasets, I am looking for a solution too.

from autodistill.

capjamesg avatar capjamesg commented on June 9, 2024 2

This issue was introduced in the latest version of Autodistill. We are working on a fix for this issue. I will message here when a fix is live. In the interim, can you try to call the .label() method with the record_confidence=False flag?

from autodistill.

nguyenthekhoig7 avatar nguyenthekhoig7 commented on June 9, 2024 1

Calling .label() with record_confidence=False flag does not cure the issue. I am running on a custom dataset setting

My code:

base_model = GroundedSAM(ontology=ontology)
dataset = base_model.label(
    input_folder=IMAGE_DIR_PATH,
    extension=".jpg",
    output_folder=DATASET_DIR_PATH,
    record_confidence=False)

And, the error:

trying to load grounding dino directly
final text_encoder_type: bert-base-uncased
Labeling /content/images/PartB_00049.jpg:   0%|          | 0/6 [00:00<?, ?it/s]The `device` argument is deprecated and will be removed in v5 of Transformers.
torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
None of the inputs have requires_grad=True. Gradients will be None
The `device` argument is deprecated and will be removed in v5 of Transformers.
torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
None of the inputs have requires_grad=True. Gradients will be None
Labeling /content/images/PartB_00045.jpg: 100%|██████████| 6/6 [00:32<00:00,  5.35s/it]
Did not find /content/dataset3/annotations/confidence-PartB_00043.txt, not moving anything to /content/dataset3/train/labels
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
[/usr/lib/python3.10/shutil.py](https://localhost:8080/#) in move(src, dst, copy_function)
    815     try:
--> 816         os.rename(src, real_dst)
    817     except OSError:

FileNotFoundError: [Errno 2] No such file or directory: '/content/dataset3/annotations/confidence-PartB_00043.txt' -> '/content/dataset3/train/labels/confidence-PartB_00043.txt'

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
6 frames
[<ipython-input-10-8bbc08cfb905>](https://localhost:8080/#) in <cell line: 4>()
      2 
      3 base_model = GroundedSAM(ontology=ontology)
----> 4 dataset = base_model.label(
      5     input_folder=IMAGE_DIR_PATH,
      6     extension=".jpg",

[/usr/local/lib/python3.10/dist-packages/autodistill/detection/detection_base_model.py](https://localhost:8080/#) in label(self, input_folder, extension, output_folder, human_in_the_loop, roboflow_project, roboflow_tags, sahi, record_confidence)
    108                 output_folder + "/annotations", images_map, detections_map
    109             )
--> 110         split_data(output_folder)
    111 
    112         if human_in_the_loop:

[/usr/local/lib/python3.10/dist-packages/autodistill/helpers.py](https://localhost:8080/#) in split_data(base_dir, split_ratio)
    144         _check_move_file(images_dir, file + ".jpg", train_images_dir)
    145         _check_move_file(annotations_dir, file + ".txt", train_labels_dir)
--> 146         _check_move_file(
    147             annotations_dir, "confidence-" + file + ".txt", train_labels_dir
    148         )

[/usr/local/lib/python3.10/dist-packages/autodistill/helpers.py](https://localhost:8080/#) in _check_move_file(source_dir, source_file, dest_dir)
    139                 f"Found {os.path.join(dest_dir, source_file)} as already present, not moving anything to {dest_dir}"
    140             )
--> 141         shutil.move(os.path.join(source_dir, source_file), dest_dir)
    142 
    143     for file in train_files:

[/usr/lib/python3.10/shutil.py](https://localhost:8080/#) in move(src, dst, copy_function)
    834             rmtree(src)
    835         else:
--> 836             copy_function(src, real_dst)
    837             os.unlink(src)
    838     return real_dst

[/usr/lib/python3.10/shutil.py](https://localhost:8080/#) in copy2(src, dst, follow_symlinks)
    432     if os.path.isdir(dst):
    433         dst = os.path.join(dst, os.path.basename(src))
--> 434     copyfile(src, dst, follow_symlinks=follow_symlinks)
    435     copystat(src, dst, follow_symlinks=follow_symlinks)
    436     return dst

[/usr/lib/python3.10/shutil.py](https://localhost:8080/#) in copyfile(src, dst, follow_symlinks)
    252         os.symlink(os.readlink(src), dst)
    253     else:
--> 254         with open(src, 'rb') as fsrc:
    255             try:
    256                 with open(dst, 'wb') as fdst:

FileNotFoundError: [Errno 2] No such file or directory: '/content/dataset3/annotations/confidence-PartB_00043.txt'

Hope there is a workaround of this, or can we call split_data() separately outside of .labell() function?

from autodistill.

nguyenthekhoig7 avatar nguyenthekhoig7 commented on June 9, 2024 1

@stephenpiet Surprisingly it worked, thank you!

from autodistill.

duydatnguyen11 avatar duydatnguyen11 commented on June 9, 2024 1

@stephenpiet Thank you so much

from autodistill.

capjamesg avatar capjamesg commented on June 9, 2024

Thank you, @stephenpiet! We are releasing a fix to this bug this week. Apologies for all the inconvenience caused.

from autodistill.

capjamesg avatar capjamesg commented on June 9, 2024

Our new release is now live that should fix this issue. Please run pip install -U autodistill.

from autodistill.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.