Giter Club home page Giter Club logo

Comments (18)

forbiddenname avatar forbiddenname commented on August 11, 2024 2
3\. **keeping custom test image file names as the names of VG-DRNET test image file names**.

HI! "keeping custom test image file names as the names of VG-DRNET test image file names", it means that setting the own image file name as the image name in test.json, like 2407892.jpg, and putting own image in the corresponding VG image folder?

from factorizablenet.

Hastyrush avatar Hastyrush commented on August 11, 2024 1

@abhijeetnijampurkar I had the same problem as you initially. But they were gone after I tuned —triplet-nms and —nms. For example, I simply ask the visualization tool to output the top 10 most confident triplets prediction, and none of them were actually duplicates. Nms filters the objects during faster-rcnn prediction, triplet-nms filters duplicate triplets when predicting. When I looked at the visualization code, its mostly comparison to ground truth to determine if the predicted results are actually in the ground truth, and skip the triplet if it is not. Hence, without ground truth, you might get inaccurate label and relationship predictions, but the duplicates should be solved when you are evaluating with train_FN.py

from factorizablenet.

abhijeetnijampurkar avatar abhijeetnijampurkar commented on August 11, 2024

As a result, I am getting multiple detections for a single object.

@sometimething please let me know if you get the solution to this query. Thanks in advance. :)

Capture

from factorizablenet.

sometimething avatar sometimething commented on August 11, 2024

As a result, I am getting multiple detections for a single object.

@sometimething please let me know if you get the solution to this query. Thanks in advance. :)

Capture
I did not encounter such a problem. However, I think it may be because the same object has multiple predicted bounding boxes, and the iou of these bounding boxes and ground truth are greater than the threshold you set, resulting in one object in the image being detected multiple times. You can set the threshold of iou to be larger during the visualization, or filter the detected relationships (the same relationships will no longer be visualized)

from factorizablenet.

abhijeetnijampurkar avatar abhijeetnijampurkar commented on August 11, 2024

@sometimething the above results does not have corresponding ground truth. So, increasing IoU threshold with ground truth does not look comfortable to me. Please correct me, if wrong?

from factorizablenet.

fpsluozi avatar fpsluozi commented on August 11, 2024

As a result, I am getting multiple detections for a single object.

@sometimething please let me know if you get the solution to this query. Thanks in advance. :)

Capture

Hello there, I'm working on a similar task (generating scene graphs with custom images) and could not get around with it. Would you kindly share your code? Thanks!

from factorizablenet.

abhijeetnijampurkar avatar abhijeetnijampurkar commented on August 11, 2024

@fpsluozi I am just replacing VG test images with my test images, and getting it through (though partially, since I don't have ground truths for them). Nothing great!

from factorizablenet.

fpsluozi avatar fpsluozi commented on August 11, 2024

@fpsluozi I am just replacing VG test images with my test images, and getting it through (though partially, since I don't have ground truths for them). Nothing great!

I see I will go try this one out

from factorizablenet.

muraliadithya avatar muraliadithya commented on August 11, 2024

@abhijeetnijampurkar I think I'm having problems with loading images, but my goal is to use the pretrained model on new images as well. I have some questions:
(i) Which dataset did you use (VRD or visual genome)?
(ii) Which directory did you put it in (please tell me relative to the repository)
(iii) Where did you put your own images in order for the pretrained model to predict on them (again, relative to the repository)
(iv) Where do you get the predictions stored?
Thanks a lot!

from factorizablenet.

abhijeetnijampurkar avatar abhijeetnijampurkar commented on August 11, 2024

@murali24adithya

  1. I'm using VG-DRNET as of now. But, I have tried all the three data subsets. All of them work fine for me.
  2. You can place data anywhere, but update the directory path (as mentioned in Readme/Project settings/point 6).
  3. I replaced original test images from the given data subset (say VG-DRNET) with custom test images, keeping custom test image file names as the names of VG-DRNET test image file names.
  4. You have to run following:
    CUDA_VISIBLE_DEVICES=0 python train_FN.py --evaluate
    --path_opt options/models/VRD.yaml
    --pretrained_model output/trained_models/Model-VRD.h5
    This will predict and store the results as a pkl file in output/ directory. You can then run python visualize_graph.py to visualize the results.

Follow the instructions mentioned in README. Hope this helps!

from factorizablenet.

muraliadithya avatar muraliadithya commented on August 11, 2024

@abhijeetnijampurkar I think I'm missing something fairly obvious here, please help me.

1. I'm using VG-DRNET as of now. But, I have tried all the three data subsets. All of them work fine for me.

I am unable to understand where to find these data subsets. I thought they were from point 5 on the Readme/Project settings, where there are two, not three data sets (you had mentioned three above). If you mean point 4, none of those .tgz files have an image folder inside them.

The instruction on line 6 asks us to create a symbolic link to our actual image directory under the name "F-Net/data/${Dataset}/images" if I'm understanding this correctly. Let's say I'm using the VRD data subset downloaded from point 5. It's a folder called sg_dataset with two json files and two folders of images, sg_train_images and sg_test_images. Which of these folders should I create the symlink for?

from factorizablenet.

abhijeetnijampurkar avatar abhijeetnijampurkar commented on August 11, 2024

@murali24adithya The three data splits which I mentioned were from point 4. Download them and extract them. They contain json files, which stores image file name in train and test splits.
For dataset images, follow point 5, and download Visual Genome images. Put all these in directories as mentioned, otherwise you have to configure them in code.
Create symbolic link for the outermost extracted folders. While creating this link pass absolute path in the command.

from factorizablenet.

muraliadithya avatar muraliadithya commented on August 11, 2024

@abhijeetnijampurkar Thanks a lot! I'm able to finally execute past that point now.

I am getting a CUDA out of memory error, though, and on issue #20 (comment) someone said they added torch.no_grad before the inference to solve this issue. Did you face the same issue/do you happen to know where to add that code to solve it?

from factorizablenet.

Hastyrush avatar Hastyrush commented on August 11, 2024

The code is designed such that it is used to calculate the recall metric across evaluation images with ground truth labels. To do it on custom images without ground truth, I tried the following:

  1. Change the forward_evaluate() function to output the raw predictions (bounding box, object, subject and relationship etc.) instead of the recall metric to the testing_results.pkl file

  2. In the visualization tool (visualize.py), it compares the triplets in the testing_results.pkl file with the ground truth annotated images. Since we don't have the ground truth, comment out the comparison, and just output the top x most confident triplets ([:10]) for example.

  3. To ensure that the code can work without ground truth labelled annotations, comment out all lines that loads the ground truth labels in the dataloader script, as well as the graph visualization tool.

  4. @abhijeetnijampurkar If you are seeing multiple duplicated objects as well as triplets being detected and visualized, I recommend tuning the --triplet_nms and --nms parameter when doing evaluation to threshold the duplicates as appropriate.

Best of luck!

from factorizablenet.

abhijeetnijampurkar avatar abhijeetnijampurkar commented on August 11, 2024

@Hastyrush I tried playing with --nms parameter, but results didn't improve. I noticed that in visualize.py , the raw predictions from pkl file are read and IoU based filtering of predictions with ground truth boxes is made, pruning out the almost same regions. This filtering gives minimal and relevant regions for visualization, given the ground truth.

But since for evaluation on custom images won't have GT, I was planning of some kind of identifying similar boxes. I would be glad to know your ideas on this.

PS: I am working on smaller subset of object and relation classes.

from factorizablenet.

zzrcc avatar zzrcc commented on August 11, 2024

Who has the Scene Graph of coco dataset , can you publish it,thanks a lot

from factorizablenet.

janick187 avatar janick187 commented on August 11, 2024

The code is designed such that it is used to calculate the recall metric across evaluation images with ground truth labels. To do it on custom images without ground truth, I tried the following:

  1. Change the forward_evaluate() function to output the raw predictions (bounding box, object, subject and relationship etc.) instead of the recall metric to the testing_results.pkl file
  2. In the visualization tool (visualize.py), it compares the triplets in the testing_results.pkl file with the ground truth annotated images. Since we don't have the ground truth, comment out the comparison, and just output the top x most confident triplets ([:10]) for example.
  3. To ensure that the code can work without ground truth labelled annotations, comment out all lines that loads the ground truth labels in the dataloader script, as well as the graph visualization tool.
  4. @abhijeetnijampurkar If you are seeing multiple duplicated objects as well as triplets being detected and visualized, I recommend tuning the --triplet_nms and --nms parameter when doing evaluation to threshold the duplicates as appropriate.

Best of luck!

Hi @Hastyrush
would you mind to share your custom implementation / the adjusted F-Net? Because I am exactly looking for the generation of a scene graph on a custom image. Many thanks.

from factorizablenet.

fomalhaut-b avatar fomalhaut-b commented on August 11, 2024

@Hastyrush second @janick187 's comment above. Sharing that will be immensely useful!

from factorizablenet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.