Giter Club home page Giter Club logo

Comments (6)

geyuying avatar geyuying commented on July 17, 2024
  1. You are correct. Just re-use the RoI Pooled features from mask head because after the second stage, features from RoI Align already contain mask information. We tried using features from other layers, but got worse performance.

from deepfashion2.

geyuying avatar geyuying commented on July 17, 2024

-- Conv1: 3x3 conv - 256 channels -> ReLU
-- Conv2: 3x3 conv - 256 channels -> ReLU
-- Conv3: 3x3 conv - 256 channels -> ReLU
-- Conv4: 3x3 conv - 1024 channels -> ReLU
-- Pooling: GlobalAvgPool
--ReLU
-- FC: 1024 to 256 channels (No ReLU) +BN
Besides, the similarity learning net have:
-- Substraction (output 256 channels)
-- Element-wise square (output 256 channels)
-- FC: 256 to 2 channels (No ReLU)(The first channel means similarity, the second channel means difference. Positive pair label (1,0) ,negative pair label(0,1)
-- Softmax function.

from deepfashion2.

xwjabc avatar xwjabc commented on July 17, 2024

Thank you for your great help! Besides, I have two more questions:

  1. In the first version of the answer of the match network, I noticed that there are several tile operations:
INFO net.py: 263: self1 : (64, 256) => self_user : (8, 8, 256) ------- (op: Reshape)
INFO net.py: 263: self_user : (8, 8, 256) => self_user_ : (8, 8, 256) ------- (op: Transpose)
INFO net.py: 263: self_user_ : (8, 8, 256) => self_user_after : (64, 256) ------- (op: Reshape)
INFO net.py: 263: self_user_after : (64, 256) => self_user_after_ : (512, 256) ------- (op: Tile)
INFO net.py: 263: self2 : (64, 256) => self_shop_before : (64, 2048) ------- (op: Tile)
INFO net.py: 263: self_shop_before : (64, 2048) => self_shop : (512, 256) ------- (op: Reshape)

Could you explain a bit of the use of tile function?
Besides, I see the final output has shape (512, 2). However, according to the discussion, we should have 4096 pairs (512 positive pairs and 3584 negative pairs), which will lead to a shape of (4096, 2). I wonder the reason of such gap.

  1. In the evaluation of the retrieval, does Match R-CNN compare the user instance with all shop instances, or only compare the user instance with the shop instances which has the same predicted class as the user instance?

from deepfashion2.

geyuying avatar geyuying commented on July 17, 2024
  1. 4019 is proper. In our experiment, in oder to reduce the number of pairs, we do not use all pairs.
  2. compare the user instance with all shop instances

from deepfashion2.

xwjabc avatar xwjabc commented on July 17, 2024

Thank you for your great help! In my current implementation, I use the mask features after RoIAlign in the mask branch. However, the number of instances in the mask features is limited (1~2 instances per gt garment (unique pair_id + style) in total at the beginning of the training). Thus, I wonder how you can generate 8 instances per image for the retrieval task? Thx!

from deepfashion2.

joppichristian avatar joppichristian commented on July 17, 2024
  1. 4019 is proper. In our experiment, in oder to reduce the number of pairs, we do not use all pairs.
  2. compare the user instance with all shop instances

How did you compare all the user instance with all shop instances? It means an enormous number of comparisons. I have 4x Titan RTX and tqdm estimates 6000 hours to complete the evaluation. Have I missed something?

from deepfashion2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.