๐ Hi, Iโm @pmorerio
pmorerio / admd Goto Github PK
View Code? Open in Web Editor NEWTensorflow code for the paper 'Learning with privileged information via adversarial discriminative modality distillation', TPAMI 2019
License: MIT License
Tensorflow code for the paper 'Learning with privileged information via adversarial discriminative modality distillation', TPAMI 2019
License: MIT License
๐ Hi, Iโm @pmorerio
hi,
I would like to reimplement the results of your paper on NTU dataset, but I don't know how to orginize the dataset form.
Could you please share the NTU file form :)
Thanks.
Hi,
Thanks for releasing code! I have a question regarding the training dataset. Why you use different dataset in each training step, and even different dataset within step 1 training? For example, step 1, training rgb use ntu and training depth use nwucla. step 2 use nwucla to train hallucination net. Why can't use a single dataset to train all the steps?
I always get that -
--Do not forget to rename the variables in the orginal resnet checkpoint if you are training for the first time
---Run rename_ckpt.sh
Loading pretrained rgb/resnet50...
[]
Traceback (most recent call last):
File "/media/seb/SSD2/master/comparaison/admd/NYUD/main.py", line 19, in
solver.train_single_stream(modality='rgb')
File "/media/seb/SSD2/master/comparaison/admd/NYUD/solver.py", line 143, in train_single_stream
restorer=tf.compat.v1.train.Saver(variables_to_restore)
File "/home/seb/anaconda3/envs/radar/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 825, in init
self.build()
File "/home/seb/anaconda3/envs/radar/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 837, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/home/seb/anaconda3/envs/radar/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 862, in _build
raise ValueError("No variables to save")
ValueError: No variables to save
yes I have run the rename_ckpt.sh succesfully. So there is No variables to save in variables_to_restore at the beginning
Hi,
for the NYU classification task using only RGB images.
I want to know if you have modified the resnet 50, what normalization do you used (im pretty sure its the one of ImageNet) and what kind of data augmentation do you used during training? In Pytorch Iโm not able to get closed to your results (with RGB for exemple I get an accuracy of 45%), Iโm using a resnet 50 pretrained on image net, in training I applied the normalization of image net and I used data augmentation (random horizontal flip and random crop).
I tried to find the information in your code, but not being used to Tensorflow I'm really lost.
Thank you for your help!
Im trying to write your code according to your paper, but I have some problem understanding the difference between train_hallucination and train_hallucination_p2. In your paper it is written: The discriminator also features an additional classification task, i.e. not only it is trained to discriminate between hallucinated and depth features, but also to assign samples to the correct class. But in your code for the discriminator loss of train_hallucination you just do with tf.reduce_mean(tf.square(self.logits_fake - tf.zeros_like(self.logits_fake))) so discriminate between hallucinated and depth features, while for train_hallucination_p2 you do tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=self.logits_real, labels=tf.one_hot(self.labels, self.no_classes + 1))) so assign samples to the correct class. I dont understand why you dont have a loss merging those two parts?
Would it be possible for you to better present how the loss for the train_hallucination are calculated for both the generator and the discriminator. Thank you.
I just want to check that I use the right features to train the hallucination network, I use the feature map coming out of the first FC layer of the sequential fully connected module (the output of (0) in the third image) . Is this equivalent to your Tensorflow code, according to your artcile it's the same thing.
See the following images:
Thank you for your reply. I have downloaded the dataset and I noted that there are three kinds of depth data in a folder. Could you tell me which one do you use? There are named "_depth.png", "_depth_vis.jpg" and "_maprgbd.png".
Originally posted by @punknownq in #16 (comment)
Hi, can you provide the download link of Northwestern-UCLA dataset?
So the loading of the rgb/resnet50 is a success but when Loading pretrained depth/resnet50 I got the following error: The passed save_path is not a valid checkpoint: model/depth.
I'm running exactly the same code as you no modification except the path for the data and models.
But in the folder model/resnet50 I got:
resnet_v1_50.ckpt,
resnet_v1_50_rgb.ckpt.data-00000-of-00001,
resnet_v1_50_rgb.ckpt.index,
resnet_v1_50_rgb.ckpt.meta
resnet_v1_50_depth.ckpt.data-00000-of-00001,
resnet_v1_50_depth.ckpt.index,
resnet_v1_50_depth.ckpt.meta
Interestingly the single stream with the depth is working perfectly fine.
Following the issue #5 (now the problem is with the loading of the pretrained weights)
I'm able to assign a tensor...and load the value contain in the checkpoints, but... But it's seem I still have some problems...
Code 1 (saving the pretrained weight in numpy, line variables_to_restore = [vv for vv in variables_to_restore if 'conv1' not in vv.name] not active):
for var, val in zip(vars, vars_vals): #to save in numpy the pretrained weights
if 'rgb/resnet_v1_50/conv1/weights' in var.name:
np.save('weights_conv1', val)
Code 2 (assigning the pretrained weights) :
vars = tf.trainable_variables()
vars_vals = sess.run(vars)
for var, val in zip(vars, vars_vals):
if 'rgb/resnet_v1_50/conv1/weights' in var.name:
numpy_tensor=np.load('/media/seb/SSD2/master/comparaison/admd/NYUD/weights_conv1.npy')
t1=tf.convert_to_tensor(numpy_tensor,dtype=tf.dtypes.float32)
t2 = tf.random.uniform([7, 7, 1, 64], minval=0, maxval=None,
dtype=tf.dtypes.float32) # the 4th channel here
res = tf.concat(axis=2, values=[t1,
t2]) # the concatenation on the channel axis to obtain a [7,7,4,64] tensor
var.assign(res)
So I made those verifications:
numpy_tensor=np.load('/media/seb/SSD2/master/comparaison/admd/NYUD/weights_conv1.npy')
t1=tf.convert_to_tensor(numpy_tensor,dtype=tf.dtypes.float32)
with tf.Session() as sess:
print(sess.run(t1))
This print exactly the value expected, it contain the pretrained weights. So that's ok.
But after executing the block Code 2 to assign te pretrained weights to the convolution initialized randomly I do:
vars = tf.trainable_variables()
for var, val in zip(vars, vars_vals):
if 'rgb/resnet_v1_50/conv1/weights' in var.name:
print(val)
I get exactly the randomly initialized weights from before that block of code and not the pretrained ones...the assign dont have any effect on the trainable_variables
The problem, because it's seem I dont have any effect on the trainable_variables, must be with :
vars = tf.trainable_variables()
vars_vals = sess.run(vars)
According to some person on Stack_Overflow: ass=sess.run(var) and sess.run(ass) at the end of block Code 2, that resolve that, the transfer is ok for the weights of conv1.
Or do you think that variables_to_restore = [vv for vv in variables_to_restore if 'conv1' not in vv.name] also affect variable like rgb/resnet_v1_50/block3/unit_3/bottleneck_v1/conv1/weights:0 (float32_ref 1x1x1024x256) [262144, bytes: 1048576] because of the conv1...? I changed it for variables_to_restore = [vv for vv in variables_to_restore if 'rgb/resnet_v1_50/conv1/weights' not in vv.name] so it's suppose to be exactly the same weights and it's working. But I still have weird thing happening...
Hi,
I have a doubt about the discriminator design of ADMD method.
In ADMD discriminator you have used label 0 for real and label 1 for generated/fake samples. Usually, I have seen people using label 1 for real and label 0 for fake samples. Why is it the other way around in ADMD?
Also, for the extended label vector, zeros(C) is used for RGB inputs. Why zeros(C) for RGB and the true y label for depth inputs? What does the zeros(C) signify?
Thanks,
Hello Pietro,
the checkpoint in your repo links to Resnet 50 v2 http://download.tensorflow.org/models/resnet_v2_50_2017_04_14.tar.gz
In your paper and in your code you mention using the Resnet 50 v1.
I am able to run your code using the model v1 ( resnet_v1_50_2016_08_28.tar.gz), which I downloaded directly from the tensorflow webpage.
Unfortunately I did not manage to run your code with the resnet_v2_50_2017_04_14.tar.gz model.
Could you maybe clarify which one is the correct model?
Hi I need to change the first convolution of the model from rgb/resnet_v1_50/conv1/weights:0 (float32_ref 7x7x3x64) to rgb/resnet_v1_50/conv1/weights:0 (float32_ref 7x7x4x64), so basicaly augmenting the number of filter form 3 to 4 to accept 4 channels images but keeping the pretrained weight elsewhere (just the additional channel initialize ramdonly).
Do you have an idea of how to do that in Tensorflow (I'm more of a PyTorch guy...) ?
InPyTorch I do:
net = model.resnet50(num_classes=dataset_train.num_classes(),pretrained=True)
new_conv1 = nn.Conv2d(4, 64, kernel_size=7, stride=2,padding=3,bias=False)
conv1 = net.conv1
with torch.no_grad():
new_conv1.weight[:, :3, :, :]= conv1.weight
new_conv1.bias = conv1.bias
net.conv1 = new_conv1
Thanks a lot for your help!
I dont understand that line: fake_labels = self.labels + self.no_classes -self.labels , this simplify to fake_labels=self.no_classes, why writting fake_labels = self.labels + self.no_classes -self.labels?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.