zhongdao / gcn_clustering Goto Github PK
View Code? Open in Web Editor NEWCode for CVPR'19 paper Linkage-based Face Clustering via GCN
License: MIT License
Code for CVPR'19 paper Linkage-based Face Clustering via GCN
License: MIT License
~
您好,您提供的那个供下载features的百度网盘链接,为什么下载时总是显示“下载请求中”,但就是无法下载呢?
Could you please tell me where to find the images corresponding to the features.zip.
不知道作者有没有单独在Ms-Celeb-1M上进行过测试,测试结果是多少,耗时是多少。
Thank you very much for your inspiring work!
As suggested in the paper, "In the testing phase, it is not necessary to keep the same configuration with the training phase. ", setting k_at_hop=[20,5] of test.py
is reasonable for fast testing. But the pred and loss seem to become nan if k_at_hop=[200,10]. May I ask whether this phenomena is reproduced on your side, and why nan occurs?
Hi! Could you share the list of 'IJB-B' dataset?
I wonder why the first iteration not use thershold in connected_components_constraint(vertex, max_sz) / graph_propagation() /graph.py?
Is the experiment in the paper also based on this setting?
I find that the first iteration result 'remain' may be 'null', so code will not do the next iteration and finally clustering result has nothing to do with the model's predict.
For the first iteration:
the 'vertex' contains all the node-pairs/links/edges are generated by KNN and also are the input data of model,
and the code just directly use all these node-pairs/links (neighbors = n.links, line 69) to create groups/cluster, just like BFS algorithm, rather than use the score predicted by the model to filter them. Is it right?
I would be very grateful if you could provide suggestion.
你好,请问这篇论文的思路或者采用的gcn与graphsage的具体区别在哪个地方,请教一下创新的具体点,谢谢
Hello, when I try to train the program,it got some errors:
How can I slove it?
The details:
Current lr 0.01
Traceback (most recent call last):
File "train.py", line 165, in
main(args)
File "train.py", line 64, in main
train(trainloader, net, criterion, opt, epoch)
File "train.py", line 81, in train
for i, ((feat, adj, cid, h1id), gtmat) in enumerate(loader):
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 280, in next
idx, batch = self._get_batch()
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 259, in _get_batch
return self.data_queue.get()
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/threading.py", line 340, in wait
waiter.acquire()
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 178, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 31687) is killed by signal: Killed.
Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 71, in _worker_manager_loop
r = in_queue.get()
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/multiprocessing/queues.py", line 378, in get
return recv()
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/site-packages/torch/multiprocessing/queue.py", line 22, in recv
return pickle.loads(buf)
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/pickle.py", line 1388, in loads
return Unpickler(file).load()
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/pickle.py", line 864, in load
dispatchkey
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/pickle.py", line 1139, in load_reduce
value = func(*args)
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/site-packages/torch/multiprocessing/reductions.py", line 68, in rebuild_storage_fd
fd = multiprocessing.reduction.rebuild_handle(df)
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/multiprocessing/reduction.py", line 155, in rebuild_handle
conn = Client(address, authkey=current_process().authkey)
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/multiprocessing/connection.py", line 175, in Client
answer_challenge(c, authkey)
File "/home/xiaozhenzhen/anaconda2/envs/gcn_pytorch/lib/python2.7/multiprocessing/connection.py", line 432, in answer_challenge
message = connection.recv_bytes(256) # reject large message
IOError: [Errno 104] Connection reset by peer
Exception NameError: "global name 'FileNotFoundError' is not defined" in <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f77d1c11650>> ignored
请问一下这是分类还是聚类,为何feeder.py中会有一个"self.labels = np.load(label_path)"
这个label从何而来?
where's the labels ?
Hello, when I was trying to train the model with your example, I found predicting edges came to be zeros. (No edge is predicted to be true) Have you ever met this situation?
This status usually occurs after 100 batchs' training, with following args.
k_at_hop = [200, 10]
active_connection = 10
batch_size = 16
momentum = 0.9
weight_decay = 1e-5
lr = 1e-5
Hi, I have a question regarding the feature extraction, as I cannot reproduce the results with my own preprocessed files. Given IJB-B-512, your checkpoint for CASIA, and pytorch implementation of ArcFace. I came up with the following code:
import numpy as np
from tqdm import tqdm
import torch
import torch.nn as nn
import torchvision.transforms as transforms
from torchvision import datasets
from model import Backbone
transform = transforms.Compose([
transforms.RandomHorizontalFlip(),
transforms.Resize(size=(112, 112)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
])
data_path = "../data/IJB-B-512/"
batch_size = 16
num_workers = 16
data = datasets.ImageFolder(data_path, transform=transform)
loader = torch.utils.data.DataLoader(data,
batch_size=batch_size,
num_workers=num_workers,
shuffle=True,
pin_memory=True)
model = Backbone(50, 0.6, 'ir_se')
ckpt = torch.load("../pretrained/model_ir_se50.pth")
model.load_state_dict(ckpt)
model.cuda()
model.eval()
features = []
def hook(module, input, output):
N, C, H, W = output.shape
output = output.reshape(N, C, -1)
features.append(output.mean(dim=2).cpu().detach().numpy())
handle = model._modules['body'][23].res_layer[5].fc2.register_forward_hook(hook)
for i_batch, inputs in tqdm(enumerate(loader), total=len(loader)):
_ = model(inputs[0].cuda())
features = np.concatenate(features)
handle.remove()
Could you please let me know if my approach makes sense or how is it different from yours or could you kindly share your pre-processing module?
Hello, I want to know why you use the random subset instead of the whole CASIA?
Additionally, could you provide the list of the random subset?
Thank you very much.
Hi, you've done an excellent work!Could you please share the IJB-B dataset (including protocols)? Thanks a lot!
我设置的batch_size=64,正常1s就训练完,有些时候得1分钟,感觉不是很正常
你这个work应该就是用了《Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition》里面的insight吧,尤其是kNN那块的很明显,还用了人家的label propogation。
identities指的是类别数么,第一个测试集512.labels里包含不止512个整数?请问labels里的数值指的是什么?
Hello, I saw that you came up with a novel feature aggregation method "Attention Aggregation"(from 3.3), you said that the element in G is generate by a 2-layer MLP model. How to train this matrix, can you provide more detailed information? Thank you. (I didn't find relative source code, did I missed something?)
Thanks for your work.
I wanna know that whether the features in features.zip are extracted from arcface or the resnet-101 trained by yourself?
The dataset that I found wasn't in the type of npy, which makes me confused.
您好,我按照论文中的步骤,用CASIA数据集抽样5000个类(总共22万个样本)做训练集,用512,1024,1845三个数据集做测试集,用train.py的默认参数训练的模型,测试时达不到best.ckpt模型的效果。和best.ckpt比,我的训练结果的precision只有不到0.8。请问您有没有什么建议?
作者你好,我在测试期间对knn graph进行构建时,利用类似于下面的语句构建:
result, dists = flann.nn(dataset, testset, 201, algorithm="kmeans", branching=32, iterations=7, checks=16),但是我使用的dataset, testset,两个数据的大小是不一样的,dataset包含了testset。
在测试时出现了以下问题:
InsexError: index 6209 is out of bounds for axis 0 with size 3368.
具体报错代码是:hops[-1].update(set(self.knn_graph[h][1:self.k_at_hop[d]+1]))
请问这个问题可以解决吗?谢谢!
测试的时候,如果测试集的数量和 batch size 的设置使得最后一个 batch 中只有一张图片时,会出现 invalid index to scalar variable 这样的 IndexError.
这是因为 test.py 第 154 行:
node_list = node_list.long().squeeze().numpy()
bs = feat.size(0)
产生的 node_list 在此时会退化成一维数组,导致索引错误.
可以再加一句检查:
node_list = node_list.long().squeeze().numpy()
bs = feat.size(0)
if bs == 1:
node_list = np.array([node_list])
Hello, I use another network to extract features from CASIA and use your GCN to train it.
I keep the same parameter setting and find that the accuracy is lower.
I want to ask if there is some point I need to pay attention when I use a new feature.
Thank you.
作者您好,我有一个问题想请教您一下。
您在论文中提到训练GCN网络是用CASIA数据集对网络进行训练,通过查看源代码我发现您的数据量大小为454590,但是CASIA数据集的大小为494414,请问您有经过筛选后的list吗?
代码用了标签来计算损失,可见该方法应该是有监督,为什么题目中还用到cluster等关键词?以及对比实验还与典型无监督方法,例如:K-means等比较?
Hi ! I am trying to use this code in Google Colab and the latest problem that I am facing is the issue that the code is compiled in old cuda version. The error that I am getting is verbatim as follows :
"Found GPU0 Tesla T4 which requires CUDA_VERSION >= 9000 for
optimal performance and fast startup time, but your PyTorch was compiled
with CUDA_VERSION 8000. Please install the correct PyTorch binary
using instructions from http://pytorch.org"
Is there any structured way how to compile the code ? Can I try to change the code so it can work in the new Pytorch versions ? Will the current pretrained stuff work ? Please any advice or suggestion would be helpful ! Or at least if you could tell me how you have been able to run the code ? Thanks in advance!
Could you provide the face images of features?
Hello, I have some questions about the feature extraction process in the paper. I want to know the specific details of feature extraction. Can the author open source the preprocessing script?
您的CASIA.feas.npy里特征的维度是512,您提到使用resnet101提取特征,但是resnet101提取到的特征不是2048维的吗?感谢解答
这个问题我也有点晕乎,反复确认了一下,代码中bcubed那个函数,reference应当是GT,system应当是predict。麻烦作者你再确认一下。
比如GT为[0, 0, 0, 1, 1, 1],预测为[0, 0, 0, 0, 0, 0]。这样precision应当是0.5,recall是1,但是你的代码算出来recall 0.5,precision 为1.
Suppose I have 10000 images of 400 individual what would be the best way to find number of neighbours
Hi, I'm pleased to find that you've used the code from my repo (https://github.com/XiaohangZhan/cdp/blob/master/source/graph.py). I will appreciate it if you could acknowledge us in README :)
您好,关于梯度回传的地方,如果不是只考虑 1-hop neighbors,而是考虑所有的情况,精度会下降吗?感觉考虑的更多精度是不是会更高?
I am not familiar with this field. So could u pls provide the data preprocessing code? I mean the code for "features+labels+knn_graphs"
代码中coo_matrix返回的混淆矩阵axis=0的轴是指预测类别
错误的代码:
precision = np.sum(cm_norm * (cm / cm.sum(axis=0)))
recall = np.sum(cm_norm * (cm / np.expand_dims(cm.sum(axis=1), 1)))
正确的代码:
recall = np.sum(cm_norm * (cm / cm.sum(axis=0)))
precision= np.sum(cm_norm * (cm / np.expand_dims(cm.sum(axis=1), 1)))
但f1又恰好正确。
幸好论文里面没有打印这些数据。
下面是分析:我以BCubed的论文例子做实验
pred=np.array([0,0,0,0,1,1,1,2,2,2,2,2,2,2])
label=np.array([0,0,0,0,0,1,1,2,1,3,4,1,1,1])
cm= coo_matrix(
(np.ones((14)), (pred, label)),
shape=(3, 5),
dtype=np.int),toarray()
“”“
[[4, 0, 0, 0, 0],
[1, 2, 0, 0, 0],
[0, 4, 1, 1, 1]]
”“”
np.expand_dims(cm.sum(axis=1), 1)
”“”
[[4],
[3],
[7]]
“”“
cm / np.expand_dims(cm.sum(axis=1),1)
“”“
[[1. , 0. , 0. , 0. , 0. ],
[0.33333333, 0.66666667, 0. , 0. , 0. ],
[0. , 0.57142857, 0.14285714, 0.14285714, 0.14285714]]
”“”
cm * cm / np.expand_dims(cm.sum(axis=1),1)
"""
[[4. , 0. , 0. , 0. , 0. ],
[0.33333333, 1.33333333, 0. , 0. , 0. ],
[0. , 2.28571429, 0.14285714, 0.14285714, 0.14285714]]
"""
np.sum(cm * cm / np.expand_dims(cm.sum(axis=1),1))/cm.sum()
这和你的代码是一样的算法,但是变量名错了:
cm_norm = cm / cm.sum()
recall = np.sum(cm_norm * (cm / np.expand_dims(cm.sum(axis=1), 1)))
这应该是精度
precision:(44/4+1/3+22/3+31/7+44/7)/14 = 0.5986394557823128
显然应该要除以每个预测类的总数
np.expand_dims(cm.sum(axis=1), 1)
”“”
[[4],
[3],
[7]]
“”“
axis=1才是计算精度的轴。
您好,关于feeder中邻接矩阵的变换我有两个疑问想请教一下:
1、gcn中一般会添加self-loop来做renormalization,但是您的代码中好像没有添加self-loop,请问这是什么原因呢
2、代码中的A通过A=A.div(D)进行了变换,但是这种变换方式并不等同于D^(-1/2)AD^(-1/2),请问这里采用A=A.div(D)是有什么特殊原因吗?如果要使用D^(-1/2)AD^(-1/2)变换,可以看下下面的写法正确吗?
D = A.sum(1, keepdim=True)
D_ = torch.diagflat(torch.pow(D,-0.5))
A = torch.mm(D_,torch.mm(A,D_))
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.