Deep Embedding Learning for Image Retrieval

Deep Embedding Introduction

DeepEmbedding 是使用深度学习的方法把多种媒体映射嵌入到相同向量空间，在统一空间中进行搜索的技术。
本项目通过视觉级别搜索，细粒度类别（实例检索）和图像-文本互搜的方式来测试通用多媒体检索。

图像检索问题的一般解法

DeepEmbedding旨在使用深度度量学习（DeepMetric)或者深度哈希（DeepHash）的方法学习关系保持的空间映射函数，能将视觉空间映射到低维嵌入空间,使用向量搜索引擎进行搜索.第一个问题为特征提取问题,即为本实验研究的问题,第二个问题为特征搜索问题.第二个问题解决方案可参见ANNS（近似邻近搜索）NNSearchService 。

Note

本项目实现了基于Multi-npair-loss的度量学习应用于检索和基于Sampling margin loss的方法进行检索
具体复现参见论文Triplet lossFaceNet:
N-pair loss
Margin lossSampling Matters in Deep Emebedding Learning
BatchHard

实验结果

(可点击下载百度网盘链接,查看图片)

在StanfordOnlineProduct训练，计算NMI聚类指标 nmi=0.866，对验证集的向量嵌入进行T-SNE降维后可以看出，降维图像约 43M，百度网盘链接在下：
margin_based loss :DeepFashion https://pan.baidu.com/s/1zLZX24qBb_Op1vsry4LX6w
计算指标：nmi=0.866
Mc-n-pair loss:StanfordOnlineProduct https://pan.baidu.com/s/12eNTVsRFu--SYMW8P8HPfQ
计算指标：nmi=0.830

关于本项目的使用

1.下载相应的数据集 2.采用不同的loss类型对模型进行训练 run train cub200 model

nohup python train_mx_ebay_margin.py --gpus=1 --batch-k=5 --use_viz --epochs=30 --use_pretrained --steps=12,16,20,24 --name=CUB_200_2011 --save-model-prefix=cub200 > mycub200.out 2>&1 &

run train stanford_online_product

nohup python train_mx_ebay_margin.py --batch-k=2 --batch-size=80 --use_pretrained --use_viz --gpus=0 --name=Inclass_ebay --data=EbayInClass --save-model-prefix=ebayinclass > mytraininclass_ebay.log 2>&1

3.后续工作：

测试R-MAC NetVLAD等网络在视觉检索中的效果和评测
测试使用GAN的方法增强检索效果


  __Deep Adversarial Metric Learning__  
  Deep Metric learning cannot get full used the easy negative examples,to in [Deep Adversarial Metirc Learning](http://openaccess.thecvf.com/content_cvpr_2018/papers/Duan_Deep_Adversarial_Metric_CVPR_2018_paper.pdf) proposal a new framework called DAML  
  __DeepMetric and Deep Hashing Apply__  
   apply method to Fashion,vehicle and Person-ReID domain  
  __Construct a datasets__ crawl application-domain data

Dataset

CUB200_2011: A small part of ImageNet
LFW:face dataset
StanfordOnlineProducts: a lot of types of products(furniture,bicycle,cups)
Street2Shop:products data set from ebay
DeepFashion:all are colthes

图像检索的应用

Face indentification: deep metric learning for face cluster,from FaceNet to SphereFace
Person ReIdentification:deep metric learning for Pedestrian Re-ID,from MARS to NPSM&SVDNet
Vehicle Search:deep metric learning for fake-licensed car or Vehicle retrieval.
Street2Products:search fashion clothe from street photos or in-shop photos,namely visual-search.from DeepRanking to DAML

Deep Metric Learning mile-stone paper:

1.DrLIM:Dimensionality Reduction by Learning an Invariant Mapping
2.DeepRanking:Learning fine-graied Image Similarity with DeepRanking
3.DeepID2:Deep Learning Face Representation by Joint Identification-Verification
4.FaceNet:FaceNet: A Unified Embedding for Face Recognition and Clustering
5.Defense:In Defense of the Triplet Loss for Person Re-Identification
6.N-pair:Improved Deep Metric Learning with Multi-class N-pair Loss Objective
7.Sampling:Sampling Matters in Deep Embedding Learning
8.DAML:Deep Adversarial Metric Learning
9.SphereFace:Deep Hypersphere Embedding for Face Recognition

部分DeepHash的工作

DeepHash能将图片直接哈希到汉明码，使用faiss的IVF Binary 系列搜索加速，通过存储量大大减少。

ReImplementation of HashNet

python train_hash.py --params

Deep Hash Learning mile-stone paper:

1.CNNH:Supervised Hashing for Image Retrieval via Image Representation Learning
2.DNNH:Simultaneous feature learning and hash coding with deep neural networks
3.DLBHC:Deep Learning of Binary Hash Codes for Fast Image Retrieval
4.DSH:Deep Supervised Hashing for Fast Image Retrieval
5.SUBIC:SuBiC: A Supervised, Structured Binary Code for Image Search
6.HashNet:HashNet:Deep Learning to Hash by Continuous
7.DCH:Deep Cauchy Hashing for Hamming Space Retrieval

视觉和文本共同嵌入 Visual-semantic-align embedding(cross modal retrieval)

1.VSE++: Improving Visual-Semantic Embeddings with Hard Negatives
2.Dual-Path Convolutional Image-Text Embedding with Instance Loss

python train_vse.py --params

其他类型的搜索方式

1.Sketch based Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval
2.Text cross modal based Deep Cross-Modal Hashing

近似近邻搜索加速

ANNS (Approximation Nearest Neighbor Search) to search a query vector in gallery database. 测试数据集

SIFT1M typical 128-dim sift vector
DEEP1B，proposed by yandex.inc,this is a deep descriptor
GIST1M typical 512-dim gist vector

papers

-- PQ based
1.将传统的标量量化，转成分段乘积量化Product Quantization for Nearest Neighbor Search
2.类似于Cartisian Quantization，将向量整体进行旋转，使得聚类的分段坐标轴和向量对齐，聚类中心点和数据之间的重建误差小，压缩损失就小Optimized Product Quantization
3.Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors
4.粗量化使用双倒排，可以降低聚类维度，增加聚类中心点，使用Multi-Sequence算法提高粗略命中速度The Inverted MultiIndex
5.多义码，将汉明码距离和量化中心点的距离建立映射关系，对entry-point有过滤作用 Polysemous codes

-- Graph Based
1.可参见NSW 浏览小世界论文，结合跳表结构构造的索引Approximate nearest neighbor algorithm based on navigable small world graphs
2.Efficient and robust approximate nearest neighbor search using hierarchical Navigable Small World graphs
7.EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph
8.A Revisit on Deep Hashings for Large-scale Content Based Image Retrieval
-- Hamming Code
1.Fast Exact Search in Hamming Space with Multi-Index Hashing
2.Fast Nearest Neighbor Search in the Hamming Space
4.Web-Scale Responsive Visual Search at Bing
5.Recurrent Binary Embedding for GPU-Enabled Exhaustive Retrieval from Billion-Scale Semantic Vectors

library

1.[faiss] 当前faiss包含IVF,IMI，PQ,OPQ,PCA,二级残差量化ReRank-PQ,HNSW,Link and Code 等各种类型的索引引擎 2.索引格式选择：高容量，低精度 IMI+OPQ+reRank
高精度，选择HNSW,当前浙大NSG索引不支持增量插入，没有采用

awesome-archive / deepembeding Goto Github PK

deepembeding's Introduction