Giter Club home page Giter Club logo

deepembeding's Introduction

Deep Embedding Learning for Image Retrieval


Deep Embedding Introduction

DeepEmbedding 是使用深度学习的方法把多种媒体映射嵌入到相同向量空间,在统一空间中进行搜索的技术。
本项目通过视觉级别搜索,细粒度类别(实例检索)和图像-文本互搜的方式来测试通用多媒体检索。

图像检索问题的一般解法

DeepEmbedding旨在使用深度度量学习(DeepMetric)或者深度哈希(DeepHash)的方法学习关系保持的空间映射函数,能将视觉空间映射到低维嵌入空间,使用向量搜索引擎进行搜索.第一个问题为特征提取问题,即为本实验研究的问题,第二个问题为特征搜索问题.第二个问题解决方案可参见ANNS(近似邻近搜索)NNSearchService

Note

实验结果

(可点击下载百度网盘链接,查看图片)

关于本项目的使用

1.下载相应的数据集 2.采用不同的loss类型对模型进行训练 run train cub200 model

nohup python train_mx_ebay_margin.py --gpus=1 --batch-k=5 --use_viz --epochs=30 --use_pretrained --steps=12,16,20,24 --name=CUB_200_2011 --save-model-prefix=cub200 > mycub200.out 2>&1 &

run train stanford_online_product

nohup python train_mx_ebay_margin.py --batch-k=2 --batch-size=80 --use_pretrained --use_viz --gpus=0 --name=Inclass_ebay --data=EbayInClass --save-model-prefix=ebayinclass > mytraininclass_ebay.log 2>&1

3.后续工作:

  • 测试R-MAC NetVLAD等网络在视觉检索中的效果和评测
  • 测试使用GAN的方法增强检索效果

  __Deep Adversarial Metric Learning__  
  Deep Metric learning cannot get full used the easy negative examples,to in [Deep Adversarial Metirc Learning](http://openaccess.thecvf.com/content_cvpr_2018/papers/Duan_Deep_Adversarial_Metric_CVPR_2018_paper.pdf) proposal a new framework called DAML  
  __DeepMetric and Deep Hashing Apply__  
   apply method to Fashion,vehicle and Person-ReID domain  
  __Construct a datasets__ crawl application-domain data  

Dataset

CUB200_2011: A small part of ImageNet
LFW:face dataset
StanfordOnlineProducts: a lot of types of products(furniture,bicycle,cups)
Street2Shop:products data set from ebay
DeepFashion:all are colthes

图像检索的应用

  • Face indentification: deep metric learning for face cluster,from FaceNet to SphereFace
  • Person ReIdentification:deep metric learning for Pedestrian Re-ID,from MARS to NPSM&SVDNet
  • Vehicle Search:deep metric learning for fake-licensed car or Vehicle retrieval.
  • Street2Products:search fashion clothe from street photos or in-shop photos,namely visual-search.from DeepRanking to DAML

Deep Metric Learning mile-stone paper:

1.DrLIM:Dimensionality Reduction by Learning an Invariant Mapping
2.DeepRanking:Learning fine-graied Image Similarity with DeepRanking
3.DeepID2:Deep Learning Face Representation by Joint Identification-Verification
4.FaceNet:FaceNet: A Unified Embedding for Face Recognition and Clustering
5.Defense:In Defense of the Triplet Loss for Person Re-Identification
6.N-pair:Improved Deep Metric Learning with Multi-class N-pair Loss Objective
7.Sampling:Sampling Matters in Deep Embedding Learning
8.DAML:Deep Adversarial Metric Learning
9.SphereFace:Deep Hypersphere Embedding for Face Recognition

部分DeepHash的工作

DeepHash能将图片直接哈希到汉明码,使用faiss的IVF Binary 系列搜索加速,通过存储量大大减少。

ReImplementation of HashNet

python train_hash.py --params

Deep Hash Learning mile-stone paper:

1.CNNH:Supervised Hashing for Image Retrieval via Image Representation Learning
2.DNNH:Simultaneous feature learning and hash coding with deep neural networks
3.DLBHC:Deep Learning of Binary Hash Codes for Fast Image Retrieval
4.DSH:Deep Supervised Hashing for Fast Image Retrieval
5.SUBIC:SuBiC: A Supervised, Structured Binary Code for Image Search
6.HashNet:HashNet:Deep Learning to Hash by Continuous
7.DCH:Deep Cauchy Hashing for Hamming Space Retrieval

视觉和文本共同嵌入 Visual-semantic-align embedding(cross modal retrieval)

1.VSE++: Improving Visual-Semantic Embeddings with Hard Negatives
2.Dual-Path Convolutional Image-Text Embedding with Instance Loss

python train_vse.py --params

其他类型的搜索方式

1.Sketch based Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval
2.Text cross modal based Deep Cross-Modal Hashing

近似近邻搜索加速

ANNS (Approximation Nearest Neighbor Search) to search a query vector in gallery database. 测试数据集

  • SIFT1M typical 128-dim sift vector
  • DEEP1B,proposed by yandex.inc,this is a deep descriptor
  • GIST1M typical 512-dim gist vector

papers

-- PQ based
1.将传统的标量量化,转成分段乘积量化Product Quantization for Nearest Neighbor Search
2.类似于Cartisian Quantization,将向量整体进行旋转,使得聚类的分段坐标轴和向量对齐,聚类中心点和数据之间的重建误差小,压缩损失就小Optimized Product Quantization
3.Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors
4.粗量化使用双倒排,可以降低聚类维度,增加聚类中心点,使用Multi-Sequence算法提高粗略命中速度The Inverted MultiIndex
5.多义码,将汉明码距离和量化中心点的距离建立映射关系,对entry-point有过滤作用 Polysemous codes

-- Graph Based
1.可参见NSW 浏览小世界 论文,结合跳表结构构造的索引Approximate nearest neighbor algorithm based on navigable small world graphs
2.Efficient and robust approximate nearest neighbor search using hierarchical Navigable Small World graphs
7.EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph
8.A Revisit on Deep Hashings for Large-scale Content Based Image Retrieval
-- Hamming Code
1.Fast Exact Search in Hamming Space with Multi-Index Hashing
2.Fast Nearest Neighbor Search in the Hamming Space
4.Web-Scale Responsive Visual Search at Bing
5.Recurrent Binary Embedding for GPU-Enabled Exhaustive Retrieval from Billion-Scale Semantic Vectors

library

1.[faiss] 当前faiss包含IVF,IMI,PQ,OPQ,PCA,二级残差量化ReRank-PQ,HNSW,Link and Code 等各种类型的索引引擎 2.索引格式选择:高容量,低精度 IMI+OPQ+reRank
高精度,选择HNSW,当前浙大NSG索引不支持增量插入,没有采用

deepembeding's People

Contributors

hudengjunai avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.