- 개인 공부라 열심히는 하고 있으나, 완벽한 리뷰가 아닙니다.
- 리뷰가 끝나더라도 계속 의문/생각/교정/좋은자료가 있다면 꾸준히 업데이트 됩니다.
- link review는 다른 분들이 하신 좋은 리뷰를 링크한 것입니다.
- lihgt_link는 빠르게 개념(abstract)정도로 본 논문을 의미합니다.
- Unsupervised Representation Learning by Predicting Image Rotations : [paper][]
- Unsupervised Visual Representation Learning by Context Prediction : [paper][]
- Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles : [paper][]
- Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks : [paper][]
- Rethinking Pre-training and Self-training : [paper][]
- Selfie: Self-supervised Pretraining for Image Embedding : [paper] [light_review]
- Self-training with Noisy Student improves ImageNet classification : [paper] [review]
- Stand-Alone Self-Attention in Vision Models : [paper][review]
- Selfie: Self-supervised Pretraining for Image Embedding : [paper] [light_review]
- Visual Transformers: Token-based Image Representation and Processing for Computer Vision : [paper][]
- 2D Attentional Irregular Scene Text Recognizer : [paper][]
- NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition : [paper][]
- On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention : [paper][]
- End-to-End Object Detection with Transformers : [paper][]
Image Retrieval & Deep Feature
- Large-Scale Image Retrieval with Attentive Deep Local Features : [paper][review]
- NetVLAD: CNN architecture for weakly supervised place recognition : [paper][review]
- Learning visual similarity for product design with convolutional neural networks : [paper][review]
- Bags of Local Convolutional Features for Scalable Instance Search : [paper][review]
- Neural Codes for Image Retrieval : [paper][review]
- Conditional Similarity Networks : [paper][review]
- End-to-end Learning of Deep Visual Representations for Image Retrieval : [paper][review]
- CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples : [paper][review]
- Image similarity using Deep CNN and Curriculum Learning : [paper][review]
- Faster R-CNN Features for Instance Search : [paper][review]
- Regional Attention Based Deep Feature for Image Retrieval : [paper][review]
- Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination : [paper][review]
- Object retrieval with deep convolutional features : [paper][review]
- Cross-dimensional Weighting for Aggregated Deep Convolutional Features : [paper][review]
- Learning Embeddings for Product Visual Search with Triplet Loss and Online Sampling : [paper][review]
- Saliency Weighted Convolutional Features for Instance Search : [paper][review]
- 2018 Google Landmark Retrieval Challenge 리뷰 : [review]
- 2019 Google Landmark Retrieval Challenge 리뷰 : [review]
- REMAP: Multi-layer entropy-guided pooling of dense CNN features for image retrieval : [paper][review]
- Large-scale Landmark Retrieval/Recognition under a Noisy and Diverse Dataset : [paper][review]
- Fine-tuning CNN Image Retrieval with No Human Annotation : [paper][review]
- Large Scale Landmark Recognition via Deep Metric Learning : [paper][review]
- Deep Aggregation of Regional Convolutional Activations for Content Based Image Retrieval : [paper][review]
- Challenging deep image descriptors for retrieval in heterogeneous iconographic collections : [paper][review]
- A Benchmark on Tricks for Large-scale Image Retrieval : [paper][review]
- Attention-Aware Generalized Mean Pooling for Image Retrieval : [paper][review]
- Class-Weighted Convolutional Features for Image Retrieval : [paper][review] # 100th
- deep image retrieval loss (계속 업데이트):[paper][review]
- Matchable Image Retrieval by Learning from Surface Reconstruction:[paper][review]
- Regional Maximum Activations of Convolutions with Attention for Cross-domain Beauty and Personal Care Product Retrieval:[paper][review]
- Combination of Multiple Global Descriptors for Image Retrieval:[paper][review]
- Unifying Deep Local and Global Features for Efficient Image Search:[paper][review]
- ACTNET: end-to-end learning of feature activations and multi-stream aggregation for effective instance image retrieval:[paper][review]
- Google Landmarks Dataset v2 A Large-Scale Benchmark for Instance-Level Recognition and Retrieval:[paper][review]
- Detect-to-Retrieve: Efficient Regional Aggregation for Image Search:[paper][review]
- Local Features and Visual Words Emerge in Activations:[paper][review]
- Image Retrieval using Multi-scale CNN Features Pooling: [paper][review]
- MultiGrain: a unified image embedding for classes and instances: [paper][link_review]
- Divide and Conquer the Embedding Space for Metric Learning: [paper][link_review]
- An Effective Pipeline for a Real-world Clothes Retrieval System: [paper][light_review]
- Deep metric learning using Triplet network : [paper][review]
- FaceNet: A Unified Embedding for Face Recognition and Clustering : [paper][review]
- Sampling Matters in Deep Embedding Learning : [paper][review]
- Learning Embeddings for Product Visual Search with Triplet Loss and Online Sampling : [paper][review]
- Conditional Similarity Networks : [paper][review]
- FashionNet: Personalized Outfit Recommendation with Deep Neural Network: [paper][review]
- Context-Aware Visual Compatibility Prediction: [paper][review]
- Learning Type-Aware Embeddings for Fashion Compatibility : [paper][review]
- Be Your Own Prada: Fashion Synthesis with Structural Coherence : [paper][review]
- Fashion-Gen: The Generative Fashion Dataset and Challenge : [paper][review]
- DwNet: Dense warp-based network for pose-guided human video generation: [paper][review]
- Deep Learning of Binary Hash Codes for Fast Image Retrieval : [paper][review]
- Feature Learning based Deep Supervised Hashing with Pairwise Labels : [paper][review]
- Deep Supervised Hashing with Triplet Labels : [paper][review]
- NetVLAD: CNN architecture for weakly supervised place recognition : [paper][review]
- Learnable pooling with Context Gating for video classification : [paper][review]
- Less is More: Learning Highlight Detection from Video Duration : [paper][review]
- Efficient Video Classification Using Fewer Frames : [paper][review]
OCR - Recognition
- Synthetically Supervised Feature Learning for Scene Text Recognition : [paper][review]
- FOTS: Fast Oriented Text Spotting with a Unified Network : [paper][review]
- Robust Scene Text Recognition with Automatic Rectification : [paper][review]
OCR - Detection
- PixelLink: Detecting Scene Text via Instance Segmentation : [paper][review]
- EAST: An Efficient and Accurate Scene Text Detector : [paper][review]
- Scene Text Detection with Supervised Pyramid Context Network : [paper][review]
- FOTS: Fast Oriented Text Spotting with a Unified Network : [paper][review]
- Character Region Awareness for Text Detection : [paper][review]
- Squeeze Excitation Networks : [paper][review]
- Spatial Transformer Network : [paper][review]
- Tell Me Where to Look: Guided Attention Inference Network : [paper][review]
- CBAM: Convolutional Block Attention Module : [paper][review]
- BAM: Bottleneck Attention Module : [paper][review]
- Neural Machine Translation by Jointly Learning to Align and Translate : [paper][review]
- Residual Attention Networks for Image Classification : [paper][review]
- Attention is all you need : [paper][review][link_review]
- Residual Attention Network for Image Classification : [paper][review]
- Stand-Alone Self-Attention in Vision Models : [paper][review]
- DeViSE: A Deep Visual-Semantic Embedding Model : [paper][review]
- Dual Attention Networks for Multimodal Reasoning and Matching : [paper][review]
- Learning Deep Structure-Preserving Image-Text Embeddings : [paper][review]
- Learning Two-Branch Neural Networks for Image-Text Matching Tasks : [paper][link_review]
- FashionNet: Personalized Outfit Recommendation with Deep Neural Network: [paper][review]
- Context-Aware Visual Compatibility Prediction: [paper][review]
- Imagenet classification with deep convolutional neural networks : [paper][review]
- Going Deeper with Convolutions : [paper][review]
- ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices : [paper][review]
- Deep Residual Learning for Image Recognition : [paper][review]
- Aggregated Residual Transformations for Deep Neural Networks : [paper][review]
- Very Deep Convolutional Networks for Large-Scale Image Recognition : [paper][review]
- Squeeze Excitation Networks : [paper][review]
- MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications : [paper][review]
- Pelee: A Real-Time Object Detection System on Mobile Devices : [paper][review]
- Residual Attention Network for Image Classification : [paper][review]
- Wide Residual Networks : [paper][review]
- Stand-Alone Self-Attention in Vision Models : [paper][review]
- Selective Kernel Networks : [paper][review]
- EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks : [paper] [link_review]
- Self-training with Noisy Student improves ImageNet classification : [paper] [review]
- Selfie: Self-supervised Pretraining for Image Embedding : [paper] [light_review]
- Taskonomy: Disentangling Task Transfer Learning : [paper][link_review]
- What makes ImageNet good for transfer learning?g : [paper][review]
- Generative Adversarial Nets : [paper][review]
- Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks : [paper][review]
- Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks : [paper][review]
- Progressive Growing of GANs for Improved Quality, Stability, and Variation : [paper][review]
- Beholder-GAN: Generation and Beautification of Facial Images with Conditioning on Their Beauty Level : [paper][review]
- Synthetically Supervised Feature Learning for Scene Text Recognition : [paper][review]
- A Style-Based Generator Architecture for Generative Adversarial Networks : [paper][review]
- High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs : [paper][review]
- Everybody Dance Now : [paper][review]
- Be Your Own Prada: Fashion Synthesis with Structural Coherence : [paper][review]
- Fashion-Gen: The Generative Fashion Dataset and Challenge : [paper][review]
- StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks : [paper][review]
- DwNet: Dense warp-based network for pose-guided human video generation: [paper][review]
- FaceNet: A Unified Embedding for Face Recognition and Clustering : [paper][review]
- The Devil of Face Recognition is in the Noise : [paper][link_review]
- Revisiting a single-stage method for face detection: [paper][review
- Efficient Estimation of Word Representations in Vector Space : [paper][review]
- node2vec: Scalable Feature Learning for Networks : [paper][review]
- Transfomer(self attention) 기본 이해하기 : PPT정리
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding : [paper][review](~ing)
- DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval : [paper][review]
- SNRM: From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing : [paper][review]
- TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank : [paper][review]
- ConvRankNet: Deep Neural Network for Learning to Rank Query-Text Pairs : [paper][review]
- KNRM: End-to-End Neural Ad-hoc Ranking with Kernel Pooling : [paper][review]
- Conv-KNRM: Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search : [paper][review]
- PACRR: A position-aware neural IR model for relevance matching : [paper][link_review]
- CEDR: Contextualized Embeddings for Document Ranking #262 : [paper][link]
- Deeper Text Understanding for IR with Contextual Neural Language Modeling : [paper][link]
- Simple Applications of BERT for Ad Hoc Document Retrieval : [paper][link]
- Document Expansion by Query Prediction : [paper][link]
- Passage Re-ranking with BERT : [paper][link]
- U-Net: Convolutional Networks for Biomedical Image Segmentation : [paper][review]
- Mask R-CNN : [paper][review]
- Fully Convolutional Networks for Semantic Segmentation : [paper][review]
- Cascade Decoder: A Universal Decoding Method for Biomedical Image Segmentation : [paper][review]
- FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stochastic Inference : [link_review]
- Revisiting Small Batch Training for Deep Neural Networks : [paper][review]
- Weight Standardization : [paper][link_review]
- YOLO: Real-Time Object Detection : [paper][review]
- YOLO9000: Better, Faster, Stronger : [paper][review]
- Faster R-CNN : [paper][review]
- faster rcnn의 anchor generator 개념 뿐만 아니라 소스레벨에서도 이해하기 : [review]
- SSD: Single Shot MultiBox Detector : [paper][link_review]
- Why normalization performed only for conv4_3? : [review]
- Pelee: A Real-Time Object Detection System on Mobile Devices : [paper][review]
- R-FCN: Object Detection via Region-based Fully Convolutional Networks: [paper][review]
- Revisiting a single-stage method for face detection: [paper][review]
- DSSD : Deconvolutional Single Shot Detector: [paper][review]
- Feature-fused SSD: fast detection for small objects : [paper][link_review]
- EfficientDet : Scalable and Efficient Object Detection : [paper] [link_review] [review]
- FCOS: Fully Convolutional One-Stage Object Detection : [paper] [light_review]
- Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection : [paper] [light_review]
- Learning Transferable Architectures for Scalable Image Recognition : [paper][link_review]
- Learning to Compose with Professional Photographs on the Web : [paper][review]
- Photo Aesthetics Ranking Network with Attributes and Content Adaptation : [paper][review]
- Composition-preserving Deep Photo Aesthetics Assessment : [paper][review]
- Deep Image Aesthetics Classification using Inception Modules and Fine-tuning Connected Layer : [paper][review]
- NIMA: Neural Image Assessment : [paper][review]
- Neural Arithmetic Logic Units : [paper][link_review]