Comments (3)
There are several ways to deal with batchnorm at test time. Setting training=False
uses rolling moment data, but that is by far not the only option. Also, using batchnorm at test time across a smaller set of samples is likely helpful for distinguishing between those samples.
For example, you can simply feed your network a batch containing the whole mini training set and a single sample from the mini test set. Another thing you can do is feed the entire mini dataset, allowing some information to leak between test samples through batchnorm. If you read the paper closely, you'll see that we use batchnorm in both of these ways. When using transduction, batchnorm is allowed to share info across all the test samples. This is technically not exactly the few shot objective people tend to talk about, but it's what was used in the MAML paper. In general, transduction tends to give a slight performance boost (which makes sense, since it is basically cheating).
from supervised-reptile.
Thanks for your reply.
Can you provided some reference materials about using batchnorm at test time ? Or itβs purely empirical οΌ
from supervised-reptile.
BatchNorm at test time is usually more of a footnote than a focus. It's subtle and easy to get wrong, but it usually doesn't make enough of an impact to draw attention.
from supervised-reptile.
Related Issues (20)
- About the role of training set in the process of prediction HOT 1
- 1-shot 5-way Mini-ImageNet setting HOT 1
- What are 5-shot 5-way Reptile + Transduction hyperparameters? HOT 1
- Cannot reproduce the results for 1-shot 5-way Mini-ImageNet HOT 10
- Seems that reptile produce similar gridients as vanilla SGD
- some problems about the dataset
- Model Issue
- demo code for reinforcement learning?
- Reptile for numeric data HOT 1
- When using the pre-trained model for retraining, the accuracy declines. What is the reason and is it normal? HOT 1
- Training hyperparameters HOT 4
- Question regarding the evaluation
- moving average in AdamOptimizer when conducting evaluation HOT 3
- question about dataset HOT 1
- Update Omniglot URL
- How to interpret the batch accuracy for train and test HOT 1
- Question reagarding the mata gradient computation.
- How to convert the saved models to tflite format?
- How to understand the transductive in the code?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from supervised-reptile.