Comments (8)
Hi @jaegerstar, thanks for the discussion.
In a paper named "Deep Meta-Learning: Learning to Learn in the Concept Space", they used maml-style update and applied ResNet as the CNN structure. And they gained improvements compared with original MAML paper.
The difference is that they only meta-update the last few layers for ResNet.
from supervised-reptile.
Ah, thanks for reminding me to fix this. At one point, I updated the HPs (and results) in the paper to be simpler, because I found that they didn't make much of a difference. I never changed them in this repo.
from supervised-reptile.
Thanks for the reply.
Very insightful paper for saving GPU memory and computation cost.
Hope to see more results on large network structures such as ResNet!
from supervised-reptile.
@csyanbin I doubt it would work better in larger network because it brings more weights as it could increase the risk of overfitting. That why the former work MAML reduced the number of convolutional filters .
from supervised-reptile.
So -- just to make sure I'm translating parameters from the paper to code correct, the parameters for the omniglot experiments would look like this?
# 1-shot
python -u run_omniglot.py \
--train-shots 10 \
--inner-batch 10 \
--inner-iters 5 \
--shots 1 \
--eval-batch 5 \
--eval-iters 50 \
--meta-batch 5 \
--meta-iters 100000 \
--learning-rate 0.001 \
--meta-step 1 \
--meta-step-final 0
# 5-shot
python -u run_omniglot.py \
--train-shots 10 \
--inner-batch 20 \
--inner-iters 10 \
--shots 5 \
--eval-batch 10 \
--eval-iters 50 \
--meta-batch 5 \
--meta-iters 200000 \
--learning-rate 0.0005 \
--meta-step 1 \
--meta-step-final 0
from supervised-reptile.
@csyanbin thanks, I will take a look.
from supervised-reptile.
@bkj I see one problem--your 5-shot arguments should have --shots 5
. Also, you may want to pass --transduction
depending on which experiment you want to reproduce.
from supervised-reptile.
Ah yeah -- good catch. Updated above.
from supervised-reptile.
Related Issues (20)
- About batchnorm HOT 3
- About the role of training set in the process of prediction HOT 1
- 1-shot 5-way Mini-ImageNet setting HOT 1
- What are 5-shot 5-way Reptile + Transduction hyperparameters? HOT 1
- Cannot reproduce the results for 1-shot 5-way Mini-ImageNet HOT 10
- Seems that reptile produce similar gridients as vanilla SGD
- some problems about the dataset
- Model Issue
- demo code for reinforcement learning?
- Reptile for numeric data HOT 1
- When using the pre-trained model for retraining, the accuracy declines. What is the reason and is it normal? HOT 1
- Training hyperparameters HOT 4
- Question regarding the evaluation
- moving average in AdamOptimizer when conducting evaluation HOT 3
- question about dataset HOT 1
- Update Omniglot URL
- How to interpret the batch accuracy for train and test HOT 1
- Question reagarding the mata gradient computation.
- How to convert the saved models to tflite format?
- How to understand the transductive in the code?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from supervised-reptile.