google-deepmind / learning-to-learn Goto Github PK

View Code? Open in Web Editor NEW

4.1K 199.0 603.0 90 KB

Learning to Learn in TensorFlow

Home Page: https://arxiv.org/abs/1606.04474

License: Apache License 2.0

Python 100.00%

machine-learning artificial-intelligence neural-networks deep-learning

learning-to-learn's Introduction

Learning to Learn in TensorFlow

Dependencies

Training

python train.py --problem=mnist --save_path=./mnist

Command-line flags:

save_path: If present, the optimizer will be saved to the specified path every time the evaluation performance is improved.
num_epochs: Number of training epochs.
log_period: Epochs before mean performance and time is reported.
evaluation_period: Epochs before the optimizer is evaluated.
evaluation_epochs: Number of evaluation epochs.
problem: Problem to train on. See Problems section below.
num_steps: Number of optimization steps.
unroll_length: Number of unroll steps for the optimizer.
learning_rate: Learning rate.
second_derivatives: If true, the optimizer will try to compute second derivatives through the loss function specified by the problem.

Evaluation

python evaluate.py --problem=mnist --optimizer=L2L --path=./mnist

Command-line flags:

optimizer: Adam or L2L.
path: Path to saved optimizer, only relevant if using the L2L optimizer.
learning_rate: Learning rate, only relevant if using Adam optimizer.
num_epochs: Number of evaluation epochs.
seed: Seed for random number generation.
problem: Problem to evaluate on. See Problems section below.
num_steps: Number of optimization steps.

Problems

The training and evaluation scripts support the following problems (see util.py for more details):

simple: One-variable quadratic function.
simple-multi: Two-variable quadratic function, where one of the variables is optimized using a learned optimizer and the other one using Adam.
quadratic: Batched ten-variable quadratic function.
mnist: Mnist classification using a two-layer fully connected network.
cifar: Cifar10 classification using a convolutional neural network.
cifar-multi: Cifar10 classification using a convolutional neural network, where two independent learned optimizers are used. One to optimize parameters from convolutional layers and the other one for parameters from fully connected layers.

New problems can be implemented very easily. You can see in train.py that the meta_minimize method from the MetaOptimizer class is given a function that returns the TensorFlow operation that generates the loss function we want to minimize (see problems.py for an example).

It's important that all operations with Python side effects (e.g. queue creation) must be done outside of the function passed to meta_minimize. The cifar10 function in problems.py is a good example of a loss function that uses TensorFlow queues.

Disclaimer: This is not an official Google product.

learning-to-learn's People

Contributors

Stargazers

Watchers

Forkers

ml-lab omkarkirpan splendor-kill akzaidi nazifberat chubbymaggie jdc08161063 vyraun cainiao1989 newlulu benjamesbabala soledad89 davidjesusacu ristovska-m fundou jattenberg heatherhoney76 mindis codeaudit projectafey xhuvom yanglian99 dondeng wybosys vyoz rsiva6 caomw synpon ilyeong-ai bodidze id2359 zhiyue-archive gaopeng-eugene johnson-yue mnrmja007 lyrl ntuanhung stuvx zendevelopmentsystems abhinav-goyal ricky1203 bettyfabre kiranvaidhya kartikay18 zbessinger ashokkrishna94 xypan1232 hosford42 pmadhyastha sherkwast peratham backupmanager hitflame arrmac astro11 arita37 befreeb bradparks anguillanneuf wenmin-wu rgadigital happylicio chetkhatri wrchipman chagge wanjinchang swm623 apsaltis liuyusg modulexcite tonydeep linux86 deisler134 malongge tspannhw joseroubert08 notiplaya matthewwilfred dashmoment harry-muzart flowerjack derweeyang leezqcst andreimaslov joey5678 pdaicode charles2648 songfj 1340323171 keskarnitish snowwolph dv-dt-spal newbiettn litterbug23 abtinsetyani robustfengbin wsj1102 johnsonc aztecjag kadeng

learning-to-learn's Issues

Followup to Issue 22

Issue 22: #22

Seems to have found a workaround using this solution: #22 (comment)

Which involves changing the meta.py file. I have searched through the conda directories and can't find any file that seems like it would fit the solution. Since no one else seems to have the same issue, it's likely a dumb question, but can someone tell me how to access the meta.py file being referenced here?

Tensorflow and sonnet versions?

I'm trying to run w/ the following versions in python==3.6.5

>>> sonnet.__version__
'1.27'
>>> tensorflow.__version__
'1.12.0'

and getting the following error:

$ python train.py --problem=mnist --save_path=./mnist
WARNING:tensorflow:From /root/.anaconda/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:290: DataSet.__init__ (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Optimizee variables
['mlp/linear_0/w:0', 'mlp/linear_0/b:0', 'mlp/linear_1/w:0', 'mlp/linear_1/b:0']
Problem variables
[]
Traceback (most recent call last):
  File "/root/.anaconda/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 179, in assert_same_structure
    _pywrap_tensorflow.AssertSameStructure(nest1, nest2, check_types)
TypeError: The two structures don't have the same nested structure.

Presumably this is a version issue -- does anyone know tensorflow/sonnet versions where this repo will run?

Thanks!
~ Ben

Distributed version of this?

Has anybody already done some work to adapt this into a distributed model? (Wouldn't want to do duplicate work)
Any ideas or hints that could help? (I have built distributed models before but struggle a bit to see how to set this up - would you have a meta-optimizer for every worker?)

Outputting the optimized parameters after or during evaluation

What is the best way to determine the optimised/optimizee variables/parameters after a run (or during it)?

print_stats only provides loss and epoch time. It would be good to know the current state of the optimizee in this case.

E.g. for the "simple" problem the current value of x whilst x^2 is being evaluated.

Learning-to-learn code doesn't run after dependencies met AFAIK with an error message.

I tried the following command as instructed by the directions in the README.md:

python train.py --problem=mnist --save_path=./mnist

The output from the run:

iMac:learning-to-learn shyamalsuhanachandra$ python train.py --problem=mnist --save_path=./mnist
Extracting MNIST-data/train-images-idx3-ubyte.gz
Extracting MNIST-data/train-labels-idx1-ubyte.gz
Extracting MNIST-data/t10k-images-idx3-ubyte.gz
Extracting MNIST-data/t10k-labels-idx1-ubyte.gz
Optimizee variables
[u'mlp/linear_0/w:0', u'mlp/linear_0/b:0', u'mlp/linear_1/w:0', u'mlp/linear_1/b:0']
Problem variables
[]
Traceback (most recent call last):
  File "train.py", line 115, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "train.py", line 68, in main
    second_derivatives=FLAGS.second_derivatives)
  File "/Users/shyamalsuhanachandra/learning-to-learn/meta.py", line 398, in meta_minimize
    info = self.meta_loss(make_loss, len_unroll, **kwargs)
  File "/Users/shyamalsuhanachandra/learning-to-learn/meta.py", line 370, in meta_loss
    reset = [tf.variables_initializer(variables), fx_array.close()]
AttributeError: 'module' object has no attribute 'variables_initializer'

What should I do to solve this problem as I am running the code for the first time?

windows run error

Hi~

Great work!

When i run pyinstaller on windows 7 64bit system, it comes some error:

$ pyinstaller
Traceback (most recent call last):
  File "c:\python27\lib\runpy.py", line 174, in _run_module_as_main
	"__main__", fname, loader, pkg_name)
  File "c:\python27\lib\runpy.py", line 72, in _run_code
	exec code in run_globals
  File "C:\Python27\Scripts\pyinstaller.exe\__main__.py", line 5, in <module>
  File "c:\python27\lib\site-packages\PyInstaller\__init__.py", line 72, in <module>
	DEFAULT_SPECPATH = compat.getcwd()
  File "c:\python27\lib\site-packages\PyInstaller\compat.py", line 613, in getcwd
	cwd = win32api.GetShortPathName(cwd)
AttributeError: 'module' object has no attribute 'GetShortPathName'

here is the env:

   Python 2.7.14 (v2.7.14:84471935ed, Sep 16 2017, 20:19:30) [MSC v.1500 32 bit (Intel)] on win32

   Windows 7 x64 bit

I have pywin32 and pypiwin32 both installed.

need u help, thank you.

fatal error:iso:no such a file or directory.

when i run

python train.py --problem=mnist --save_path=./mnist
my os platform is Ubuntu 18.04
tensorflow 1.12.0 cuda10 cudnn 7.4 gcc 7.3

AttributeError: 'module' object has no attribute 'AbstractModule'

Matplotlib required for plot()
Traceback (most recent call last):
  File "evaluate.py", line 26, in <module>
    import meta
  File "/Users/tomato/Sites/learning-to-learn/meta.py", line 32, in <module>
    import networks
  File "/Users/tomato/Sites/learning-to-learn/networks.py", line 31, in <module>
    import preprocess
  File "/Users/tomato/Sites/learning-to-learn/preprocess.py", line 26, in <module>
    class Clamp(snt.AbstractModule):
AttributeError: 'module' object has no attribute 'AbstractModule'

$ python -V
Python 2.7.13
$ pip -V
pip 9.0.1 from /usr/local/lib/python2.7/site-packages (python 2.7)

Dependency: dill

Hi,

This is, of course, a minor issue, but managed to make it run (Ubuntu 16.04) after installing "dill."

Best,

Pedro

AttributeError: 'Template' object has no attribute 'variable_scope'

Hello, I wonder if you can help with this error message?

`(rllab3) ajay@ALPHA:~/PythonProjects/learning-to-learn-master_p3$ python3 train.py --problem=mnist
Extracting MNIST-data/train-images-idx3-ubyte.gz
Extracting MNIST-data/train-labels-idx1-ubyte.gz
Extracting MNIST-data/t10k-images-idx3-ubyte.gz
Extracting MNIST-data/t10k-labels-idx1-ubyte.gz
Traceback (most recent call last):
File "train.py", line 115, in
tf.app.run()
File "/home/ajay/anaconda3/envs/rllab3/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "train.py", line 60, in main
problem, net_config, net_assignments = util.get_config(FLAGS.problem)
File "/home/ajay/PythonProjects/learning-to-learn-master_p3/util.py", line 96, in get_config
problem = problems.mnist(layers=(20,), mode=mode)
File "/home/ajay/PythonProjects/learning-to-learn-master_p3/problems.py", line 165, in mnist
initializers=_nn_initializers)
File "/home/ajay/PythonProjects/learning-to-learn-master_p3/nn/mlp.py", line 84, in init
self._instantiate_layers()
File "/home/ajay/PythonProjects/learning-to-learn-master_p3/nn/mlp.py", line 100, in_instantiate_layers
with tf.variable_scope(self._template.variable_scope):

AttributeError: 'Template' object has no attribute 'variable_scope'
`

My environment is : tf 0.12.1 python 3.5

I've tried searching on the www but can't find anything to fix it?

Thanks a lot :)

structures don't have the same sequence type

I ran python train.py --problem=mnist --save_path=./mnist.
And got the following error. I have no idea what is wrong with it, can anyboby help?

Traceback (most recent call last):
File "/Users/ylfzr/Documents/Projects/learning-to-learn-master/train.py", line 117, in
tf.app.run()
File "/Users/ylfzr/anaconda/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/Users/ylfzr/Documents/Projects/learning-to-learn-master/train.py", line 70, in main
second_derivatives=FLAGS.second_derivatives)
File "/Users/ylfzr/Documents/Projects/learning-to-learn-master/meta.py", line 401, in meta_minimize
info = self.meta_loss(make_loss, len_unroll, **kwargs)
File "/Users/ylfzr/Documents/Projects/learning-to-learn-master/meta.py", line 360, in meta_loss
name="unroll")
File "/Users/ylfzr/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2775, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/Users/ylfzr/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2604, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/Users/ylfzr/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2561, in _BuildLoop
nest.assert_same_structure(list(packed_vars_for_body), list(body_result))
File "/Users/ylfzr/anaconda/lib/python2.7/site-packages/tensorflow/python/util/nest.py", line 200, in assert_same_structure
_recursive_assert_same_structure(nest1, nest2, check_types)
File "/Users/ylfzr/anaconda/lib/python2.7/site-packages/tensorflow/python/util/nest.py", line 173, in _recursive_assert_same_structure
_recursive_assert_same_structure(n1, n2, check_types)
File "/Users/ylfzr/anaconda/lib/python2.7/site-packages/tensorflow/python/util/nest.py", line 173, in _recursive_assert_same_structure
_recursive_assert_same_structure(n1, n2, check_types)
File "/Users/ylfzr/anaconda/lib/python2.7/site-packages/tensorflow/python/util/nest.py", line 173, in _recursive_assert_same_structure
_recursive_assert_same_structure(n1, n2, check_types)
File "/Users/ylfzr/anaconda/lib/python2.7/site-packages/tensorflow/python/util/nest.py", line 173, in _recursive_assert_same_structure
_recursive_assert_same_structure(n1, n2, check_types)
File "/Users/ylfzr/anaconda/lib/python2.7/site-packages/tensorflow/python/util/nest.py", line 159, in _recursive_assert_same_structure
% (type_nest1, type_nest2))
TypeError: The two structures don't have the same sequence type. First structure has type <type 'tuple'>, while second structure has type <class 'sonnet.python.modules.gated_rnn.LSTMState'>.

Macos 10.12.6
tensorflow version used: 1.3.0

mnist result is not good

I got the performance of L2L better than that of Adam about quadratic problem.

python train.py --problem=quadratic --save_path=./quad

Run on GPU again
Epoch 10000
Log Mean Final Error: -1.08
Mean epoch time: 0.16 s
EVALUATION
Log Mean Final Error: -1.08
Mean epoch time: 0.05 s
Removing previously saved meta-optimizer
Saving meta-optimizer to ./quad

Evaluate with L2L; python evaluate.py --problem=quadratic --optimizer=L2L --path=./quad

Epoch 100
Log Mean Final Error: -0.70 # works better than Adam optimizer below
Mean epoch time: 0.17 s

Evaluate with Adam; python evaluate.py --problem=quadratic --optimizer=Adam --path=./quad

Epoch 100
Log Mean Final Error: -0.06
Mean epoch time: 0.09 s

However, the mnist problem seems not to reproduce the same result with that of the paper.
4. python train.py --problem=mnist --save_path=./mnist

Epoch 10000
Log Mean Final Error: -0.47
Mean epoch time: 0.76 s
EVALUATION
Log Mean Final Error: -0.42
Mean epoch time: 0.31 s

python evaluate.py --problem=mnist --optimizer=L2L --path=./mnist

Epoch 100
Log Mean Final Error: 0.18
Mean epoch time: 0.60 s

python evaluate.py --problem=mnist --optimizer=Adam --learning_rate=[0.1| 0.3 | 0.01 | 0.003 | 0.001]

0.1	0.3	0.01	0.003	0.001
Epoch 100 Log Mean Final Error: -0.31 Mean epoch time: 0.23 s	Epoch 100 Log Mean Final Error: 0.24 Mean epoch time: 0.23 s	Epoch 100 Log Mean Final Error: -0.31 Mean epoch time: 0.23 s	Epoch 100 Log Mean Final Error: 0.08 Mean epoch time: 0.24 s	Epoch 100 Log Mean Final Error: 0.29 Mean epoch time: 0.23 s

What did i do wrong? Could you help me? I resolved “TypeError: The two structures don't have the same nested structure.” with the solution mentioned in the following issue:
#22

Thank you in advance.

cifar queues problem

This project would not work with cifar

New problems can be implemented very easily. You can see in train.py that
the meta_minimize method from the MetaOptimizer class is given a function
that returns the TensorFlow operation that generates the loss function we want
to minimize (see problems.py for an example).

It's important that all operations with Python side effects (e.g. queue
creation) must be done outside of the function passed to meta_minimize. The
cifar10 function in problems.py is a good example of a loss function that
uses TensorFlow queues.

AttributeError: module 'types' has no attribute 'StringTypes'

Hi, all,

I got the following running error

(root) root@milton-OptiPlex-9010:/data/code/learning-to-learn# python train.py --problem=mnist --save_path=./mnist
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting MNIST-data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting MNIST-data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST-data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST-data/t10k-labels-idx1-ubyte.gz
Traceback (most recent call last):
  File "train.py", line 115, in <module>
    tf.app.run()
  File "/root/anaconda3/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "train.py", line 60, in main
    problem, net_config, net_assignments = util.get_config(FLAGS.problem)
  File "/data/code/learning-to-learn/util.py", line 96, in get_config
    problem = problems.mnist(layers=(20,), mode=mode)
  File "/data/code/learning-to-learn/problems.py", line 164, in mnist
    initializers=_nn_initializers)
  File "/data/code/learning-to-learn/nn/mlp.py", line 64, in __init__
    super(MLP, self).__init__(name=name)
  File "/data/code/learning-to-learn/nn/base.py", line 119, in __init__
    if not isinstance(name, types.StringTypes):
AttributeError: module 'types' has no attribute 'StringTypes'

My environment is : tf 0.12.1, python 3.5.2
Any suggestion to fix it?

THANKS!

TypeError when trying with latest Tensorflow

When I try to play the code, error occurs, TypeError: Expected int32, got list containing Tensors of type '_Message' instead. Do you have any ideas about the error? Thanks a lot.

python train.py --problem=mnist --save_path=./mnist
I tensorflow/stream_executor/dso_loader.cc:120] successfully opened CUDA library libcublas.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:120] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:120] successfully opened CUDA library libcufft.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:120] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:120] successfully opened CUDA library libcurand.so.7.5 locally
Extracting MNIST-data/train-images-idx3-ubyte.gz
Extracting MNIST-data/train-labels-idx1-ubyte.gz
Extracting MNIST-data/t10k-images-idx3-ubyte.gz
Extracting MNIST-data/t10k-labels-idx1-ubyte.gz
Optimizee variables
[u'mlp/linear_0/w:0', u'mlp/linear_0/b:0', u'mlp/linear_1/w:0', u'mlp/linear_1/b:0']
Problem variables
[]
Traceback (most recent call last):
  File "train.py", line 115, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "train.py", line 68, in main
    second_derivatives=FLAGS.second_derivatives)
  File "/home/feigao/exp/learning-to-learn/meta.py", line 398, in meta_minimize
    info = self.meta_loss(make_loss, len_unroll, **kwargs)
  File "/home/feigao/exp/learning-to-learn/meta.py", line 357, in meta_loss
    name="unroll")
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2622, in while_loop
    result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2455, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2405, in _BuildLoop
    body_result = body(*packed_vars_for_body)
  File "/home/feigao/exp/learning-to-learn/meta.py", line 337, in time_step
    deltas, s_i_next = update(nets[key], fx, x_i, s_i)
  File "/home/feigao/exp/learning-to-learn/meta.py", line 320, in update
    deltas, state_next = zip(*[net(g, s) for g, s in zip(gradients, state)])
  File "/home/feigao/exp/learning-to-learn/nn/base.py", line 142, in __call__
    out = self._template(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/template.py", line 266, in __call__
    return self._call_func(args, kwargs, check_for_new_variables=False)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/template.py", line 215, in _call_func
    result = self._func(*args, **kwargs)
  File "/home/feigao/exp/learning-to-learn/networks.py", line 249, in _build
    output, next_state = build_fn(reshaped_inputs, prev_state)
  File "/home/feigao/exp/learning-to-learn/networks.py", line 209, in _build
    inputs = self._preprocess(tf.expand_dims(inputs, -1))
  File "/home/feigao/exp/learning-to-learn/nn/base.py", line 142, in __call__
    out = self._template(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/template.py", line 266, in __call__
    return self._call_func(args, kwargs, check_for_new_variables=False)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/template.py", line 215, in _call_func
    result = self._func(*args, **kwargs)
  File "/home/feigao/exp/learning-to-learn/preprocess.py", line 71, in _build
    return tf.concat_v2(ndims - 1, [clamped_log, sign])
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1057, in concat_v2
    dtype=dtypes.int32).get_shape(
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 651, in convert_to_tensor
    as_ref=False)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 716, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/constant_op.py", line 165, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto
    _AssertCompatible(values, dtype)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible
    (dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

originally defined at:
  File "/home/feigao/exp/learning-to-learn/meta.py", line 291, in meta_loss
    nets, net_keys, subsets = _make_nets(x, self._config, net_assignments)
  File "/home/feigao/exp/learning-to-learn/meta.py", line 186, in _make_nets
    net = networks.factory(**kwargs)
  File "/home/feigao/exp/learning-to-learn/networks.py", line 44, in factory
    return net_class(**net_options)
  File "/home/feigao/exp/learning-to-learn/networks.py", line 230, in __init__
    super(CoordinateWiseDeepLSTM, self).__init__(1, name=name, **kwargs)
  File "/home/feigao/exp/learning-to-learn/networks.py", line 181, in __init__
    self._preprocess = preprocess_class(**preprocess_options)
  File "/home/feigao/exp/learning-to-learn/preprocess.py", line 50, in __init__
    super(LogAndSign, self).__init__(name=name)
  File "/home/feigao/exp/learning-to-learn/nn/base.py", line 123, in __init__
    create_scope_now_=True)


originally defined at:
  File "/home/feigao/exp/learning-to-learn/meta.py", line 291, in meta_loss
    nets, net_keys, subsets = _make_nets(x, self._config, net_assignments)
  File "/home/feigao/exp/learning-to-learn/meta.py", line 186, in _make_nets
    net = networks.factory(**kwargs)
  File "/home/feigao/exp/learning-to-learn/networks.py", line 44, in factory
    return net_class(**net_options)
  File "/home/feigao/exp/learning-to-learn/networks.py", line 230, in __init__
    super(CoordinateWiseDeepLSTM, self).__init__(1, name=name, **kwargs)
  File "/home/feigao/exp/learning-to-learn/networks.py", line 174, in __init__
    super(StandardDeepLSTM, self).__init__(name)
  File "/home/feigao/exp/learning-to-learn/nn/base.py", line 123, in __init__
    create_scope_now_=True)

Can not run 'simple' problem. TypeError: ones_initializer() got multiple values for keyword argument 'dtype'

I simply clone the code and run the code by the following flags

flags.DEFINE_string("save_path", None, "Path for saved meta-optimizer.")
flags.DEFINE_integer("num_epochs", 10, "Number of training epochs.")
flags.DEFINE_integer("log_period", 1, "Log period.")
flags.DEFINE_integer("evaluation_period", 1, "Evaluation period.")
flags.DEFINE_integer("evaluation_epochs", 2, "Number of evaluation epochs.")

flags.DEFINE_string("problem", "simple", "Type of problem.")
flags.DEFINE_integer("num_steps", 10,
                     "Number of optimization steps per epoch.")
flags.DEFINE_integer("unroll_length", 5, "Meta-optimizer unroll length.")
flags.DEFINE_float("learning_rate", 0.001, "Learning rate.")
flags.DEFINE_boolean("second_derivatives", False, "Use second derivatives.")

And I get the Error:

Traceback (most recent call last):
  File "train.py", line 115, in <module>
    tf.app.run()
  File "/home/haohe/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "train.py", line 68, in main
    second_derivatives=FLAGS.second_derivatives)
  File "/home/haohe/workspace/learning-to-learn/meta.py", line 398, in meta_minimize
    info = self.meta_loss(make_loss, len_unroll, **kwargs)
  File "/home/haohe/workspace/learning-to-learn/meta.py", line 282, in meta_loss
    x, constants = _get_variables(make_loss)
  File "/home/haohe/workspace/learning-to-learn/meta.py", line 118, in _get_variables
    _wrap_variable_creation(func, custom_getter)
  File "/home/haohe/workspace/learning-to-learn/meta.py", line 91, in _wrap_variable_creation
    return func()
  File "/home/haohe/workspace/learning-to-learn/problems.py", line 49, in build
    initializer=tf.ones_initializer)
  File "/home/haohe/workspace/learning-to-learn/meta.py", line 87, in custom_get_variable
    return original_get_variable(*args, custom_getter=custom_getter, **kwargs)
  File "/home/haohe/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1024, in get_variable
    custom_getter=custom_getter)
  File "/home/haohe/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 850, in get_variable
    custom_getter=custom_getter)
  File "/home/haohe/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 339, in get_variable
    validate_shape=validate_shape)
  File "/home/haohe/workspace/learning-to-learn/meta.py", line 110, in custom_getter
    variable = getter(name, **kwargs)
  File "/home/haohe/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 331, in _true_getter
    caching_device=caching_device, validate_shape=validate_shape)
  File "/home/haohe/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 677, in _get_single_variable
    expected_shape=shape)
  File "/home/haohe/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 224, in __init__
    expected_shape=expected_shape)
  File "/home/haohe/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 327, in _init_from_args
    initial_value(), name="initial_value", dtype=dtype)
  File "/home/haohe/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 665, in <lambda>
    shape.as_list(), dtype=dtype, partition_info=partition_info)
TypeError: ones_initializer() got multiple values for keyword argument 'dtype'

I am new to Tensorflow. My tensorflow version the r0.12.
Could anyone help me?
Any comments will be appreciated.

Problems for CIFAR experiments

When I try to run problem cifar and cifar-multi experiments, I run into an error that the boolean variable is_training is not specified as follows:

ValueError: Boolean is_training flag must be explicitly specified when using batch normalization.
originally defined at:
File "train.py", line 61, in main
problem, net_config, net_assignments = util.get_config(FLAGS.problem)
File "/qydata/wwangbc/code/learning_to_optimize/l2l/util.py", line 113, in get_config
mode=mode)
File "/qydata/wwangbc/code/learning_to_optimize/l2l/problems.py", line 258, in cifar10
use_batch_norm=batch_norm)
File "/qydata/wwangbc/bin/anaconda/lib/python2.7/site-packages/sonnet/python/modules/nets/convnet.py", line 142, in init
super(ConvNet2D, self).init(name=name)
File "/qydata/wwangbc/bin/anaconda/lib/python2.7/site-packages/sonnet/python/modules/base.py", line 124, in init
custom_getter_=self.custom_getter)
originally defined at:
File "train.py", line 61, in main
problem, net_config, net_assignments = util.get_config(FLAGS.problem)
File "/qydata/wwangbc/code/learning_to_optimize/l2l/util.py", line 113, in get_config
mode=mode)
File "/qydata/wwangbc/code/learning_to_optimize/l2l/problems.py", line 268, in cifar10
network = snt.Sequential([conv, snt.BatchFlatten(), mlp])
File "/qydata/wwangbc/bin/anaconda/lib/python2.7/site-packages/sonnet/python/modules/sequential.py", line 65, in init
super(Sequential, self).init(name=name)
File "/qydata/wwangbc/bin/anaconda/lib/python2.7/site-packages/sonnet/python/modules/base.py", line 124, in init
custom_getter=self._custom_getter)

I think the is_training should be passed for the BN of both Conv2d and MLP and snt.Sequential function seems to be misused since we need to pass extra build arguments.

What's more, the code "network = snt.Sequential([conv, snt.BatchFlatten(), mlp])" shows that there is only one convolution layer in the network while there should be 3 in the paper.

Could you please fix the bug and implement the complete 3-layer CNN network?

Thanks a lot!

No restore logic in evaluate.py

I trained a MNIST network using train.py and saved in ./mnist folder.

In evaluate.py there is no logic that restores the saved network.

I printed the loss, it doesn't seem to decrease and I printed deltas, which are exactly the same as doing gradient descent (i.e. LSTM initialized at 0).

Debugging the meta-optimizer

I've implemented a small binary text classification task in problems.py and util.py. I'm using a small MLP similar to the MNIST model. When I run the model with a regular optimizer, the loss on the training dataset goes down easily.
However, the meta-optimizer fails to minimize the loss; after 10k epochs, the loss is still as if the model was random. Do you have any insight or tips on how I could debug the meta-learner?
Thanks in advance. I really appreciate your help.

TypeError: The two structures don't have the same nested structure.

when i try quadratic and minist expriment produce the follow error
TypeError: The two structures don't have the same nested structure.
Entire first structure:
[., ., [.], [[((., .), (., .))]]]
Entire second structure:
[., ., [.], [[(LSTMState(hidden=., cell=.), LSTMState(hidden=., cell=.))]]]

please help me how can i address this problem。
thank you very much！

Just a ConvNet

If I run
$ python evaluate.py --optimizer=Adam --problem=cifar
am I training the ConvNet for 100 epochs in batches of 128? I'm trying to use this as solely an example of using Sonnet to create a ConvNet, without any meta-learning.

I'm confused because when I delete the get_default_net_config function in util.py, which only has information about the metalearning net, and run the command above, there's an error. Why is the net config for the coordinatewise deep LSTM necessary for just training the ConvNet with Adam?

AttributeError: module 'tensorflow.contrib.rnn' has no attribute 'RNNCell'

Hey there,

Thanks a lot for open-sourcing the code of this amazing paper! I'm trying to run your code with the latest version of tensorflow (0.12.1), but I'm getting AttributeError: module 'tensorflow.contrib.rnn' has no attribute 'RNNCell'. Could you tell me how to fix this? Has RNNCell recently been moved somewhere else?

Thanks a lot for your help in advance!

MNIST not reproducible

Can anyone provide a requirements.txt to make this code work?
I have been using
tensorflow-gpu==1.5
dm-sonnet==1.10
numpy==1.14

and Python3.6, but evaluating a MNIST L2L optimizer does not result in the same loss as during training, not even close.

I have applied the code of #22

Resetting each epoch?

Using Adam optimizer, not L2L for the CIFAR problem: If I print the cost after each epoch, it doesn't decrease over time running with learning rate .001, num_steps 100 num_epochs 100. However, printing the cost for each num_step, it decreases within the epoch. Why does it seem like the weights are being reset each epoch?

I've also added code to check the training and validation accuracy after each epoch. These are also not decreasing with each epoch.

Initialization Error for Problems

I've tested the basic mnist configuration along with others.

If you run according to your README, you would have the following initialization error where there is a mismatch in the number of arguments passed and the number of arguments required.

Traceback (most recent call last):
  File "train.py", line 117, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "train.py", line 71, in main
    with ms.MonitoredSession() as sess:
TypeError: __init__() takes exactly 4 arguments (1 given)

Environment

Tensorflow 0.10.0
CUDA 8.0
CuDNN 5.1
Python 2.7
Ubuntu 16.04

Update
I've checked with another environment and it works. So this might be an environment-specific issue.

Is BPTT implemented here?

I noticed that the optimizer is not trainable, so BPTT is not implemented here?

How to add accuracy prediciton

I want to add accuracy is like loss,but if i add accuracy in build(),accuracy will like loss doing gradient to weights?So could you help me!Thank you very much for your help.

google-deepmind / learning-to-learn Goto Github PK

learning-to-learn's Introduction

Learning to Learn in TensorFlow

Dependencies

Training

Evaluation

Problems

learning-to-learn's People

Contributors

Stargazers

Watchers

Forkers

learning-to-learn's Issues

Recommend Projects

Recommend Topics

Recommend Org