Giter Club home page Giter Club logo

matchingnetworks's Introduction

Matching Networks Tensorflow Implementation

Introduction

This is an implementation of Matching Networks as described in https://arxiv.org/abs/1606.04080. The implementation provides data loaders, model builders, model trainers and model savers for the Omniglot dataset. Furthermore the data loader provider can be used on any dataset, of any size, as long as you can provide it in the folder structure outlined below.

Installation

To use the Matching Networks repository you must first install the project dependencies. This can be done by install miniconda3 from here with python 3.6 and running:

pip install -r requirements.txt

Getting the data ready

The code in the training script uses a data provider that can build a dataset directly from a folder that contains the data. The folder structure required for the data provider to work is:

 Dataset
    ||______
    |       |
 class_0 class_1 ... class_N
    |       |___________________
    |                           |
samples for class_0    samples for class_1

Once a dataset in the above form is build then simply using:

data = dataset.FolderDatasetLoader(num_of_gpus=num_gpus, batch_size=batch_size, image_height=28, image_width=28,
                                   image_channels=1,
                                   train_val_test_split=(1200/1622, 211/1622, 211/162),
                                   samples_per_iter=1, num_workers=4,
                                   data_path="path/to/dataset", name="dataset_name",
                                   index_of_folder_indicating_class=-2, reset_stored_filepaths=False,
                                   num_samples_per_class=samples_per_class, num_classes_per_set=classes_per_set)

Will allow one to built a data loader for Matching Networks. The data provider can be used as demonstrated in the experiment script.

Sampling from the data loader is as simple as:

for sample_id, train_sample in enumerate(self.data.get_train_batches(total_batches=total_train_batches,
                                                                            augment_images=self.data_augmentation))

The data provider uses parallelization as well as batch sampling while tensorflow is training a step, such that there is minimal waiting time between loading a batch and passing it to tensorflow.

Training a model

To train a model simply use arguments on the training script, for example to do a 20 way 1 shot experiment on omniglot without full context embeddings run:

python train_one_shot_learning_matching_network.py --batch_size 32 --experiment_title omniglot_20_1_matching_network --total_epochs 200 --full_context_unroll_k 5 --classes_per_set 20 --samples_per_class 1 --use_full_context_embeddings False --use_mean_per_class_embeddings False --dropout_rate_value 0.0

Features

The code supports automatic checkpointing as well as statistics saving. It uses 1200 classes for training, 211 classes for testing and 211 classes for validation. We save the latest 5 trained models as well as keep track of the models that perform best on the validation set. After training all epochs, we take the best validation model and produce test statistics. Furthermore the number of classes and samples per class can be modified and the code will be able to handle any combinations that do not exceed the available memory. As an additional feature we have added support for full context embeddings in our implementation.

Our implementation uses the omniglot dataset, but one can easily add a new data provider and then build a new experiment by passing the data provider to the ExperimentBuilder class and the system should work with it, as long as it provides the batches in the same way as our data provider, more details can be found in data.py

We've also very recently introduced the Full Context Embeddings version of matching networks properly implemented as explained in the paper. Please don't hesitate to ask questions.

Acknowledgements

Special thanks to https://github.com/zergylord for his Matching Networks implementation of which parts were used for this implementation. More details at https://github.com/zergylord/oneshot

Additional thanks to my colleagues https://github.com/gngdb, https://github.com/ZackHodari and https://github.com/artur-bekasov for reviewing my code and providing pointers.

matchingnetworks's People

Contributors

antreasantoniou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

matchingnetworks's Issues

Data loader issue

hey,

I'm using my own dataset which follows the folder structure as given in the readme. I have 89 classes and want to split 64 10 15 as the train validation and evaluation split . Each class is balanced and has 20 samples.
I made the following changes:
data = dataset.FolderDatasetLoader(num_of_gpus=1, batch_size=args.batch_size, image_height=200, image_width=300,
image_channels=3,
train_val_test_split=(64/89, 10/89, 15/89),
samples_per_iter=1, num_workers=4,
data_path="datasets/city_names", name="city_names",
indexes_of_folders_indicating_class=-2, reset_stored_filepaths=False,
num_samples_per_class=args.samples_per_class,
num_classes_per_set=args.classes_per_set, label_as_int=False)
and getting the following error
Mapped data paths can't be found, remapping paths..
Get images from /home/user/MatchingNetworks/datasets/city_names
Traceback (most recent call last):
File "/home/cvprisi19/MatchingNetworks/data.py", line 98, in load_datapaths
data_image_paths = self.load_dict(data_path_file)
File "/home/cvprisi19/MatchingNetworks/data.py", line 115, in load_dict
with open(name, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/city_names.pkl'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train_one_shot_learning_matching_network.py", line 20, in
num_classes_per_set=args.classes_per_set, label_as_int=False)
File "/home/cvprisi19/MatchingNetworks/data.py", line 453, in init
samples_per_iter, num_workers, reverse_channels, seed, labels_as_int=label_as_int)
File "/home/cvprisi19/MatchingNetworks/data.py", line 347, in init
reset_stored_filepaths=False, data_path=data_path, labels_as_int=labels_as_int)
File "/home/cvprisi19/MatchingNetworks/data.py", line 466, in get_dataset
reverse_channels=reverse_channels)
File "/home/cvprisi19/MatchingNetworks/data.py", line 439, in init
indexes_of_folders_indicating_class=indexes_of_folders_indicating_class)
File "/home/cvprisi19/MatchingNetworks/data.py", line 41, in init
self.x_train, self.x_val, self.x_test = self.load_dataset()
File "/home/cvprisi19/MatchingNetworks/data.py", line 71, in load_dataset
data_image_paths, index_to_label_name_dict_file, label_to_index = self.load_datapaths()
File "/home/cvprisi19/MatchingNetworks/data.py", line 104, in load_datapaths
data_image_paths, code_to_label_name, label_name_to_code = self.get_data_paths()
File "/home/cvprisi19/MatchingNetworks/data.py", line 145, in get_data_paths
label = self.get_label_from_path(filepath)
File "/home/cvprisi19/MatchingNetworks/data.py", line 179, in get_label_from_path
label = "_".join([label_bits[idx] for idx in self.indexes_of_folders_indicating_class])
TypeError: 'int' object is not iterable

even though the path contains the dataset and the images ...the pickle file itself is not being generated.
image

maybe a data feeder bug

file: data.py -> class: OmniglotNShotDataset -> function: sample_new_batch

for i in range(self.batch_size):
            classes_idx = np.arange(data_pack.shape[0])
            samples_idx = np.arange(data_pack.shape[1])
            choose_classes = np.random.choice(classes_idx, size=self.classes_per_set, replace=False)
            choose_label = np.random.choice(self.classes_per_set, size=1)
            choose_samples = np.random.choice(samples_idx, size=self.samples_per_class+1, replace=False)

            x_temp = data_pack[choose_classes]
            x_temp = x_temp[:, choose_samples]
            y_temp = np.arange(self.classes_per_set)
            support_set_x[i] = x_temp[:, :-1]
            support_set_y[i] = np.expand_dims(y_temp[:], axis=1)
            target_x[i] = x_temp[choose_label, -1]
            target_y[i] = y_temp[choose_label]

I check this code carefully, and I found something unreasonable here.
You choose the classes first and then the samples, the problem here is that you always choose the same index of samples in class.
For example, after you choose 20 classes, there are 20 samples in every class, the right choice is randomly choose samples in every class, but the current code runs with choosing the i-th sample in every classes.

Waiting for your reply.

loading data using dataprovider

Hi, I am trying to use data provider which you kindly implemented for us :) I have a following folder structure:
data/class_name1 (it's a string)
data/class_name2 (..)

I have 107 classes, in each folder i have samples (jpgs in different resolutions sampled from web) for each class (the dataset is inbalanced, i.e. i have sometimes 9 sometimes > 300 of examples).
Now, when I am trying to use training_one_shot_learning_matching_network.py i get following output


[THIS IS MY OWN PRINT STATEMENT] 
sample id, support set images, target set imgs, supp set labels, target set label                                        | 0/1000 [00:00<?, ?it/s]
0 (1, 1, 32, 20, 1) (1, 1, 32, 20, 1, 224, 224, 3) (1, 1, 32, 224, 224, 3) (1, 1, 32)


 Traceback (most recent call last):
File "train_one_shot_learning_matching_network.py", line 75, in
sess=sess)
File "/home/mhnatiuk/Documents/code/MatchingNetworks/experiment_builder.py", line 73, in run_training_epoch
self.learning_rate: self.current_learning_rate})

File "/home/mhnatiuk/Documents/code/env/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 905, in run
run_metadata_ptr)

File "/home/mhnatiuk/Documents/code/env/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1113, in _run
str(subfeed_t.get_shape())))

ValueError: Cannot feed value of shape (1, 32, 20, 1) for Tensor 'support_set_images:0', which has shape '(1, 32, 20, 1, 224, 224, 3)' 

which seems rather weird since it suggests that I provide a support image with one value and in incorrect shape. Can you kindly suggest where to look for a mistake? I can maybe fork your repo and provide some assertions?
Best regards

bug report: shouldn't use tf.nn.sparse_softmax_cross_entropy_with_logits to calculate loss

Bug:
Since the predict is based on softmax_similarity, we shouldn't calculate loss with softmax again.

Show:
If using tf.nn.sparse_softmax_cross_entropy_with_logits, the loss will still be high when the accuracy is about 0.99

Solution:
Only one softmax layer exists. Get cross entropy without softmax. The loss will be extramely small when training finished.

Question about DistanceNetwork

Thanks for your code! I have two question about the code:

First, When calculating the cosine distance, I think it should also be divided by the L2 norms of input_image, but now it seems to be divided only by the L2 norms of support_image.

Second, can you explain in detail the efficient implementation of cross entropy? or what literature I should consult

In f_embedding, h_c is never used

Line 82 of one_shot_learning_network.py,
x, h_c = fw_lstm_cells_encoder(inputs=target_set_embeddings, state=c_h)
h_c should be used as the c_h[0] for the next call of fw_lstm_cells_encoder

Which parameters to change for loading own data?

Hi,

I have an image set of 6 classes, each containing 4 images. Exactly which parameters should I modify? I've only modified the train_one_shot_learning_matching_network.py, where I've changed batch_size to 1 (I also tried leaving it as 32), classes_per_set to 4 instead of 20, total_train_batches = 18, total_val_batches = 3, and total_test_batches = 3. Those look to me the main variables that I should need to change to work with my data, but I get various errors based on what I change these numbers to (example below). Is there anything else I should modify?

I can successfully run this model using the default Omniglot data, but when I try to use my own data, I receive different errors depending on which parameters I have changed (I've tried many different combinations.

On my latest iteration, here's my code in train_one_shot_learning_matching_network.py, lines 11 to 36 (which are the only lines I've modified):

batch_size = 1
fce = False
classes_per_set = 4
samples_per_class = 1
continue_from_epoch = -1  # use -1 to start from scratch
epochs = 1
num_gpus = 1
logs_path = "one_shot_outputs/"
experiment_name = "one_shot_learning_embedding_{}_{}".format(samples_per_class, classes_per_set)

data = dataset.FolderDatasetLoader(num_of_gpus=num_gpus, batch_size=batch_size, image_height=100, image_width=100,
                                   image_channels=3, #RBG images have 3 color channels
                                   train_val_test_split= (18/24,3/24,3/24),#(1200/1622, 211/1622, 211/162),
                                   samples_per_iter=1, num_workers=4,
                                   data_path="datasets/toys", name="toys",
                                   index_of_folder_indicating_class=-2, reset_stored_filepaths=False,
                                   num_samples_per_class=samples_per_class, num_classes_per_set=classes_per_set)

experiment = ExperimentBuilder(data)
one_shot_omniglot, losses, c_error_opt_op, init = experiment.build_experiment(batch_size,
                                                                                     classes_per_set,
                                                                                     samples_per_class, fce)
total_train_batches = 18
total_val_batches = 3
total_test_batches = 3

And running 'python train_one_shot_learning_matching_network.py' in Terminal (on a Mac) yields:

/Users/spencer/anaconda3/lib/python3.6/site-packages/h5py/init.py:34: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
6
4 5 6
data {'train': 16, 'val': 4, 'test': 4}
WARNING:tensorflow:From /Users/spencer/Documents/MatchingNetworks/one_shot_learning_network.py:75: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
Traceback (most recent call last):
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/common_shapes.py", line 686, in _call_cpp_shape_fn_impl
input_tensors_as_shapes, status)
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Index out of range using input dim 0; input has only 0 dims for 'losses/strided_slice_9' (op: 'StridedSlice') with input shapes: [], [2], [2], [2] and with computed input tensors: input[3] = <1 1>.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train_one_shot_learning_matching_network.py", line 33, in
samples_per_class, fce)
File "/Users/spencer/Documents/MatchingNetworks/experiment_builder.py", line 48, in build_experiment
summary, self.losses, self.c_error_opt_op = self.one_shot_omniglot.init_train()
File "/Users/spencer/Documents/MatchingNetworks/one_shot_learning_network.py", line 278, in init_train
losses = self.loss()
File "/Users/spencer/Documents/MatchingNetworks/one_shot_learning_network.py", line 237, in loss
crossentropy_loss = self.crossentropy_softmax(targets=targets, outputs=preds)
File "/Users/spencer/Documents/MatchingNetworks/one_shot_learning_network.py", line 269, in crossentropy_softmax
normOutputs = outputs - tf.reduce_max(outputs, axis=-1)[:, None]
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 588, in _slice_helper
name=name)
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 754, in strided_slice
shrink_axis_mask=shrink_axis_mask)
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 5397, in strided_slice
name=name)
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3273, in create_op
compute_device=compute_device)
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3313, in _create_op_helper
set_shapes_for_outputs(op)
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2501, in set_shapes_for_outputs
return _set_shapes_for_outputs(op)
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2474, in _set_shapes_for_outputs
shapes = shape_func(op)
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2404, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/common_shapes.py", line 627, in call_cpp_shape_fn
require_shape_fn)
File "/Users/spencer/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/common_shapes.py", line 691, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Index out of range using input dim 0; input has only 0 dims for 'losses/strided_slice_9' (op: 'StridedSlice') with input shapes: [], [2], [2], [2] and with computed input tensors: input[3] = <1 1>.

I believe my file structures is set up properly for the data loader:
screen shot 2018-03-22 at 7 38 39 pm

I probably just don't understand some of the variables and how they should change based on the number of classes/images. Any help is much appreciated!

How to pretrain baseline classifier?

Hi, Antreas,

Thank you for the code!

I have a question about the experiment setting in chapter 4.1:
'The baseline classifier was trained to classify an image into one of the original classes present in the training data set'

For my understanding, this means to first pre-train the baseline classifier on all the training classes in general manner, that's a 1200 classes classification for Omniglot. Then, use the feature in this baseline classifier(before softmax) to train in one-shot manner.

What do you think of it?

Regards,

Xiaolu

Performance on Omniglot

Hi, I have run your code on the Omniglot dataset. The performance is higher than the number in the paper. After 3 epochs, the 1-shot 20-way accuracy reaches 0.9615. However, when setting shuffle class to False, https://github.com/AntreasAntoniou/MatchingNetworks/blob/master/data.py#L4 , the performance drops to around 0.935. It looks that the split of train/val/test is important. If setting shuffle_class to true, the classes in test set are more likely to come from different alphabets. If setting shuffle class to False, the classes in test set are more likely to come from same alphabet than setting shuffle class to True.(assuming the classes in original data.npy is sorted by alphabets). It's obvious that within alphabet classification is more difficult than cross alphabets classification.

Another issue is https://github.com/AntreasAntoniou/MatchingNetworks/blob/master/data.py#L95, for test/val, the augment is set to False and seems that the image won't be normalized.

Besides, have you ever try 5-shot, 20-way and 5-shot, 5-way experiments. The paper performance shows that 20-way performance is nearly identical with 5-way, which is strange and my experiments can reach the 5-way, 5-shot performance but not the 20-way, 5-shot performance. Thanks.

question over the decoder in bid_lstm

I'm a bit confused at applying a "decoder" right after the encoder layer in the bidirectional lstm implementation, why do we need an extra layer? It's not really reflected in the orignal paper correct?

why is the magnitude of input_image not divided?

To compute cosine similarity, the dot product of two vectors is divided by the magnitudes of both vectors.
i.e. cos(A,B) = dot(A,B)/(||A||*||B||).
However, you did not use the magnitude of input_image to compute cosine similarity.
Why is that?

class DistanceNetwork():

    def __init__(self):
        self.reuse = False

    def __call__(self, support_set, input_image, name, training=False):
        with tf.name_scope('distance-module' + name), tf.variable_scope('distance-module', reuse=self.reuse):
            eps = 1e-10
            similarities = []
            for support_image in tf.unstack(support_set, axis=0):
                sum_support = tf.reduce_sum(tf.square(support_image), 1, keep_dims=True)
                support_magnitude = tf.rsqrt(tf.clip_by_value(sum_support, eps, float("inf")))
                dot_product = tf.matmul(tf.expand_dims(input_image, 1), tf.expand_dims(support_image, 2))
                dot_product = tf.squeeze(dot_product, [1, ])
                cosine_similarity = dot_product * support_magnitude
                similarities.append(cosine_similarity)

        similarities = tf.concat(axis=1, values=similarities)
        self.variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='distance-module')

        return similarities

incorrect implementation of FCE

  1. fce should be a seq2seq architecture with attention
  2. the decoder of fce should be a uni-directional lstm instead of a bi-directional one.

How to load own data

Hi,
The model is excellent, and I'd like to use some of my own images instead of the Omniglot data. From what I can tell the Omniglot data is being loaded from data.npy (although I can't figure out how to view the images in this file to confirm). I assume that I need to get my own images, along with their class labels, into a file of type *.npy to use them in the model. How exactly should I do this?

I've looked at some documentation for saving data as *.npy, but I'm not sure how to add in class labels.

Any help is much appreciated!

Dropout during testing epoch

Hi,

It looks like you are using some dropout during the run_testing_epoch. Which is the same that the one you use for training and evaluating the model self.args.dropout_rate_value (default value is 0.3).

I would think that during the test you might want a dropout value at 0.

Is this the intended behavior?

Best Regards,
Pierre

ValueError when executing train_one_shot_learning_matching_network.py without any modification

Hi,
I simply ran train_one_shot_learning_matching_network.py following guideline in the Readme file, then I met following error. Does anyone meet the same problem?

Traceback (most recent call last): | 0/1000 [00:00<?, ?it/s]
File "train_one_shot_learning_matching_network.py", line 69, in
sess=sess)
File "/one_shot_learning/MatchingNetworks-master/experiment_builder.py", line 82, in run_training_epoch
self.learning_rate: self.current_learning_rate})
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1128, in _run
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 32, 20, 1) for Tensor 'target_image:0', which has shape '(1, 32, 28, 28, 1)'

Baseline model in paper

Hi -- Have you ever tried to reproduce the baseline results reported in the paper? I implemented Matching Networks from scratch, and my baseline model does substantially better than the one reported in the paper (66% reduction in error). Am asking around to see whether other people have ever noticed a similar issue.

Thanks
Ben

Clarification about implementation

Sorry to use the bug tracker for this, it's actually more of a question.
How did you interpret the concatenation of the hidden state and the readout in equation 3 of the paper?
It seems to me the state has twice the required shape after the concatenation, how is one supposed to manage that?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.