Giter Club home page Giter Club logo

vqa_demo's Introduction

VQA Demo

Updated to work with Keras 2.0 and TF 1.2 and Spacy 2.0 This code is meant for education thus focus is on simplicity and not speed.

This is a simple Demo of Visual Question answering which uses pretrained models (see models/CNN and models/VQA) to answer a given question about the given image.

Dependency

  1. Keras version 2.0+

    • Modular deep learning library based on python
  2. Tensorflow 1.2+ (Might also work with Theano. I have not tested Theano after the recent commit, use commit 0f89007 for Theano)

  3. scikit-learn

    • Quintessential machine library for python
  4. Spacy version 2.0+

    • Used to load Glove vectors (word2vec)
    • To upgrade & install Glove Vectors
      • python -m spacy download en_vectors_web_lg
  5. OpenCV

    • OpenCV is used only to resize the image and change the color channels,
    • You may use other libraries as long as you can pass a 224x224 BGR Image (NOTE: BGR and not RGB)
  6. VGG 16 Pretrained Weights

Usage

python demo.py -image_file_name path_to_file -question "Question to be asked"

e.g

python demo.py -image_file_name test.jpg -question "Is there a man in the picture?"

if you have prefer to use Theano backend and if you have GPU you may want to run like this

THEANO_FLAGS='floatX=float32,device=gpu0,lib.cnmem=1,mode=FAST_RUN' python demo.py -image_file_name test.jpg -question "What vechile is in the picture?"

Expected Output : 095.2 % train 00.67 % subway 00.54 % mcdonald's 00.38 % bus 00.33 % train station

Runtime

  • GPU (Titan X) Theano optimizer=fast_run : 51.3 seconds
  • GPU (Titan X) Theano optimizer=fast_compile : 47.5 seconds
  • CPU (i7-5820K CPU @ 3.30GHz : 35.9 seconds (Is this strange or not ?)

iPython Notebook

Jupyter/iPython Notebook has been provided with more examples and interactive tutorial. https://github.com/iamaaditya/VQA_Demo/blob/master/Visual_Question_Answering_Demo_in_python_notebook.ipynb

NOTE: See the comments on demo.py for more information on the model and methods

VQA Training

vqa_demo's People

Contributors

akshitac8 avatar bryant1410 avatar iamaaditya avatar yadavankit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vqa_demo's Issues

from models.VQA.VQA import VQA_MODEL, error while installing models

File "C:/Users/ADMIN/PycharmProjects/VQA/VQA_Demo/demo.py", line 25, in get_image_model
image_model = VGG_16(CNN_weights_file_name)
File "C:\Users\ADMIN\PycharmProjects\VQA\VQA_Demo\models\CNN\VGG.py", line 56, in VGG_16
model.add(ZeroPadding2D((1,1),input_shape=(3,224,224)))
File "C:\Users\ADMIN\PycharmProjects\Kmean\venv\lib\site-packages\keras\engine\sequential.py", line 166, in add
layer(x)
File "C:\Users\ADMIN\PycharmProjects\Kmean\venv\lib\site-packages\keras\backend\tensorflow_backend.py", line 75, in symbolic_fn_wrapper
return func(*args, **kwargs)
File "C:\Users\ADMIN\PycharmProjects\Kmean\venv\lib\site-packages\keras\engine\base_layer.py", line 446, in call
self.assert_input_compatibility(inputs)
File "C:\Users\ADMIN\PycharmProjects\Kmean\venv\lib\site-packages\keras\engine\base_layer.py", line 310, in assert_input_compatibility
K.is_keras_tensor(x)
File "C:\Users\ADMIN\PycharmProjects\Kmean\venv\lib\site-packages\keras\backend\tensorflow_backend.py", line 695, in is_keras_tensor
if not is_tensor(x):
File "C:\Users\ADMIN\PycharmProjects\Kmean\venv\lib\site-packages\keras\backend\tensorflow_backend.py", line 703, in is_tensor
return isinstance(x, tf_ops._TensorLike) or tf_ops.is_dense_tensor_like(x)
AttributeError: module 'tensorflow.python.framework.ops' has no attribute '_TensorLike'

Filter must not be larger than the input

When I run demo.py file in order to test it, I see the following errors.

$ python demo.py -image_file_name test.jpg -question "Is there a man in the picture?"

Using TensorFlow backend.

Loading image features ...
Traceback (most recent call last):
  File "demo.py", line 114, in <module>
    main()
  File "demo.py", line 92, in main
    image_features = get_image_features(args.image_file_name, CNN_weights_file_name)
  File "demo.py", line 55, in get_image_features
    image_features[0,:] = get_image_model(CNN_weights_file_name).predict(im)[0]
  File "demo.py", line 20, in get_image_model
    image_model = VGG_16(CNN_weights_file_name)
  File "/home/alibugra/Desktop/VQA_Demo/models/CNN/VGG.py", line 39, in VGG_16
    model.add(MaxPooling2D((2,2), strides=(2,2)))
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 324, in add
    output_tensor = layer(self.outputs[0])
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 517, in __call__
    self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 571, in add_inbound_node
    Node.create_node(self, inbound_layers, node_indices, tensor_indices)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 155, in create_node
    output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
  File "/usr/local/lib/python2.7/dist-packages/keras/layers/pooling.py", line 158, in call
    dim_ordering=self.dim_ordering)
  File "/usr/local/lib/python2.7/dist-packages/keras/layers/pooling.py", line 207, in _pooling_function
    border_mode, dim_ordering, pool_mode='max')
  File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 1815, in pool2d
    x = tf.nn.max_pool(x, pool_size, strides, padding=padding)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 665, in max_pool
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 1123, in _max_pool
    data_format=data_format, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 703, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2319, in create_op
    set_shapes_for_outputs(ret)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1711, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 505, in max_pool_shape
    padding)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 184, in get2d_conv_output_size
    (row_stride, col_stride), padding_type)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 149, in get_conv_output_size
    "Filter: %r Input: %r" % (filter_size, input_size))
ValueError: Filter must not be larger than the input: Filter: (2, 2) Input: (1, 112)

Predict function is showing error

Hi,

Thank you for this interactive demo of VQA. Anyway while running the prediction cell:

`y_output = model_vqa.predict([question_features, image_features])

labelencoder = joblib.load(label_encoder_file_name)
for label in reversed(np.argsort(y_output)[0,-5:]):
print str(round(y_output[0,label]*100,2)).zfill(5), "% ", labelencoder.inverse_transform(label)`

Then it showing some error just as below:

`ValueError Traceback (most recent call last)
in ()
----> 1 y_output = model_vqa.predict([question_features, image_features])
2
3 # This task here is represented as a classification into a 1000 top answers
4 # this means some of the answers were not part of training and thus would
5 # not show up in the result.

/usr/local/lib/python2.7/dist-packages/Keras-1.0.2-py2.7.egg/keras/models.pyc in predict(self, x, batch_size, verbose)
454 if self.model is None:
455 raise Exception('The model needs to be compiled before being used.')
--> 456 return self.model.predict(x, batch_size=batch_size, verbose=verbose)
457
458 def predict_on_batch(self, x):

/usr/local/lib/python2.7/dist-packages/Keras-1.0.2-py2.7.egg/keras/engine/training.pyc in predict(self, x, batch_size, verbose)
1117 f = self.predict_function
1118 return self._predict_loop(f, ins,
-> 1119 batch_size=batch_size, verbose=verbose)
1120
1121 def train_on_batch(self, x, y,

/usr/local/lib/python2.7/dist-packages/Keras-1.0.2-py2.7.egg/keras/engine/training.pyc in _predict_loop(self, f, ins, batch_size, verbose)
837 ins_batch = slice_X(ins, batch_ids)
838
--> 839 batch_outs = f(ins_batch)
840 if type(batch_outs) != list:
841 batch_outs = [batch_outs]

/usr/local/lib/python2.7/dist-packages/Keras-1.0.2-py2.7.egg/keras/backend/theano_backend.pyc in call(self, inputs)
505 def call(self, inputs):
506 assert type(inputs) in {list, tuple}
--> 507 return self.function(*inputs)
508
509

/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.pyc in call(self, _args, *_kwargs)
869 node=self.fn.nodes[self.fn.position_of_error],
870 thunk=thunk,
--> 871 storage_map=getattr(self.fn, 'storage_map', None))
872 else:
873 # old-style linkers raise their own exceptions

/usr/local/lib/python2.7/dist-packages/theano/gof/link.pyc in raise_with_op(node, thunk, exc_info, storage_map)
312 # extra long error message in that case.
313 pass
--> 314 reraise(exc_type, exc_value, exc_trace)
315
316

/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.pyc in call(self, _args, *_kwargs)
857 t0_fn = time.time()
858 try:
--> 859 outputs = self.fn()
860 except Exception:
861 if hasattr(self.fn, 'position_of_error'):

ValueError: total size of new array must be unchanged
Apply node that caused the error: Reshape{3}(Elemwise{Add}[(0, 0)].0, TensorConstant{[ -1 30 512]})
Toposort index: 49
Inputs types: [TensorType(float32, matrix), TensorType(int64, vector)]
Inputs shapes: [(3, 512), (3,)]
Inputs strides: [(2048, 4), (8,)]
Inputs values: ['not shown', array([ -1, 30, 512])]
Outputs clients: [[Join(TensorConstant{2}, Reshape{3}.0, Reshape{3}.0, Reshape{3}.0, Reshape{3}.0)]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.2-py2.7.egg/keras/engine/topology.py", line 341, in create_input_layer
self(x)
File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.2-py2.7.egg/keras/engine/topology.py", line 485, in call
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.2-py2.7.egg/keras/engine/topology.py", line 543, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.2-py2.7.egg/keras/engine/topology.py", line 148, in create_node
output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.2-py2.7.egg/keras/layers/recurrent.py", line 219, in call
preprocessed_input = self.preprocess_input(x)
File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.2-py2.7.egg/keras/layers/recurrent.py", line 729, in preprocess_input
input_dim, self.output_dim, timesteps)
File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.2-py2.7.egg/keras/layers/recurrent.py", line 38, in time_distributed_dense
x = K.reshape(x, (-1, timesteps, output_dim))
File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.2-py2.7.egg/keras/backend/theano_backend.py", line 283, in reshape
return T.reshape(x, shape)

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.`

SyntaxError: invalid syntax

File "demo.py", line 114
print str(round(y_output[0,label]*100,2)).zfill(5), "% ", labelencoder.inverse_transform(label)
^
How do I resolve this error? Please help.

Error predicting


ValueError Traceback (most recent call last)
in
8 labelencoder = joblib.load(label_encoder_file_name)
9 for label in reversed(np.argsort(y_output)[0,-5:]):
---> 10 print(str(round(y_output[0,label]*100,2)).zfill(5), "% ",labelencoder.inverse_transform(label))

~\Anaconda3\envs\VQA1\lib\site-packages\sklearn\preprocessing\label.py in inverse_transform(self, y)
271 """
272 check_is_fitted(self, 'classes_')
--> 273 y = column_or_1d(y, warn=True)
274 # inverse transform of empty array is empty array
275 if _num_samples(y) == 0:

~\Anaconda3\envs\VQA1\lib\site-packages\sklearn\utils\validation.py in column_or_1d(y, warn)
795 return np.ravel(y)
796
--> 797 raise ValueError("bad input shape {0}".format(shape))
798
799

ValueError: bad input shape ()

KeyError: 'class_name'

when i run this part from notebook
from keras.utils.visualize_util import plot model_vqa = get_VQA_model(VQA_model_file_name, VQA_weights_file_name) plot(model_vqa, to_file='model_vqa.png')

i get error
KeyError Traceback (most recent call last)
in ()
1 from keras.utils.visualize_util import plot
----> 2 model_vqa = get_VQA_model(VQA_model_file_name, VQA_weights_file_name)
3 plot(model_vqa, to_file='model_vqa.png')

in get_VQA_model(VQA_model_file_name, VQA_weights_file_name)
5 # very easy to understand and work. Alternative would be to load model
6 # from binary like cPickle but then model would be obfuscated to users
----> 7 vqa_model = model_from_json(open(VQA_model_file_name).read())
8 vqa_model.load_weights(VQA_weights_file_name)
9 vqa_model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

/home/ap/github/VQA_Demo/keras/models.pyc in model_from_json(json_string, custom_objects)
28 from keras.utils.layer_utils import layer_from_config
29 config = json.loads(json_string)
---> 30 return layer_from_config(config, custom_objects=custom_objects)
31
32

/home/ap/github/VQA_Demo/keras/utils/layer_utils.py in layer_from_config(config, custom_objects)
22 globals()[cls_key] = custom_objects[cls_key]
23
---> 24 if 'class_name' in config:
25 class_name = config['class_name']
26 layer_class = Sequential

KeyError: 'class_name'

related to Vgg16...(please reply soon)

I am getting the error while running this command -
model_vgg = get_image_model(CNN_weights_file_name)

Error - You are trying to load a weight file containing 0 layers into a model with 16 layers.

Value Error: Unknown Layer: Merge

image

Hi,

I am working on the code given in ipython notebook and tried to solve every possible thing to solve the issue but it is not happening

I have also uploaded the image

I am very interested in the work done by you, please help me out.

Thank You

Inconsistent dimensions in VQA json model

Hi, thanks for the repo!

In the function get_question_features:
Ipython notebook uses np array of size (1, len(tokens), 300 for the question_tensor, while demo.py uses (1, 30, 300). The json model also uses the latter.

Ipython notebook produces an error at this function.

Error while loading image features

Loading image features ...
Traceback (most recent call last):
File "demo.py", line 121, in
main()
File "demo.py", line 99, in main
image_features = get_image_features(args.image_file_name, CNN_weights_file_name)
File "demo.py", line 61, in get_image_features
image_features[0,:] = get_image_model(CNN_weights_file_name).predict(im)[0]
ValueError: could not broadcast input array from shape (1000) into shape (4096)

This the error I am getting while loading image features.
Can you please provide the solution for this. ITS VERY URGENT

h5py/h5f.pyx in h5py.h5f.open() OSError: Unable to open file (file signature not found)

While running a code, we are receiving below error. Please help me out.

OSError Traceback (most recent call last)
in ()
85
86 if name == "main":
---> 87 main()

in main()
81
82 if verbose : print("\n\n\nLoading image features ...")
---> 83 image_features = get_image_features(args.image_file_name, CNN_weights_file_name)
84
85

in get_image_features(image_file_name, CNN_weights_file_name)
65 print('call image features middle')
66 print(CNN_weights_file_name)
---> 67 image_features[0,:] = get_image_model(CNN_weights_file_name).predict(im)[0]
68 return image_features
69

in get_image_model(CNN_weights_file_name)
27 print(CNN_weights_file_name)
28
---> 29 image_model = VGG_16(CNN_weights_file_name)
30
31 # this is standard VGG 16 without the last two layers

/content/drive/My Drive/VQA_Demo/models/CNN/VGG.py in VGG_16(weights_path)
99 model.add(Dense(1000, activation='softmax'))
100
--> 101 if weights_path:
102 # model.load_weights(weights_path)
103 load_model_legacy(model, weights_path)

/content/drive/My Drive/VQA_Demo/models/CNN/VGG.py in load_model_legacy(model, weight_path)
33 ''' this function is used because the weights in this model
34 were trained with legacy keras. New keras does not support loading these weights '''
---> 35
36 import h5py
37 f = h5py.File(weight_path, mode='r')

/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py in init(self, name, mode, driver, libver, userblock_size, swmr, **kwds)
310 with phil:
311 fapl = make_fapl(driver, libver, **kwds)
--> 312 fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
313
314 if swmr_support:

/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
140 if swmr and swmr_support:
141 flags |= h5f.ACC_SWMR_READ
--> 142 fid = h5f.open(name, flags, fapl=fapl)
143 elif mode == 'r+':
144 fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5f.pyx in h5py.h5f.open()

OSError: Unable to open file (file signature not found)

KeyError: "Can't open attribute (Can't locate attribute: 'layer_names')"

I'm facing this issue when I run the python demo.py -image_file_name test.jpg -question "Is there a man in the picture?". I'm using Python 2.7, Keras 2.0.4, {Theano,th} backend in keras.json {backend,image_dim_ordering}, Theano 0.9.0. Here's the stacktrace:

Traceback (most recent call last):
  File "demo.py", line 116, in <module>
    main()
  File "demo.py", line 94, in main
    image_features = get_image_features(args.image_file_name, CNN_weights_file_name)
  File "demo.py", line 57, in get_image_features
    image_features[0,:] = get_image_model(CNN_weights_file_name).predict(im)[0]
  File "demo.py", line 22, in get_image_model
    image_model = VGG_16(CNN_weights_file_name)
  File "/media/bhavya/New_Volume/ICT/UGRP/Implementation/VQA_Demo/models/CNN/VGG.py", line 73, in VGG_16
    model.load_weights(weights_path)
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 717, in load_weights
    topology.load_weights_from_hdf5_group(f, layers)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2931, in load_weights_from_hdf5_group
    layer_names = [n.decode('utf8') for n in f.attrs['layer_names']]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-nCYoKW-build/h5py/_objects.c:2840)
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-nCYoKW-build/h5py/_objects.c:2798)
  File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/attrs.py", line 58, in __getitem__
    attr = h5a.open(self._id, self._e(name))
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-nCYoKW-build/h5py/_objects.c:2840)
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-nCYoKW-build/h5py/_objects.c:2798)
  File "h5py/h5a.pyx", line 77, in h5py.h5a.open (/tmp/pip-nCYoKW-build/h5py/h5a.c:2337)
KeyError: "Can't open attribute (Can't locate attribute: 'layer_names')"

Unexpected result on prediction even after using en_vectors_web_lg

The result for python demo.py -image_file_name test.jpg -question "Is there a man in the picture?" command i am getting are
76.77 % yes
23.2 % no
0.01 % safety
0.0 % sun
0.0 % taking picture

intead for 'train' as you mentioned in README

here is the code for get_question_features method
word_embeddings = spacy.load('en_vectors_web_lg')
tokens = word_embeddings(question)
question_tensor = np.zeros((1, 30, 300))
for j in range(len(tokens)):
question_tensor[0,j,:] = tokens[j].vector
return question_tensor

Please reply. Thanks in advance

KeyError: 'Unable to open object (bad symbol table node signature)'

Loading image features ...
Traceback (most recent call last):
File "demo.py", line 117, in
main()
File "demo.py", line 95, in main
image_features = get_image_features(args.image_file_name, CNN_weights_file_name)
File "demo.py", line 58, in get_image_features
image_features[0,:] = get_image_model(CNN_weights_file_name).predict(im)[0]
File "demo.py", line 23, in get_image_model
image_model = VGG_16(CNN_weights_file_name)
File "/home/anurag/Desktop/VQA_Demo/models/CNN/VGG.py", line 101, in VGG_16
load_model_legacy(model, weights_path)
File "/home/anurag/Desktop/VQA_Demo/models/CNN/VGG.py", line 41, in load_model_legacy
g = f['layer_{}'.format(k)]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/anurag/.local/lib/python3.6/site-packages/h5py/_hl/group.py", line 167, in getitem
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open

What should I do to resolve this error? Please help

Unexpected results running demo.py

Hi,

Thank you for sharing the demo. I was trying to repeat the experiment but came across unexpected results.

My python libraries:

Keras (1.0.5)
spacy (0.101.0)
cv2(2.4.8)

when I run python demo.py, here is the result

Using Theano backend.
Couldn't import dot_parser, loading of dot files will not be possible.
Using gpu device 0: GeForce GTX 980M (CNMeM is disabled, cuDNN 5005)



Loading image features ...
Loading question features ...
Loading VQA Model ...



Predicting result ...
80.22 %  yes
19.78 %  no
000.0 %  woman
000.0 %  train
000.0 %  man

The only change I made to the code is line 63 in demo.py

word_embeddings = spacy.load('en')#, vectors='en_glove_cc_300_1m_vectors')

I used the default vectors due to a bug in recent spacy version, which shouldnt change the result too much?

I also noticed that in pre-processing the images for VGG16, there is no mean subtraction like

img[:,:,0] -= 103.939
img[:,:,1] -= 116.779
img[:,:,2] -= 123.68

Would that cause a difference?

Appreciate your help on this! Thank you very much!

ValueError: could not broadcast input array from shape (300) into shape (100)

I am not sure what the problem us being new to using keera's

Code:
embedding_matrix = np.random.random((len(word_index) + 1, EMBEDDING_DIM))
for word, i in word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
# words not found in embedding index will be all-zeros.
embedding_matrix[i] = (embedding_vector)

embedding_layer = Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=MAX_SEQUENCE_LENGTH)

Stack Trace:
[141. 243.]
[ 76. 164.]
[ 88. 152.]
Total 27995 word vectors in Glove.
Traceback (most recent call last):

File "", line 1, in
runfile('C:/Users//Desktop/LSTM/SLSTM.py', wdir='C:/Users//Desktop/LSTM')

File "C:\Users\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 704, in runfile
execfile(filename, namespace)

File "C:\Users\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "C:/Users//Desktop/LSTM/SLSTM.py", line 111, in
embedding_matrix[i] = (embedding_vector)

ValueError: could not broadcast input array from shape (300) into shape (100)

get image features / 'int' object has no attribute 'predict'

Hi,

solved my last issue, but now I'm stuck at "get image features":

AttributeError Traceback (most recent call last)
in ()
1 # get the image features
----> 2 image_features = get_image_features(image_file_name, CNN_weights_file_name)

in get_image_features(image_file_name, CNN_weights_file_name)
16 im = np.expand_dims(im, axis=0)
17
---> 18 image_features[0,:] = get_image_model(CNN_weights_file_name).predict(im)[0]
19 return image_features

AttributeError: 'int' object has no attribute 'predict'

Any idea?

Thank you, Mike

ValueError: Error when checking : expected lstm_4_input to have shape (None, 30, 300)

Hi,

i get

ValueError: Error when checking : expected lstm_4_input to have shape (None, 30, 300) but got array with shape (1, 7, 300)

on y_output = model_vqa.predict([question_features, image_features])

Any idea why?

Thanks for your help!

fG Mike

Update: I found out that the problem is "len(tokens)"! The question was seven tokens long.

If I change "len(tokens" to "30" it works, but don't think that should be right?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.