Giter Club home page Giter Club logo

s2vt's Issues

How to import caffe?

I got this error

Traceback (most recent call last):
File "", line 9, in
from cnn_util import *
File "", line 4, in
import caffe
File "/content/drive/My Drive/S2VT/caffe/python/caffe/", line 1, in
from .pycaffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, RMSPropSolver, AdaDeltaSolver, AdamSolver, NCCL, Timer
File "/content/drive/My Drive/S2VT/caffe/python/caffe/", line 13, in
from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver,
ImportError: No module named _caffe

How about GPU?

I could run these code in my cpu tensorflow, but the training time is quite long. So I downloaded GPU tensorflow and wanted to run again, but there were many peoblems. The most biggest problem is ResourceExhaustedError:OMM when allocating tensor with shape[3000,4000].
I want to know if these codes just for CPU? And we cannot simply apply them to GPU environment?
Thank you for your reply! I am new to video description.

Unable to reproduce the results mentioned

I am unable to reproduce the result. Can you tell me at which epoch u got the meteor of 28%. I have trained it for about 1000 epochs and most of the captions are "a man is playing a guitar"

[Deprication warning] in rnn_cell.BasicLSTMCell

when I run the i get that warning

WARNING:tensorflow:<tensorflow.python.ops.rnn_cell.BasicLSTMCell object at 0x7f16020b08d0>: Using a concatenated state is slower and will soon be deprecated. Use state_is_tuple=True.
and then the entire process is killed.

i am using tensorflow = 0.12.0 python 2.7

test error

I have a big problem about test(). When I ran test(), there were no sentences been generated. And I found that video_feat.shape[1] was always 0 rather than 80(n_frame_step). How could I fix this?

error when running model_RGB.train()

In [2]: model_RGB.train()

TypeError Traceback (most recent call last)
in ()
----> 1 model_RGB.train()

/home2/xzhe3946/S2VT/ in train()
248 def train():
--> 249 train_data = get_video_train_data(video_train_data_path, video_train_feat_path)
250 train_captions = train_data['Description'].values
251 test_data = get_video_test_data(video_test_data_path, video_test_feat_path)

/home2/xzhe3946/S2VT/ in get_video_train_data(video_data_path, video_feat_path)
184 def get_video_train_data(video_data_path, video_feat_path):
185 video_data = pd.read_csv(video_data_path, sep=',')
--> 186 video_data = video_data[video_data['Language'] == 'English']
187 video_data['video_path'] = video_data.apply(lambda row: row['VideoID']+''+str(int(row['Start']))+''+str(int(row['End']))+'.avi.npy', axis=1)
188 video_data['video_path'] = video_data['video_path'].map(lambda x: os.path.join(video_feat_path, x))

/usr/lib/python2.7/dist-packages/pandas/core/ops.pyc in wrapper(self, other)
574 # mask out the invalids
575 if mask.any():
--> 576 res[mask] = masker
578 return res

/usr/lib/python2.7/dist-packages/pandas/core/series.pyc in setitem(self, key, value)
633 key = _check_bool_indexer(self.index, key)
634 try:
--> 635 self.where(~key, value, inplace=True)
636 return
637 except (InvalidIndexError):

/usr/lib/python2.7/dist-packages/pandas/core/generic.pyc in where(self, cond, other, inplace, axis, level, try_cast, raise_on_error)
3025 if inplace:
-> 3026 cond = -(cond.fillna(True).astype(bool))
3027 else:
3028 cond = cond.fillna(False).astype(bool)

/usr/lib/python2.7/dist-packages/pandas/core/series.pyc in neg(self)
999 # inversion
1000 def neg(self):
-> 1001 arr = operator.neg(self.values)
1002 return self._constructor(arr, self.index).finalize(self)

TypeError: The numpy boolean negative, the - operator, is not supported, use the ~ operator or the logical_not function instead.

How can I download MSVD videos, not the csv file.

@chenxinpeng Thank you for sharing your work. That’s a great work. I would appreciate if you tell me how to download videos of MSVD. I have only got the csv file after downloading from
And I also want to know what the format of the video's names are, for example, "mv89psg6zh4.avi" or "mv89psg6zh4_33_46.avi" ?
Look forward to your soonest reply.

TypeError: 'map' object is not subscriptable

File "", line 327, in train
current_feats[ind][:len(current_feats_vals[ind])] = feat
TypeError: 'map' object is not subscriptable

I can't fix it. Anyone can help?

Download MSVD

I get .csv file of MSVD dataset. How can I get video from the file and use it in

Time taken to run the code


Thank you for making this code available. Quick question -

How long does it take to run the scripts on a CPU:

time taken to run on the MSVD dataset?

and then to train and test the model?

Hope to hear back from someone soon.

error model_rgb.train()

ValueError Traceback (most recent call last)
in ()
----> 1 model_rgb.train()

/home/jyuan/software/S2VT-master/ in train()
288 with tf.variable_scope(tf.get_variable_scope(), reuse=False):
289 saver = tf.train.Saver(max_to_keep=100, write_version=1)
--> 290 train_op = tf.train.AdamOptimizer(learning_rate).minimize(tf_loss)
291 tf.global_variables_initializer().run()

/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.pyc in minimize(self, loss, global_step, var_list, gate_gradients, aggregation_method, colocate_gradients_with_ops, name, grad_loss)
324 return self.apply_gradients(grads_and_vars, global_step=global_step,
--> 325 name=name)
327 def compute_gradients(self, loss, var_list=None,

/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.pyc in apply_gradients(self, grads_and_vars, global_step, name)
444 ([str(v) for _, _, v in converted_grads_and_vars],))
445 with ops.control_dependencies(None):
--> 446 self._create_slots([_get_variable_for(v) for v in var_list])
447 update_ops = []
448 with ops.name_scope(name, self._name) as name:

/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/adam.pyc in _create_slots(self, var_list)
126 # Create slots for the first and second moments.
127 for v in var_list:
--> 128 self._zeros_slot(v, "m", self._name)
129 self._zeros_slot(v, "v", self._name)

/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.pyc in _zeros_slot(self, var, slot_name, op_name)
764 named_slots = self._slot_dict(slot_name)
765 if _var_key(var) not in named_slots:
--> 766 named_slots[_var_key(var)] = slot_creator.create_zeros_slot(var, op_name)
767 return named_slots[_var_key(var)]

/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.pyc in create_zeros_slot(primary, name, dtype, colocate_with_primary)
172 return create_slot_with_initializer(
173 primary, initializer, slot_shape, dtype, name,
--> 174 colocate_with_primary=colocate_with_primary)
175 else:
176 val = array_ops.zeros(slot_shape, dtype=dtype)

/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.pyc in create_slot_with_initializer(primary, initializer, shape, dtype, name, colocate_with_primary)
144 with ops.colocate_with(primary):
145 return _create_slot_var(primary, initializer, "", validate_shape, shape,
--> 146 dtype)
147 else:
148 return _create_slot_var(primary, initializer, "", validate_shape, shape,

/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.pyc in _create_slot_var(primary, val, scope, validate_shape, shape, dtype)
64 use_resource=_is_resource(primary),
65 shape=shape, dtype=dtype,
---> 66 validate_shape=validate_shape)
67 variable_scope.get_variable_scope().set_partitioner(current_partitioner)

/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.pyc in get_variable(name, shape, dtype, initializer, regularizer, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter)
1063 collections=collections, caching_device=caching_device,
1064 partitioner=partitioner, validate_shape=validate_shape,
-> 1065 use_resource=use_resource, custom_getter=custom_getter)
1066 get_variable_or_local_docstring = (
1067 """%s

/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.pyc in get_variable(self, var_store, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter)
960 collections=collections, caching_device=caching_device,
961 partitioner=partitioner, validate_shape=validate_shape,
--> 962 use_resource=use_resource, custom_getter=custom_getter)
964 def _get_partitioned_variable(self,

/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.pyc in get_variable(self, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter)
365 reuse=reuse, trainable=trainable, collections=collections,
366 caching_device=caching_device, partitioner=partitioner,
--> 367 validate_shape=validate_shape, use_resource=use_resource)
369 def _get_partitioned_variable(

/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.pyc in _true_getter(name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource)
350 trainable=trainable, collections=collections,
351 caching_device=caching_device, validate_shape=validate_shape,
--> 352 use_resource=use_resource)
354 if custom_getter is not None:

/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.pyc in _get_single_variable(self, name, shape, dtype, initializer, regularizer, partition_info, reuse, trainable, collections, caching_device, validate_shape, use_resource)
680 raise ValueError("Variable %s does not exist, or was not created with "
681 "tf.get_variable(). Did you mean to set reuse=None in "
--> 682 "VarScope?" % name)
683 if not shape.is_fully_defined() and not initializing_from_value:
684 raise ValueError("Shape of a new variable (%s) must be fully defined, "

ValueError: Variable Wemb/Adam/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

KeyError: 'video_path' in Model_RGB

I guess this issue had happened with people a lot,but whatsoever was not able to fix with the solutions provided. Please help
I am currently using Google Colab to run the file

KeyError: 'video_path'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last)
5 frames
in ()
453 if name == "main":
--> 454 main()

in main()
449 def main():
--> 450 train()

in train()
247 train_data = get_video_train_data(video_train_data_path, video_train_feat_path)
248 train_captions = train_data['Description'].values
--> 249 test_data = get_video_test_data(video_test_data_path, video_test_feat_path)
250 test_captions = test_data['Description'].values

in get_video_test_data(video_data_path, video_feat_path)
199 video_data = video_data[video_data['Description'].map(lambda x: isinstance(x, str))]
--> 201 unique_filenames = sorted(video_data['video_path'].unique())
202 test_data = video_data[video_data['video_path'].map(lambda x: x in unique_filenames)]
203 return test_data

/usr/local/lib/python3.6/dist-packages/pandas/core/ in getitem(self, key)
2798 if self.columns.nlevels > 1:
2799 return self._getitem_multilevel(key)
-> 2800 indexer = self.columns.get_loc(key)
2801 if is_integer(indexer):
2802 indexer = [indexer]

/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/ in get_loc(self, key, method, tolerance)
2646 return self._engine.get_loc(key)
2647 except KeyError:
-> 2648 return self._engine.get_loc(self._maybe_cast_indexer(key))
2649 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2650 if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'video_path'

how long is the train time?

It took 3600s for one epoch for me. How long are your traing time?
I am looking forward to anyone's answer. Thank you!

Rationale behind using stacked LSTM


I apologise initially if this is not the right forum to ask this question. But I believe since you were able to get good results, I thought you will be able to help me out!

I just have a confusion in understanding why the architecture involves stacked LSTMs. It is not very clearly explained in the paper (or I might have missed the finer details ). Since the inputs are just pad, I do not see any reason for the LSTM stacked layer. Request to point me in right direction to eliminate this ambiguity.


A few quentions ahout this project

Hi, I am a student who just get started with video description. I am sorry I have many quenstions about s2vt.
1.How can I get MSVD dataset? I have download from Internet,but it just have an excel document which I can not find any videos.
2.How can I get VGGmodel? I see that in your home/chenx../caffe/models/...
I would be very glad to receive your respons.
Sorry I am a tiro.

Error running model_RGB.train()

Thanks for your detailed work.
I got an error running the code. Does that mean that there is something wrong with my video dataset?

In [2]: model_RGB.train()

KeyError Traceback (most recent call last)
in ()
----> 1 model_RGB.train()

/home/binwang/Documents/S2VT/ in train()
248 def train():
--> 249 train_data = get_video_train_data(video_train_data_path, video_train_feat_path)
250 train_captions = train_data['Description'].values
251 test_data = get_video_test_data(video_test_data_path, video_test_feat_path)

/home/binwang/Documents/S2VT/ in get_video_train_data(video_data_path, video_feat_path)
190 video_data = video_data[video_data['Description'].map(lambda x: isinstance(x, str))]
--> 192 unique_filenames = sorted(video_data['video_path'].unique())
193 train_data = video_data[video_data['video_path'].map(lambda x: x in unique_filenames)]
194 return train_data

/usr/lib/python2.7/dist-packages/pandas/core/frame.pyc in getitem(self, key)
1967 return self._getitem_multilevel(key)
1968 else:
-> 1969 return self._getitem_column(key)
1971 def _getitem_column(self, key):

/usr/lib/python2.7/dist-packages/pandas/core/frame.pyc in _getitem_column(self, key)
1974 # get column
1975 if self.columns.is_unique:
-> 1976 return self._get_item_cache(key)
1978 # duplicate columns & possible reduce dimensionality

/usr/lib/python2.7/dist-packages/pandas/core/generic.pyc in _get_item_cache(self, item)
1089 res = cache.get(item)
1090 if res is None:
-> 1091 values = self._data.get(item)
1092 res = self._box_item_values(item, values)
1093 cache[item] = res

/usr/lib/python2.7/dist-packages/pandas/core/internals.pyc in get(self, item, fastpath)
3210 if not isnull(item):
-> 3211 loc = self.items.get_loc(item)
3212 else:
3213 indexer = np.arange(len(self.items))[isnull(self.items)]

/usr/lib/python2.7/dist-packages/pandas/core/index.pyc in get_loc(self, key, method, tolerance)
1757 'backfill or nearest lookups')
1758 key = _values_from_object(key)
-> 1759 return self._engine.get_loc(key)
1761 indexer = self.get_indexer([key], method=method,

/usr/lib/python2.7/dist-packages/pandas/ in pandas.index.IndexEngine.get_loc (pandas/index.c:3979)()

/usr/lib/python2.7/dist-packages/pandas/ in pandas.index.IndexEngine.get_loc (pandas/index.c:3843)()

/usr/lib/python2.7/dist-packages/pandas/ in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12265)()

/usr/lib/python2.7/dist-packages/pandas/ in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12216)()

KeyError: 'video_path'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.