Giter Club home page Giter Club logo

vcl's People

Contributors

zhihou7 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

vcl's Issues

How to change the batchsize for model

Thank you for your work. Sorry to interrupt. I tried to train your code on a single 1080 (12G) and the training stopped because of the following error.

`2022-05-16 20:09:46.768782: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at save_restore_v2_ops.cc:185 : Out of range: Read less bytes than requested
Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1349, in _run_fn
    return self._call_tf_sessionrun(options, feed_dict, fetch_list,
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1441, in _call_tf_sessionrun
    return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
  (0) Out of range: Read less bytes than requested
         [[{{node save_1/RestoreV2}}]]
         [[save_1/RestoreV2/_235]]
  (1) Out of range: Read less bytes than requested
         [[{{node save_1/RestoreV2}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "tools/Train_VCL_ResNet_VCOCO.py", line 115, in <module>
    sw.train_model(sess, args.max_iters)
  File "/root/autodl-tmp/VCL-master/tools/../lib/models/train_Solver_VCOCO_MultiGPU.py", line 153, in train_model
    self.from_snapshot(sess)
  File "/root/autodl-tmp/VCL-master/tools/../lib/models/train_Solver_VCOCO.py", line 170, in from_snapshot
    self.saver_restore.restore(sess, self.pretrained_model)
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/training/saver.py", line 1289, in restore
    sess.run(self.saver_def.restore_op_name,
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 955, in run
    result = self._run(None, fetches, feed_dict, options_ptr,
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1179, in _run
    results = self._do_run(handle, final_targets, final_fetches,
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1358, in _do_run
    return self._do_call(_run_fn, feeds, fetches, targets, options,
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
  (0) Out of range: Read less bytes than requested
         [[node save_1/RestoreV2 (defined at /root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
         [[save_1/RestoreV2/_235]]
  (1) Out of range: Read less bytes than requested
         [[node save_1/RestoreV2 (defined at /root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'save_1/RestoreV2':
  File "tools/Train_VCL_ResNet_VCOCO.py", line 115, in <module>
    sw.train_model(sess, args.max_iters)
  File "/root/autodl-tmp/VCL-master/tools/../lib/models/train_Solver_VCOCO_MultiGPU.py", line 153, in train_model
    self.from_snapshot(sess)
  File "/root/autodl-tmp/VCL-master/tools/../lib/models/train_Solver_VCOCO.py", line 169, in from_snapshot
    self.saver_restore = tf.train.Saver(saver_t)
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/training/saver.py", line 828, in __init__
    self.build()
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/training/saver.py", line 840, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/training/saver.py", line 868, in _build
    self.saver_def = self._builder._build_internal(  # pylint: disable=protected-access
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/training/saver.py", line 507, in _build_internal
    restore_op = self._AddRestoreOps(filename_tensor, saveables,
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/training/saver.py", line 327, in _AddRestoreOps
    all_tensors = self.bulk_restore(filename_tensor, saveables, preferred_shard,
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/training/saver.py", line 575, in bulk_restore
    return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/ops/gen_io_ops.py", line 1693, in restore_v2
    _, _, _op = _op_def_lib._apply_op_helper(
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/framework/op_def_library.py", line 792, in _apply_op_helper
    op = g.create_op(op_type_name, inputs, dtypes=None, name=scope,
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/util/deprecation.py", line 513, in new_func
    return func(*args, **kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/framework/ops.py", line 3356, in create_op
    return self._create_op_internal(op_type, inputs, dtypes, input_types, name,
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/framework/ops.py", line 3418, in _create_op_internal
    ret = Operation(
  File "/root/miniconda3/lib/python3.8/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = `tf_stack.extract_stack()``

I have never used TensorFlow before and I wonder whether it is due to insufficient GPU memory, if so, how should I adjust batchsize, I did not find any place to adjust batchsize in the code. I would appreciate it if you could help me and look forward to your reply

Inference on a test image

Hello! Thank you for this work! I would like to know how to run inference of your model on a single image.

tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation

Dear, sir

Thank you for your works!

I try to train VCL on V-COCO as following instructions.

Train an VCL on V-COCO
python tools/Train_VCL_ResNet_VCOCO.py --model VCL_union_multi_ml1_l05_t3_rew_aug5_3_new_VCOCO_test --num_iteration 400000

I only assigned 1 GPU for training and I got error messages as below, would you help me to solve with this?
I don't know why I am try to training on V-COCO, but the error is about HICO.

Traceback (most recent call last):
File "/home/kogashi/miniconda3/envs/VCL/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/home/kogashi/miniconda3/envs/VCL/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1339, in _run_fn
self._extend_graph()
File "/home/kogashi/miniconda3/envs/VCL/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1374, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation HICO_0/MatMul: {{node HICO_0/MatMul}}was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:CPU:1, /job:localhost/replica:0/task:0/device:CPU:10, /job:localhost/replica:0/task:0/device:CPU:11, /job:localhost/replica:0/task:0/device:CPU:12, /job:localhost/replica:0/task:0/device:CPU:13, /job:localhost/replica:0/task:0/device:CPU:14, /job:localhost/replica:0/task:0/device:CPU:15, /job:localhost/replica:0/task:0/device:CPU:2, /job:localhost/replica:0/task:0/device:CPU:3, /job:localhost/replica:0/task:0/device:CPU:4, /job:localhost/replica:0/task:0/device:CPU:5, /job:localhost/replica:0/task:0/device:CPU:6, /job:localhost/replica:0/task:0/device:CPU:7, /job:localhost/replica:0/task:0/device:CPU:8, /job:localhost/replica:0/task:0/device:CPU:9, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0 ]. Make sure the device specification refers to a valid device.
[[HICO_0/MatMul]]

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tools/Train_VCL_ResNet_VCOCO.py", line 109, in
sw.train_model(sess, args.max_iters)
File "/home/kogashi/VCL/tools/../lib/models/train_Solver_VCOCO_MultiGPU.py", line 153, in train_model
self.from_snapshot(sess)
File "/home/kogashi/VCL/tools/../lib/models/train_Solver_VCOCO.py", line 134, in from_snapshot
sess.run(tf.global_variables_initializer())
File "/home/kogashi/miniconda3/envs/VCL/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
File "/home/kogashi/miniconda3/envs/VCL/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "/home/kogashi/miniconda3/envs/VCL/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1350, in _do_run
run_metadata)
File "/home/kogashi/miniconda3/envs/VCL/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation HICO_0/MatMul: node HICO_0/MatMul (defined at /home/kogashi/VCL/tools/../lib/networks/ResNet50_VCOCO.py:150) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:CPU:1, /job:localhost/replica:0/task:0/device:CPU:10, /job:localhost/replica:0/task:0/device:CPU:11, /job:localhost/replica:0/task:0/device:CPU:12, /job:localhost/replica:0/task:0/device:CPU:13, /job:localhost/replica:0/task:0/device:CPU:14, /job:localhost/replica:0/task:0/device:CPU:15, /job:localhost/replica:0/task:0/device:CPU:2, /job:localhost/replica:0/task:0/device:CPU:3, /job:localhost/replica:0/task:0/device:CPU:4, /job:localhost/replica:0/task:0/device:CPU:5, /job:localhost/replica:0/task:0/device:CPU:6, /job:localhost/replica:0/task:0/device:CPU:7, /job:localhost/replica:0/task:0/device:CPU:8, /job:localhost/replica:0/task:0/device:CPU:9, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0 ]. Make sure the device specification refers to a valid device.
[[HICO_0/MatMul]]

Errors may have originated from an input operation.
Input Source operations connected to node HICO_0/MatMul:
IteratorGetNext (defined at /home/kogashi/VCL/tools/../lib/ult/ult.py:884)
HICO_0/Const (defined at /home/kogashi/VCL/tools/../lib/networks/ResNet50_VCOCO.py:148)

About Preprocessed file

Hello, I want to migrate your code to my project, but I don’t quite understand what Trainval_GT_VCOCO_obj.pkl and Trainval_Neg_VCOCO_obj.pkl in your code mean, and how to generate the above format using hico-det raw data

你好,我想把您的代码迁移到我的项目中,但是我不是很明白您代码里的Trainval_GT_VCOCO_obj.pkl和Trainval_Neg_VCOCO_obj.pkl是什么意思,使用hico-det原始数据如何生成上述格式

zero-shot

How to split rare first datasets and non-rare datasets on HICO-DET. And please provide Unseen object list for me

Tensor 'sp:0' shape mismatch

Hello:
You job is so cool, I am very glad to try this new project.
I trained a model by coco dataset with Res50. But when I run "Test_VCL_ResNet_VCOCO.py", error will be called. Like:

    "VCL_ResNet50_VCOCO doesn't support pool5_HO
    ......
    Traceback (most recent call last):
         File "tools/Test_VCL_ResNet_VCOCO.py", line 88, in <module>
             net.create_architecture(False)
           >>/lib/networks/ResNet50_VCOCO.py", line 418, in create_architecture
             self.build_network(is_training)
           >>/lib/networks/HOI.py", line 169, in build_network
             fc7_HO_raw = self.res5_ho(pool5_HO, is_training, 'res5')
           >>/lib/networks/HOI.py", line 66, in res5_ho
             scope=self.scope)
    ...
    AttributeError: 'NoneType' object has no attribute 'get_shape' "

Now I have no idea. So could you help me?

Number of parameters in the model?

Hi,
First of all, kudos to the great work!

I was wondering if you have an idea on the number of parameters (trainable and total) used by your model? I could probably dig into your code to find that but it would help if you have it already!

Looking forward to your reply! Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.