s-gupta / visual-concepts Goto Github PK
View Code? Open in Web Editor NEWCode for detecting visual concepts in images.
Home Page: http://www.cs.berkeley.edu/~sgupta/captions/index.html
License: BSD 2-Clause "Simplified" License
Code for detecting visual concepts in images.
Home Page: http://www.cs.berkeley.edu/~sgupta/captions/index.html
License: BSD 2-Clause "Simplified" License
Hi
I followed your instruction to install the caffe in mil branch.
git clone [email protected]:s-gupta/caffe.git code/caffe cd code/caffe git checkout mil
After entering the codes above, when I was trying to make the caffe, I got this:
Makefile:6: *** Makefile.config not found. See Makefile.config.example.. Stop.
So I copied the Makefile.config.example and change the hdf5 path (my system is ubuntu 16.04) in the Makefile.config.
I also uncommented #USE_CUDNN (I've already installed cuda 8.0 ) and # WITH_ PYTHON_LAYER .
Then, I tried to do ' make -j 16', and I got many errors like this
Makefile:516: recipe for target '.build_release/src/caffe/layers/data_layer.o' failed make: *** [.build_release/src/caffe/layers/data_layer.o] Error 1
I installed caffe before and the changes in Makefile.config worked, I have no idea why I could not install the caffe in mil branch.
Thank you very much!
Hi @s-gupta , I was trying to do some experiment with your tool, t is really impressive.
But the problems is tha while running the demo, I am unable to get a proper sentence.
a [1.00, 1.00]
the [0.94, 0.94]
on [0.91, 0.91]
bus [0.89, 0.89]
street [0.88, 0.88]
in [0.87, 0.87]
of [0.86, 0.86]
man [0.84, 0.84]
truck [0.81, 0.81]
parking [0.81, 0.81]
car [0.81, 0.81]
parked [0.80, 0.80]
and [0.79, 0.79]
with [0.76, 0.76]
woman [0.76, 0.76]
standing [0.71, 0.71]
is [0.70, 0.70]
blue [0.68, 0.68]
motorcycle [0.65, 0.65]
people [0.62, 0.62]
tennis [0.61, 0.61]
black [0.58, 0.58]
cars [0.58, 0.58]
city [0.57, 0.57]
to [0.57, 0.57]
walking [0.55, 0.55]
bed [0.53, 0.53]
luggage [0.53, 0.53]
two [0.52, 0.52]
down [0.52, 0.52]
there are lots of unnecessary words, How can i remove those?
Hi,
I tried to follow your work but encountered some difficulties in generating lanugage with statistical model.
Can you realease the code with Language Generation and Sentence Re-Ranking in your paper?or tell me some which code can be utilized?
Thank you very much!
I know that you split you dataset into train set and valid1 and valid2, but you used the calibration set in you training technique. can you tell me what the calibration set used for?
Hi
I was following the instructions but faced this error after this line:
make -j 16
error:
Makefile:6: *** Makefile.config not found. See Makefile.config.example.. Stop.
would you please help me out?
@s-gupta hi Saurabh, can you provide me some hints about how to visualize the spatial response maps p^w_{i,j} for word w? The fc8 layer corresponds to 12x12=144 regions (of size 224x224), why the visualized maps such as figure 2, are not in rectangle shape? did you combine these 12x12 regions? Thank you!
hey, I cannot figure out what https://github.com/s-gupta/visual-concepts/blob/master/cap_eval_utils.py#L68-L85 means? can any one help me ?
Hi,
If I re-train the model with the words defined by myself, how can I generate the calibrate the model?
Thanks!
Mengmeng
In the training process, I encountered with this error message: "caffe.LayerParameter" has no field named "mil_data_param". Anyone has an idea how to deal with it?
Hi,
I tried to run the file demo.ipynb and when loading Caffe model, I had the following error:
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 161:12: Message type "caffe.LayerParameter" has no field named "mil_param".
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0107 22:52:49.491786 7653 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: home/chau/Dropbox/img_vid_cap/Code/cnn_model/vgg_finetuned/mil_finetune.prototxt.deploy
*** Check failure stack trace: ***
Aborted (core dumped)
The system said that it couldn't find the "mil_param" type. Do I have to install additional dependency or reuse code from other papers?
(My system is Ubuntu 14.04, CPU-only mode)
Thank you very much!
I can not download the models or pretrain data from ftp://ftp.cs.berkeley.edu/pub/projects/vision/im2cap-cvpr15b/* ,
I'm trying to figure out what this stretch of code does.
It seems to be computing precision, but in a lopsided way: it counts true positives multiple times (number of captions where the word occurs * 0.8), but false positives at most once (1 if the word occurs in no captions, 0.2 if it occurs in no captions). Is this the intended behaviour?
I can not download the models or pretrain data from ftp://ftp.cs.berkeley.edu/pub/projects/vision/im2cap-cvpr15b/* .
Hi Saurabh,
Recently, I am trying to export this code from Caffe to Pytorch. When tuning the model in Fully-Convolutional Network, I found a very interesting trick in the code.
The bias of the classification layer should be set to -6.58, otherwise, the optimization will be misled. For example, If this value is initialized to zero, the model even does not converge. So I want to know why you use this value as initialization and how did you find it.
there isn't github.com:pdollar/coco.git to download?
Hello,
I am using the scripts_all and trying to output the visual attributes for valid2.
Currently, I am able to reproduce the results on split test using the pre-trained model with the commind:
python run_mil.py --task output_words --gpu 1 --model output/vgg/snapshot_iter_240000.caffemodel --test_set test --calibration_set valid1 --vocab_file vocabs/vocab_train.pkl
However, when I try to ourput words for valid2 with either
python run_mil.py --task output_words --gpu 1 --model output/vgg/snapshot_iter_240000.caffemodel --test_set valid2 --calibration_set valid2 --vocab_file vocabs/vocab_train.pkl
or
python run_mil.py --task output_words --gpu 1 --model output/vgg/snapshot_iter_240000.caffemodel --test_set valid2 --calibration_set valid1 --vocab_file vocabs/vocab_train.pkl
I kind of produce the following results (for example using "prec" metric) for most of the images.
112798: a (1.00), on (0.81), of (0.79), in (0.77), the (0.76), with (0.72), and (0.65), is (0.54), to (0.41), man (0.38), sitting (0.32), an (0.31), two (0.27), at (0.26), standing (0.25), next (0.25), are (0.24),
185838: a (1.00), on (0.81), of (0.79), in (0.77), the (0.76), with (0.72), and (0.65), is (0.54), to (0.41), man (0.38), sitting (0.32), an (0.31), two (0.27), at (0.26), standing (0.25), next (0.25), are (0.24),
519874: a (1.00), on (0.81), of (0.79), in (0.77), the (0.76), with (0.72), and (0.65), is (0.54), to (0.41), man (0.38), sitting (0.32), an (0.31), two (0.27), at (0.26), standing (0.25), next (0.25), are (0.24),
235319: a (1.00), on (0.81), of (0.79), in (0.77), the (0.76), with (0.72), and (0.65), is (0.54), to (0.41), man (0.38), sitting (0.32), an (0.31), two (0.27), at (0.26), standing (0.25), next (0.25), are (0.24),
420882: a (1.00), on (0.81), of (0.79), in (0.77), the (0.76), with (0.72), and (0.65), is (0.54), to (0.41), man (0.38), sitting (0.32), an (0.31), two (0.27), at (0.26), standing (0.25), next (0.25), are (0.24),
521071: a (1.00), on (0.81), of (0.79), in (0.77), the (0.76), with (0.72), and (0.65), is (0.54), to (0.41), man (0.38), sitting (0.32), an (0.31), two (0.27), at (0.26), standing (0.25), next (0.25), are (0.24),
322194: a (1.00), on (0.81), of (0.79), in (0.77), the (0.76), with (0.72), and (0.65), is (0.54), to (0.41), man (0.38), sitting (0.32), an (0.31), two (0.27), at (0.26), standing (0.25), next (0.25), are (0.24),
Does anyone have the same problem?
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.