Giter Club home page Giter Club logo

Comments (15)

emedvedev avatar emedvedev commented on August 23, 2024

Hi,

It's perfectly normal, especially if you run GPU. Some optimizations that Tensorflow performs are non-deterministic, so you'll get slightly different results every time. There's usually no need to make it perfectly deterministic, and it can come with a speed penalty, but if you're interested, take a look at this article.

from attention-ocr.

tumusudheer avatar tumusudheer commented on August 23, 2024

Hi @emedvedev ,

I trained the net on GPU, freeze the model and running the frozen graph(model) on CPU. The results are not slightly different, but way off.
Run1: LUBE with probability 0.839753
Run2: LUBE with probability 0.690503
and
Run3: LUBE with probability 0.796141

I'll checkout the article you've referred. Thank you.

from attention-ocr.

ckirmse avatar ckirmse commented on August 23, 2024

Interesting article-- @tumusudheer maybe we can investigate uses of the non-deterministic functions and change them out. I've run into the same issue, but with longer phrases I often get actual different predicted text.

from attention-ocr.

tumusudheer avatar tumusudheer commented on August 23, 2024

Hi @ckirmse,

Sure, sounds good to me. I'll keep posted if I find a fix. Please let me know if find a fix.

Thanks.

from attention-ocr.

reidjohnson avatar reidjohnson commented on August 23, 2024

I believe the source of non-deterministic behavior is how the CNN is initialized. Namely, this line:

cnn_model = CNN(self.img_data, True)

should actually be:

cnn_model = CNN(self.img_data, not self.forward_only)

Otherwise, dropout (which will randomly remove connections from the output of the CNN) is performed even during testing.

from attention-ocr.

emedvedev avatar emedvedev commented on August 23, 2024

@reidjohnson oh, good catch! Committed the fix.

@tumusudheer @ckirmse could you verify that this behavior is fixed (or at least significantly reduced) in the latest master?

from attention-ocr.

ckirmse avatar ckirmse commented on August 23, 2024

Oh wow, yeah that's quite poor! Good catch. I'll test out tonight.

from attention-ocr.

ckirmse avatar ckirmse commented on August 23, 2024

This would have affected the exported graphs, too, I think.

from attention-ocr.

ckirmse avatar ckirmse commented on August 23, 2024

OK, confirmed that this fixed the non-determinism of prediction for me. That's really good!

As for the exported graphs--it should be building a test/prediction graph (no dropout), not using what's in the checkpoint, right @emedvedev ?

from attention-ocr.

emedvedev avatar emedvedev commented on August 23, 2024

@ckirmse I think so.

from attention-ocr.

mattfeury avatar mattfeury commented on August 23, 2024

@ckirmse @emedvedev i'm not sure that is the case and that may actually be our issue here. as far as I can tell, our export stuff pulls from the checkpoint_state, which is only saved during training, meaning it's likely saving the model as prepared for training. this is probably why aocr predict works very well, but exporting and serving gives wildly different values (#25).

from attention-ocr.

ckirmse avatar ckirmse commented on August 23, 2024

@mattfeury yeah I agree--that's what I was trying to say but I now realize my statement was vague. I meant to say "exporting should be building a test/prediction graph (no dropout), but as of right now it is using what's in the checkpoint which does have dropout, so that needs to be changed".

I'm hopeful that fixing that will fix #25.

from attention-ocr.

mattfeury avatar mattfeury commented on August 23, 2024

ok i'm going to try and get up to speed with that code and see what i can do

from attention-ocr.

tumusudheer avatar tumusudheer commented on August 23, 2024

Hi All,

The fix
cnn_model = CNN(self.img_data, not self.forward_only) is working great for me. Thank you for the fix.

Here is the code I've used to freeze binary graph without weights.

with tf.Graph().as_default() as graph:
        with tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) as sess:
            model = Model(
                phase=parameters.phase,
                visualize=parameters.visualize,
                output_dir=parameters.output_dir,
                batch_size=parameters.batch_size,
                initial_learning_rate=parameters.initial_learning_rate,
                steps_per_checkpoint=parameters.steps_per_checkpoint,
                model_dir=parameters.model_dir,
                target_embedding_size=parameters.target_embedding_size,
                attn_num_hidden=parameters.attn_num_hidden,
                attn_num_layers=parameters.attn_num_layers,
                clip_gradients=parameters.clip_gradients,
                max_gradient_norm=parameters.max_gradient_norm,
                session=sess,
                load_model=parameters.load_model,
                gpu_id=parameters.gpu_id,
                use_gru=parameters.use_gru,
                use_distance=parameters.use_distance,
                max_image_width=parameters.max_width,
                max_image_height=parameters.max_height,
                max_prediction_length=parameters.max_prediction,
            )
            graph_def = graph.as_graph_def()
            saver = tf.train.Saver()
            input_graph_def = graph.as_graph_def()
            
            sess.run(tf.global_variables_initializer())
            with gfile.GFile('test_binary_graph.pb', 'wb') as f:
                f.write(graph_def.SerializeToString())

Just use the same parameters as 'aocr test'

After this I've used freeze_graph utility from tensorflow as follows:

python freeze_graph.py --input_graph=test_binary_graph.pb --input_checkpoint=./logs/train/model.ckpt-514062 --input_binary=true --output_node_names=prediction,probability --output_graph=test_frozen_graph.pb

The both steps can be combined into a single step. The final test_frozen_graph.pb working well for me.

Hope it helps.

from attention-ocr.

tumusudheer avatar tumusudheer commented on August 23, 2024

Hi @emedvedev ,

Closing this as the fix is working great.

Thanks.

from attention-ocr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.