Giter Club home page Giter Club logo

image-caption's Introduction

image-caption

image caption实际上是一个看图说话的任务,即输入一张图,输出图片的描述。此项目数据集来源于COCO2014,包含80000多张图片。此项目采用tf.keras进行建模,模型很简单,适合初学者进行学习、练习

TensorFlow官方力推、GitHub爆款项目:用Attention模型自动生成图像字幕

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/generative_examples/image_captioning_with_attention.ipynb
官方给的代码使用的是tensorflow的eager模式,可自行查阅参考。

Run

  1. python data_hepler.py
    划分数据,使用了20000张图片, 并保存20000张图片特征

  2. python image_caption_keras.py
    开始训练,并保存模型

Tutorial Overview

1、数据划分、文本清洗、构建数据集,标签需要自己构造。
比如:对于two women stand on each side of the elephant来说,重新构造数据的方法:

image              caption         label  
============== ================  ==================  
image              <start>                two               
image           <start> two               women                 
image         <start> two women           stand 
image              ......                ......
image     <start> two women stand on each side of the elephant   <end>  

每次将同一张图片和该图片描述前面的词输入模型,模型的输出是描述的后一个词
2、模型

3、测试
使用beam_search来生成图片描述
效果:查看result.ipynb文件

image-caption's People

Contributors

wangru8080 avatar

Watchers

 avatar

Forkers

madmax110

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.