Giter Club home page Giter Club logo

Comments (8)

wuyangdut avatar wuyangdut commented on July 24, 2024

我也有这个疑惑,看其他地方都没有用到啊

from captcha_break.

ypwhs avatar ypwhs commented on July 24, 2024

数据生成器和 CNN 的差不多,这里需要多几个矩阵,一个是 input_length,代表序列长度,一个是 label_length,代表验证码长度,还有一个 np.ones,没有意义,只是为了适配 Keras 训练需要的矩阵输入。

from tensorflow.keras.utils import Sequence

class CaptchaSequence(Sequence):
    def __init__(self, characters, batch_size, steps, n_len=4, width=128, height=64, 
                 input_length=16, label_length=4):
        self.characters = characters
        self.batch_size = batch_size
        self.steps = steps
        self.n_len = n_len
        self.width = width
        self.height = height
        self.input_length = input_length
        self.label_length = label_length
        self.n_class = len(characters)
        self.generator = ImageCaptcha(width=width, height=height)
    
    def __len__(self):
        return self.steps

    def __getitem__(self, idx):
        X = np.zeros((self.batch_size, self.height, self.width, 3), dtype=np.float32)
        y = np.zeros((self.batch_size, self.n_len), dtype=np.uint8)
        input_length = np.ones(self.batch_size)*self.input_length
        label_length = np.ones(self.batch_size)*self.label_length
        for i in range(self.batch_size):
            random_str = ''.join([random.choice(self.characters) for j in range(self.n_len)])
            X[i] = np.array(self.generator.generate_image(random_str)) / 255.0
            y[i] = [self.characters.find(x) for x in random_str]
        return [X, y, input_length, label_length], np.ones(self.batch_size)

input_length 和 label_length 都在计算 loss 的地方用到了:

  • y_pred 是模型的输出,是按顺序输出的37个字符的概率,因为我们这里用到了循环神经网络,所以需要一个空白字符的概念;
  • labels 是验证码,是四个数字,每个数字代表字符在字符集里的位置
  • input_length 表示 y_pred 的长度,我们这里是16
  • label_length 表示 labels 的长度,我们这里是4
import tensorflow.keras.backend as K

def ctc_lambda_func(args):
    y_pred, labels, input_length, label_length = args
    return K.ctc_batch_cost(labels, y_pred, input_length, label_length)

from captcha_break.

alex337 avatar alex337 commented on July 24, 2024

请问一下这边的input_length是啥

from captcha_break.

ypwhs avatar ypwhs commented on July 24, 2024

请问一下这边的input_length是啥

input_length,代表序列长度,我们这里是4

from captcha_break.

lijiajun3029 avatar lijiajun3029 commented on July 24, 2024

请问序列长度input_length选择16的原因是什么呢,最后一张特征图宽度16有什么关系呢?16是这里的最大取值吗,小一点有什么缺点与优点呢,请赐教

from captcha_break.

ypwhs avatar ypwhs commented on July 24, 2024

请问序列长度input_length选择16的原因是什么呢,最后一张特征图宽度16有什么关系呢?16是这里的最大取值吗,小一点有什么缺点与优点呢,请赐教

input_length 表示 y_pred 的长度,也就是 RNN 输出的序列长度,它和 CNN 输出的宽度是相等的,不能随意修改。

from captcha_break.

lijiajun3029 avatar lijiajun3029 commented on July 24, 2024

请问序列长度input_length选择16的原因是什么呢,最后一张特征图宽度16有什么关系呢?16是这里的最大取值吗,小一点有什么缺点与优点呢,请赐教

input_length 表示 y_pred 的长度,也就是 RNN 输出的序列长度,它和 CNN 输出的宽度是相等的,不能随意修改。谢谢

from captcha_break.

ctzhang2008 avatar ctzhang2008 commented on July 24, 2024

请问序列长度input_length选择16的原因是什么呢,最后一张特征图宽度16有什么关系呢?16是这里的最大取值吗,小一点有什么缺点与优点呢,请赐教

input_length 表示 y_pred 的长度,也就是 RNN 输出的序列长度,它和 CNN 输出的宽度是相等的,不能随意修改。谢谢

它和 CNN 输出的宽度是相等的,不能随意修改。不理解CNN输出的宽度指的是什么?

from captcha_break.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.