Giter Club home page Giter Club logo

Comments (28)

MMMMichaelzhang avatar MMMMichaelzhang commented on August 12, 2024

meldataset.py:

#coding: utf-8

import os
import os.path as osp
import time
import random
import numpy as np
import random
import soundfile as sf

import torch
from torch import nn
import torch.nn.functional as F
import torchaudio
from torch.utils.data import DataLoader

from g2pM import G2pM

import logging
logger = logging.getLogger(name)
logger.setLevel(logging.DEBUG)
from text_utils import TextCleaner
np.random.seed(1)
random.seed(1)
DEFAULT_DICT_PATH = osp.join(osp.dirname(file), 'word_index_dict.txt') #'kata_dict.csv') #
SPECT_PARAMS = {
"n_fft": 2048,
"win_length": 1200,
"hop_length": 300
}
MEL_PARAMS = {
"n_mels": 80,
}

class MelDataset(torch.utils.data.Dataset):
def init(self,
data_list,
dict_path=DEFAULT_DICT_PATH,
sr=24000
):

    spect_params = SPECT_PARAMS
    mel_params = MEL_PARAMS

    _data_list = [l[:-1].split('|') for l in data_list]
    self.data_list = [data if len(data) == 3 else (*data, 0) for data in _data_list]
    self.text_cleaner = TextCleaner(dict_path)
    self.sr = sr

    self.to_melspec = torchaudio.transforms.MelSpectrogram(**MEL_PARAMS)
    self.mean, self.std = -4, 4
    
    self.g2p = G2pM()

def __len__(self):
    return len(self.data_list)

def __getitem__(self, idx):
    data = self.data_list[idx]
    wave, text_tensor, speaker_id = self._load_tensor(data)
    wave_tensor = torch.from_numpy(wave).float()
    mel_tensor = self.to_melspec(wave_tensor)

    if (text_tensor.size(0)+1) >= (mel_tensor.size(1) // 3):
        mel_tensor = F.interpolate(
            mel_tensor.unsqueeze(0), size=(text_tensor.size(0)+1)*3, align_corners=False,
            mode='linear').squeeze(0)

    acoustic_feature = (torch.log(1e-5 + mel_tensor) - self.mean)/self.std

    length_feature = acoustic_feature.size(1)
    acoustic_feature = acoustic_feature[:, :(length_feature - length_feature % 2)]

    return wave_tensor, acoustic_feature, text_tensor, data[0]

def _load_tensor(self, data):
    wave_path, text, speaker_id = data
    speaker_id = int(speaker_id)

    wave, sr = sf.read(wave_path)

    # phonemize the text
    ps = self.g2p(text.replace('-', ' '))
    if "'" in ps:
        ps.remove("'")
    text = self.text_cleaner(ps)
    blank_index = self.text_cleaner.word_index_dictionary[" "]
    text.insert(0, blank_index) # add a blank at the beginning (silence)
    text.append(blank_index) # add a blank at the end (silence)
    
    text = torch.LongTensor(text)

    return wave, text, speaker_id

class Collater(object):
"""
Args:
return_wave (bool): if true, will return the wave data along with spectrogram.
"""

def __init__(self, return_wave=False):
    self.text_pad_index = 0
    self.return_wave = return_wave

def __call__(self, batch):
    batch_size = len(batch)

    # sort by mel length
    lengths = [b[1].shape[1] for b in batch]
    batch_indexes = np.argsort(lengths)[::-1]
    batch = [batch[bid] for bid in batch_indexes]

    nmels = batch[0][1].size(0)
    max_mel_length = max([b[1].shape[1] for b in batch])
    max_text_length = max([b[2].shape[0] for b in batch])

    mels = torch.zeros((batch_size, nmels, max_mel_length)).float()
    texts = torch.zeros((batch_size, max_text_length)).long()
    input_lengths = torch.zeros(batch_size).long()
    output_lengths = torch.zeros(batch_size).long()
    paths = ['' for _ in range(batch_size)]
    for bid, (_, mel, text, path) in enumerate(batch):
        mel_size = mel.size(1)
        text_size = text.size(0)
        mels[bid, :, :mel_size] = mel
        texts[bid, :text_size] = text
        input_lengths[bid] = text_size
        output_lengths[bid] = mel_size
        paths[bid] = path
        assert(text_size < (mel_size//2))

    if self.return_wave:
        waves = [b[0] for b in batch]
        return texts, input_lengths, mels, output_lengths, paths, waves

    return texts, input_lengths, mels, output_lengths

def build_dataloader(path_list,
validation=False,
batch_size=4,
num_workers=1,
device='cpu',
collate_config={},
dataset_config={}):

dataset = MelDataset(path_list, **dataset_config)
collate_fn = Collater(**collate_config)
data_loader = DataLoader(dataset,
                         batch_size=batch_size,
                         shuffle=(not validation),
                         num_workers=num_workers,
                         drop_last=(not validation),
                         collate_fn=collate_fn,
                         pin_memory=(device != 'cpu'))

return data_loader

from auxiliaryasr.

Kristopher-Chen avatar Kristopher-Chen commented on August 12, 2024

hi, what does your dict table for Mandarin look like?

from auxiliaryasr.

MMMMichaelzhang avatar MMMMichaelzhang commented on August 12, 2024

word_index_dict.txt
@Kristopher-Chen

from auxiliaryasr.

Kristopher-Chen avatar Kristopher-Chen commented on August 12, 2024

wo men can jia guo xu duo zhong da huo dong de biao yan

Thank you! BTW, if the input to GP2M is pinyin, it seems the output is also pinyin. How will it be changed to phonemes in the dict?

from auxiliaryasr.

MMMMichaelzhang avatar MMMMichaelzhang commented on August 12, 2024

word_index_dict.txt
maybe like this? @Kristopher-Chen

from auxiliaryasr.

Kristopher-Chen avatar Kristopher-Chen commented on August 12, 2024

word_index_dict.txt maybe like this? @Kristopher-Chen
This may be one way for training, the input and dict both in pinyin format. I just wonder if this is similar to the results of the author's or to the way trained in Chinese characters, in the meaning of encoder output.
It seems, for Chinese ASR, Chinese character, or pinyin, or phonemes are all acceptable for training. Maybe the experienced guys may tell the difference. @yl4579 Is it ok if we train by pinyin, or better by phonemes ? If by phonemes , is there any tool to directly convert pinyin into phonemes?

from auxiliaryasr.

MMMMichaelzhang avatar MMMMichaelzhang commented on August 12, 2024

how to change meldataset.py?I still got error. @Kristopher-Chen

from auxiliaryasr.

Kristopher-Chen avatar Kristopher-Chen commented on August 12, 2024

speaker_id = int(speaker_id)

it seems there is something wrong with your speaker label in the train list...

from auxiliaryasr.

MMMMichaelzhang avatar MMMMichaelzhang commented on August 12, 2024

I change it like this:

def _load_tensor(self, data):
wave_path, text, speaker_id = data
speaker_id = 0
wave, sr = sf.read(wave_path)

    # phonemize the text
    ps = text.split(" ")
    if "'" in ps:
        ps.remove("'")
    text = self.text_cleaner(ps)
    blank_index = self.text_cleaner.word_index_dictionary[" "]
    text.insert(0, blank_index)  # add a blank at the beginning (silence)
    text.append(blank_index)  # add a blank at the end (silence)
    text = torch.LongTensor(text)

    return wave, text, speaker_id

I didnt use g2p,only make txt into an array,like this:['zhi', 'ye', 'lian','sai',....]
then I run train.py,I got lots of nan..

c/home/mike/anaconda3/envs/asr/bin/python /home/mike/PycharmProjects/AuxiliaryASR/train.py
{'max_lr': 0.0005, 'pct_start': 0.0, 'epochs': 200, 'steps_per_epoch': 5}
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
/home/mike/PycharmProjects/AuxiliaryASR/trainer.py:158: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
mel_input_length = mel_input_length // (2 ** self.model.n_down)
[train]: 100%|██████████| 5/5 [00:02<00:00, 2.19it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
/home/mike/PycharmProjects/AuxiliaryASR/trainer.py:203: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
mel_input_length = mel_input_length // (2 ** self.model.n_down)
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.65it/s]
--- epoch 1 ---
train/loss : 75.0535
train/ctc : 69.0236
train/s2s : 6.0298
train/learning_rate: 0.0005
eval/ctc : 6.2143
eval/s2s : 5.7734
eval/loss : 11.9877
eval/wer : 0.9154
eval/acc : 0.0681
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.62it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s]
--- epoch 2 ---
train/loss : 11.0505
train/ctc : 5.3216
train/s2s : 5.7289
train/learning_rate: 0.0005
eval/ctc : 5.7357
eval/s2s : 5.4435
eval/loss : 11.1791
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 2.55it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 3 ---
train/loss : 10.1859
train/ctc : 4.7313
train/s2s : 5.4546
train/learning_rate: 0.0005
eval/ctc : 4.5162
eval/s2s : 5.2624
eval/loss : 9.7787
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 2.61it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.89it/s]
--- epoch 4 ---
train/loss : 9.6173
train/ctc : 4.3016
train/s2s : 5.3157
train/learning_rate: 0.0005
eval/ctc : 4.5265
eval/s2s : 5.1128
eval/loss : 9.6394
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.23it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s]
--- epoch 5 ---
train/loss : 9.1252
train/ctc : 3.9722
train/s2s : 5.1530
train/learning_rate: 0.0005
eval/ctc : 4.1449
eval/s2s : 5.0081
eval/loss : 9.1530
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.40it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s]
--- epoch 6 ---
train/loss : 8.8360
train/ctc : 3.7944
train/s2s : 5.0416
train/learning_rate: 0.0005
eval/ctc : 3.9801
eval/s2s : 4.9309
eval/loss : 8.9109
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.47it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s]
--- epoch 7 ---
train/loss : 8.5783
train/ctc : 3.6231
train/s2s : 4.9552
train/learning_rate: 0.0005
eval/ctc : 3.9631
eval/s2s : 4.8768
eval/loss : 8.8399
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 2.70it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.84it/s]
--- epoch 8 ---
train/loss : 8.3889
train/ctc : 3.5022
train/s2s : 4.8867
train/learning_rate: 0.0005
eval/ctc : 3.7899
eval/s2s : 4.8400
eval/loss : 8.6300
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 2.60it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 9 ---
train/loss : 8.2826
train/ctc : 3.4358
train/s2s : 4.8468
train/learning_rate: 0.0005
eval/ctc : 3.7411
eval/s2s : 4.8128
eval/loss : 8.5539
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.38it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s]
--- epoch 10 ---
train/loss : 8.2358
train/ctc : 3.4074
train/s2s : 4.8285
train/learning_rate: 0.0005
eval/ctc : 3.8239
eval/s2s : 4.7924
eval/loss : 8.6163
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 2.97it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.72it/s]
--- epoch 11 ---
train/loss : 8.1843
train/ctc : 3.3910
train/s2s : 4.7933
train/learning_rate: 0.0005
eval/ctc : 3.7246
eval/s2s : 4.7765
eval/loss : 8.5012
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 2.89it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.58it/s]
--- epoch 12 ---
train/loss : 8.1550
train/ctc : 3.3807
train/s2s : 4.7743
train/learning_rate: 0.0005
eval/ctc : 3.8298
eval/s2s : 4.7562
eval/loss : 8.5860
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.42it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s]
--- epoch 13 ---
train/loss : 8.1364
train/ctc : 3.3905
train/s2s : 4.7459
train/learning_rate: 0.0005
eval/ctc : 3.7493
eval/s2s : 4.7559
eval/loss : 8.5052
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.09it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s]
--- epoch 14 ---
train/loss : 8.0515
train/ctc : 3.3432
train/s2s : 4.7083
train/learning_rate: 0.0005
eval/ctc : 3.7520
eval/s2s : 4.7024
eval/loss : 8.4544
eval/wer : 0.9156
eval/acc : 0.1737
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.65it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.87it/s]
--- epoch 15 ---
train/loss : 8.0855
train/ctc : 3.3798
train/s2s : 4.7057
train/learning_rate: 0.0005
eval/ctc : 3.7498
eval/s2s : 4.6981
eval/loss : 8.4480
eval/wer : 0.9156
eval/acc : 0.1746
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.03it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s]
--- epoch 16 ---
train/loss : 8.1962
train/ctc : 3.4610
train/s2s : 4.7351
train/learning_rate: 0.0005
eval/ctc : 3.6114
eval/s2s : 4.7435
eval/loss : 8.3549
eval/wer : 0.9156
eval/acc : 0.1761
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.76it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s]
--- epoch 17 ---
train/loss : 8.1676
train/ctc : 3.4286
train/s2s : 4.7390
train/learning_rate: 0.0005
eval/ctc : 3.7043
eval/s2s : 4.7304
eval/loss : 8.4346
eval/wer : 0.9156
eval/acc : 0.1674
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.61it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s]
--- epoch 18 ---
train/loss : 8.0775
train/ctc : 3.3791
train/s2s : 4.6983
train/learning_rate: 0.0005
eval/ctc : 3.8197
eval/s2s : 4.6896
eval/loss : 8.5093
eval/wer : 0.9156
eval/acc : 0.1685
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.69it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s]
--- epoch 19 ---
train/loss : 7.9888
train/ctc : 3.3347
train/s2s : 4.6541
train/learning_rate: 0.0005
eval/ctc : 3.8478
eval/s2s : 4.6834
eval/loss : 8.5312
eval/wer : 0.9156
eval/acc : 0.1761
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.67it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s]
--- epoch 20 ---
train/loss : 7.9516
train/ctc : 3.3236
train/s2s : 4.6280
train/learning_rate: 0.0005
eval/ctc : 3.7614
eval/s2s : 4.6617
eval/loss : 8.4231
eval/wer : 0.9156
eval/acc : 0.1837
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.57it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.84it/s]
--- epoch 21 ---
train/loss : 7.9785
train/ctc : 3.3378
train/s2s : 4.6407
train/learning_rate: 0.0005
eval/ctc : 3.6597
eval/s2s : 4.7463
eval/loss : 8.4060
eval/wer : 0.9156
eval/acc : 0.1789
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.87it/s]
--- epoch 22 ---
train/loss : 7.9172
train/ctc : 3.3130
train/s2s : 4.6042
train/learning_rate: 0.0005
eval/ctc : 3.6189
eval/s2s : 4.6734
eval/loss : 8.2923
eval/wer : 0.9156
eval/acc : 0.1833
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s]
--- epoch 23 ---
train/loss : 7.8904
train/ctc : 3.2974
train/s2s : 4.5929
train/learning_rate: 0.0005
eval/ctc : 3.6237
eval/s2s : 4.6622
eval/loss : 8.2859
eval/wer : 0.9156
eval/acc : 0.1892
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s]
--- epoch 24 ---
train/loss : 7.8784
train/ctc : 3.2912
train/s2s : 4.5872
train/learning_rate: 0.0005
eval/ctc : 3.5953
eval/s2s : 4.6667
eval/loss : 8.2620
eval/wer : 0.9156
eval/acc : 0.1817
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.77it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.87it/s]
--- epoch 25 ---
train/loss : 7.8672
train/ctc : 3.2971
train/s2s : 4.5701
train/learning_rate: 0.0005
eval/ctc : 3.6394
eval/s2s : 4.6323
eval/loss : 8.2717
eval/wer : 0.9156
eval/acc : 0.1903
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 2.98it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s]
--- epoch 26 ---
train/loss : 7.8857
train/ctc : 3.3115
train/s2s : 4.5742
train/learning_rate: 0.0005
eval/ctc : 3.7269
eval/s2s : 4.6432
eval/loss : 8.3700
eval/wer : 0.9156
eval/acc : 0.1866
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s]
--- epoch 27 ---
train/loss : 7.8556
train/ctc : 3.3006
train/s2s : 4.5550
train/learning_rate: 0.0005
eval/ctc : 3.6610
eval/s2s : 4.6692
eval/loss : 8.3302
eval/wer : 0.9156
eval/acc : 0.1872
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.24it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.86it/s]
--- epoch 28 ---
train/loss : 7.8417
train/ctc : 3.3064
train/s2s : 4.5353
train/learning_rate: 0.0005
eval/ctc : 3.7343
eval/s2s : 4.8558
eval/loss : 8.5901
eval/wer : 0.9156
eval/acc : 0.1804
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.63it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s]
--- epoch 29 ---
train/loss : 7.8900
train/ctc : 3.3365
train/s2s : 4.5535
train/learning_rate: 0.0005
eval/ctc : 3.6595
eval/s2s : 4.7935
eval/loss : 8.4530
eval/wer : 0.9156
eval/acc : 0.1841
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.66it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s]
--- epoch 30 ---
train/loss : 7.8213
train/ctc : 3.3108
train/s2s : 4.5105
train/learning_rate: 0.0005
eval/ctc : 3.6763
eval/s2s : 4.7761
eval/loss : 8.4524
eval/wer : 0.9156
eval/acc : 0.1862
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.83it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s]
--- epoch 31 ---
train/loss : 7.7976
train/ctc : 3.2952
train/s2s : 4.5025
train/learning_rate: 0.0005
eval/ctc : 3.9837
eval/s2s : 4.7439
eval/loss : 8.7276
eval/wer : 0.9156
eval/acc : 0.1514
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s]
--- epoch 32 ---
train/loss : 7.9222
train/ctc : 3.3169
train/s2s : 4.6052
train/learning_rate: 0.0005
eval/ctc : 3.7498
eval/s2s : 4.9242
eval/loss : 8.6741
eval/wer : 0.9156
eval/acc : 0.1760
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.47it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.86it/s]
--- epoch 33 ---
train/loss : 7.9178
train/ctc : 3.3398
train/s2s : 4.5780
train/learning_rate: 0.0005
eval/ctc : 3.8901
eval/s2s : 4.6694
eval/loss : 8.5595
eval/wer : 0.9156
eval/acc : 0.1801
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s]
--- epoch 34 ---
train/loss : 7.8355
train/ctc : 3.3211
train/s2s : 4.5144
train/learning_rate: 0.0005
eval/ctc : 3.6889
eval/s2s : 4.6956
eval/loss : 8.3845
eval/wer : 0.9156
eval/acc : 0.1798
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.09it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s]
--- epoch 35 ---
train/loss : 7.8038
train/ctc : 3.3026
train/s2s : 4.5013
train/learning_rate: 0.0005
eval/ctc : 3.8723
eval/s2s : 4.8571
eval/loss : 8.7294
eval/wer : 0.9156
eval/acc : 0.1779
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.66it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s]
--- epoch 36 ---
train/loss : 7.8076
train/ctc : 3.3171
train/s2s : 4.4906
train/learning_rate: 0.0005
eval/ctc : 4.0461
eval/s2s : 4.8800
eval/loss : 8.9261
eval/wer : 0.9156
eval/acc : 0.1769
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.84it/s]
--- epoch 37 ---
train/loss : 7.8130
train/ctc : 3.3149
train/s2s : 4.4981
train/learning_rate: 0.0005
eval/ctc : 3.7968
eval/s2s : 4.7285
eval/loss : 8.5253
eval/wer : 0.9156
eval/acc : 0.1841
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.86it/s]
--- epoch 38 ---
train/loss : 7.7021
train/ctc : 3.2687
train/s2s : 4.4333
train/learning_rate: 0.0005
eval/ctc : 3.9418
eval/s2s : 4.7848
eval/loss : 8.7266
eval/wer : 0.9156
eval/acc : 0.1815
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.64it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s]
--- epoch 39 ---
train/loss : 7.6667
train/ctc : 3.2566
train/s2s : 4.4101
train/learning_rate: 0.0005
eval/ctc : 4.1770
eval/s2s : 4.9316
eval/loss : 9.1086
eval/wer : 0.9156
eval/acc : 0.1807
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s]
--- epoch 40 ---
train/loss : 7.6706
train/ctc : 3.2393
train/s2s : 4.4312
train/learning_rate: 0.0005
eval/ctc : 4.1531
eval/s2s : 4.7729
eval/loss : 8.9260
eval/wer : 0.9156
eval/acc : 0.1825
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s]
--- epoch 41 ---
train/loss : 7.7245
train/ctc : 3.3017
train/s2s : 4.4228
train/learning_rate: 0.0004
eval/ctc : 3.8345
eval/s2s : 4.7370
eval/loss : 8.5714
eval/wer : 0.9156
eval/acc : 0.1828
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.25it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s]
--- epoch 42 ---
train/loss : 7.8338
train/ctc : 3.3723
train/s2s : 4.4615
train/learning_rate: 0.0004
eval/ctc : 4.0506
eval/s2s : 4.8532
eval/loss : 8.9038
eval/wer : 0.9156
eval/acc : 0.1833
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.35it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s]
--- epoch 43 ---
train/loss : 7.6717
train/ctc : 3.2835
train/s2s : 4.3882
train/learning_rate: 0.0004
eval/ctc : 4.1455
eval/s2s : 4.7469
eval/loss : 8.8924
eval/wer : 0.9156
eval/acc : 0.1846
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.52it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.83it/s]
--- epoch 44 ---
train/loss : 7.6108
train/ctc : 3.2429
train/s2s : 4.3679
train/learning_rate: 0.0004
eval/ctc : 3.9913
eval/s2s : 4.9418
eval/loss : 8.9331
eval/wer : 0.9156
eval/acc : 0.1800
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s]
--- epoch 45 ---
train/loss : 7.5969
train/ctc : 3.2278
train/s2s : 4.3691
train/learning_rate: 0.0004
eval/ctc : 4.0990
eval/s2s : 4.8902
eval/loss : 8.9892
eval/wer : 0.9156
eval/acc : 0.1805
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.65it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.72it/s]
--- epoch 46 ---
train/loss : 7.6923
train/ctc : 3.2903
train/s2s : 4.4020
train/learning_rate: 0.0004
eval/ctc : 4.0601
eval/s2s : 4.9885
eval/loss : 9.0486
eval/wer : 0.9156
eval/acc : 0.1810
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.65it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s]
--- epoch 47 ---
train/loss : 7.6176
train/ctc : 3.2344
train/s2s : 4.3832
train/learning_rate: 0.0004
eval/ctc : 4.1990
eval/s2s : 4.9712
eval/loss : 9.1702
eval/wer : 0.9156
eval/acc : 0.1801
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.76it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.89it/s]
--- epoch 48 ---
train/loss : 7.7549
train/ctc : 3.3699
train/s2s : 4.3850
train/learning_rate: 0.0004
eval/ctc : 3.5383
eval/s2s : 4.6619
eval/loss : 8.2002
eval/wer : 0.9156
eval/acc : 0.1878
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.63it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 49 ---
train/loss : 7.7524
train/ctc : 3.3666
train/s2s : 4.3858
train/learning_rate: 0.0004
eval/ctc : 3.7864
eval/s2s : 4.6288
eval/loss : 8.4152
eval/wer : 0.9156
eval/acc : 0.1858
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.57it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s]
--- epoch 50 ---
train/loss : 7.6992
train/ctc : 3.3198
train/s2s : 4.3793
train/learning_rate: 0.0004
eval/ctc : 3.7948
eval/s2s : 4.7639
eval/loss : 8.5587
eval/wer : 0.9156
eval/acc : 0.1833
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.73it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.86it/s]
--- epoch 51 ---
train/loss : 7.6929
train/ctc : 3.3021
train/s2s : 4.3908
train/learning_rate: 0.0004
eval/ctc : 3.7772
eval/s2s : 4.6957
eval/loss : 8.4729
eval/wer : 0.9156
eval/acc : 0.1862
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.72it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.69it/s]
--- epoch 52 ---
train/loss : 7.6130
train/ctc : 3.2714
train/s2s : 4.3417
train/learning_rate: 0.0004
eval/ctc : 3.8054
eval/s2s : 4.6372
eval/loss : 8.4425
eval/wer : 0.9156
eval/acc : 0.1883
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.54it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s]
--- epoch 53 ---
train/loss : 7.6414
train/ctc : 3.2906
train/s2s : 4.3508
train/learning_rate: 0.0004
eval/ctc : 3.9416
eval/s2s : 4.7329
eval/loss : 8.6745
eval/wer : 0.9156
eval/acc : 0.1862
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.37it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s]
--- epoch 54 ---
train/loss : 7.5984
train/ctc : 3.2760
train/s2s : 4.3224
train/learning_rate: 0.0004
eval/ctc : 4.0414
eval/s2s : 4.7898
eval/loss : 8.8312
eval/wer : 0.9156
eval/acc : 0.1853
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.17it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.48it/s]
--- epoch 55 ---
train/loss : 7.6170
train/ctc : 3.2708
train/s2s : 4.3462
train/learning_rate: 0.0004
eval/ctc : 4.1231
eval/s2s : 4.8225
eval/loss : 8.9455
eval/wer : 0.9156
eval/acc : 0.1840
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 2.92it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.61it/s]
--- epoch 56 ---
train/loss : 7.5712
train/ctc : 3.2546
train/s2s : 4.3166
train/learning_rate: 0.0004
eval/ctc : 4.1421
eval/s2s : 4.8496
eval/loss : 8.9917
eval/wer : 0.9156
eval/acc : 0.1837
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 2.66it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s]
--- epoch 57 ---
train/loss : 7.5125
train/ctc : 3.2390
train/s2s : 4.2734
train/learning_rate: 0.0004
eval/ctc : 4.1720
eval/s2s : 4.8073
eval/loss : 8.9792
eval/wer : 0.9156
eval/acc : 0.1837
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.24it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s]
--- epoch 58 ---
train/loss : 7.5394
train/ctc : 3.2512
train/s2s : 4.2881
train/learning_rate: 0.0004
eval/ctc : 4.2439
eval/s2s : 4.8307
eval/loss : 9.0746
eval/wer : 0.9156
eval/acc : 0.1851
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.48it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.72it/s]
--- epoch 59 ---
train/loss : 7.4915
train/ctc : 3.2168
train/s2s : 4.2747
train/learning_rate: 0.0004
eval/ctc : 4.2920
eval/s2s : 4.8120
eval/loss : 9.1040
eval/wer : 0.9156
eval/acc : 0.1844
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.15it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.66it/s]
--- epoch 60 ---
train/loss : 7.5041
train/ctc : 3.2265
train/s2s : 4.2776
train/learning_rate: 0.0004
eval/ctc : 4.0227
eval/s2s : 4.7344
eval/loss : 8.7571
eval/wer : 0.9156
eval/acc : 0.1866
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.20it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.62it/s]
--- epoch 61 ---
train/loss : 7.4721
train/ctc : 3.1990
train/s2s : 4.2731
train/learning_rate: 0.0004
eval/ctc : 4.1542
eval/s2s : 4.7564
eval/loss : 8.9106
eval/wer : 0.9156
eval/acc : 0.1867
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.28it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s]
--- epoch 62 ---
train/loss : 7.4689
train/ctc : 3.2092
train/s2s : 4.2597
train/learning_rate: 0.0004
eval/ctc : 3.9684
eval/s2s : 4.8199
eval/loss : 8.7883
eval/wer : 0.9156
eval/acc : 0.1853
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.31it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.72it/s]
--- epoch 63 ---
train/loss : 7.4341
train/ctc : 3.1954
train/s2s : 4.2387
train/learning_rate: 0.0004
eval/ctc : 4.0277
eval/s2s : 4.7588
eval/loss : 8.7865
eval/wer : 0.9156
eval/acc : 0.1872
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.30it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.70it/s]
--- epoch 64 ---
train/loss : 7.5040
train/ctc : 3.2359
train/s2s : 4.2681
train/learning_rate: 0.0004
eval/ctc : 4.0127
eval/s2s : 4.7493
eval/loss : 8.7620
eval/wer : 0.9156
eval/acc : 0.1859
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.45it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s]
--- epoch 65 ---
train/loss : 7.4346
train/ctc : 3.1880
train/s2s : 4.2466
train/learning_rate: 0.0004
eval/ctc : 3.8882
eval/s2s : 4.7089
eval/loss : 8.5971
eval/wer : 0.9156
eval/acc : 0.1835
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.50it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s]
--- epoch 66 ---
train/loss : 7.4244
train/ctc : 3.1936
train/s2s : 4.2309
train/learning_rate: 0.0004
eval/ctc : 4.0908
eval/s2s : 4.8293
eval/loss : 8.9200
eval/wer : 0.9156
eval/acc : 0.1856
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.60it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.69it/s]
--- epoch 67 ---
train/loss : 7.4008
train/ctc : 3.1989
train/s2s : 4.2019
train/learning_rate: 0.0004
eval/ctc : 3.8532
eval/s2s : 4.8491
eval/loss : 8.7024
eval/wer : 0.9156
eval/acc : 0.1860
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.55it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s]
--- epoch 68 ---
train/loss : 7.3786
train/ctc : 3.1684
train/s2s : 4.2103
train/learning_rate: 0.0004
eval/ctc : 4.1781
eval/s2s : 5.0866
eval/loss : 9.2648
eval/wer : 0.9156
eval/acc : 0.1822
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.51it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 69 ---
train/loss : 7.3674
train/ctc : 3.1431
train/s2s : 4.2242
train/learning_rate: 0.0004
eval/ctc : 4.0825
eval/s2s : 4.7750
eval/loss : 8.8575
eval/wer : 0.9156
eval/acc : 0.1857
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.44it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 70 ---
train/loss : 7.3628
train/ctc : 3.1575
train/s2s : 4.2053
train/learning_rate: 0.0004
eval/ctc : 4.0177
eval/s2s : 4.7277
eval/loss : 8.7454
eval/wer : 0.9156
eval/acc : 0.1857
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.06it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 71 ---
train/loss : 7.3362
train/ctc : 3.1444
train/s2s : 4.1918
train/learning_rate: 0.0004
eval/ctc : 3.8727
eval/s2s : 4.6658
eval/loss : 8.5385
eval/wer : 0.9156
eval/acc : 0.1865
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.52it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 72 ---
train/loss : 7.2773
train/ctc : 3.1114
train/s2s : 4.1659
train/learning_rate: 0.0004
eval/ctc : 3.6477
eval/s2s : 4.7479
eval/loss : 8.3956
eval/wer : 0.9156
eval/acc : 0.1881
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.63it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s]
--- epoch 73 ---
train/loss : 7.3971
train/ctc : 3.1863
train/s2s : 4.2108
train/learning_rate: 0.0004
eval/ctc : 4.3377
eval/s2s : 4.8245
eval/loss : 9.1622
eval/wer : 0.9156
eval/acc : 0.1815
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.59it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 74 ---
train/loss : 7.2459
train/ctc : 3.1035
train/s2s : 4.1425
train/learning_rate: 0.0003
eval/ctc : 4.0785
eval/s2s : 4.7461
eval/loss : 8.8246
eval/wer : 0.9156
eval/acc : 0.1865
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.05it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s]
--- epoch 75 ---
train/loss : 7.2715
train/ctc : 3.1017
train/s2s : 4.1698
train/learning_rate: 0.0003
eval/ctc : 3.9213
eval/s2s : 4.7599
eval/loss : 8.6812
eval/wer : 0.9156
eval/acc : 0.1847
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.52it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s]
--- epoch 76 ---
train/loss : 7.3272
train/ctc : 3.1439
train/s2s : 4.1833
train/learning_rate: 0.0003
eval/ctc : 4.1709
eval/s2s : 4.8088
eval/loss : 8.9798
eval/wer : 0.9156
eval/acc : 0.1832
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.88it/s]
--- epoch 77 ---
train/loss : 7.2035
train/ctc : 3.0685
train/s2s : 4.1350
train/learning_rate: 0.0003
eval/ctc : 4.6577
eval/s2s : 5.0092
eval/loss : 9.6669
eval/wer : 0.9156
eval/acc : 0.1787
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.66it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s]
--- epoch 78 ---
train/loss : 7.2384
train/ctc : 3.0983
train/s2s : 4.1401
train/learning_rate: 0.0003
eval/ctc : 4.1970
eval/s2s : 4.9218
eval/loss : 9.1188
eval/wer : 0.9156
eval/acc : 0.1815
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.02it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.86it/s]
--- epoch 79 ---
train/loss : 7.1406
train/ctc : 3.0216
train/s2s : 4.1190
train/learning_rate: 0.0003
eval/ctc : 4.3503
eval/s2s : 4.9700
eval/loss : 9.3204
eval/wer : 0.9156
eval/acc : 0.1822
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.38it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.69it/s]
--- epoch 80 ---
train/loss : 7.2061
train/ctc : 3.0606
train/s2s : 4.1455
train/learning_rate: 0.0003
eval/ctc : 3.9002
eval/s2s : 4.7365
eval/loss : 8.6367
eval/wer : 0.9156
eval/acc : 0.1858
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.18it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s]
--- epoch 81 ---
train/loss : 7.1768
train/ctc : 3.0608
train/s2s : 4.1159
train/learning_rate: 0.0003
eval/ctc : 4.6743
eval/s2s : 4.8976
eval/loss : 9.5719
eval/wer : 0.9156
eval/acc : 0.1828
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 2.83it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.70it/s]
--- epoch 82 ---
train/loss : 7.2171
train/ctc : 3.0902
train/s2s : 4.1269
train/learning_rate: 0.0003
eval/ctc : 3.8073
eval/s2s : 4.8600
eval/loss : 8.6673
eval/wer : 0.9156
eval/acc : 0.1843
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.24it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s]
--- epoch 83 ---
train/loss : 7.2151
train/ctc : 3.0843
train/s2s : 4.1308
train/learning_rate: 0.0003
eval/ctc : 3.7982
eval/s2s : 4.7248
eval/loss : 8.5230
eval/wer : 0.9156
eval/acc : 0.1849
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.44it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s]
--- epoch 84 ---
train/loss : 7.1503
train/ctc : 3.0318
train/s2s : 4.1185
train/learning_rate: 0.0003
eval/ctc : 3.9267
eval/s2s : 4.7152
eval/loss : 8.6419
eval/wer : 0.9156
eval/acc : 0.1840
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.50it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.84it/s]
--- epoch 85 ---
train/loss : 7.1685
train/ctc : 3.0486
train/s2s : 4.1199
train/learning_rate: 0.0003
eval/ctc : 4.0690
eval/s2s : 4.8683
eval/loss : 8.9373
eval/wer : 0.9156
eval/acc : 0.1845
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.63it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s]
--- epoch 86 ---
train/loss : 7.1133
train/ctc : 3.0121
train/s2s : 4.1013
train/learning_rate: 0.0003
eval/ctc : 4.0398
eval/s2s : 4.7599
eval/loss : 8.7997
eval/wer : 0.9156
eval/acc : 0.1828
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.51it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s]
--- epoch 87 ---
train/loss : 7.0757
train/ctc : 3.0049
train/s2s : 4.0708
train/learning_rate: 0.0003
eval/ctc : 4.2974
eval/s2s : 4.8235
eval/loss : 9.1209
eval/wer : 0.9156
eval/acc : 0.1864
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.65it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.72it/s]
--- epoch 88 ---
train/loss : 7.1118
train/ctc : 3.0216
train/s2s : 4.0902
train/learning_rate: 0.0003
eval/ctc : 3.9977
eval/s2s : 4.7862
eval/loss : 8.7840
eval/wer : 0.9156
eval/acc : 0.1844
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.61it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s]
--- epoch 89 ---
train/loss : 7.0431
train/ctc : 2.9789
train/s2s : 4.0642
train/learning_rate: 0.0003
eval/ctc : 4.2950
eval/s2s : 4.9520
eval/loss : 9.2470
eval/wer : 0.9156
eval/acc : 0.1828
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.73it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s]
--- epoch 90 ---
train/loss : 7.0664
train/ctc : 2.9918
train/s2s : 4.0746
train/learning_rate: 0.0003
eval/ctc : 4.1555
eval/s2s : 4.8369
eval/loss : 8.9924
eval/wer : 0.9156
eval/acc : 0.1799
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.87it/s]
--- epoch 91 ---
train/loss : 7.0185
train/ctc : 2.9730
train/s2s : 4.0455
train/learning_rate: 0.0003
eval/ctc : 4.0850
eval/s2s : 4.7547
eval/loss : 8.8398
eval/wer : 0.9150
eval/acc : 0.1843
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.73it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s]
--- epoch 92 ---
train/loss : 7.0077
train/ctc : 2.9500
train/s2s : 4.0577
train/learning_rate: 0.0003
eval/ctc : 4.0783
eval/s2s : 4.7477
eval/loss : 8.8260
eval/wer : 0.9156
eval/acc : 0.1851
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s]
--- epoch 93 ---
train/loss : 6.9128
train/ctc : 2.9149
train/s2s : 3.9978
train/learning_rate: 0.0003
eval/ctc : 4.5163
eval/s2s : 4.9429
eval/loss : 9.4592
eval/wer : 0.9116
eval/acc : 0.1814
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.54it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s]
--- epoch 94 ---
train/loss : 7.1016
train/ctc : 3.0004
train/s2s : 4.1013
train/learning_rate: 0.0003
eval/ctc : 4.8001
eval/s2s : 4.9809
eval/loss : 9.7809
eval/wer : 0.9156
eval/acc : 0.1811
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.64it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.84it/s]
--- epoch 95 ---
train/loss : 6.9483
train/ctc : 2.9333
train/s2s : 4.0150
train/learning_rate: 0.0003
eval/ctc : 3.8254
eval/s2s : 4.6975
eval/loss : 8.5228
eval/wer : 0.9156
eval/acc : 0.1840
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.64it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.69it/s]
--- epoch 96 ---
train/loss : 7.0110
train/ctc : 2.9697
train/s2s : 4.0413
train/learning_rate: 0.0003
eval/ctc : 3.9616
eval/s2s : 4.8604
eval/loss : 8.8220
eval/wer : 0.9115
eval/acc : 0.1797
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.76it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.59it/s]
--- epoch 97 ---
train/loss : 7.0256
train/ctc : 2.9598
train/s2s : 4.0659
train/learning_rate: 0.0003
eval/ctc : 3.9609
eval/s2s : 4.9142
eval/loss : 8.8752
eval/wer : 0.9141
eval/acc : 0.1820
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 2.92it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.57it/s]
--- epoch 98 ---
train/loss : 6.9884
train/ctc : 2.9417
train/s2s : 4.0467
train/learning_rate: 0.0003
eval/ctc : 4.1383
eval/s2s : 4.7269
eval/loss : 8.8652
eval/wer : 0.9149
eval/acc : 0.1886
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.60it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 99 ---
train/loss : 6.9401
train/ctc : 2.9170
train/s2s : 4.0231
train/learning_rate: 0.0003
eval/ctc : 4.2011
eval/s2s : 4.7642
eval/loss : 8.9653
eval/wer : 0.9154
eval/acc : 0.1825
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.43it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.59it/s]
--- epoch 100 ---
train/loss : 6.9860
train/ctc : 2.9462
train/s2s : 4.0398
train/learning_rate: 0.0003
eval/ctc : 4.0909
eval/s2s : 4.8165
eval/loss : 8.9075
eval/wer : 0.9149
eval/acc : 0.1837
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.71it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s]
--- epoch 101 ---
train/loss : 6.8023
train/ctc : 2.8412
train/s2s : 3.9610
train/learning_rate: 0.0002
eval/ctc : 4.1164
eval/s2s : 4.7567
eval/loss : 8.8731
eval/wer : 0.9146
eval/acc : 0.1841
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.53it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 102 ---
train/loss : 6.8663
train/ctc : 2.8736
train/s2s : 3.9927
train/learning_rate: 0.0002
eval/ctc : 3.9890
eval/s2s : 4.8344
eval/loss : 8.8234
eval/wer : 0.9051
eval/acc : 0.1835
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.60it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s]
--- epoch 103 ---
train/loss : 6.8254
train/ctc : 2.8415
train/s2s : 3.9839
train/learning_rate: 0.0002
eval/ctc : 4.2138
eval/s2s : 4.8773
eval/loss : 9.0911
eval/wer : 0.9127
eval/acc : 0.1801
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 104 ---
train/loss : 6.8487
train/ctc : 2.8615
train/s2s : 3.9872
train/learning_rate: 0.0002
eval/ctc : 4.0311
eval/s2s : 4.7282
eval/loss : 8.7593
eval/wer : 0.9094
eval/acc : 0.1862
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.69it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.69it/s]
--- epoch 105 ---
train/loss : 6.7640
train/ctc : 2.8293
train/s2s : 3.9347
train/learning_rate: 0.0002
eval/ctc : 4.2728
eval/s2s : 4.9799
eval/loss : 9.2526
eval/wer : 0.9094
eval/acc : 0.1806
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.66it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s]
--- epoch 106 ---
train/loss : 6.7261
train/ctc : 2.7815
train/s2s : 3.9446
train/learning_rate: 0.0002
eval/ctc : 4.1878
eval/s2s : 4.7721
eval/loss : 8.9600
eval/wer : 0.9144
eval/acc : 0.1836
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.55it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 107 ---
train/loss : 6.7471
train/ctc : 2.8097
train/s2s : 3.9374
train/learning_rate: 0.0002
eval/ctc : 4.1852
eval/s2s : 4.8745
eval/loss : 9.0597
eval/wer : 0.9112
eval/acc : 0.1803
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.50it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s]
--- epoch 108 ---
train/loss : 6.7333
train/ctc : 2.7896
train/s2s : 3.9437
train/learning_rate: 0.0002
eval/ctc : 4.3646
eval/s2s : 4.9009
eval/loss : 9.2655
eval/wer : 0.9051
eval/acc : 0.1822
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.42it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s]
--- epoch 109 ---
train/loss : 6.7074
train/ctc : 2.7849
train/s2s : 3.9225
train/learning_rate: 0.0002
eval/ctc : 4.1075
eval/s2s : 4.8536
eval/loss : 8.9611
eval/wer : 0.9038
eval/acc : 0.1846
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.45it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s]
--- epoch 110 ---
train/loss : 6.7409
train/ctc : 2.8076
train/s2s : 3.9333
train/learning_rate: 0.0002
eval/ctc : 4.2799
eval/s2s : 4.8573
eval/loss : 9.1372
eval/wer : 0.9024
eval/acc : 0.1843
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.58it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s]
--- epoch 111 ---
train/loss : 6.8031
train/ctc : 2.8327
train/s2s : 3.9703
train/learning_rate: 0.0002
eval/ctc : 4.3046
eval/s2s : 4.8558
eval/loss : 9.1604
eval/wer : 0.9021
eval/acc : 0.1865
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.61it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s]
--- epoch 112 ---
train/loss : 6.7090
train/ctc : 2.7768
train/s2s : 3.9322
train/learning_rate: 0.0002
eval/ctc : 4.3296
eval/s2s : 4.9961
eval/loss : 9.3257
eval/wer : 0.9033
eval/acc : 0.1840
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.57it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.66it/s]
--- epoch 113 ---
train/loss : 6.6278
train/ctc : 2.7155
train/s2s : 3.9123
train/learning_rate: 0.0002
eval/ctc : 4.4864
eval/s2s : 4.7846
eval/loss : 9.2710
eval/wer : 0.9092
eval/acc : 0.1850
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.66it/s]
--- epoch 114 ---
train/loss : 6.6054
train/ctc : 2.7000
train/s2s : 3.9054
train/learning_rate: 0.0002
eval/ctc : 4.5483
eval/s2s : 4.8994
eval/loss : 9.4477
eval/wer : 0.8950
eval/acc : 0.1827
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.66it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s]
--- epoch 115 ---
train/loss : 6.5971
train/ctc : 2.7080
train/s2s : 3.8892
train/learning_rate: 0.0002
eval/ctc : 4.6213
eval/s2s : 4.9159
eval/loss : 9.5372
eval/wer : 0.9038
eval/acc : 0.1838
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.61it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s]
--- epoch 116 ---
train/loss : 6.6083
train/ctc : 2.7158
train/s2s : 3.8925
train/learning_rate: 0.0002
eval/ctc : 4.6672
eval/s2s : 4.8830
eval/loss : 9.5502
eval/wer : 0.8977
eval/acc : 0.1860
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.00it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 117 ---
train/loss : 6.6253
train/ctc : 2.7059
train/s2s : 3.9195
train/learning_rate: 0.0002
eval/ctc : 4.5678
eval/s2s : 4.9646
eval/loss : 9.5324
eval/wer : 0.8970
eval/acc : 0.1839
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.64it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s]
--- epoch 118 ---
train/loss : 6.6870
train/ctc : 2.7655
train/s2s : 3.9215
train/learning_rate: 0.0002
eval/ctc : 4.4199
eval/s2s : 4.9017
eval/loss : 9.3216
eval/wer : 0.8919
eval/acc : 0.1844
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.75it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s]
--- epoch 119 ---
train/loss : 6.6171
train/ctc : 2.7096
train/s2s : 3.9075
train/learning_rate: 0.0002
eval/ctc : 4.7816
eval/s2s : 4.9494
eval/loss : 9.7310
eval/wer : 0.8923
eval/acc : 0.1836
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.68it/s]
--- epoch 120 ---
train/loss : 6.6390
train/ctc : 2.7281
train/s2s : 3.9109
train/learning_rate: 0.0002
eval/ctc : 4.5604
eval/s2s : 4.9981
eval/loss : 9.5585
eval/wer : 0.8985
eval/acc : 0.1834
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.61it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s]
--- epoch 121 ---
train/loss : 6.5856
train/ctc : 2.6883
train/s2s : 3.8973
train/learning_rate: 0.0002
eval/ctc : 4.5273
eval/s2s : 4.8998
eval/loss : 9.4271
eval/wer : 0.8895
eval/acc : 0.1845
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 122 ---
train/loss : 6.4885
train/ctc : 2.6342
train/s2s : 3.8543
train/learning_rate: 0.0002
eval/ctc : 4.7328
eval/s2s : 4.9184
eval/loss : 9.6512
eval/wer : 0.9001
eval/acc : 0.1851
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.63it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s]
--- epoch 123 ---
train/loss : 6.5376
train/ctc : 2.6740
train/s2s : 3.8635
train/learning_rate: 0.0002
eval/ctc : 4.7663
eval/s2s : 4.9702
eval/loss : 9.7365
eval/wer : 0.8941
eval/acc : 0.1868
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.45it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s]
--- epoch 124 ---
train/loss : 6.4988
train/ctc : 2.6389
train/s2s : 3.8599
train/learning_rate: 0.0002
eval/ctc : 4.8439
eval/s2s : 4.9807
eval/loss : 9.8246
eval/wer : 0.8911
eval/acc : 0.1854
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s]
--- epoch 125 ---
train/loss : 6.4257
train/ctc : 2.5810
train/s2s : 3.8447
train/learning_rate: 0.0002
eval/ctc : 4.9985
eval/s2s : 5.0640
eval/loss : 10.0625
eval/wer : 0.8936
eval/acc : 0.1842
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.58it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s]
--- epoch 126 ---
train/loss : 6.4752
train/ctc : 2.6204
train/s2s : 3.8548
train/learning_rate: 0.0002
eval/ctc : 4.8503
eval/s2s : 4.9105
eval/loss : 9.7608
eval/wer : 0.8920
eval/acc : 0.1830
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.64it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.87it/s]
--- epoch 127 ---
train/loss : 6.4884
train/ctc : 2.6190
train/s2s : 3.8694
train/learning_rate: 0.0001
eval/ctc : 5.0363
eval/s2s : 5.1468
eval/loss : 10.1831
eval/wer : 0.8921
eval/acc : 0.1830
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.84it/s]
--- epoch 128 ---
train/loss : 6.4482
train/ctc : 2.5900
train/s2s : 3.8581
train/learning_rate: 0.0001
eval/ctc : 4.9648
eval/s2s : 4.8854
eval/loss : 9.8502
eval/wer : 0.8874
eval/acc : 0.1842
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.69it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 129 ---
train/loss : 6.4149
train/ctc : 2.5781
train/s2s : 3.8368
train/learning_rate: 0.0001
eval/ctc : 4.9873
eval/s2s : 5.0764
eval/loss : 10.0637
eval/wer : 0.8910
eval/acc : 0.1827
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.14it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s]
--- epoch 130 ---
train/loss : 6.4227
train/ctc : 2.5821
train/s2s : 3.8406
train/learning_rate: 0.0001
eval/ctc : 5.0115
eval/s2s : 4.9145
eval/loss : 9.9260
eval/wer : 0.8866
eval/acc : 0.1811
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nannan

nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.37it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s]
--- epoch 131 ---
train/loss : 6.3550
train/ctc : 2.5384
train/s2s : 3.8167
train/learning_rate: 0.0001
eval/ctc : 5.0433
eval/s2s : 5.0876
eval/loss : 10.1309
eval/wer : 0.8889
eval/acc : 0.1845
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.47it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s]
--- epoch 132 ---
train/loss : 6.3446
train/ctc : 2.5334
train/s2s : 3.8112
train/learning_rate: 0.0001
eval/ctc : 4.9058
eval/s2s : 4.8962
eval/loss : 9.8020
eval/wer : 0.8830
eval/acc : 0.1824
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.67it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s]
--- epoch 133 ---
train/loss : 6.3411
train/ctc : 2.5253
train/s2s : 3.8158
train/learning_rate: 0.0001
eval/ctc : 5.0717
eval/s2s : 5.0374
eval/loss : 10.1091
eval/wer : 0.8755
eval/acc : 0.1845
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nannan

nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.52it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s]
--- epoch 134 ---
train/loss : 6.4452
train/ctc : 2.5937
train/s2s : 3.8515
train/learning_rate: 0.0001
eval/ctc : 5.0382
eval/s2s : 4.9168
eval/loss : 9.9550
eval/wer : 0.8833
eval/acc : 0.1862
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nannan

nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.46it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 135 ---
train/loss : 6.3595
train/ctc : 2.5399
train/s2s : 3.8196
train/learning_rate: 0.0001
eval/ctc : 5.0823
eval/s2s : 5.0376
eval/loss : 10.1199
eval/wer : 0.8694
eval/acc : 0.1833
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.62it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s]
--- epoch 136 ---
train/loss : 6.2757
train/ctc : 2.4839
train/s2s : 3.7918
train/learning_rate: 0.0001
eval/ctc : 4.9648
eval/s2s : 4.9231
eval/loss : 9.8879
eval/wer : 0.8760
eval/acc : 0.1840
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.51it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s]
--- epoch 137 ---
train/loss : 6.4195
train/ctc : 2.5827
train/s2s : 3.8367
train/learning_rate: 0.0001
eval/ctc : 5.2293
eval/s2s : 5.0484
eval/loss : 10.2776
eval/wer : 0.8782
eval/acc : 0.1866
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.57it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.72it/s]
--- epoch 138 ---
train/loss : 6.4159
train/ctc : 2.5802
train/s2s : 3.8358
train/learning_rate: 0.0001
eval/ctc : 5.2113
eval/s2s : 4.9180
eval/loss : 10.1293
eval/wer : 0.8747
eval/acc : 0.1859
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s]
--- epoch 139 ---
train/loss : 6.3295
train/ctc : 2.5188
train/s2s : 3.8107
train/learning_rate: 0.0001
eval/ctc : 5.1996
eval/s2s : 5.0505
eval/loss : 10.2501
eval/wer : 0.8757
eval/acc : 0.1841
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.75it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.83it/s]
--- epoch 140 ---
train/loss : 6.2596
train/ctc : 2.4745
train/s2s : 3.7850
train/learning_rate: 0.0001
eval/ctc : 5.1043
eval/s2s : 5.1086
eval/loss : 10.2130
eval/wer : 0.8682
eval/acc : 0.1876
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s]
--- epoch 141 ---
train/loss : 6.4073
train/ctc : 2.5652
train/s2s : 3.8420
train/learning_rate: 0.0001
eval/ctc : 5.1560
eval/s2s : 4.9750
eval/loss : 10.1310
eval/wer : 0.8730
eval/acc : 0.1840
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.62it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s]
--- epoch 142 ---
train/loss : 6.3672
train/ctc : 2.5420
train/s2s : 3.8252
train/learning_rate: 0.0001
eval/ctc : 5.1685
eval/s2s : 5.0811
eval/loss : 10.2496
eval/wer : 0.8706
eval/acc : 0.1831
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s]
--- epoch 143 ---
train/loss : 6.2984
train/ctc : 2.4938
train/s2s : 3.8045
train/learning_rate: 0.0001
eval/ctc : 5.1751
eval/s2s : 4.8938
eval/loss : 10.0690
eval/wer : 0.8697
eval/acc : 0.1864
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s]
--- epoch 144 ---
train/loss : 6.3082
train/ctc : 2.5022
train/s2s : 3.8060
train/learning_rate: 0.0001
eval/ctc : 5.2745
eval/s2s : 5.1016
eval/loss : 10.3761
eval/wer : 0.8844
eval/acc : 0.1816
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.66it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s]
--- epoch 145 ---
train/loss : 6.3316
train/ctc : 2.5142
train/s2s : 3.8174
train/learning_rate: 0.0001
eval/ctc : 5.3757
eval/s2s : 4.9865
eval/loss : 10.3623
eval/wer : 0.8680
eval/acc : 0.1832
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.60it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s]
--- epoch 146 ---
train/loss : 6.3041
train/ctc : 2.4977
train/s2s : 3.8064
train/learning_rate: 0.0001
eval/ctc : 5.3912
eval/s2s : 4.9660
eval/loss : 10.3572
eval/wer : 0.8683
eval/acc : 0.1847
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.50it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s]
--- epoch 147 ---
train/loss : 6.1718
train/ctc : 2.4324
train/s2s : 3.7395
train/learning_rate: 0.0001
eval/ctc : 5.2637
eval/s2s : 4.9811
eval/loss : 10.2448
eval/wer : 0.8820
eval/acc : 0.1847
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.76it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.83it/s]
--- epoch 148 ---
train/loss : 6.2739
train/ctc : 2.4775
train/s2s : 3.7964
train/learning_rate: 0.0001
eval/ctc : 5.4498
eval/s2s : 5.0704
eval/loss : 10.5201
eval/wer : 0.8704
eval/acc : 0.1841
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.55it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s]
--- epoch 149 ---
train/loss : 6.2611
train/ctc : 2.4690
train/s2s : 3.7921
train/learning_rate: 0.0001
eval/ctc : 5.4955
eval/s2s : 5.0015
eval/loss : 10.4970
eval/wer : 0.8684
eval/acc : 0.1823
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.61it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.69it/s]
--- epoch 150 ---
train/loss : 6.2660
train/ctc : 2.4992
train/s2s : 3.7668
train/learning_rate: 0.0001
eval/ctc : 5.3163
eval/s2s : 5.0459
eval/loss : 10.3622
eval/wer : 0.8734
eval/acc : 0.1837
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.67it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.83it/s]
--- epoch 151 ---
train/loss : 6.2079
train/ctc : 2.4543
train/s2s : 3.7536
train/learning_rate: 0.0001
eval/ctc : 5.5699
eval/s2s : 5.1366
eval/loss : 10.7065
eval/wer : 0.8644
eval/acc : 0.1851
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.58it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.70it/s]
--- epoch 152 ---
train/loss : 6.2287
train/ctc : 2.4417
train/s2s : 3.7871
train/learning_rate: 0.0001
eval/ctc : 5.4731
eval/s2s : 4.9824
eval/loss : 10.4555
eval/wer : 0.8508
eval/acc : 0.1883
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s]
--- epoch 153 ---
train/loss : 6.2143
train/ctc : 2.4430
train/s2s : 3.7714
train/learning_rate: 0.0001
eval/ctc : 5.4926
eval/s2s : 5.0861
eval/loss : 10.5787
eval/wer : 0.8609
eval/acc : 0.1871
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 2.87it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s]
--- epoch 154 ---
train/loss : 6.2405
train/ctc : 2.4724
train/s2s : 3.7681
train/learning_rate: 0.0001
eval/ctc : 5.5326
eval/s2s : 5.1088
eval/loss : 10.6414
eval/wer : 0.8641
eval/acc : 0.1834
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.50it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s]
--- epoch 155 ---
train/loss : 6.2183
train/ctc : 2.4464
train/s2s : 3.7719
train/learning_rate: 0.0001
eval/ctc : 5.4821
eval/s2s : 5.0116
eval/loss : 10.4937
eval/wer : 0.8540
eval/acc : 0.1877
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.42it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s]
--- epoch 156 ---
train/loss : 6.2605
train/ctc : 2.4798
train/s2s : 3.7807
train/learning_rate: 0.0001
eval/ctc : 5.3754
eval/s2s : 4.9878
eval/loss : 10.3632
eval/wer : 0.8588
eval/acc : 0.1845
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.65it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s]
--- epoch 157 ---
train/loss : 6.2181
train/ctc : 2.4409
train/s2s : 3.7772
train/learning_rate: 0.0001
eval/ctc : 5.3325
eval/s2s : 5.0767
eval/loss : 10.4092
eval/wer : 0.8625
eval/acc : 0.1862
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s]
--- epoch 158 ---
train/loss : 6.2230
train/ctc : 2.4478
train/s2s : 3.7752
train/learning_rate: 0.0001
eval/ctc : 5.4948
eval/s2s : 5.0612
eval/loss : 10.5560
eval/wer : 0.8647
eval/acc : 0.1858
[train]: 0%| | 0/5 [00:00<?, ?it/s]nannan

nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.78it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s]
--- epoch 159 ---
train/loss : 6.2061
train/ctc : 2.4407
train/s2s : 3.7654
train/learning_rate: 0.0001
eval/ctc : 5.6033
eval/s2s : 5.0781
eval/loss : 10.6813
eval/wer : 0.8548
eval/acc : 0.1801
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.76it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s]
--- epoch 160 ---
train/loss : 6.1916
train/ctc : 2.4349
train/s2s : 3.7567
train/learning_rate: 0.0001
eval/ctc : 5.5541
eval/s2s : 5.1141
eval/loss : 10.6682
eval/wer : 0.8493
eval/acc : 0.1833
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.62it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s]
--- epoch 161 ---
train/loss : 6.0618
train/ctc : 2.3397
train/s2s : 3.7220
train/learning_rate: 0.0000
eval/ctc : 5.4812
eval/s2s : 5.1022
eval/loss : 10.5834
eval/wer : 0.8570
eval/acc : 0.1870
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.52it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 162 ---
train/loss : 6.1683
train/ctc : 2.4256
train/s2s : 3.7427
train/learning_rate: 0.0000
eval/ctc : 5.4861
eval/s2s : 4.9905
eval/loss : 10.4766
eval/wer : 0.8572
eval/acc : 0.1869
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.55it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s]
--- epoch 163 ---
train/loss : 6.2493
train/ctc : 2.4704
train/s2s : 3.7789
train/learning_rate: 0.0000
eval/ctc : 5.7146
eval/s2s : 5.0813
eval/loss : 10.7960
eval/wer : 0.8580
eval/acc : 0.1854
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.65it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s]
--- epoch 164 ---
train/loss : 6.2610
train/ctc : 2.4721
train/s2s : 3.7889
train/learning_rate: 0.0000
eval/ctc : 5.7172
eval/s2s : 5.1034
eval/loss : 10.8207
eval/wer : 0.8577
eval/acc : 0.1790
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.67it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s]
--- epoch 165 ---
train/loss : 6.0859
train/ctc : 2.3641
train/s2s : 3.7219
train/learning_rate: 0.0000
eval/ctc : 5.6258
eval/s2s : 5.0231
eval/loss : 10.6490
eval/wer : 0.8590
eval/acc : 0.1820
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.49it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.66it/s]
--- epoch 166 ---
train/loss : 6.2488
train/ctc : 2.4680
train/s2s : 3.7808
train/learning_rate: 0.0000
eval/ctc : 5.4833
eval/s2s : 5.0314
eval/loss : 10.5147
eval/wer : 0.8558
eval/acc : 0.1870
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.45it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s]
--- epoch 167 ---
train/loss : 6.1344
train/ctc : 2.3942
train/s2s : 3.7402
train/learning_rate: 0.0000
eval/ctc : 5.5201
eval/s2s : 5.1154
eval/loss : 10.6355
eval/wer : 0.8567
eval/acc : 0.1853
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.67it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.68it/s]
--- epoch 168 ---
train/loss : 6.1516
train/ctc : 2.4127
train/s2s : 3.7388
train/learning_rate: 0.0000
eval/ctc : 5.5528
eval/s2s : 5.0850
eval/loss : 10.6378
eval/wer : 0.8559
eval/acc : 0.1838
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.35it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s]
--- epoch 169 ---
train/loss : 6.1128
train/ctc : 2.3775
train/s2s : 3.7353
train/learning_rate: 0.0000
eval/ctc : 5.5691
eval/s2s : 5.0203
eval/loss : 10.5895
eval/wer : 0.8553
eval/acc : 0.1857
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.03it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s]
--- epoch 170 ---
train/loss : 6.1213
train/ctc : 2.3781
train/s2s : 3.7432
train/learning_rate: 0.0000
eval/ctc : 5.6908
eval/s2s : 5.0534
eval/loss : 10.7443
eval/wer : 0.8602
eval/acc : 0.1846
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.33it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s]
--- epoch 171 ---
train/loss : 6.1483
train/ctc : 2.4037
train/s2s : 3.7445
train/learning_rate: 0.0000
eval/ctc : 5.6797
eval/s2s : 5.0921
eval/loss : 10.7718
eval/wer : 0.8606
eval/acc : 0.1831

from auxiliaryasr.

yl4579 avatar yl4579 commented on August 12, 2024

I believe this error says one of your lines in your train_list.txt does not have a speaker number, it may look like filename.wav|text| instead of filename.wav|text|0

from auxiliaryasr.

MMMMichaelzhang avatar MMMMichaelzhang commented on August 12, 2024

train_list.txt
val_list.txt
I change "speaker_id = int(speaker_id)" to "speaker_id = 0"
and when I train,I got:
[train]: 0%| | 0/5 [00:00<?, ?it/s]nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
nan
[train]: 100%|██████████| 5/5 [00:01<00:00, 3.33it/s]
[eval]: 0%| | 0/3 [00:00<?, ?it/s]nan
nan
nan
nan
[eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s]
--- epoch 171 ---
train/loss : 6.1483
train/ctc : 2.4037
train/s2s : 3.7445
train/learning_rate: 0.0005
eval/ctc : 5.6797
eval/s2s : 5.0921
eval/loss : 10.7718
eval/wer : 0.8606
eval/acc : 0.1831
here is my train.log
train.log
how to test the model?Thank you very much.
@yl4579

from auxiliaryasr.

yl4579 avatar yl4579 commented on August 12, 2024

I believe something is wrong with your labels. The loss should not be NaN and the WER should not be this high after 170 epochs of training. Can you discuss it with @Charlottecuc because it looks like she could train on this dataset with no problem? It looks like you have created so many tokens (420 tokens) and they aren't actually phonemes but syllables.

from auxiliaryasr.

Charlottecuc avatar Charlottecuc commented on August 12, 2024

@MMMMichaelzhang The WER should be lower than 0.2 after about 20 epochs. Could you print your final text tensors in meldataset.py? The text tensor and the corrsponding index should match the sentences in your training data file. Your can check whether there is something wrong in your preprocessing steps. Besides, it seems that your word.dict file is not correct. The dict file should cover all possible Mandarin phonemes (e.g.
……
……
ta1 t a1
ta2 t a2
ta3 t a3
ta4 t a4
ta5 t a5
tai1 t ai1
tai2 t ai2
tai3 t ai3
tai4 t ai4
tai5 t ai5
tan1 t an1
tan2 t an2
tan3 t an3
tan4 t an4
tan5 t an5
tang1 t ang1
tang2 t ang2
tang3 t ang3
……
……)
I suggest you add tones because the WER will be higher if you delete them.

from auxiliaryasr.

Charlottecuc avatar Charlottecuc commented on August 12, 2024

@MMMMichaelzhang The WER should be lower than 0.2 after about 20 epochs. Could you print your final text tensors in meldataset.py? The text tensor and the corrsponding index should match the sentences in your training data file. Your can check whether there is something wrong in your preprocessing steps. Besides, it seems that your word.dict file is not correct. The dict file should cover all possible Mandarin pinyins (e.g. …… …… ta1 t a1 ta2 t a2 ta3 t a3 ta4 t a4 ta5 t a5 tai1 t ai1 tai2 t ai2 tai3 t ai3 tai4 t ai4 tai5 t ai5 tan1 t an1 tan2 t an2 tan3 t an3 tan4 t an4 tan5 t an5 tang1 t ang1 tang2 t ang2 tang3 t ang3 …… ……) I suggest you add tones because the WER will be higher if you delete them.

However, if you add all possible Mandarin pinyins, there will be too many tokens to learn. So a good choice is to split pinyins into phonemes.

from auxiliaryasr.

Kristopher-Chen avatar Kristopher-Chen commented on August 12, 2024

@MMMMichaelzhang , is there some tool to convert pinyins to the phonemes?

from auxiliaryasr.

Kristopher-Chen avatar Kristopher-Chen commented on August 12, 2024

@MMMMichaelzhang , is there some tool to convert pinyins to the phonemes?

Is this the format suitable?
image

from auxiliaryasr.

MMMMichaelzhang avatar MMMMichaelzhang commented on August 12, 2024

thanks for your reply.It helps a lot.I am trying to setup again. @Charlottecuc @yl4579

from auxiliaryasr.

MMMMichaelzhang avatar MMMMichaelzhang commented on August 12, 2024

@Kristopher-Chen
I didnt find some tool to convert pinyins to the phonemes.I just split them into an array.
maybe like this,set speaker_id =0
/media/mike/yys/data_asr/SSB00800056.wav|w o3 m en1 c an1 j ia1 g uo2 x u3 d uo1 zh ong1 d a4 h uo3 d ong1|0

from auxiliaryasr.

MMMMichaelzhang avatar MMMMichaelzhang commented on August 12, 2024

Screenshot from 2022-06-18 08-27-07
word_index_dict.txt
train_list.txt
val_list.txt
train.log

My train loss became negative, I don't know why。 @yl4579 @Charlottecuc

from auxiliaryasr.

yl4579 avatar yl4579 commented on August 12, 2024

@MMMMichaelzhang This is expected, see #4

from auxiliaryasr.

Kristopher-Chen avatar Kristopher-Chen commented on August 12, 2024

@yl4579 It seems something not ideal with the eval loss, and, though acc is quite high, wer is almost 45% in my case. I used the dict with tones(1-5).
image

from auxiliaryasr.

yl4579 avatar yl4579 commented on August 12, 2024

@Kristopher-Chen For some reason, your model overfits very badly because your evaluation loss starts to increase after the 40th epoch, you may want to add more data or use data augmentation. An idea training curve should look like the reply above you.

from auxiliaryasr.

Kristopher-Chen avatar Kristopher-Chen commented on August 12, 2024

@MMMMichaelzhang how many hours of data did you use?

from auxiliaryasr.

MMMMichaelzhang avatar MMMMichaelzhang commented on August 12, 2024

about 20 hours @Kristopher-Chen

from auxiliaryasr.

Kristopher-Chen avatar Kristopher-Chen commented on August 12, 2024

about 20 hours @Kristopher-Chen

It seems too limited data is used... LibriTTS includes over 500h+ of data.

from auxiliaryasr.

Charlottecuc avatar Charlottecuc commented on August 12, 2024

about 20 hours @Kristopher-Chen

More training data is needed. I used around 400 hours of data and the WER can reach about 0.08 after epoch 80.

from auxiliaryasr.

Kristopher-Chen avatar Kristopher-Chen commented on August 12, 2024

More training data is needed. I used around 400 hours of data and the WER can reach about 0.08 after epoch 80.

Yes, thank you! I'm trying to use more training data. BTW, which open source are you using?

from auxiliaryasr.

superhg avatar superhg commented on August 12, 2024

@Charlottecuc did you add space between each pinyin? like this : ['b', 'iao1', ' ', 'g', 'an1', ' ', 'f', 'ang2', ' ', 'q', 'i3', ' ', 'b', 'i4', ' ', 'r', 'an2', ' ', 't', 'iao2', ' ', 'zh', 'eng3', ' ', 'sh', 'iii4', ' ', 'ch', 'ang3', ' ', 'zh', 'an4', ' ', 'l', 've4']

from auxiliaryasr.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.