Giter Club home page Giter Club logo

yeyupiaoling / ppasr Goto Github PK

View Code? Open in Web Editor NEW
756.0 11.0 129.0 17.76 MB

基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型

License: Apache License 2.0

Python 97.35% CSS 0.29% JavaScript 1.40% HTML 0.97%
asr paddlepaddle deep-learning chinese speech-to-text speech speech-recognition streaming-asr conformer squeezeformer

ppasr's Introduction

开发者,你们好!

访问者

Anurag's GitHub stats

核心项目

项目类型 Pytorch版本 PaddlePaddle版本 备注
语音识别 MASR PPASR
声纹识别 VoiceprintRecognition-Pytorch VoiceprintRecognition-PaddlePaddle
声音分类 AudioClassification-Pytorch AudioClassification-PaddlePaddle
语音情感识别 SpeechEmotionRecognition-Pytorch SpeechEmotionRecognition-PaddlePaddle
语音合成 VITS-Pytorch VITS-PaddlePaddle

语音项目

  1. 基于PaddlePaddle动态图实现的语音识别项目:PPASR GitHub Repo stars
  2. 基于Pytorch实现的语音识别项目:MASR GitHub Repo stars
  3. 微调Whisper模型和加速推理:Whisper-Finetune GitHub Repo stars
  4. 基于PaddlePaddle静态图实现的语音识别项目:PaddlePaddle-DeepSpeech GitHub Repo stars
  5. 基于Pytorch实现的声音分类项目:AudioClassification-Pytorch GitHub Repo stars
  6. 基于PaddlePaddle实现声音分类项目:AudioClassification-PaddlePaddle GitHub Repo stars
  7. 基于PaddlePaddle实现声纹识别项目:VoiceprintRecognition-PaddlePaddle GitHub Repo stars
  8. 基于Pytorch实现声纹识别项目:VoiceprintRecognition-Pytorch GitHub Repo stars
  9. 基于Tensorflow实现声纹识别项目:VoiceprintRecognition-Tensorflow GitHub Repo stars
  10. 基于Keras实现声纹识别项目:VoiceprintRecognition-Keras GitHub Repo stars
  11. 基于PaddlePaddle实现的语音情感识别:SpeechEmotionRecognition-PaddlePaddle GitHub Repo stars
  12. 基于Pytorch实现的语音情感识别:SpeechEmotionRecognition-Pytorch GitHub Repo stars
  13. 基于PaddlePaddle实现的VIST语音合成:VITS-PaddlePaddle GitHub Repo stars
  14. 基于Pytorch实现的VIST语音合成:VITS-Pytorch GitHub Repo stars

视觉项目

  1. 基于PaddlePaddle实现的人脸识别项目:PaddlePaddle-MobileFaceNets GitHub Repo stars
  2. 基于Pytorch实现的人脸识别项目:Pytorch-MobileFaceNet GitHub Repo stars
  3. 基于PaddlePaddle实现的SSD目标检测模型:PaddlePaddle-SSD GitHub Repo stars
  4. 基于Pytorch实现的人脸关键点检测MTCNN模型:Pytorch-MTCNN GitHub Repo stars
  5. 基于PaddlePaddle实现的人脸关键点检测MTCNN模型:PaddlePaddle-MTCNN GitHub Repo stars
  6. 基于PaddlePaddle实现的文字识别CRNN模型:PaddlePaddle-CRNN GitHub Repo stars
  7. 基于PaddlePaddle实现的人流密度CrowdNet模型:PaddlePaddle-CrowdNet GitHub Repo stars
  8. 基于MXNET实现的年龄性别识别项目:Age-Gender-MXNET GitHub Repo stars
  9. 使用Tensorflow Lite、Paddle Lite、MNN、TNN框架在Android上不是图像分类模型:ClassificationForAndroid GitHub Repo stars
  10. 基于PaddlePaddle实现的PP-YOLOE模型:PP-YOLOE GitHub Repo stars
  11. 在Android部署的人脸检测、口罩识别、关键检测模型:FaceKeyPointsMask GitHub Repo stars
  12. 在Android上部署语义分割模型实现换人物背景:ChangeHumanBackground GitHub Repo stars
  13. 使用Tensorflow实现的人脸识别项目:Tensorflow-FaceRecognition GitHub Repo stars

系列教程

  1. PaddlePaddle V2版本系列教程:LearnPaddle GitHub Repo stars
  2. PaddlePaddle Fluid版本系列教程:LearnPaddle2 GitHub Repo stars

书籍源码

  1. 《PaddlePaddle从入门到实战》源码:PaddlePaddleCourse GitHub Repo stars
  2. 《深度学习应用实战之PaddlePaddle》源码:BookSource GitHub Repo stars
github contribution grid snake animation

ppasr's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ppasr's Issues

wenetspeech数据集GPU显存占用越来越大

大佬你好~
想请教一下,我训练wenetspeech数据集的时候,发现单个epoch的GPU显存占用,会越来越大,由一开始的2G增加到最后的21G,这个正常吗?我的训练参数设置如下

add_arg('batch_size',       int,    64,                       '训练的批量大小')
add_arg('num_workers',      int,    32,                        '读取数据的线程数量')
add_arg('num_epoch',        int,    65,                       '训练的轮数')
add_arg('learning_rate',    int,    5e-5,                     '初始学习率的大小')
add_arg('min_duration',     int,    0.5,                      '过滤最短的音频长度')
add_arg('max_duration',     int,    30,                       '过滤最长的音频长度,当为-1的时候不限制长度')
add_arg('alpha',            float,  2.2,                      '集束搜索的LM系数')
add_arg('beta',             float,  4.3,                      '集束搜索的WC系数')
add_arg('beam_size',        int,    300,                      '集束搜索的大小,范围:[5, 500]')
add_arg('num_proc_bsearch', int,    10,                       '集束搜索方法使用CPU数量')
add_arg('cutoff_prob',      float,  0.99,                     '剪枝的概率')
add_arg('cutoff_top_n',     int,    40,                       '剪枝的最大值')
add_arg('use_model',        str,    'deepspeech2',              '所使用的模型')
add_arg('train_manifest',   str,    'dataset/manifest.train',   '训练数据的数据列表路径')
add_arg('test_manifest',    str,    'dataset/manifest.test',    '测试数据的数据列表路径')
add_arg('dataset_vocab',    str,    'dataset/vocabulary.txt',   '数据字典的路径')
add_arg('mean_std_path',    str,    'dataset/mean_std.npz',     '数据集的均值和标准值的npy文件路径')
add_arg('augment_conf_path',str,    'conf/augmentation.json',   '数据增强的配置文件,为json格式')
add_arg('save_model_path',  str,    'models/',                  '模型保存的路径')
add_arg('decoder',          str,    'ctc_greedy',               '结果解码方法', choices=['ctc_beam_search', 'ctc_greedy'])
add_arg('lang_model_path',  str,    'lm/zh_giga.no_cna_cmn.prune01244.klm',        "语言模型文件路径")
add_arg('resume_model',     str,    None,                       '恢复训练,当为None则不使用预训练模型')
add_arg('pretrained_model', str,    None,                       '预训练模型的路径,当为None则不使用预训练模型')

python train.py 运行时没有显示错误就退出,而且没有生成models文件夹。

(asr) D:\ASR>python train.py
-----------  Configuration Arguments -----------
alpha: 2.2
augment_conf_path: conf/augmentation.json
batch_size: 32
beam_size: 300
beta: 4.3
cutoff_prob: 0.99
cutoff_top_n: 40
dataset_vocab: dataset/vocabulary.txt
decoder: ctc_beam_search
feature_method: linear
lang_model_path: lm/zh_giga.no_cna_cmn.prune01244.klm
learning_rate: 5e-05
max_duration: 20
mean_std_path: dataset/mean_std.npz
metrics_type: cer
min_duration: 0.5
num_epoch: 65
num_proc_bsearch: 10
num_workers: 8
pretrained_model: None
resume_model: None
save_model_path: models/
test_manifest: dataset/manifest.test
train_manifest: dataset/manifest.train
use_model: deepspeech2
------------------------------------------------
dataset/manifest.noise不存在,已经忽略噪声增强操作!
[2022-02-12 12:55:37.853290] 数据增强配置:{'type': 'speed', 'aug_type': 'audio', 'params': {'min_speed_rate': 0.9, 'max_speed_rate': 1.1, 'num_rates': 3}, 'prob': 1.0}
[2022-02-12 12:55:37.853290] 数据增强配置:{'type': 'shift', 'aug_type': 'audio', 'params': {'min_shift_ms': -5, 'max_shift_ms': 5}, 'prob': 1.0}
[2022-02-12 12:55:37.853290] 数据增强配置:{'type': 'volume', 'aug_type': 'audio', 'params': {'min_gain_dBFS': -15, 'max_gain_dBFS': 15}, 'prob': 1.0}
[2022-02-12 12:55:37.861290] 数据增强配置:{'type': 'specaug', 'aug_type': 'feature', 'params': {'W': 0, 'warp_mode': 'PIL', 'F': 10, 'n_freq_masks': 2, 'T': 50, 'n_time_masks': 2, 'p': 1.0, 'adaptive_number_ratio': 0, 'adaptive_size_ratio': 0, 'max_n_time_masks': 20, 'replace_with_zero': True}, 'prob': 1.0}
D:\XXXX\lib\site-packages\paddle\fluid\reader.py:355: UserWarning: DataLoader with multi-process mode is not supported on MacOs and Windows currently. Please use signle-process mode with num_workers = 0 instead
  warnings.warn(
W0212 12:55:37.974066 17176 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 11.6, Runtime API Version: 10.2
W0212 12:55:37.990077 17176 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[2022-02-12 12:55:41.438553] 训练数据:13361
[2022-02-12 12:55:42.453420] Train epoch: [1/65], batch: [0/417], loss: 855.68878, learning rate: 0.00005000, eta: 7:38:27
**就是这个地方,莫名其妙退出。**
(asr) D:\ASR>

为何训练模型时loss突然变为nan

我最近在尝试训练PPASR模型,使用了WenetSpeech数据集和Aishell,Free ST-Chinese-Mandarin-Corpus,THCHS-30数据集,但是在训练过程中loss突然变为nan,训练依旧进行,我有调低学习率,但是没什么用,我应该怎么确定目前情况发生的原因呢,这是我训练时的参数配置
我是在有GPU卡的docker内进行的单卡训练

-----------  Configuration Arguments -----------
alpha: 2.2
augment_conf_path: conf/augmentation.json
batch_size: 128
beam_size: 300
beta: 4.3
cutoff_prob: 0.99
cutoff_top_n: 40
dataset_vocab: dataset/vocabulary.txt
decoder: ctc_beam_search
lang_model_path: lm/zhidao_giga.klm
learning_rate: 5e-05
max_duration: 20
mean_std_path: dataset/mean_std.npz
min_duration: 0.5
num_epoch: 65
num_proc_bsearch: 10
num_workers: 6
pretrained_model: None
resume_model: None
save_model_path: models/
test_manifest: dataset/manifest.test
train_manifest: dataset/manifest.train
use_model: deepspeech2
------------------------------------------------

下面图片是训练时loss变为nan
1

安装requirements里面的包安装失败

visualdl,cn2an,zhconv,paddlespeech_feat,webrtcvad,这几个包都无法成功安装在anaconda官网也找不到这几个包,麻烦大佬指点一下呀,谢谢。

运行create_data.py后没有生成mean_std.npz

作者大大你好,我运行create_data.py后没有生成mean_std.npz,运行train.py时报错没有这个。希望你能解答一下谢谢!

E:\PyCharm2020\PycharmProjects\PPASR\Virtualenv_Environment\lib\site-packages\pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
  0%|          | 0/13388 [00:00<?, ?it/s]-----------  Configuration Arguments -----------
annotation_path: dataset/annotation/
count_threshold: 2
dataset_vocab: dataset/vocabulary.txt
feature_method: linear
is_change_frame_rate: True
max_test_manifest: 10000
mean_std_path: dataset/mean_std.npz
noise_manifest_path: dataset/manifest.noise
noise_path: dataset/audio/noise
num_samples: 1000000
num_workers: 8
test_manifest: dataset/manifest.test
train_manifest: dataset/manifest.train
------------------------------------------------
开始生成数据列表...
100%|██████████| 13388/13388 [00:27<00:00, 489.01it/s]
完成生成数据列表,数据集总长度为34.16小时!
======================================================================
开始生成噪声数据列表...
噪声音频文件为空,已跳过!
======================================================================
开始生成数据字典...
100%|██████████| 13361/13361 [00:00<00:00, 16198.03it/s]
100%|██████████| 27/27 [00:00<00:00, 13728.48it/s]
数据字典生成完成!
======================================================================
开始抽取1000000条数据计算均值和标准值...
E:\PyCharm2020\PycharmProjects\PPASR\Virtualenv_Environment\lib\site-packages\paddle\fluid\reader.py:356: UserWarning: DataLoader with multi-process mode is not supported on MacOs and Windows currently. Please use signle-process mode with num_workers = 0 instead
  "DataLoader with multi-process mode is not supported on MacOs and Windows currently." \
100%|██████████| 209/209 [01:45<00:00,  1.50it/s]
进程已结束,退出代码 -1073741819 (0xC0000005)

Training loss is nan?

[2021-07-14 02:09:40.075967] Train epoch: 0, batch: 750/34799, loss: 6.70559, learning rate: 0.0001, train time: 4.319s
完成第750的保存
[2021-07-14 02:10:20.735872] Train epoch: 0, batch: 760/34799, loss: 6.64842, learning rate: 0.0001, train time: 4.218s
完成第760的保存
[2021-07-14 02:11:05.361007] Train epoch: 0, batch: 770/34799, loss: 6.76281, learning rate: 0.0001, train time: 6.372s
完成第770的保存
[2021-07-14 02:11:53.009762] Train epoch: 0, batch: 780/34799, loss: 6.61592, learning rate: 0.0001, train time: 3.155s
完成第780的保存
[2021-07-14 02:12:37.086634] Train epoch: 0, batch: 790/34799, loss: nan, learning rate: 0.0001, train time: 3.768s
完成第790的保存
[2021-07-14 02:13:21.365857] Train epoch: 0, batch: 800/34799, loss: nan, learning rate: 0.0001, train time: 3.402s
完成第800的保存
[2021-07-14 02:14:00.991322] Train epoch: 0, batch: 810/34799, loss: nan, learning rate: 0.0001, train time: 3.514s
完成第810的保存
[2021-07-14 02:14:50.522749] Train epoch: 0, batch: 820/34799, loss: nan, learning rate: 0.0001, train time: 3.567s
完成第820的保存
[2021-07-14 02:15:37.339313] Train epoch: 0, batch: 830/34799, loss: nan, learning rate: 0.0001, train time: 3.885s
How to deal with it?

在aistudio运行代码报错

from ppasr.trainer import PPASRTrainer

trainer = PPASRTrainer(mean_std_path="dataset/mean_std.npz",
                       train_manifest="dataset/manifest.train",
                       test_manifest="dataset/manifest.test",
                       dataset_vocab="dataset/vocabulary.txt",
                       num_workers=2)
trainer.create_data(annotation_path="dataset/annotation/")

报错信息如下 :

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_17791/1498183113.py in <module>
      7                        num_workers=2)
      8 
----> 9 trainer.create_data(annotation_path="dataset/annotation/")

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/ppasr/trainer.py in create_data(self, annotation_path, noise_manifest_path, noise_path, num_samples, count_threshold, is_change_frame_rate, max_test_manifest)
    108                         test_manifest_path=self.test_manifest,
    109                         is_change_frame_rate=is_change_frame_rate,
--> 110                         max_test_manifest=max_test_manifest)
    111         print('=' * 70)
    112         print('开始生成噪声数据列表...')

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/ppasr/utils/utils.py in create_manifest(annotation_path, train_manifest_path, test_manifest_path, is_change_frame_rate, max_test_manifest)
     59             # 重新调整音频格式并保存
     60             if is_change_frame_rate:
---> 61                 change_rate(audio_path)
     62             # 获取音频长度
     63             audio_data, samplerate = soundfile.read(audio_path)

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/ppasr/utils/utils.py in change_rate(audio_path)
    103 # 改变音频采样率为16000Hz
    104 def change_rate(audio_path):
--> 105     data, sr = soundfile.read(audio_path)
    106     if sr != 16000:
    107         data = librosa.resample(data, sr, target_sr=16000)

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/soundfile.py in read(file, frames, start, stop, dtype, always_2d, fill_value, out, samplerate, channels, format, subtype, endian, closefd)
    255     """
    256     with SoundFile(file, 'r', samplerate, channels,
--> 257                    subtype, endian, format, closefd) as f:
    258         frames = f._prepare_read(start, stop, frames)
    259         data = f.read(frames, dtype, always_2d, fill_value, out)

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/soundfile.py in __init__(self, file, mode, samplerate, channels, subtype, endian, format, closefd)
    627         self._info = _create_info_struct(file, mode, samplerate, channels,
    628                                          format, subtype, endian)
--> 629         self._file = self._open(file, mode_int, closefd)
    630         if set(mode).issuperset('r+') and self.seekable():
    631             # Move write position to 0 (like in Python file objects)

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/soundfile.py in _open(self, file, mode_int, closefd)
   1182             raise TypeError("Invalid file: {0!r}".format(self.name))
   1183         _error_check(_snd.sf_error(file_ptr),
-> 1184                      "Error opening {0!r}: ".format(self.name))
   1185         if mode_int == _snd.SFM_WRITE:
   1186             # Due to a bug in libsndfile version <= 1.0.25, frames != 0

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/soundfile.py in _error_check(err, prefix)
   1355     if err != 0:
   1356         err_str = _snd.sf_error_number(err)
-> 1357         raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
   1358 
   1359 

RuntimeError: Error opening 'aset/audio/data_aishell/wav/test/S0764/BAC009S0764W0201.wav': System error.

语音合成如何利用speaker_audio下自由音频

“把需要说话人的语音放在tools/generate_audio/speaker_audio目录下,可以使用dataset/test.wav文件,可以到找多个人的音频放在tools/generate_audio/speaker_audio目录下,开发者也可以尝试入自己的音频放入该目录,这样训练出来的模型能更好识别开发者的语音,采样率最好是16000Hz。”

大佬我没看到利用speaker_audio目录下音频信息的代码, 我看它合成的音频说话人都是取自下载的模型中models/fastspeech2_nosil_aishell3_ckpt_0.4/speaker_id_map.txt, 这个是后续会开发还是说我看错了,麻烦您解惑!

scipy==1.6.1 ??

scipy==1.6.1 ??

ERROR: Could not find a version that satisfies the requirement scipy==1.6.1 (from versions: 0.8.0, 0.9.0, 0.10.0, 0.10.1, 0.11.0, 0.12.0, 0.12.1, 0.13.0, 0.13.1, 0.13.2, 0.13.3, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.16.0, 0.16.1, 0.17.0, 0.17.1, 0.18.0, 0.18.1, 0.19.0, 0.19.1, 1.0.0b1, 1.0.0rc1, 1.0.0rc2, 1.0.0, 1.0.1, 1.1.0rc1, 1.1.0, 1.2.0rc1, 1.2.0rc2, 1.2.0, 1.2.1, 1.2.2, 1.2.3, 1.3.0rc1, 1.3.0rc2, 1.3.0, 1.3.1, 1.3.2, 1.3.3, 1.4.0rc1, 1.4.0rc2, 1.4.0, 1.4.1, 1.5.0rc1, 1.5.0rc2, 1.5.0, 1.5.1, 1.5.2, 1.5.3, 1.5.4)

WenetSpeech训练时间的问题

您好~
按照项目的说明在训练WenetSpeech数据集,目前是单卡3090;但是看log提示,全部训练完65个epoch需要至少半年时间,同时看GPU的利用率也不是很高。我是初学者, 想请教下主要的瓶颈可能在哪里,有没有什么包括升级硬件在内的优化方法呢?
ee88a540aad1e622f305ce669107a20

inference

I found that the accuracy maybe worser than before,how you guys think?

语言模型相关

目前想在现有语言模型的基础上针对专有名词做一个增强,想法有两种:
一是直接修改语言模型中 n-gram的概率,但好像 klm 不支持修改;
二是在原有语料集的基础上增加语料并重新训练,但不知道语料集的出处(好像是百度内部语料集)。

所以在这请教一下,有什么好的实现路径吗?

GPU预测时出错,Aborted at 1637208819 (unix time) try "date -d @1637208819" if you are using GNU date

老师,您好。

前序步骤【数据准备】、【训练模型】、【执行评估】、【导出模型】均正常通过。

但是在【快速预测】时,执行:python infer_path.py --wav_path=./dataset/test.wav 后,出现以下错误:

root@pp:~/PPASR# python export_model.py --resume_model=models/deepspeech2/epoch_50/
/usr/local/lib/python3.7/dist-packages/numba/types/__init__.py:110: DeprecationWarning: `np.long` is a deprecated alias for `np.compat.long`. To silence this warning, use `np.compat.long` by itself. In the likely event your code does not need to work on Python 2 you can use the builtin `int` for which `np.compat.long` is itself an alias. Doing this will not modify any behaviour and is safe. When replacing `np.long`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  long_ = _make_signed(np.long)
/usr/local/lib/python3.7/dist-packages/numba/types/__init__.py:111: DeprecationWarning: `np.long` is a deprecated alias for `np.compat.long`. To silence this warning, use `np.compat.long` by itself. In the likely event your code does not need to work on Python 2 you can use the builtin `int` for which `np.compat.long` is itself an alias. Doing this will not modify any behaviour and is safe. When replacing `np.long`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  ulong = _make_unsigned(np.long)
/usr/local/lib/python3.7/dist-packages/librosa/cache.py:49: DeprecationWarning: The 'cachedir' attribute has been deprecated in version 0.12 and will be remo
root@pp:~/PPASR# python infer_path.py --wav_path=./dataset/test.wav
/usr/local/lib/python3.7/dist-packages/numba/types/__init__.py:110: DeprecationWarning: `np.long` is a deprecated alias for `np.compat.long`. To silence this warning, use `np.compat.long` by itself. In the likely event your code does not need to work on Python 2 you can use the builtin `int` for which `np.compat.long` is itself an alias. Doing this will not modify any behaviour and is safe. When replacing `np.long`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  long_ = _make_signed(np.long)
/usr/local/lib/python3.7/dist-packages/numba/types/__init__.py:111: DeprecationWarning: `np.long` is a deprecated alias for `np.compat.long`. To silence this warning, use `np.compat.long` by itself. In the likely event your code does not need to work on Python 2 you can use the builtin `int` for which `np.compat.long` is itself an alias. Doing this will not modify any behaviour and is safe. When replacing `np.long`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  ulong = _make_unsigned(np.long)
/usr/local/lib/python3.7/dist-packages/librosa/cache.py:49: DeprecationWarning: The 'cachedir' attribute has been deprecated in version 0.12 and will be removed in version 0.14.
Use os.path.join(memory.location, 'joblib') attribute instead.
  if self.cachedir is not None and self.level >= level:
/usr/local/lib/python3.7/dist-packages/librosa/cache.py:49: DeprecationWarning: The 'cachedir' attribute has been deprecated in version 0.12 and will be removed in version 0.14.
Use os.path.join(memory.location, 'joblib') attribute instead.
  if self.cachedir is not None and self.level >= level:
-----------  Configuration Arguments -----------
alpha: 1.2
beam_size: 10
beta: 0.35
cutoff_prob: 1.0
cutoff_top_n: 40
decoder: ctc_greedy
is_long_audio: False
lang_model_path: lm/zh_giga.no_cna_cmn.prune01244.klm
model_dir: models/deepspeech2/infer/
to_an: True
use_gpu: True
vocab_path: dataset/vocabulary.txt
wav_path: ./dataset/test.wav
------------------------------------------------


--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::framework::SignalHandle(char const*, int)
1   paddle::platform::GetCurrentTraceBackString[abi:cxx11]()

----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
  [TimeInfo: *** Aborted at 1637207454 (unix time) try "date -d @1637207454" if you are using GNU date ***]
  [SignalInfo: *** SIGSEGV (@0x0) received by PID 10680 (TID 0x7f258fdb6740) from PID 0 ***]

Segmentation fault

环境如下:

系统:Ubuntu  18.04 64位
显卡:NVIDIA P100 (单卡)
驱动:10.2.89
内存:60G
Python:3.7.5
PaddlePaddle:2.1.3 (PIP安装)
项目:PPASR
模型:thchs_30(34小时) 
noise模型:无

NVIDIA信息如下:

root@pp:~/PPASR# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
root@pp:~/PPASR# nvidia-smi
Thu Nov 18 12:05:23 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03    Driver Version: 460.91.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  On   | 00000000:00:08.0 Off |                    0 |
| N/A   27C    P0    26W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

请问老师,我在项目:“PaddlePaddle-DeepSpeech”和“PPASR"使用【GPU】预测均遇到该问题,使用【CPU】预测正常。在训练时,使用【GPU】也是正常。

请问这个问题是什么原因导致的?麻烦可否告知如何解决?

感谢您百忙这中抽空审阅,谢谢。

标点符号

想问下大佬,根据停顿添加标点符号怎么做?需要添加分词吗?

运行train.py没有调用GPU训练

作者你好,我train.py后训练数据调用的是cpu,没有调用到GPU,我用paddle.fluid.install_check.run_check()检查显示在GPU或CPU上运行良好。希望你能解答一下,谢谢!

mfcc维度

我发现用的是128维度mfccs = librosa.feature.mfcc(y=wav, sr=sr, n_mfcc=128, n_fft=512, hop_length=128).astype("float32"),但是看大多数都是13维,这个128维度是怎么得到的,128个三角滤波器吗,有什么用意吗,

test cer 总是1.0

您好,麻烦问一下,我用的thchs30的训练集训练的,训练了50个epoch 但是test cer的值一直都是1.0,这个是为什么?
参数用的是默认的参数,没去做调整

444

机器是
ubuntu16.04
GPU tiant xp
12G显存

vocabulary.txt和mean_std.npz文件

您好,我下载了您的预训练模型,想要导出进行预测,请问这两个文件要怎么得到呢?
add_arg('dataset_vocab', str, 'dataset/vocabulary.txt', '数据字典的路径') add_arg('mean_std_path', str, 'dataset/mean_std.npz', '数据集的均值和标准值的npy文件路径')

train的时候最后一层linear的输入shape问题

AssertionError: Variable Shape not match, Variable [ linear_0.w_0_moment1_0 ] need tensor with shape (1024, 563) but load set tensor with shape (1024, 564)
create_data.py后词表vocab是563,但是开始训练后第一个epoch就报错,继续往下找
python3.7/site-packages/paddle/fluid/dygraph/layers.py中 _check_match(key, param)方法中发现,state_dict.get(key, None)在key=output.bias和output.weight的时候shape是564,但for key, param in self.state_dict().items():的最后几个param中的shape又是词表的563,所以导致了冲突
求大佬帮忙看看,感激不尽

训练自己的数据集

您好,麻烦问下 ,我自己准备的数据集进行训练 ,训练发现,cer在0.24左右就下不去了 ,识别效果一般,没有thchs30训练识别的效果好。
我自己准备了1000小时左右的数据集进行训练的。
我们用自己的训练集进行训练的时候,需要对某些参数进行调整吗?

单机多卡训练失败怎么办?

(paddle_env) D:\python\PPASR__TEST\PPASR>python -m paddle.distributed.launch --gpus '0,1' train.py
-----------  Configuration Arguments -----------
backend: auto
elastic_server: None
force: False
gpus: '0,1'
heter_devices:
heter_worker_num: None
heter_workers:
host: None
http_port: None
ips: 127.0.0.1
job_id: None
log_dir: log
np: None
nproc_per_node: None
run_mode: None
scale: 0
server_num: None
servers:
training_script: train.py
training_script_args: []
worker_num: None
workers:
------------------------------------------------
WARNING 2022-02-16 18:25:24,619 launch.py:423] Not found distinct arguments and compiled with cuda or xpu. Default use collective mode
launch train in GPU mode!
INFO 2022-02-16 18:25:24,621 launch_utils.py:528] Local start 2 processes. First process distributed environment info (Only For Debug):
    +=======================================================================================+
    |                        Distributed Envs                      Value                    |
    +---------------------------------------------------------------------------------------+
    |                       PADDLE_TRAINER_ID                        0                      |
    |                 PADDLE_CURRENT_ENDPOINT                 127.0.0.1:55021               |
    |                     PADDLE_TRAINERS_NUM                        2                      |
    |                PADDLE_TRAINER_ENDPOINTS         127.0.0.1:55021,127.0.0.1:55022       |
    |                     PADDLE_RANK_IN_NODE                        0                      |
    |                 PADDLE_LOCAL_DEVICE_IDS                       '0                      |
    |                 PADDLE_WORLD_DEVICE_IDS                      '0,1'                    |
    |                     FLAGS_selected_gpus                       '0                      |
    |             FLAGS_selected_accelerators                       '0                      |
    +=======================================================================================+

INFO 2022-02-16 18:25:24,621 launch_utils.py:532] details abouts PADDLE_TRAINER_ENDPOINTS can be found in log/endpoints.log, and detail running logs maybe found in log/workerlog.0
子目录或文件 -p 已经存在。
处理: -p 时出错。
子目录或文件 log 已经存在。
处理: log 时出错。
子目录或文件 -p 已经存在。
处理: -p 时出错。
子目录或文件 log 已经存在。
处理: log 时出错。
launch proc_id:16392 idx:0
launch proc_id:18756 idx:1
Traceback (most recent call last):
  File "train.py", line 4, in <module>
    from ppasr.trainer import PPASRTrainer
  File "D:\python\PPASR__TEST\PPASR\ppasr\trainer.py", line 11, in <module>
    import paddle
  File "D:\python\anaconda3\envs\paddle_env\lib\site-packages\paddle\__init__.py", line 293, in <module>
    from .hapi import Model  # noqa: F401
  File "D:\python\anaconda3\envs\paddle_env\lib\site-packages\paddle\hapi\__init__.py", line 25, in <module>
    logger.setup_logger()
  File "D:\python\anaconda3\envs\paddle_env\lib\site-packages\paddle\hapi\logger.py", line 47, in setup_logger
    local_rank = ParallelEnv().local_rank
  File "D:\python\anaconda3\envs\paddle_env\lib\site-packages\paddle\fluid\dygraph\parallel.py", line 121, in __init__
    self._device_id = int(selected_gpus[0])
ValueError: invalid literal for int() with base 10: "'0"
INFO 2022-02-16 18:25:30,788 launch_utils.py:341] terminate all the procs
ERROR 2022-02-16 18:25:30,788 launch_utils.py:604] ABORT!!! Out of all 2 trainers, the trainer process with rank=[0, 1] was aborted. Please check its log.
INFO 2022-02-16 18:25:33,790 launch_utils.py:341] terminate all the procs
INFO 2022-02-16 18:25:33,790 launch.py:311] Local processes completed.

长音频文件识别不了问题

请问下大佬,我用的是自己的wav上音频文件,但是总识别错误,用短音频识别可以,但是没有标点符号,然后demo里的长音频文件貌似没有了

如何调用模型实时识别语音

更新日志中提到「2021.11.30: 全面修改为流式语音识别模型」,那应该能支持实时的语音识别?

是否有相应的调用代码提供参考,谢谢🙏

下载数据集后解压失败

你好,我想问下将filepath指定为数据集的绝对路径,然后运训aishell.py报错误:EOFError: Compressed file ended before the end-of-stream marker was reached是什么问题?

visualDL提示无“无可视化结果展示”

你好~
在训练WenetSpeech的过程中想查看训练日志,但是输入visualdl --logdir=./log --host=0.0.0.0或者visualdl --logdir=log --host=0.0.0.0后,在visualDL页面中都提示无可视化结果展示,目前除了修改了部分训练参数外,没有对clone下来的代码进行过修改。对paddlepaddle框架不大熟悉,想请教下,这可能是什么原因导致的?
微信图片_20220220234529
微信图片_20220220234548
微信图片_20220220234559

使用GPU,会卡住很久最终报Process finished with exit code -1073741819 (0xC0000005)终止程序

E:\Users\ikun\anaconda3\envs\speech\python.exe E:/Users/ikun/PycharmProjects/Speech_AI/MASR-master/infer_path.py
E:\Users\ikun\anaconda3\envs\speech\lib\site-packages\pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
----------- Configuration Arguments -----------
alpha: 2.2
beam_size: 30
beta: 4.3
cutoff_prob: 0.99
cutoff_top_n: 40
decoder: ctc_beam_search
is_long_audio: False
lang_model_path: lm/zh_giga.no_cna_cmn.prune01244.klm
model_path: models/deepspeech2/inference.pt
real_time_demo: False
to_an: False
use_gpu: True
use_model: deepspeech2
vocab_path: dataset/vocabulary.txt
wav_path: ./dataset/040.wav

==================================================================
缺少 paddlespeech-ctcdecoders 库,请根据文档安装,如果是Windows系统,只能使用ctc_greedy。
【注意】已自动切换为ctc_greedy解码器。

W0111 22:20:02.530877 1384 analysis_predictor.cc:1353] Deprecated. Please use CreatePredictor instead.

Process finished with exit code -1073741819 (0xC0000005)

使用预下载模型导出并预测时,测试样例预测出错

当我使用预下载模型导出并预测test.wav时,输出结果为:

D:\anaconda3\python.exe F:/PPASR-master/infer_path.py
D:\anaconda3\lib\site-packages\librosa\core\constantq.py:1058: DeprecationWarning: `np.complex` is a deprecated alias for the builtin `complex`. To silence this warning, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype=np.complex,
D:\anaconda3\lib\site-packages\pydub-0.25.1-py3.9.egg\pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)

缺少 paddlespeech-ctcdecoders 库,请安装,如果是Windows系统,只能使用ctc_greedy。
【注意】已自动切换为ctc_greedy解码器。

-----------  Configuration Arguments -----------
alpha: 2.2
beam_size: 300
beta: 4.3
cutoff_prob: 0.99
cutoff_top_n: 40
decoder: ctc_beam_search
feature_method: linear
is_long_audio: False
lang_model_path: lm/zh_giga.no_cna_cmn.prune01244.klm
model_dir: models/deepspeech2/infer/
pun_model_dir: models/pun_models/
real_time_demo: False
to_an: False
use_gpu: True
use_model: deepspeech2
use_pun: False
vocab_path: dataset/vocabulary.txt
wav_path: dataset/test.wav
------------------------------------------------
消耗时间:164ms, 识别结果: 逐屈肮霸故罅咽罅鳟物鳟马悖鳟忑敌茧忑龙茧忑物裁唱勤疡掩物婷忑马窗马物鳟疡层悖忑混忑恐物层, 得分: 0

导出模型输出为:
-----------  Configuration Arguments -----------
dataset_vocab: dataset/vocabulary.txt
feature_method: linear
mean_std_path: dataset/mean_std.npz
resume_model: models/deepspeech2/epoch_50/
save_model: models/
use_model: deepspeech2
------------------------------------------------
W0121 01:57:48.756676 15560 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 11.2, Runtime API Version: 10.2
W0121 01:57:48.765652 15560 device_context.cc:465] device: 0, cuDNN Version: 7.6.
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for conv.conv1.conv.weight. conv.conv1.conv.weight is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for conv.conv1.conv.bias. conv.conv1.conv.bias is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for conv.conv2.conv.weight. conv.conv2.conv.weight is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for conv.conv2.conv.bias. conv.conv2.conv.bias is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.0.rnn.weight_ih_l0. rnn.rnn.0.rnn.weight_ih_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.0.rnn.weight_hh_l0. rnn.rnn.0.rnn.weight_hh_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.0.rnn.bias_ih_l0. rnn.rnn.0.rnn.bias_ih_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.0.rnn.bias_hh_l0. rnn.rnn.0.rnn.bias_hh_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.0.rnn.0.cell.weight_ih. rnn.rnn.0.rnn.0.cell.weight_ih is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.0.rnn.0.cell.weight_hh. rnn.rnn.0.rnn.0.cell.weight_hh is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.0.rnn.0.cell.bias_ih. rnn.rnn.0.rnn.0.cell.bias_ih is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.0.rnn.0.cell.bias_hh. rnn.rnn.0.rnn.0.cell.bias_hh is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.0.norm.weight. rnn.rnn.0.norm.weight is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.0.norm.bias. rnn.rnn.0.norm.bias is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.1.rnn.weight_ih_l0. rnn.rnn.1.rnn.weight_ih_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.1.rnn.weight_hh_l0. rnn.rnn.1.rnn.weight_hh_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.1.rnn.bias_ih_l0. rnn.rnn.1.rnn.bias_ih_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.1.rnn.bias_hh_l0. rnn.rnn.1.rnn.bias_hh_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.1.rnn.0.cell.weight_ih. rnn.rnn.1.rnn.0.cell.weight_ih is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.1.rnn.0.cell.weight_hh. rnn.rnn.1.rnn.0.cell.weight_hh is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.1.rnn.0.cell.bias_ih. rnn.rnn.1.rnn.0.cell.bias_ih is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.1.rnn.0.cell.bias_hh. rnn.rnn.1.rnn.0.cell.bias_hh is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.1.norm.weight. rnn.rnn.1.norm.weight is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.1.norm.bias. rnn.rnn.1.norm.bias is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.2.rnn.weight_ih_l0. rnn.rnn.2.rnn.weight_ih_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.2.rnn.weight_hh_l0. rnn.rnn.2.rnn.weight_hh_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.2.rnn.bias_ih_l0. rnn.rnn.2.rnn.bias_ih_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.2.rnn.bias_hh_l0. rnn.rnn.2.rnn.bias_hh_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.2.rnn.0.cell.weight_ih. rnn.rnn.2.rnn.0.cell.weight_ih is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.2.rnn.0.cell.weight_hh. rnn.rnn.2.rnn.0.cell.weight_hh is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.2.rnn.0.cell.bias_ih. rnn.rnn.2.rnn.0.cell.bias_ih is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.2.rnn.0.cell.bias_hh. rnn.rnn.2.rnn.0.cell.bias_hh is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.2.norm.weight. rnn.rnn.2.norm.weight is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.2.norm.bias. rnn.rnn.2.norm.bias is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.3.rnn.weight_ih_l0. rnn.rnn.3.rnn.weight_ih_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.3.rnn.weight_hh_l0. rnn.rnn.3.rnn.weight_hh_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.3.rnn.bias_ih_l0. rnn.rnn.3.rnn.bias_ih_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.3.rnn.bias_hh_l0. rnn.rnn.3.rnn.bias_hh_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.3.rnn.0.cell.weight_ih. rnn.rnn.3.rnn.0.cell.weight_ih is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.3.rnn.0.cell.weight_hh. rnn.rnn.3.rnn.0.cell.weight_hh is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.3.rnn.0.cell.bias_ih. rnn.rnn.3.rnn.0.cell.bias_ih is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.3.rnn.0.cell.bias_hh. rnn.rnn.3.rnn.0.cell.bias_hh is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.3.norm.weight. rnn.rnn.3.norm.weight is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.3.norm.bias. rnn.rnn.3.norm.bias is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.4.rnn.weight_ih_l0. rnn.rnn.4.rnn.weight_ih_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.4.rnn.weight_hh_l0. rnn.rnn.4.rnn.weight_hh_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.4.rnn.bias_ih_l0. rnn.rnn.4.rnn.bias_ih_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.4.rnn.bias_hh_l0. rnn.rnn.4.rnn.bias_hh_l0 is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.4.rnn.0.cell.weight_ih. rnn.rnn.4.rnn.0.cell.weight_ih is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.4.rnn.0.cell.weight_hh. rnn.rnn.4.rnn.0.cell.weight_hh is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.4.rnn.0.cell.bias_ih. rnn.rnn.4.rnn.0.cell.bias_ih is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.4.rnn.0.cell.bias_hh. rnn.rnn.4.rnn.0.cell.bias_hh is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.4.norm.weight. rnn.rnn.4.norm.weight is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for rnn.rnn.4.norm.bias. rnn.rnn.4.norm.bias is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for output.weight. output.weight is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
D:\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py:1436: UserWarning: Skip loading for output.bias. output.bias is not found in the provided dict.
  warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
[2022-01-21 01:57:51.640273] 成功恢复模型参数和优化方法参数:models/deepspeech2/epoch_50/model.pdparams
D:\anaconda3\lib\site-packages\paddle\fluid\layers\utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working
  return (isinstance(seq, collections.Sequence) and
预测模型已保存:models/deepspeech2\infer

请问这是说明我的模型导出有很多问题,然后导致我的预测结果也出问题了吗,我该如何修复呢

生成数据列表时出现错误

老哥,我是在AI studio上部署的,在运行create_data时出现了这个问题,希望您能帮助解答一下,谢谢!

annotation_path: dataset/annotation/
count_threshold: 2
dataset_vocab: dataset/vocabulary.txt
feature_method: linear
is_change_frame_rate: True
max_test_manifest: 10000
mean_std_path: dataset/mean_std.npz
noise_manifest_path: dataset/manifest.noise
noise_path: dataset/audio/noise
num_samples: 1000000
num_workers: 8
test_manifest: dataset/manifest.test
train_manifest: dataset/manifest.train
------------------------------------------------
开始生成数据列表...
  0%|                                                                                                                                          | 0/7176 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "create_data.py", line 39, in <module>
    max_test_manifest=args.max_test_manifest)
  File "/home/aistudio/ppasr/trainer.py", line 110, in create_data
    max_test_manifest=max_test_manifest)
  File "/home/aistudio/ppasr/utils/utils.py", line 61, in create_manifest
    change_rate(audio_path)
  File "/home/aistudio/ppasr/utils/utils.py", line 105, in change_rate
    data, sr = soundfile.read(audio_path)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/soundfile.py", line 257, in read
    subtype, endian, format, closefd) as f:
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/soundfile.py", line 629, in __init__
    self._file = self._open(file, mode_int, closefd)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/soundfile.py", line 1184, in _open
    "Error opening {0!r}: ".format(self.name))
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/soundfile.py", line 1357, in _error_check
    raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening 'aset/audio/data_aishell/wav/test/S0764/BAC009S0764W0480.wav': System error.

TypeError: object of type 'NoneType' has no len()

执行create_data.py报错:
Traceback (most recent call last):
File "create_data.py", line 39, in
max_test_manifest=args.max_test_manifest)
File "/usr/local/PPASR/ppasr/trainer.py", line 138, in create_data
num_workers=self.num_workers)
File "/usr/local/PPASR/ppasr/utils/utils.py", line 196, in compute_mean_std
num_workers=num_workers)
File "/usr/local/PPASR/ppasr/data_utils/normalizer.py", line 40, in init
self._compute_mean_std(manifest_path, num_samples, num_workers)
File "/usr/local/PPASR/ppasr/data_utils/normalizer.py", line 94, in _compute_mean_std
for i in range(len(means)):

生成数据时遇到的问题

您好,我在ai studio跑该项目的时候,在处理数据部分遇到了下面这个问题,请问是为什么?
_以及,在数据划分中,test.txt中每一条音频的路径有一些错误。被写入到test.txt中的路径是_aset/_开头的,而不是_dataset_开头的,您有空可以去修改一下。


IsADirectoryError Traceback (most recent call last)
/tmp/ipykernel_189/311691653.py in
7 num_workers=1)
8
----> 9 trainer.create_data(annotation_path='dataset/annotation/')

~/PPASR/ppasr/trainer.py in create_data(self, annotation_path, noise_manifest_path, noise_path, num_samples, count_threshold, is_change_frame_rate, max_test_manifest)
108 test_manifest_path=self.test_manifest,
109 is_change_frame_rate=is_change_frame_rate,
--> 110 max_test_manifest=max_test_manifest)
111 print('=' * 70)
112 print('开始生成噪声数据列表...')

~/PPASR/ppasr/utils/utils.py in create_manifest(annotation_path, train_manifest_path, test_manifest_path, is_change_frame_rate, max_test_manifest)
53 for annotation_text in os.listdir(annotation_path):
54 annotation_text_path = os.path.join(annotation_path, annotation_text)
---> 55 with open(annotation_text_path, 'r', encoding='utf-8') as f:
56 lines = f.readlines()
57 for line in tqdm(lines):

IsADirectoryError: [Errno 21] Is a directory: 'dataset/annotation/.ipynb_checkpoints'

使用1300数据集的那个模型出现乱码了,请问下可能是什么样的原因呢

----------- Configuration Arguments -----------
alpha: 1.2
beam_size: 10
beta: 0.35
cutoff_prob: 1.0
cutoff_top_n: 40
decoder: ctc_beam_search
is_long_audio: False
lang_model_path: D:\dnf�\zh_giga.no_cna_cmn.prune01244.klm
model_dir: D:\dnf\PPASR\infer\deepspeech2\infer
real_time_demo: False
to_an: True
use_gpu: True
use_model: deepspeech2
vocab_path: D:\dnf\PPASR\dataset\zh_vocab.txt
wav_path: C:\Users\qiegewala\Music\A2_2.wav

[4234, 5841, 1048, 3128, 4782, 2775, 4081, 3728, 2775, 5412, 5065, 3134, 1792, 3134, 2951, 1566, 1458, 1566, 1792, 1566, 5167, 1930, 3465, 5412, 1566, 4012, 2951, 5168, 1566, 2951, 5250, 48, 2290, 48, 2951, 5168, 1566, 5168, 5250, 2951, 2290, 2951, 5168, 769, 5168, 1458, 2951, 5177, 4497, 1566, 5658, 2760, 337, 3128, 2760]
消耗时间:1741ms, 识别结果: 冼悕仍肪霈烯葳酌烯袢嚟婶曰婶怂旗朴旗曰旗埵柯谌袢旗呻怂阊旗怂踟电盼电怂阊旗阊踟怂盼怂阊引阊朴怂徂馊旗跱趴星肪趴, 得分: 0

windows10,paddlepaddle-gpu==2.1.3 cudatoolkit=10.2,PaddlePaddle 2.2.0

求助一下,wenetspeech的训练配置

麻烦大佬告知一下训练wenetspeech一万个小时数据集的时候,用几张什么显卡大概跑了多久呢,想预估一下自己这边的情况,万分感谢!

An error occurred when use pretrained model to train WenetSpeech dataset

Hello~
l copy the pretrained model form "PPASR_大数据集/models/deepspeech2/best_model" to the "/PPASR/models/deepspeech2/last_model", then l start trainning, but an error occurred. l wonder what's the reason of this problem?

AssertionError: Variable Shape not match, Variable [ linear_0.w_0_moment1_0 ] need tensor with shape (1024, 5451) but load set tensor with shape (1024, 6436)
微信图片_20220205190959

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.