Giter Club home page Giter Club logo

crnn_chinese_characters_rec's Introduction

Characters Recognition

A Chinese characters recognition repository based on convolutional recurrent networks. (Below please scan the QR code to join the wechat group.)

Performance

Recognize characters in pictures

Dev Environments

  1. WIN 10 or Ubuntu 16.04
  2. PyTorch 1.2.0 (may fix ctc loss) with cuda 10.0 🔥
  3. yaml
  4. easydict
  5. tensorboardX

Data

Synthetic Chinese String Dataset

  1. Download the dataset
  2. Edit lib/config/360CC_config.yaml DATA:ROOT to you image path
    DATASET:
      ROOT: 'to/your/images/path'
  1. Download the labels (password: eaqb)

  2. Put char_std_5990.txt in lib/dataset/txt/

  3. And put train.txt and test.txt in lib/dataset/txt/

    eg. test.txt

    20456343_4045240981.jpg 89 201 241 178 19 94 19 22 26 656
    20457281_3395886438.jpg 120 1061 2 376 78 249 272 272 120 1061
    ...

Or your own data

  1. Edit lib/config/OWN_config.yaml DATA:ROOT to you image path
    DATASET:
      ROOT: 'to/your/images/path'
  1. And put your train_own.txt and test_own.txt in lib/dataset/txt/

    eg. test_own.txt

    20456343_4045240981.jpg 你好啊!祖国!
    20457281_3395886438.jpg 晚安啊!世界!
    ...

note: fixed-length training is supported. yet you can modify dataloader to support random length training.

Train

   [run] python train.py --cfg lib/config/360CC_config.yaml
or [run] python train.py --cfg lib/config/OWN_config.yaml
#### loss curve

```angular2html
   [run] cd output/360CC/crnn/xxxx-xx-xx-xx-xx/
   [run] tensorboard --logdir log

loss overview(first epoch)

Demo

   [run] python demo.py --image_path images/test.png --checkpoint output/checkpoints/mixed_second_finetune_acc_97P7.pth

References

crnn_chinese_characters_rec's People

Contributors

sierkinhane avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

crnn_chinese_characters_rec's Issues

关于nclass的问题

中文一个字符占3个字节,比如说我有5个字符,那我的nclass不就是3×5+1=16了,这样的话标签的种类不就有16个,而不是3+1=4

对倾斜或弯曲字体的识别效果很差

@Sierkinhane 你好,我使用了你训练好的模型,对横向的且清晰的文字识别率较高,但是对带有倾斜、弯曲或变形的文字识别效果很差。我尝试了对图片进行预处理矫正再使用模型预测,结果还是很差,那我是不是要在生成样本时加入一些倾斜弯曲的操作,我看你的generator.py里面没有补全旋转、噪声和字体拉伸函数。

旋转函数

def rotate_func():
pass

噪声函数

def random_noise_func():
pass

字体拉伸函数

def stretching_func():
pass

val_root 验证集路径

这个验证集 路径是哪个,我仅仅是生成了 lmdb文件 而已,不知道这个参数是什么意思

多GPU版报错

image
把代码改成多GPU的train时候正常但是在val时候报错。上图用的是4GPU,数值差四倍
image
GPU代码加上了这句。
val代码加加上注释的那一句也报错
image
把 preds = preds.squeeze(2)注释掉后报错为
image

出现一个RuntimeError

RuntimeError: Error(s) in loading state_dict for CRNN:
size mismatch for rnn.1.embedding.bias: copying a param of torch.Size([19997]) from checkpoint, where the shape is torch.Size([6736]) in current model.
size mismatch for rnn.1.embedding.weight: copying a param of torch.Size([19997, 512]) from checkpoint, where the shape is torch.Size([6736, 512]) in current model.
打扰了,刚接触这方面的问题,跑程序的时候遇到了这个问题。请问有什么解决办法吗?

数据字符问题

您好我看这个360w数据集中的字符为6736.个 ,的数据集中有一些特殊繁体字,与alphabet.py中的字合并后成为9116个字。但是用这个新的alphabet.py文件后不能在测试成功。报以下错误

oading pretrained model from trained_models/mixed_second_finetune_acc97p7.pth
Traceback (most recent call last):
File "test.py", line 66, in
model.load_state_dict(torch.load(crnn_model_path))
File "/home/imc/XR/anaconda3/envs/crnn-chinese/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for CRNN:
size mismatch for rnn.1.embedding.weight: copying a param with shape torch.Size([6736, 512]) from checkpoint, the shape in current model is torch.Size([9116, 512]).
size mismatch for rnn.1.embedding.bias: copying a param with shape torch.Size([6736]) from checkpoint, the shape in current model is torch.Size([9116]).
(crnn-chinese) imc@imc-NO108:~/XR/models/chenxu/crnn_chinese$
请问在网络中是不是 字符长度个数是写死了的?
为何怎么提示shape不匹配?要如何修改呢?

损失降下来之后突然又升上去了。。

昨天训练的,损失降到0到1之间,准确率大约在88%左右吧。今早起来看损失又变成了50多,准确率是0,这是怎么回事?=.=另外作者数据集的label全是你一个一个打的吗?

相反样本训练

今天突然想到一个问题,假设我在训练CRNN的时候,开始我使用的白底黑字的样本,后来我又训练黑底白字样本。两种样本的风格完全相反。这样会对模型的准确率产生不好的影响吗?

How to fine-tune?

Hi, thx for ur job. Could u pls tell me how to fine-tune with ur pre-train model?

训练出问题

为什么每训练一段时间,就会周期性出现如下错误,但是不影响训练进程
Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7fa67ff97f60>>
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 399, in del
self._shutdown_workers()
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 378, in _shutdown_workers
self.worker_result_queue.get()
File "/usr/lib/python3.5/multiprocessing/queues.py", line 345, in get
return ForkingPickler.loads(res)
File "/usr/local/lib/python3.5/dist-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd
fd = df.detach()
File "/usr/lib/python3.5/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/usr/lib/python3.5/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 493, in Client
answer_challenge(c, authkey)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 737, in answer_challenge
response = connection.recv_bytes(256) # reject large message
File "/usr/lib/python3.5/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer

Tolmdb

图片和文字生成lmdb的时候,是用二进制读取图片很文字吗,如果tolmdb.py下
with open('.'+imagePath, 'r') as f:
imageBin = f.read()
如果不用‘rb’,读出来的图片,会说'%s is not a valid image' %

关于训练集的几个问题

想请教下 你的mixed_second_finetune_acc97p7.pth 模型文件 是基于360万张开源数据集+自己本地脚本生成的,一些本地常用文字组合的数据集 一次训练出来的呢 还是基于开源数据集训练生成的模型 再而二次训练生成的?
还有一个问题是 我试了一下你的模型文件 对于白底黑字识别精度挺好的 但是对于复杂一些的背景识别结果一般,和网上一些其他开源的模型文件识别结果有差距 是因为你用的360万张数据集本身的背景不够多样化导致的吗

有关训练的问题

您好,我想请问下,我采用了同样的360万的数据集和标签文件,其它参数也一致,训练的时候准确率也可以达到98%,但是在另外生成的白底黑字的数据上进行测试准确率只有六十几,而且和您给的模型准确率也有所差距,请问下您有对数据进行其它什么额外的处理,还有就是另外生成的测试和训练数据非常相似,但是准确率确远不如训练的精度是什么原因呢?可以的话能否指点一下,谢谢!

正确率

采用360万数据集,其他参数和你的一致,不管是 从头训练还是finetune最后正确率都在50左右震荡是怎么回事

python test.py error

当我尝试运行这份代码时,test.py 有以下报错信息,疑惑?
我添加了map_location='cpu' 是这个原因吗?
model.load_state_dict(torch.load(crnn_model_path,map_location='cpu'))

报错信息:

python test.py ****
loading pretrained model from trained_models/mixed_second_finetune_acc97p7.pth
Traceback (most recent call last):
File "test.py", line 64, in
**model.load_state_dict(torch.load(crnn_model_path,map_location='cpu')) **
File "/home/seven/anaconda3/envs/chinese_ocr/lib/python2.7/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for CRNN:
size mismatch for rnn.1.embedding.bias: copying a param of torch.Size([19997]) from checkpoint, where the shape is torch.Size([6736]) in current model.
size mismatch for rnn.1.embedding.weight: copying a param of torch.Size([19997, 512]) from checkpoint, where the shape is torch.Size([6736, 512]) in current model.

在制作数据集lmdb时候,提示有错误

Traceback (most recent call last):
File "tolmdb.py", line 90, in
createDataset(outputPath, imagePathList, labelList)
File "tolmdb.py", line 55, in createDataset
imageBin = f.read()
File "/usr/lib/python3.5/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

preprocessing

我想请问一下那个preprocessing.py 是干嘛用的,我跑了一下什么都没出现啊。

识别效果与字体大小有关吗?

我这边识别一张发票上的区域图片,完全识别不出来,我稍微resize得大一些就识别出来了,再大一些又识别得很差,请问是否和字体的大小有关呢?

5990个字符的作用

我想请问一下就是说在char_std_5990.txt 有5990个字符,那么这么多的字符的作用是什么呢?是类别码?是将识别当作分类来做吗?

AttributeError: Can't get attribute '_rebuild_tensor_v2'

AttributeError: Can't get attribute '_rebuild_tensor_v2' on <module 'torch._utils' from '/data8T/fangping/anaconda2/envs/pytorch_py36/lib/python3.6/site-packages/torch/_utils.py'>
我使用的pytorch版本为0.3.1,貌似是pytorch版本不对,请问你的版本是多少呢?

训练的模型与字体问题

作者你好,首先非常感谢你提供的已训练模型和代码,这里有两个问题想问你一下:

  1. 当时训练的时候是用的什么字体?是generator文件夹里面的几个吗?

  2. 我现在想在你的基础上增加一些特殊字符的识别,大概数据规模需要多少呢?另外,做修改的话是否只要在alphabets.py中进行更改,然后按照README进行训练就行?

麻烦您有空解答一下,感激不尽!

关于windows下识别图片

我把模型放在了windows下,识别训练的图片,发现没有识别出来。是因为在linux下训练的模型,在windows下就会出现问题吗?
原图
image
识别结果:
image

python3环境执行crnn_main.py报错

在制作lmdb时,tolmdb.py 28行 报错:
TypeError: Won't implicitly convert Unicode to bytes; use .encode()

25 def writeCache(env, cache):
26 with env.begin(write=True) as txn:
27 for k, v in cache.items():
28 txn.put(k, v)

28修改成:txn.put(str(K).encode('utf-8'), str(v).encode('utf-8'))
可以继续执行,但是crnn_main.py训练时报错:
Corrupted image for 2488925
Corrupted image for 3213999
Corrupted image for 3214001
Corrupted image for 2488927
Corrupted image for 2488929
Corrupted image for 3214003
Corrupted image for 2488931
Corrupted image for 3214005
Corrupted image for 2488933
Corrupted image for 3214007
Corrupted image for 2488935
Corrupted image for 3214009
Corrupted image for 2488937
Corrupted image for 3214011
Corrupted image for 2488939

model和log的保存

我想请问一下model 和 log 实训连结束后 才保存到expr文件夹中的吗 我想可视化loss曲线怎么办

图片尺寸问题

大佬您好,谢谢您的分享。想请教个问题,我的数据集的图片大小是不固定的,各种尺寸都有,那么我应该怎么使用您的代码?

模型训练的预处理

作者您好,我想请问下,我看了下模型训练的脚本,发现在对训练集做处理时。好像没有对图像做灰度处理,是我没有看到吗?

运行python tolmdb.py

当运行这个文件,我是python3.6 其他环境和作者是一样的。
运行的时候报了这个错误:
File "/home/hyc/anaconda3/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
怎么才能成功运行呢

关于crnn_main脚本问题

您好,关于crnn_main脚本,第158行代码
image = torch.FloatTensor(params.batchSize, 3, params.imgH, params.imgH)
这里是有意要让图片调整到正方形的吗?

finetune

请问您在finetune时使用生成的数据都是10个字符的长度的吗?

训练时出现 KeyError: '\x00'

Start val
Traceback (most recent call last):
File "crnn_main.py", line 200, in
training()
File "crnn_main.py", line 117, in training
val(crnn, test_dataset, criterion)
File "crnn_main.py", line 57, in val
t, l = converter.encode(cpu_texts)
File "/home/OCR/crnn_train/crnn_chinese_characters_rec-master/utils.py", line 101, in encode
index = self.dict[char]
KeyError: '\x00'
Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f425248dc88>>
Traceback (most recent call last):
File "/home/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 349, in del
self._shutdown_workers()
File "/home/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 328, in _shutdown_workers
self.worker_result_queue.get()
File "/home/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/queues.py", line 337, in get
return _ForkingPickler.loads(res)
File "/home/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 70, in rebuild_storage_fd
fd = df.detach()
File "/home/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/home/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/home/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/connection.py", line 487, in Client
c = SocketClient(address)
File "/home/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f47434f68d0>>
Traceback (most recent call last):
File "/home/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 349, in del
self._shutdown_workers()
File "/home/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 328, in _shutdown_workers
self.worker_result_queue.get()
File "/home/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/queues.py", line 337, in get
return _ForkingPickler.loads(res)
File "/home/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 70, in rebuild_storage_fd
fd = df.detach()
File "/home/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/home/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/home/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/connection.py", line 487, in Client
c = SocketClient(address)
File "/home/anaconda3/envs/pytorch/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
主要问题是出现char在dicts中找不到,找到报错的这行数据是“61688125_428659907.jpg 时,塔吉尔的手不由得”,并没有不在字典中的字符,请问这是什么问题造成的呢?
目前的训练集使用的是Synthetic_Chinese_String_Dataset,字典是char_std_5990.txt

安装warp-ctc出问题

yue@yue-Vostro-3668:/crnn_chinese_characters_rec$ cd warp-ctc/pytorch_binding
yue@yue-Vostro-3668:
/crnn_chinese_characters_rec/warp-ctc/pytorch_binding$ python setup.py install
Torch was not built with CUDA support, not building warp-ctc GPU extensions.
generating build/warpctc_pytorch/_warp_ctc/__warp_ctc.c
(already up-to-date)
not modified: 'build/warpctc_pytorch/_warp_ctc/__warp_ctc.c'
running install
Checking .pth file support in /usr/local/lib/python3.5/dist-packages/
error: can't create or remove files in install directory

The following error occurred while trying to add or remove files in the
installation directory:

[Errno 13] Permission denied: '/usr/local/lib/python3.5/dist-packages/test-easy-install-8801.pth'

The installation directory you specified (via --install-dir, --prefix, or
the distutils default setting) was:

/usr/local/lib/python3.5/dist-packages/

Perhaps your account does not have write access to this directory? If the
installation directory is a system-owned directory, you may need to sign in
as the administrator or "root" account. If you do not have administrative
access to this machine, you may wish to choose a different installation
directory, preferably one that is listed in your PYTHONPATH environment
variable.

For information on other options, you may wish to consult the
documentation at:

https://pythonhosted.org/setuptools/easy_install.html

Please make the appropriate changes for your system and try again.

训练对GPU资源的要求?报内存不足

作者跑300万数据集用的GPU资源要多少?
我有一个2万张图片的数据集,做成lmdb格式也就800M,跑起来CPU模式报内存泄漏,GPU模式报Cuda计算错误——而且两者有时都是跑了一两个batch才出错的,怀疑是中间变量过大导致——把batch size从16改为1,GPU就可以正常跑了,但这样貌似没意义吧。。。
我的GPU用的是阿里云的P100,显存16G的

训练 loss一直很大,训练效果非常不理想

根据一百种字体生成将近50万张图片,45万张作为训练集,5万张作为测试集,训练loss 一直徘徊在20.几左右。训练完了,除了训练集里面的图片能识别,其他同样字体图片都不能识别。我是需要调整什么参数呢

多gpu

我在使用crnn.pytorch进行训练,结果多gpu总是出问题,能请问一下你是多gpu训练的吗?
而且我看crnn_main.py里面程序也是又问题的,param对应不上,所以这是您最后执行成功的代码吗?

我是inter显卡,这个该怎么运行起来

现在ctc 都配置完成了,运行报错:
cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at torch/csrc/cuda/Module.cpp:51
不知道有没有解决方案

训练需要多久?

我用的是csdn文章中提供的训练集。
修改了 batch_size = 128, worker = 8。
我的系统是Ubuntu 16.04 LTS, CPU I7 8Core,16G内存,带1块显卡Nvidia 1070(8G显存)。
训练了8个小时还没有结束,现在的准确度是95%。
Test loss: 0.232426, accuray: 0.959453

是不是我修改的worker或者batch_size太大了?

编码问题

loading pretrained model from trained_models/mixed_second_finetune_acc97p7.pth
Traceback (most recent call last):
  File "test.py", line 73, in <module>
    crnn_recognition(image, model)
  File "test.py", line 56, in crnn_recognition
    print('results: {0}'.format(sim_pred))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 9-46: ordinal not in range(128)

关于编译warp-ctc的问题

你好,我在训练的时候一直提示GPU execution requested, but not compiled with GPU support
然后我回头看warp-ctc cmake 的时候发现提示Building shared library with no GPU support 然而我cuda的路径已经导入了 而且查看WITH_GPU的值是TRUE 但是一直进不去with GPU 的语句 。
请问下这个是什么原因呢?谢谢

Runtime Error cuda out of memory occurs while the gpu memory is empty

Detailed error description::

Traceback (most recent call last):
File "crnn_main.py", line 193, in
training()
File "crnn_main.py", line 110, in training
cost = trainBatch(crnn, criterion, optimizer, train_iter)
File "crnn_main.py", line 96, in trainBatch
cost = criterion(preds, text, preds_size, length) / batch_size
File "/home/ubuntu/suraj/TrainModel/venv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, kwargs)
File "/home/ubuntu/suraj/TrainModel/venv/lib/python3.5/site-packages/warpctc_pytorch-0.1-py3.5-linux-x86_64.egg/warpctc_pytorch/init.py", line 82, in forward
self.length_average, self.blank)
File "/home/ubuntu/suraj/TrainModel/venv/lib/python3.5/site-packages/warpctc_pytorch-0.1-py3.5-linux-x86_64.egg/warpctc_pytorch/init.py", line 32, in forward
blank)
File "/home/ubuntu/suraj/TrainModel/venv/lib/python3.5/site-packages/torch/utils/ffi/init.py", line 202, in safe_call
result = torch._C._safe_call(args, kwargs)
torch.FatalError: CUDA error: out of memory (allocate at /pytorch/aten/src/THC/THCCachingAllocator.cpp:510)
frame #0: THCudaMalloc + 0x79 (0x7f50f7b32e99 in /home/ubuntu/suraj/TrainModel/venv/lib/python3.5/site-packages/torch/lib/libcaffe2_gpu.so)
frame #1: gpu_ctc + 0x134 (0x7f50f61f92a4 in /home/ubuntu/suraj/TrainModel/venv/lib/python3.5/site-packages/warpctc_pytorch-0.1-py3.5-linux-x86_64.egg/warpctc_pytorch/_warp_ctc/$
_warp_ctc.cpython-35m-x86_64-linux-gnu.so)
frame #2: + 0x1ad2 (0x7f50f61f8ad2 in /home/ubuntu/suraj/TrainModel/venv/lib/python3.5/site-packages/warpctc_pytorch-0.1-py3.5-linux-x86_64.egg/warpctc_pytorc$
/_warp_ctc/__warp_ctc.cpython-35m-x86_64-linux-gnu.so)

frame #5: THPModule_safeCall(_object
, _object
, _object
) + 0x4c (0x7f511e7a67cc in /home/ubuntu/suraj/TrainModel/venv/lib/python3.5/site-packages/torch/_C.cpython-35m-x86_64-l$
nux-gnu.so)
frame #8: python() [0x5401ef]
frame #11: python() [0x4ec358]
frame #14: THPFunction_apply(_object
, _object
) + 0x38f (0x7f511eb9383f in /home/ubuntu/suraj/TrainModel/venv/lib/python3.5/site-packages/torch/_C.cpython-35m-x86_64-linux-gnu.so)
frame #18: python() [0x4ec3f7]
frame #22: python() [0x4ec2e3]
frame #24: python() [0x4fbfce]
frame #26: python() [0x574db6]
frame #31: python() [0x53fc97]
frame #33: python() [0x60cb42]
frame #38: __libc_start_main + 0xf0 (0x7f513430a830 in /lib/x86_64-linux-gnu/libc.so.6)
Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f50ec9151d0>>
Traceback (most recent call last):
File "/home/ubuntu/suraj/TrainModel/venv/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 399, in del
self._shutdown_workers()
File "/home/ubuntu/suraj/TrainModel/venv/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 378, in _shutdown_workers
self.worker_result_queue.get()
File "/usr/lib/python3.5/multiprocessing/queues.py", line 345, in get
return ForkingPickler.loads(res)
File "/home/ubuntu/suraj/TrainModel/venv/lib/python3.5/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd
fd = df.detach()
File "/usr/lib/python3.5/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/usr/lib/python3.5/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 487, in Client
c = SocketClient(address)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 614, in SocketClient
s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused

I am using ::
cuda: 8.0
python: 3.5
pytourch : 0.4.1

I am getting error while using cuda. It is running fine on cpu.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.