mariolew / deep-alignment-network-tensorflow Goto Github PK
View Code? Open in Web Editor NEWA re-implementation of Deep-Alignment-Network using TensorFlow
A re-implementation of Deep-Alignment-Network using TensorFlow
my stage 1 training failed with this message:
ptxas fatal : Memory allocation failure
my gpu memory is 12GB, memory is 64GB, python3.6, CUDA9.2, python3.6, tensorflow-gpu 1.10.0
running
trainDAN.py with stage=1 and using npz file generated
Hi sir, i appreciate your work, and I would like to know which values should be in TestErr and BatchErr on training phase of stage 1 and 2 to obtain a good model?. How many epochs you used?
Recently i started training phase stage 1. Best Regards...

没法训练,一直在爆内存。
Could you please put the pre-trained model to this project ? or do you know how to use the pre-trained model in the origin theano project
有人知道stage1 和 stage2 的训练周期多少比较合适吗?感谢感谢
想请问一下在300-w数据集上,stage1和stage2差不多要训练几个epoch呢。
另外训练好之后的误差在多少说明模型效果比较好呢?我目前只训练了stage1,batch_error在0.8左右一直下不去,在300W-test上做测试的时候mean_error在0.07。
最好还想请问一下原代码中的validationSet是不是有问题,和trainSet的数据重复了?
非常希望得到您的解答,谢谢。
您好,请问训练的batch_size怎么设置,还有当我运行trainDAN.py时候,提示 FailedPreconditionError(see above for traceback):Attemping to use uninitialized value Stage2/beta1_power[[{{node Stage2/beta1_power/read}}=Identity[T=DT_FLoat,_CLASS={'"LOC:@STAGE2/adam/assign_1"},-device="job:localhost/replace:0/task:0/device:GPU:0"(stage2/beta1_power)]]],请问这个错误怎么避免
您好,我在进行trainDAN训练时,STAGE=2,执行sess.sun();获取数据训练时,运行到models.py中的S2_InputImage = AffineTransformLayer(InputImage, S2_AffineParam)会报错;
Caused by op 'Stage2/MatrixInverse', defined at:
File "D:/softmares/Pycharm_workplace/Deep-Alignment-Network-tensorflow-master/DAN-TF/python_test1.py", line 73, in
dan = DAN(initLandmarks)
File "D:\softmares\Pycharm_workplace\Deep-Alignment-Network-tensorflow-master\DAN-TF\python_test2.py", line 102, in DAN
S2_InputImage = AffineTransformLayer(InputImage, S2_AffineParam) ## 通过变换矩阵对原图进行矫正,得到新的图片
File "D:\softmares\Pycharm_workplace\Deep-Alignment-Network-tensorflow-master\DAN-TF\layers.py", line 60, in AffineTransformLayer
A = tf.matrix_inverse(A)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_linalg_ops.py", line 330, in matrix_inverse
name=name)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 768, in apply_op
op_def=op_def)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2336, in create_op
original_op=self._default_original_op, op_def=op_def)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1228, in init
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Input is not invertible.
[[Node: Stage2/MatrixInverse = MatrixInverseT=DT_FLOAT, adjoint=false, _device="/job:localhost/replica:0/task:0/cpu:0"]]
请问您遇到过吗?是我开始数据变换出了问题吗?能请教您下吗?
你好,我在运行python trainDAN.py,会出现以下错误:
Traceback (most recent call last): │
File "trainDAN.py", line 12, in │
trainSet = ImageServer.Load(datasetDir + "dataset_nimgs=40_perturbations=[0.2, 0.2, 20,│
0.25]_size=[112, 112].npz") │
File "/home/zhanggl/dantf/DAN-TF/ImageServer.py", line 31, in Load │
arrays = np.load(filename) │
File "/home/zhanggl/anaconda3/envs/dantf/lib/python3.6/site-packages/numpy/lib/npyio.py",│
line 415, in load │
fid = open(os_fspath(file), "rb") │
FileNotFoundError: [Errno 2] No such file or directory: '../data/dataset_nimgs=40_perturbat│
ions=[0.2, 0.2, 20, 0.25]_size=[112, 112].npz'
请问这是我哪里出来问题?(我已经过运行TestSetPreparation.py 和 TrainingSetPreparation.py)
非常感谢。
I was wondering what npz-files do I have to load in line 15 and 16 in order to train the model? At the moment those files aren't in the directory "data".
(14)>>>datasetDir = "../data/"
(15)>>>trainSet = ImageServer.Load(datasetDir + "dataset_nimgs=40_perturbations=[0.2, 0.2, 20, 0.25]_size=[112, 112].npz")
(16)>>>validationSet = ImageServer.Load(datasetDir + "dataset_nimgs=9_perturbations=[]_size=[112, 112].npz")
Hi,
What size of model when you train Deep alignment network using mobile net? And if I have GPU so I can run your code on GPU?
stupid misstake.0.0
你好,
我用你的mobilenet版本来从头训练DAN,使用工程提供的脚本生成的数据集,只训练stage1,训了两天,loss已经不在下降,但是测试效果很差。对于mobilenet的训练,您有没有什么训练经验可以分享一下吗?之前用mobilenet从头训练其他的任务,效果也不好。
谢谢!
Do you know how to save this model to pb? the output tensor name is needed ,but i don't know the name ?
您好,我在训练模型的时候发现loss下降的速度很慢,想修改模型训练的学习率,但是整个代码都找了,没发现学习率在哪设置的,您这边能告诉学习率的参数到底是哪一个么?
如题,非人脸数据,想进行关键点检测的训练,该如何进行呢
There are two lines in testDan which uses format
like:
.format{np.mean(errs)})
that should of course be
.format(np.mean(errs)))
训练集:300w indoor
stage0和stage1的时候表现良好
stage2的时候testerror爆发式增长,有人遇到类似情况吗?
when run the testDAN.py,there some error
nChannels = testSet.imgs.shape[1]
error:tuple index out of range
1、has this code reached the same accuracy as the theano ?
2、 i saw in theano document , bn(conv(relu)) means conv + bn +relu
like this: >>> from lasagne.layers import InputLayer, DenseLayer, batch_norm
>>> from lasagne.nonlinearities import tanh
>>> l1 = InputLayer((64, 768))
>>> l2 = batch_norm(DenseLayer(l1, num_units=500, nonlinearity=tanh))
This introduces batch normalization right before its nonlinearity:
>>> from lasagne.layers import get_all_layers
>>> [l.class.name for l in get_all_layers(l2)]
['InputLayer', 'DenseLayer', 'BatchNormLayer', 'NonlinearityLayer']
so , which is right about bn.
def LandmarkImageLayer(Landmarks):
def draw_landmarks(L):
def draw_landmarks_helper(Point):
intLandmark = tf.to_int32(Point)
locations = Offsets + intLandmark
dxdy = Point - tf.to_float(intLandmark)
offsetsSubPix = tf.to_float(Offsets) - dxdy
vals = 1 / (1 + tf.norm(offsetsSubPix, axis=2))
img = tf.scatter_nd(locations, vals, shape=(IMGSIZE, IMGSIZE))
return img
这个函数中,Offsets和intLandmark维度不匹配呀?
intLandmark的维度是(图片的数量,136,2)儿Offsets的维度是(16,16,2)
您好,我在做实验的时候发现一个问题,我现在只训练了第一级网络,第一级网络其实能预测关键点了,在第一级网络中加了meanshape的效果比没有加meanshape的效果要差一下,您有遇到这个问题吗?
不知道可以不可以用resnet换掉第一层的vgg16
Hi
How is the performance of mobileNet on iphone 7?
Thanks
RT
i do not found weight decay in all three implementation. Did i miss something?
我训练完 s1 50次 s2 50次之后发现效果 和theano版还有一定差距,又训练了40次s2,结果没什么改变,我想再继续训练一下s1 看看有没有提升,可以直接训练吗? 还是必须要从头开始训练s1
challengingSet model 140 0.08934
commonSet.npz model 140 0.04839
theano:
commonSet 0.04287
challengeSet 0.07040
评价指标是 normalization = 'centers'
Hi,
I have an error when run TestsetPreparation.py file:
Traceback (most recent call last):
File "TestSetPreparation.py", line 19, in
commonSet.PrepareData(commonSetImageDirs, commonSetBoundingBoxFiles, meanShape, 0, 1000, False)
File "/home/diffdeep/Documents/train_dan/Deep-Alignment-Network-tensorflow-master-2/DAN-TF/ImageServer.py", line 60, in PrepareData
boundingBoxDict = pickle.load(open(boundingBoxFiles[i], 'rb'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd3 in position 0: ordinal not in range(128)
Please suggest me some advice. Thanks
小姐姐,你好^^.
我在看代码的实现细节,但有些涉及到矩阵运算的代码,我不太理解为什么要那样做。
比如TransformParamsLayer和AffineTransformLayer中的矩阵运算,
你能提供一些资料吗,多谢。^^
时隔快一年了,还是没弄懂为什么在替换了MobilentV2后,没法训练STAGE2.是因为mobilenet本身的特性么?
I can not see any explanation about Face Normalization. It this project can normalize the faces? Could you provide some details please?
Hi,
When I run trainingsetpreparation .py,
Creating perturbations of 0 shapes
E:\anaconda\lib\site-packages\numpy\core\fromnumeric.py:2957: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
E:\anaconda\lib\site-packages\numpy\core_methods.py:80: RuntimeWarning: invalid value encountered in true_divide
ret = ret.dtype.type(ret / rcount)
E:\anaconda\lib\site-packages\numpy\core_methods.py:135: RuntimeWarning: Degrees of freedom <= 0 for slice
keepdims=keepdims)
E:\anaconda\lib\site-packages\numpy\core_methods.py:105: RuntimeWarning: invalid value encountered in true_divide
arrmean, rcount, out=arrmean, casting='unsafe', subok=False)
E:\anaconda\lib\site-packages\numpy\core_methods.py:127: RuntimeWarning: invalid value encountered in true_divide
ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
File "TrainingSetPreparation.py", line 17, in
trainSet.NormalizeImages()#去均值,除以标准差
File "F:\code\Deep-Alignment-Network-tensorflow-master\DAN-TF\ImageServer.py", line 219, in NormalizeImages
plt.imshow(meanImg[:,:,0], cmap=plt.cm.gray)
IndexError: invalid index to scalar variable.
Can you help me ?
Thank you !
您好:
我想实际的看下视频流的Landmark效果,源码只有测试集的Error评估,能指点一下,怎么实现测试视频流吗?万分感谢~
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.