feynman1999 / mai-vsr-diggers Goto Github PK
View Code? Open in Web Editor NEWDiggers solution of Mobile AI 2021 Real-Time Video Super-Resolution Challenge
Diggers solution of Mobile AI 2021 Real-Time Video Super-Resolution Challenge
2022-08-10 22:49:58,911 - edit - INFO - training gpus num: 2
2022-08-10 22:49:58,912 - edit - INFO - init distributed process group 0 / 2
2022-08-10 22:49:58,915 - edit - INFO - init distributed process group 1 / 2
2022-08-10 23:16:08,691 - rank0_edit - INFO - SRManyToManyDataset dataset load ok, mode: train len:24000
2022-08-10 23:16:08,691 - rank0_edit - INFO - use repeatdataset, repeat times: 1
2022-08-10 23:16:08,694 - rank0_edit - INFO - model: BasicVSR_v5 's total parameter nums: 23371
2022-08-10 23:16:08,698 - rank0_edit - INFO - syncing the model's parameters...
2022-08-10 23:16:08,991 - rank0_edit - INFO - SRManyToManyDataset dataset load ok, mode: eval len:3000
2022-08-10 23:16:08,992 - rank0_edit - INFO - 1500 iters for one epoch, trained iters: 0, total iters: 600000
2022-08-10 23:16:08,992 - rank0_edit - INFO - Start running, work_dir: ./workdirs/mai_training/20220810_224958, workflow: train, max epochs : 400
2022-08-10 23:16:08,992 - rank0_edit - INFO - registered hooks: [<edit.core.hook.logger.text.TextLoggerHook object at 0x7f89e5a0a290>, <edit.core.hook.checkpoint.checkpoint.CheckpointHook object at 0x7f89e5a0a2d0>, <edit.core.hook.evaluation.eval_hooks.EvalIterHook object at 0x7f89e7f49750>]
2022-08-10 23:16:25,548 - rank0_edit - INFO - epoch: 0, losses: [0.00301], losses_ma: [0.00301], iter: 4
2022-08-10 23:16:34,997 - rank0_edit - INFO - epoch: 0, losses: [0.00328], losses_ma: [0.00314], iter: 9
2022-08-10 23:16:44,411 - rank0_edit - INFO - epoch: 0, losses: [0.00399], losses_ma: [0.00343], iter: 14
2022-08-10 23:16:52,555 - rank0_edit - INFO - epoch: 0, losses: [0.00506], losses_ma: [0.00383], iter: 19
2022-08-10 23:17:02,273 - rank0_edit - INFO - epoch: 0, losses: [0.00280], losses_ma: [0.00363], iter: 24
2022-08-10 23:17:10,873 - rank0_edit - INFO - epoch: 0, losses: [0.00386], losses_ma: [0.00367], iter: 29
2022-08-10 23:17:19,736 - rank0_edit - INFO - epoch: 0, losses: [0.00313], losses_ma: [0.00359], iter: 34
2022-08-10 23:17:28,804 - rank0_edit - INFO - epoch: 0, losses: [0.00358], losses_ma: [0.00359], iter: 39
2022-08-10 23:17:38,345 - rank0_edit - INFO - epoch: 0, losses: [0.00376], losses_ma: [0.00361], iter: 44
2022-08-10 23:17:46,827 - rank0_edit - INFO - epoch: 0, losses: [0.00311], losses_ma: [0.00356], iter: 49
2022-08-10 23:17:55,623 - rank0_edit - INFO - epoch: 0, losses: [0.00406], losses_ma: [0.00360], iter: 54
2022-08-10 23:18:04,857 - rank0_edit - INFO - epoch: 0, losses: [0.00326], losses_ma: [0.00357], iter: 59
2022-08-10 23:18:14,276 - rank0_edit - INFO - epoch: 0, losses: [0.00368], losses_ma: [0.00358], iter: 64
2022-08-10 23:18:23,079 - rank0_edit - INFO - epoch: 0, losses: [0.00392], losses_ma: [0.00361], iter: 69
2022-08-10 23:18:31,822 - rank0_edit - INFO - epoch: 0, losses: [0.00393], losses_ma: [0.00363], iter: 74
2022-08-10 23:18:40,337 - rank0_edit - INFO - epoch: 0, losses: [0.00429], losses_ma: [0.00367], iter: 79
2022-08-10 23:18:50,203 - rank0_edit - INFO - epoch: 0, losses: [0.00378], losses_ma: [0.00368], iter: 84
2022-08-10 23:18:58,560 - rank0_edit - INFO - epoch: 0, losses: [0.00336], losses_ma: [0.00366], iter: 89
2022-08-10 23:19:08,135 - rank0_edit - INFO - epoch: 0, losses: [0.00406], losses_ma: [0.00368], iter: 94
2022-08-10 23:19:16,653 - rank0_edit - INFO - epoch: 0, losses: [0.00396], losses_ma: [0.00369], iter: 99
2022-08-10 23:19:26,492 - rank0_edit - INFO - epoch: 0, losses: [0.00315], losses_ma: [0.00367], iter: 104
2022-08-10 23:19:34,738 - rank0_edit - INFO - epoch: 0, losses: [0.00389], losses_ma: [0.00368], iter: 109
2022-08-10 23:19:43,368 - rank0_edit - INFO - epoch: 0, losses: [0.00297], losses_ma: [0.00365], iter: 114
2022-08-10 23:19:52,057 - rank0_edit - INFO - epoch: 0, losses: [0.00436], losses_ma: [0.00368], iter: 119
2022-08-10 23:20:02,484 - rank0_edit - INFO - epoch: 0, losses: [0.00338], losses_ma: [0.00367], iter: 124
2022-08-10 23:20:11,059 - rank0_edit - INFO - epoch: 0, losses: [0.00386], losses_ma: [0.00367], iter: 129
2022-08-10 23:20:20,270 - rank0_edit - INFO - epoch: 0, losses: [651753408617431171072.00000], losses_ma: [24139015133978931200.00000], iter: 134
2022-08-10 23:20:28,060 - rank0_edit - INFO - epoch: 0, losses: [nan], losses_ma: [nan], iter: 139
The losses seems to start low and then it goes to nan.
For training i used -
python tools/train.py configs/restorers/BasicVSR/mai.py --gpuids 0,1 -d
Hi, Is this the same VSR model used in AI benchmark app?
How does this model compare with ESRGAN? If we use that tobupscale every single frame. Runtime vs output quality.
Hello,
First, I would like to appreciate your work.
Actually, I have been trying to implement this model for my custom project. I have tried to obtain the single frame result from model_none.tflite, but it takes 10 frames, which I did. However, the result does not look good. can you tell me why?
I have used the following:
all_frames = sorted(glob.glob('/content/drive/MyDrive/Mobile_communication/val_sharp_bicubic/X4/000/*.png'))
read_1 = tf.io.read_file(all_frames[0])
read_1 = tf.image.decode_jpeg(read_1, channels=3)
stacked = tf.Variable(np.empty((1,read_1.shape[0],read_1.shape[1],read_1.shape[2]), dtype=np.float32))
#for ind in range(1):
for ind in all_frames:
test_img_path = ind
lr1 = tf.io.read_file(test_img_path)
lr = tf.image.decode_jpeg(lr1, channels=3)
lr = tf.expand_dims(lr, axis=0)
lr = tf.cast(lr, tf.float32)
stacked = tf.concat([stacked, lr], axis=-1)
stacked = stacked[:,:,:,3:]
print(stacked.shape)
frames_10 = stacked[:,:,:,:30]
vsr_model_path = './MAI-VSR-Diggers/ckpt/model_none.tflite'
#vsr_model_path = './MAI-VSR-Diggers/ckpt/model.tflite'
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path=vsr_model_path)
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print(input_details, '\n',output_details)
# Run the model
interpreter.resize_tensor_input(input_details[0]['index'], [1, 180, 320, 30], strict=False)
interpreter.allocate_tensors()
interpreter.set_tensor(input_details[0]['index'], in_frames)
interpreter.invoke()
# Extract the output and postprocess it
output_data = interpreter.get_tensor(output_details[0]['index'])
vsr = tf.squeeze(output_data, axis=0)
print(vsr.shape)
frame_1 = stacked[:,:,:,:3]
#lr = tf.squeeze(frame_1, axis=0)
lr = tf.cast(tf.squeeze(frame_1, axis=0), tf.uint8)
print(lr.shape)
#Image.fromarray(np.asarray(lr)).show()
plt.figure(figsize = (5,6))
plt.title('LR')
plt.imshow(lr.numpy());
tensor = vsr[:,:,:3]
shape = tensor.shape
image_scaled = minmax_scale(tf.reshape(tensor,shape=[-1]), feature_range=(0,255)).reshape(shape)
tensor = tensor/255
print(tensor.shape)
plt.figure(figsize=(25, 15))
plt.subplot(1, 2, 1)
plt.title(f'VSR (x4)')
plt.imshow(tensor.numpy());
bicubic = tf.image.resize(lr, [720, 1280], tf.image.ResizeMethod.BICUBIC)
bicubic = tf.cast(bicubic, tf.uint8)
plt.subplot(1, 2, 2)
plt.title('Bicubic')
plt.imshow(bicubic.numpy());
NIce work.
May I know the tensorflow version for the tflite converted from meg?
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.