Giter Club home page Giter Club logo

zoom-learn-zoom's Introduction

zoom-learn-zoom

Code for CVPR 2019 paper: Zoom to Learn, Learn to Zoom

Project Website | Paper

This paper shows that when applying machine learning to digital zoom for photography, it is beneficial to use real, RAW sensor data for training. This code is based on tensorflow (tested on V 1.13.1). It has been tested on Ubuntu 16.04 LTS.

SR-RAW Dataset

Use SR-RAW

SR-RAW training and testing now available here.

(If you want to try out without downloading the full train/test dataset, please see the section of quick inference)

To download testing dataset (7 GB), run:

bash ./scripts/download.sh 19zlN1fqRRm7E_6i5J3B1OskJocVeuvzG test.zip
unzip test.zip
rm test.zip

We used 35 mm images (mostly named '00006' in the sequences) for test.

To download training dataset (58 GB), run:

bash ./scripts/download.sh 1qp6z3F4Ru9srwq1lNZr3pQ4kcVN-AOlM train.zip
unzip train.zip
rm train.zip

Training dataset on Baidu Drive: @llp1996 has kindly uploaded the dataset to Baidu Drive. The key is:wi02. The original issue opened here.

Try with your own data

Our model is trained on raw data in Sony Digital Camera Raw. If you use other types of raw data formats, like DNG used by iPhone (you can use the app Halide to store raw from iPhone), it is necessary to fine tune the model with raw data in that format.

Quick inference

We will download the pre-trained model and example raw data.

git clone https://github.com/ceciliavision/zoom-learn-zoom.git
cd zoom-learn-zoom
bash ./scripts/download.sh 1iForbFhhWqrq22FA1xIusfUpdi8td4Kq model.zip
unzip model.zip
bash ./scripts/download.sh 1WVSGaKIJVHwphTKhcr9ajolEnBh3aUkR quick_inference.zip
unzip quick_inference.zip
rm *.zip
python3 inference.py

Notes about config/inference.yaml

  • To do inference on a folder, set mode to inference and set inference_root (e.g. ./quick_inference/)
  • To do inference on a single image, set mode to inference_single and set inference_path (e.g. ./quick_inference/00134.ARW)
  • Set task_folder (e.g. ./restore_4x)
  • Results are saved in ./[task_folder]/[mode]

Training

CoBi loss

The implementation of CoBi loss presented in the paper can be found in the ./CX directory. It's modified based on the original contextual loss implementation. Refer to ./loss.py to see how it's used. The full training pipeline is under preparation and will be released somewhere around October.

Data Pre-processing

We provide alignment functions and scripts to account for hand motion when capturing the dataset. This is an optional step, as CoBi loss does not require pixel-wise aligned data pairs for training. However, we notice that having a preliminary (imprecise) alignment step leads to faster convergence. In summary, we provide:

  • ./scripts/run_align.sh is the script that calls ./main_crop.py and ./main_align_camera.py, which first aligns field of view and then accounts for hand motion misalignment among images
  • ./scripts/run_wb.sh is the script that calls ./main_wb.py to compute white balance applied to the processed images in the camera ISP

To run these scripts, fill in [TRAIN_PATH] with your local training data path, and [TEST_PATH] with your local test data path. If you use your own collected data for training, you either follow our data directory structure or modify these scripts.

bash ./scripts/run_align.sh [TRAIN_PATH]
bash ./scripts/run_wb.sh [TRAIN_PATH]
bash ./scripts/run_align.sh [TEST_PATH]
bash ./scripts/run_wb.sh [TEST_PATH]

After running these scripts, you can use the tform.txt and wb.txt inside each sequence during training. The folders called ./cropped, ./compare and ./aligned are only saved for visualization.

[Update 05/15/2020] Preparing RAW-RGB pair demo

We added a documented jupyter notebook demo_train_patch_pair.ipynb to walk you through the process of preparing for RAW-RGB pairs. A few functions are added to utils.py, please check out the newest commit.

Citation

If you find this work useful for your research, please cite:

@inproceedings{zhang2019zoom
  title={Zoom to Learn, Learn to Zoom},
  author={Zhang, Xuaner and Chen, Qifeng and Ng, Ren and Koltun, Vladlen},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2019}
}

Contact

Please contact me if there is any question (Cecilia Zhang [email protected]).

zoom-learn-zoom's People

Contributors

ceciliavision avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zoom-learn-zoom's Issues

Queries about the dataset

Hi, Thank you for sharing the code the dataset.

Could you please tell me how many LR-HR image pairs in total in your datasets?

How about the scale factors used in this dataset?

Thanks.

About the ratio of SR-RAW dataset

Thanks for sharing your great work! I have some questions w.r.t the SR-RAW dataset.

(1) May I know the corresponding ratio of each data pair?
For example, in ./test/00134, there are 7 images. Are they corresponding to focal length in the Fig 2 of the paper? In fig2, there are two 70mm focal length settings. Any difference between these two settings? What about some folder with 6 images. What are the settings (focal length etc) of these examples?

(2) Additionally, may I know there are some descriptions about the SR-RAW dataset (folder structure explanation and file description) other than the paper?

Really appreciate your help!
Best

About spatial loss calculation is wrong?

In "CX_helper.py " line 13, 14, you random sample the patches at different locations in the whole image. But in "CSFlow.py" line 125, you use meshgrid to describe the spatial distance. Dose it has something wrong about spatial loss computation?

Question about CoBiRGB loss

Thanks for your work!
1.Is the compute_patch_contextual_loss in loss.py corresponds to the CoBiRGB in your paper?
2.Is the CoBiRGB a kind of Pixel level loss like L1 in EDSR and RDN but compute the cosine distance?

issue about calculating CoBi spatial loss

I found there is something strange in CoBi loss implementation, in "CSFlow.py'' line 126, why use 2 same variable "features_grid" to get the spatial coordinate loss?

An error occurs when downloading dataset

Hi,

Thanks for open sourcing your code and dataset.

I got an an error when downloading train.zip file with your script.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 388 0 388 0 0 1001 0 --:--:-- --:--:-- --:--:-- 1000
100 9.9G 0 9.9G 0 0 43.2M 0 --:--:-- 0:03:56 --:--:-- 33.9M
curl: (56) GnuTLS recv error (-9): A TLS packet with unexpected length was received.

I also tried downloading with Google Drvie in Windows7 enviroment.
However, I also got a network error.

By the way, I can download test.zip successfully.

Thanks.

Training Code Not Found

Hi,
I am not able to find the code that shows back propagation. There are multiple losses in the loss.py file and I want to understand how to use it. Can you please provide the training/backpropagation code or explain how to use the loss.py file to run back propagation?

Thanks

Have you aligned the train dataset ?

Dear author, I have found your aligned images in the test dataset. All the images are center-cropped and share the same size. But you have not produced the aligned images of the train set. Could you release that or teach me how to get the aligned images? Could I get that by running the utils_align.py?

Thank you very much.

issue about CoBiRGB

when you introduce CoBiRGB, " where we use n×n RGB patches as features for CoBiRGB,and n should be larger for the 8X zoom (optimal n = 15) than the 4X zoom model (optimal n = 10)" ,
I can not understand the relation between zoom and n.
1.what's the relation between zoom and n?
2.why should set the patches as the feature,not only rgb?
3.If 1X zoom, how should I set the n?
Can you please tell me?

Question about training

i don't find training code in the file ,could you provide your training code? thanks

and , could i upload training data to Baidu Drive in China?

About PSNR, SSIM and LPIPS on the released model

I have try many methods to reproduce the metrics on the released model. But those are still different with the metrics on the paper.

Does someone can give me some advise?
Thanks in advance~

Generalization and calculate PSNR

Thanks for your great work!
And I wonder to know whether there will be reddish problems when testing other camera raw files.
Besides, whether the CoBi loss is used when calculating PSNR?
I am looking forward to your reply! Thanks

Dataset and code licence

Didn't find a license for dataset nor code. Would you please provide one, specially for the dataset. Thank's in advance.

你的train.py呢?

同学,都2024年了,你的train.py文件还是没有放出来。我想你是忘记了吧。。。。。。。。

Can't make inference work on CR2 or DNG raw data

When I trun to run the inference on different raw data such as CR2 it returns

ValueError: all the input array dimensions except for the concatenation axis must match exactly

Is there something to do to fix this ?

Best
Bruno

Question about released models

Hi,

First, thanks a lot for sharing your code and dataset! Here I just have two questions about your released model:

I checked its architecture via Tensorboard, and it seems to be an 8x but not 4x model. Is that correct?

Do you plan to release your 4x model as well? I would really appreciate it if you do so!

Many thanks again!

issue about calculating psnr

Thanks for your great work !
And I wonder how to calculate the PSNR value in your paper because I found that the output rgb image can not be aligned with the hr image perfectly, so I always got a poor psnr ~16db, even using the model you have released, however, the visual quality was fine.
So I want to know whether the data were perfectly aligned when you were training and testing, or when you calculated the psnr, you aligned the output and the ground truth again?

about train code

hi,thank you for your excellent work.
can you please provide the train code?i want to learn this code.

the sky is dirty issue

when I use CoBiVGG to train my own data, the sky is very dirty, including dark line and dark border, do you know why it occurs? and can you please tell me the possible reason?

iPhoneX-DSLR data

Hi,
is it possible to share the iPhoneX-DSLR data and pre-process code (alignement, cropping)?
I know its only small dataset, but it would really assist me.
Thanks in advanced.
Ofer

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.