ceciliavision / zoom-learn-zoom Goto Github PK

View Code? Open in Web Editor NEW

320.0 8.0 55.0 40.95 MB

computational zoom from raw sensor data

Home Page: http://people.eecs.berkeley.edu/~cecilia77/project-pages/zoom.html

Python 5.21% Shell 0.09% Jupyter Notebook 94.70%

super-resolution image-processing image-quality computer-vision raw-image

zoom-learn-zoom's Introduction

zoom-learn-zoom

Code for CVPR 2019 paper: Zoom to Learn, Learn to Zoom

Project Website | Paper

This paper shows that when applying machine learning to digital zoom for photography, it is beneficial to use real, RAW sensor data for training. This code is based on tensorflow (tested on V 1.13.1). It has been tested on Ubuntu 16.04 LTS.

SR-RAW Dataset

Use SR-RAW

SR-RAW training and testing now available here.

(If you want to try out without downloading the full train/test dataset, please see the section of quick inference)

To download testing dataset (7 GB), run:

bash ./scripts/download.sh 19zlN1fqRRm7E_6i5J3B1OskJocVeuvzG test.zip
unzip test.zip
rm test.zip

We used 35 mm images (mostly named '00006' in the sequences) for test.

To download training dataset (58 GB), run:

bash ./scripts/download.sh 1qp6z3F4Ru9srwq1lNZr3pQ4kcVN-AOlM train.zip
unzip train.zip
rm train.zip

Training dataset on Baidu Drive: @llp1996 has kindly uploaded the dataset to Baidu Drive. The key is：wi02. The original issue opened here.

Try with your own data

Our model is trained on raw data in Sony Digital Camera Raw. If you use other types of raw data formats, like DNG used by iPhone (you can use the app Halide to store raw from iPhone), it is necessary to fine tune the model with raw data in that format.

Quick inference

We will download the pre-trained model and example raw data.

git clone https://github.com/ceciliavision/zoom-learn-zoom.git
cd zoom-learn-zoom
bash ./scripts/download.sh 1iForbFhhWqrq22FA1xIusfUpdi8td4Kq model.zip
unzip model.zip
bash ./scripts/download.sh 1WVSGaKIJVHwphTKhcr9ajolEnBh3aUkR quick_inference.zip
unzip quick_inference.zip
rm *.zip
python3 inference.py

Notes about config/inference.yaml

To do inference on a folder, set mode to inference and set inference_root (e.g. ./quick_inference/)
To do inference on a single image, set mode to inference_single and set inference_path (e.g. ./quick_inference/00134.ARW)
Set task_folder (e.g. ./restore_4x)
Results are saved in ./[task_folder]/[mode]

Training

CoBi loss

The implementation of CoBi loss presented in the paper can be found in the ./CX directory. It's modified based on the original contextual loss implementation. Refer to ./loss.py to see how it's used. The full training pipeline is under preparation and will be released somewhere around October.

Data Pre-processing

We provide alignment functions and scripts to account for hand motion when capturing the dataset. This is an optional step, as CoBi loss does not require pixel-wise aligned data pairs for training. However, we notice that having a preliminary (imprecise) alignment step leads to faster convergence. In summary, we provide:

./scripts/run_align.sh is the script that calls ./main_crop.py and ./main_align_camera.py, which first aligns field of view and then accounts for hand motion misalignment among images
./scripts/run_wb.sh is the script that calls ./main_wb.py to compute white balance applied to the processed images in the camera ISP

To run these scripts, fill in [TRAIN_PATH] with your local training data path, and [TEST_PATH] with your local test data path. If you use your own collected data for training, you either follow our data directory structure or modify these scripts.

bash ./scripts/run_align.sh [TRAIN_PATH]
bash ./scripts/run_wb.sh [TRAIN_PATH]
bash ./scripts/run_align.sh [TEST_PATH]
bash ./scripts/run_wb.sh [TEST_PATH]

After running these scripts, you can use the tform.txt and wb.txt inside each sequence during training. The folders called ./cropped, ./compare and ./aligned are only saved for visualization.

[Update 05/15/2020] Preparing RAW-RGB pair demo

We added a documented jupyter notebook demo_train_patch_pair.ipynb to walk you through the process of preparing for RAW-RGB pairs. A few functions are added to utils.py, please check out the newest commit.

Citation

If you find this work useful for your research, please cite:

@inproceedings{zhang2019zoom
  title={Zoom to Learn, Learn to Zoom},
  author={Zhang, Xuaner and Chen, Qifeng and Ng, Ren and Koltun, Vladlen},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2019}
}

Contact

Please contact me if there is any question (Cecilia Zhang [email protected]).

zoom-learn-zoom's People

Contributors

Stargazers

Watchers

zoom-learn-zoom's Issues

Seems like there existed obvious errors in the utils.py

Hi,
I follow the discussion in the #9, and use the utils.py code in the commit 1ebbdf6.
But I found there have many logic errors in the utils.py that make the code cannot get paired raw and rgb data.
Could you please provide the correct training code?

Queries about the dataset

Hi, Thank you for sharing the code the dataset.

Could you please tell me how many LR-HR image pairs in total in your datasets?

How about the scale factors used in this dataset?

Thanks.

can we input rgn image to test?

Can we test the RGB images on your "Ours-png" modle?

the paper have Mentioned “train a copy of our model (“Ours-png”) using 8-bit processed RGB images to evaluate the benefits of having real raw sensor data”

What is the relationship between raw data and jpg data?

What is the relationship between raw data and jpg data?
the width and height of jpg is not double of the raw,what cause this.
Can you tell me how you operate?

About the ratio of SR-RAW dataset

Thanks for sharing your great work! I have some questions w.r.t the SR-RAW dataset.

(1) May I know the corresponding ratio of each data pair?
For example, in ./test/00134, there are 7 images. Are they corresponding to focal length in the Fig 2 of the paper? In fig2, there are two 70mm focal length settings. Any difference between these two settings? What about some folder with 6 images. What are the settings (focal length etc) of these examples?

(2) Additionally, may I know there are some descriptions about the SR-RAW dataset (folder structure explanation and file description) other than the paper?

Really appreciate your help!
Best

Do you have plan to release your training code

@ceciliavision do you have plan to release your training code?

About spatial loss calculation is wrong?

In "CX_helper.py " line 13, 14, you random sample the patches at different locations in the whole image. But in "CSFlow.py" line 125, you use meshgrid to describe the spatial distance. Dose it has something wrong about spatial loss computation?

Question about CoBiRGB loss

Thanks for your work！
1.Is the compute_patch_contextual_loss in loss.py corresponds to the CoBiRGB in your paper?
2.Is the CoBiRGB a kind of Pixel level loss like L1 in EDSR and RDN but compute the cosine distance？

How to use tform.txt and wb.txt during training?

Hi, I would like to know how you use tform.txt and wb.txt during training. How do you obtain corresponding raw-L and rgb-H during training?

issue about calculating CoBi spatial loss

I found there is something strange in CoBi loss implementation, in "CSFlow.py'' line 126, why use 2 same variable "features_grid" to get the spatial coordinate loss?

An error occurs when downloading dataset

Hi,

Thanks for open sourcing your code and dataset.

I got an an error when downloading train.zip file with your script.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 388 0 388 0 0 1001 0 --:--:-- --:--:-- --:--:-- 1000
100 9.9G 0 9.9G 0 0 43.2M 0 --:--:-- 0:03:56 --:--:-- 33.9M
curl: (56) GnuTLS recv error (-9): A TLS packet with unexpected length was received.

I also tried downloading with Google Drvie in Windows7 enviroment.
However, I also got a network error.

By the way, I can download test.zip successfully.

Thanks.

Training Code Not Found

Hi,
I am not able to find the code that shows back propagation. There are multiple losses in the loss.py file and I want to understand how to use it. Can you please provide the training/backpropagation code or explain how to use the loss.py file to run back propagation?

Thanks

Have you aligned the train dataset ?

Dear author, I have found your aligned images in the test dataset. All the images are center-cropped and share the same size. But you have not produced the aligned images of the train set. Could you release that or teach me how to get the aligned images? Could I get that by running the utils_align.py?

Thank you very much.

Dear author: what's the details of 'scale offset'?

In your paper, you mentioned that you conducted a scale offset of 1.07 when you use (35mm, 150mm) to train a 4X zoom model.
Could you please teach me about the 'scale offset'? Thank you very much.

How can we test your provided model using PNG file? OR how can we convert .PNG image to .ARW?

Dear Author,
Thank yo so much for providing useful code. Now can you guide how can use your testing code with my PNG images? OR how can i convert my .PNG images to .ARW?
Kindly guide. thank you!

issue about CoBiRGB

when you introduce CoBiRGB, " where we use n×n RGB patches as features for CoBiRGB,and n should be larger for the 8X zoom (optimal n = 15) than the 4X zoom model (optimal n = 10)" ,
I can not understand the relation between zoom and n.
1.what's the relation between zoom and n?
2.why should set the patches as the feature,not only rgb?
3.If 1X zoom, how should I set the n?
Can you please tell me?

Can you please provide the crop information for aligning the field of view of the different images?

Great work. thank you!
Can you please provide the crop information for aligning the field of view of the different images?
thanks!!

Question about training

i don't find training code in the file ,could you provide your training code? thanks

and , could i upload training data to Baidu Drive in China?

waiting for your training code.

@ceciliavision waiting for your traing code.

Can you please provide the train code?

Can you please provide the train code?
We cannot reproduce your work.

data set Baidu drive link

i upload the dataset to baidu drive !
the link is：https://pan.baidu.com/s/1pRu7DDnpUYaPdXY8Rlb6UQ
the key is ：wi02
hope it can help

About PSNR, SSIM and LPIPS on the released model

I have try many methods to reproduce the metrics on the released model. But those are still different with the metrics on the paper.

Does someone can give me some advise?
Thanks in advance~

Generalization and calculate PSNR

Thanks for your great work!
And I wonder to know whether there will be reddish problems when testing other camera raw files.
Besides, whether the CoBi loss is used when calculating PSNR?
I am looking forward to your reply! Thanks

can you provide the train code

your work is excellent.
can you please provide the train code?i want to learn this code.

Dataset and code licence

Didn't find a license for dataset nor code. Would you please provide one, specially for the dataset. Thank's in advance.

The value of white_lv

When using our own camera, how can we get the value of white_lv? Thank you!

wb_rgb[...,0] *= np.power(out_wb[0,0],1/2.2) why do this ?

inference.py 116~118
wb_rgb[...,0] *= np.power(out_wb[0,0],1/2.2)
wb_rgb[...,1] *= np.power(out_wb[0,1],1/2.2)
wb_rgb[...,2] *= np.power(out_wb[0,3],1/2.2)
i could not understand why do this ,anybody know?

what the function get_transformed_corner() mean?

can you offer the function of get_transformed_corner() code?

你的train.py呢？

同学，都2024年了，你的train.py文件还是没有放出来。我想你是忘记了吧。。。。。。。。

Can't make inference work on CR2 or DNG raw data

When I trun to run the inference on different raw data such as CR2 it returns

ValueError: all the input array dimensions except for the concatenation axis must match exactly

Is there something to do to fix this ?

Best
Bruno

how to test my own data

how to test the jpg or png data without wb using the pretrained model?

could you share iPhoneX-DSLR datasets?

@ceciliavision thanks for your paper give me much help. Now I want to your iPhoneX-DSLR datasets to expriment. could you share the iPhoneX-DSLR datasets? Hope for your reply. thank you very much.

Question about released models

Hi,

First, thanks a lot for sharing your code and dataset! Here I just have two questions about your released model:

I checked its architecture via Tensorboard, and it seems to be an 8x but not 4x model. Is that correct?

Do you plan to release your 4x model as well? I would really appreciate it if you do so!

Many thanks again!

issue about calculating psnr

Thanks for your great work !
And I wonder how to calculate the PSNR value in your paper because I found that the output rgb image can not be aligned with the hr image perfectly, so I always got a poor psnr ~16db, even using the model you have released, however, the visual quality was fine.
So I want to know whether the data were perfectly aligned when you were training and testing, or when you calculated the psnr, you aligned the output and the ground truth again?