louieyang / deep-photo-styletransfer-tf Goto Github PK

View Code? Open in Web Editor NEW

804.0 804.0 186.0 165.48 MB

Tensorflow (Python API) implementation of Deep Photo Style Transfer

Python 100.00%

deep-photo-styletransfer-tf's People

Contributors

Stargazers

Watchers

Forkers

paojianghu lulzzz johndpope codeaudit mustafamoneer mave5 sungjinlees yehaike jimaldon junerai irfanicmll cclauss oppa3109 19ai johnsnow511 xiuxiuzhang1995 lichaoliu666 abelsu codeamigodev wxthss82 4ever911 ml-lab jerusalemsbell cloudherods gxlcliqi leochencipher boussaffawalid dongjunlee marcomarchesi se7oluti0n jdetle sinianyutian otherend1 hli1221 42binwang zzutk deepmusic moshiii suweixin waleedgondal weihaoxie timbir-git emmanuelezenwere h005 fourwq hairy-crab lingmingli xueyangfu youngstu chumphup zlg358 kirilles detan alabarga anhlbt shubhampachori12110095 potis flanbian hccho2 wesleyw72 taksau happymarco zhukkang jacky-fan sarathknv neurai t0mst0ne leiqi amankh moonlight1776 maxshuang ramanan12345 hackedbyxx hlgkb ycjing yifenzhong1920 jianyuan2015 sandorlevi tmbtw armenghambaryan cowboy-lee nataliezou mltf mengyingwu dedekinds hyangda philandrew hjguyhan caijiyimei lexmao cybersp freakwill parkmftsai sanshiqiduer yongxiongwei wang101 yasohasakii sola303 huangpu1 adrianevi

deep-photo-styletransfer-tf's Issues

召集了一帮兄弟做数据挖掘，加个QQ 一起交流吧：1840658279

如题，谢谢！

Number of iterations

Hello, there! Well, my hardware is not that good and i'm using the google colab free version, but 2k iterations is just too much... How can I turn it down?

Output image file saving and naming

The output iterations seem to be naming themselves based on the total number of iterations not their current iteration. Or maybe I'm doing something wrong.

python deep_photostyle.py --content_image_path ./Test/input/input.png --style_image_path ./Test/style/style.png --content_seg_path ./Test/segmentation/input_seg.png --style_seg_path ./Test/segmentation/style_seg.png --style_option 2 --max_iter 2000 --save_iter 50 --output_image ./Test/results --lbfgs True --init_image_path ./Test/input/input.png

This results in the first image being named "out_iter_2000.png" and being saved in the working directory I launch python from rather than in the results directory.

OOM with a GTX 1080 ti

Can this actually happen? Am I missing something? Image files are about 2-3 MB,
Used Anaconda environment with Cuda 9.0, CuDNN 7.0, Python 3.6, Tensorflow-gpu 1.7 and all the other libraries up to date.

pycuda.driver.CompileError: nvcc preprocessing failed

I'm not sure if you can help on this issue or if someone using this implementation had the same problem but after running the training (for example with --style_option 1 after 1000 iterations), I get the following error:

Traceback (most recent call last):
  File "deep_photostyle.py", line 115, in <module>
    main()
  File "deep_photostyle.py", line 84, in main
    best_ = smooth_local_affine(output_, input_, 1e-7, 3, H, W, args.f_radius, args.f_edge).transpose(1, 2, 0)
  File "C:\Users\spenh\Deep Photo Style Transfer\smooth_local_affine.py", line 332, in smooth_local_affine
    """)
  File "C:\Users\spenh\AppData\Local\Programs\Python\Python36\lib\site-packages\pycuda\compiler.py", line 291, in __init__
    arch, code, cache_dir, include_dirs)
  File "C:\Users\spenh\AppData\Local\Programs\Python\Python36\lib\site-packages\pycuda\compiler.py", line 255, in compile
    return compile_plain(source, options, keep, nvcc, cache_dir, target)
  File "C:\Users\spenh\AppData\Local\Programs\Python\Python36\lib\site-packages\pycuda\compiler.py", line 78, in compile_plain
    checksum.update(preprocess_source(source, options, nvcc).encode("utf-8"))
  File "C:\Users\spenh\AppData\Local\Programs\Python\Python36\lib\site-packages\pycuda\compiler.py", line 55, in preprocess_source
    cmdline, stderr=stderr)
pycuda.driver.CompileError: nvcc preprocessing of C:\Users\spenh\AppData\Local\Temp\tmpzhinmn14.cu failed
[command: nvcc --preprocess -arch sm_61 -m64 -Ic:\users\spenh\appdata\local\programs\python\python36\lib\site-packages\pycuda\cuda C:\Users\spenh\AppData\Local\Temp\tmpzhinmn14.cu --compiler-options -EP]

I couldn't find any help with googling so maybe someone here knows what to do, to try or what this even come from?

My PC:

Windows 10 64 bit
Nvidia GTX 1080
Cuda 8.0
cudnn 6.1
pycuda 2017.1.1+cuda8061
Python 3.6.2rc1
tensorflow-gpu 1.4.0
numpy 1.13.3+mkl
Pillow 4.2.1
scipy 0.19.1

performance time

Hi, I'm running the code on one picture on Tesla K80 (Ubuntu 16 + CUDA 8 + cudann 7 + tensorflow1.4) and 1000 steps takes like 0.5 hour.
is it normal?
How much time it takes you to run one example?

Style transfer without providing segmentation masks?

Is it possible to use the code without providing segmentation masks?
I tried this but it didn't work:

python deep_photostyle.py --content_image_path ./examples/input/in11.png - -style_image_path ./examples/style/tar11.png --style_option 1

Running with GPU

Hi, how can I run the codes with PyCuda? What do i need to modify in the codes?
P/S: Sorry, I'm inexperienced in the use of GPU

Option for training from scratch?

Can this code be used for training a new model, for instance if I want to modify the loss term and retrain the model again - is that possible?

Thanks

question about getlaplacian

when I set epsilon=1e-6, the affine loss is negative. Is any precision issue with that?

TypeError: 'NoneType' object is not subscriptable

[ environment: ]
Ubuntu 17.10
1080Ti cuda 9.1 cudnn-9.1(libcudnn.so.7)
Conda:
cudnn: 7.0.5-cuda8.0_0
keras-gpu: 2.1.3-py36_0
tensorflow-gpu: 1.4.1-0
tensorflow-gpu-base: 1.4.1-py36h01caf0a_0

when I execute the sample code :

/anaconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
2018-02-25 15:09:34.839241: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-02-25 15:09:35.468143: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-02-25 15:09:35.468599: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
totalMemory: 10.92GiB freeMemory: 10.76GiB
2018-02-25 15:09:35.468626: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Setting up style layer: <conv1_1/Relu:0>
Setting up style layer: <conv2_1/Relu:0>
Setting up style layer: <conv3_1/Relu:0>
Setting up style layer: <conv4_1/Relu:0>
Setting up style layer: <conv5_1/Relu:0>
Iteration 0 / 2000
Content loss: nan
Style 1 loss: nan
Style 2 loss: nan
Style 3 loss: nan
Style 4 loss: nan
Style 5 loss: nan
TV loss: 2.216766915807966e-05
Affine loss: 9.999999747378752e-06
Total loss: nan
Traceback (most recent call last):
File "deep_photostyle.py", line 115, in
main()
File "deep_photostyle.py", line 90, in main
result = Image.fromarray(np.uint8(np.clip(tmp_image_bgr[:, :, ::-1], 0, 255.0)))
TypeError: 'NoneType' object is not subscriptable

Question : is it the problem of new Cuda 9.1 or lack of PyCUDA package ? thanks

How About Real Time Style Transfer?

Seems this code has a lot of setup things to work with such as segmentation and should do many iteratrations to generate style image. How about implement real time style transfer, what is the main obstacles to achieve that?

Get error when running the command

get following error

Traceback (most recent call last):
File "deep_photostyle.py", line 115, in
main()
File "deep_photostyle.py", line 63, in main
best_image_bgr = stylize(args, False)
File "/Users/william/Programming/myGithub/deep-photo-styletransfer-tf/photo_style.py", line 231, in stylize
vgg_const = Vgg19()
File "/Users/william/Programming/myGithub/deep-photo-styletransfer-tf/vgg19/vgg.py", line 16, in init
self.data_dict = np.load(vgg19_npy_path, encoding='latin1').item()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/numpy/lib/npyio.py", line 384, in load
fid = open(file, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/Users/william/Programming/myGithub/deep-photo-styletransfer-tf/vgg19/vgg19.npy'

No permission to google drive to get the VGG weights

How can I get the weights? Google drive link doesn't allow direct downloading.

Problem with saving best results under Windows 10.

Windows 10. WinPython 3.5.3 with CUDA, pyCUDA, numpy+mkl, latest tensorflow, etc. CL.exe from MS VS2015 VC++. (pyCUDA setup correctly and successfully run test examples)
When i run programm all seems to be fine: temporary results succesfully saves in result folder, but in the end I got this error:
Traceback (most recent call last):
File "deep_photostyle.py", line 115, in <module> main()
File "deep_photostyle.py", line 84, in main best_ = smooth_local_affine(output_, input_, 1e-7, 3, H, W, args.f_radius, args.f_edge).transpose(1, 2, 0)
File "D:\WinPython\notebooks\deep-photo-styletransfer\smooth_local_affine.py", line 385, in smooth_local_affine np.int32(h), np.int32(w), np.float32(epsilon), np.int32(radius), block=(256, 1, 1), grid=((h * w) / 256 + 1, 1)
File "d:\WinPython\python-3.5.3.amd64\lib\site-packages\pycuda\driver.py", line 402, in function_call func._launch_kernel(grid, block, arg_buf, shared, None)
TypeError: No registered converter was able to produce a C++ rvalue of type unsigned int from this Python object of type float

and at the end I have only tmp iteration images, but not a final results.

Installing PyCUDA in Google Colab

I've been trying to install PyCUDA in latest Google Colab Notebook.

System Spec

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Tue_Jun_12_23:07:04_CDT_2018
Cuda compilation tools, release 9.2, V9.2.148

Tue Nov 13 12:12:40 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.44                 Driver Version: 396.44                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   37C    P8    27W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Encountered with the following error.

    src/cpp/curand.hpp:6:12: fatal error: curand.h: No such file or directory
       #include <curand.h>
                ^~~~~~~~~~
    compilation terminated.
    error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

Time taken for one image

Hi,
It takes about 15-20 mins for transferring a style from one image to another content image. I have RTX 2080 SUPER on my laptop, is this normal? My GPU runs at about 40%. I am trying to do style transfer for my dataset that has 100,000 images and I think I won't be able to do that with this speed. Any suggestions?

Any plans to declare a license?

e.g. https://github.com/blog/1530-choosing-an-open-source-license

interrupt without reason

hi, @LouieYang I run your example

python deep_photostyle.py --content_image_path ./examples/input/in11.png --style_image_path ./examples/style/tar11.png --content_seg_path ./examples/segmentation/in11.png --style_seg_path ./examples/segmentation/tar11.png --style_option 2
2018-07-29 16:28:27.869650: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-07-29 16:28:27.869702: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-07-29 16:28:27.869713: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-07-29 16:28:27.869722: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-07-29 16:28:27.869730: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2018-07-29 16:28:28.231340: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties:
name: Tesla P40
major: 6 minor: 1 memoryClockRate (GHz) 1.531
pciBusID 0000:84:00.0
Total memory: 22.38GiB
Free memory: 22.22GiB
2018-07-29 16:28:28.231407: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0
2018-07-29 16:28:28.231419: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0: Y
2018-07-29 16:28:28.231438: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla P40, pci bus id: 0000:84:00.0)
Setting up style layer: <conv1_1/Relu:0>
Setting up style layer: <conv2_1/Relu:0>
Setting up style layer: <conv3_1/Relu:0>
Setting up style layer: <conv4_1/Relu:0>
Setting up style layer: <conv5_1/Relu:0>
Iteration 0 / 2000
Content loss: 951353.6875
Style 1 loss: 37615.7226562
Style 2 loss: 587025.8125
Style 3 loss: 140299.578125
Style 4 loss: 9803968.0
Style 5 loss: 1114.98840332
TV loss: 2.21390382649e-05
Affine loss: 9.99999974738e-06
Total loss: 11521378.0
Iteration 10 / 2000
Content loss: 947028.1875
Style 1 loss: 37615.7265625
Style 2 loss: 587003.625
Style 3 loss: 140294.0
Style 4 loss: 9801898.0
Style 5 loss: 1114.94946289
TV loss: 0.506700396538
Affine loss: 9.99999974738e-06
Total loss: 11514955.0
2018-07-29 16:28:58.889174: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla P40, pci bus id: 0000:84:00.0)
Setting up style layer: <conv1_1/Relu:0>
Setting up style layer: <conv2_1/Relu:0>
Setting up style layer: <conv3_1/Relu:0>
Setting up style layer: <conv4_1/Relu:0>
Setting up style layer: <conv5_1/Relu:0>
Iteration 20 / 2000
Content loss: 948244.75
Style 1 loss: 37606.8867188
Style 2 loss: 586896.3125
Style 3 loss: 140282.890625
Style 4 loss: 9802683.0
Style 5 loss: 1114.94970703
TV loss: 2.0472342968
Affine loss: [[932.59344]]
Total loss: [[11516831.]]
Iteration 30 / 2000
Content loss: 941090.3125
Style 1 loss: 37606.7304688
Style 2 loss: 586720.1875
Style 3 loss: 140242.375
Style 4 loss: 9798884.0
Style 5 loss: 1114.71008301
TV loss: 11.5854501724
Affine loss: [[6303.9805]]
Total loss: [[11505670.]]

The code interrupt and no reason or error appear, could you help me to fix this?
Help is greatly appreciated!

Error

File "deep_photostyle.py", line 4, in
from photo_style import stylize
File "/home/nvidia/Desktop/Deep-Style-Transfer/deep-photo-styletransfer-tf/photo_style.py", line 6, in
from vgg19.vgg import Vgg19
ImportError: No module named vgg19.vgg

能否将数据模型迁移到Android平台？

如题，谢谢！

Creating high resolution images

Hey there,
first of all thanks for porting to tensorflow!

I was wondering how you managed to create images out of 3500x2340px sized inputs?
I am trying to use 1920x1080 as an input size and that is already wrecking my gpu o.O

Maybe you could share which gpu you have been using?!

error for applying on multiple images in a for loop

Hey,
I am trying to apply this method for many images (i.e. have one reference style image and then have 50 input images that I want to change their styles). Simply I put the whole algorithm in a for loop. However, after a few loops in the for loop (i.e. after 5 loops for 5 input images) I get an error like this:
raise ValueError("GraphDef cannot be larger than 2GB.") ValueError: GraphDef cannot be larger than 2GB.

Any ideas how to get rid of this erroe?

Could you share a baiduyun link of VGG19

as you know，the google drive is not reachable in china

docker

Hi, I wrote a docker to run deep-photo-style transfer-tf, if interested you can integrate it.

https://github.com/dmaugis/docker-deep-photostyle-transfer

delete

Smooth local affine / TypeError

Hello,

I've had troubles to launch the transfer with smooth local affine, the main problem was that when it was passing a float in the grid parameter I had this error :

block=(256, 1, 1), grid=((h * w) / 256 + 1, 1)
  File "/home/username/local/miniconda/lib/python3.6/site-packages/pycuda/driver.py", line 402, in function_call
    func._launch_kernel(grid, block, arg_buf, shared, None)
TypeError: No registered converter was able to produce a C++ rvalue of type unsigned int from this Python object of type float

Which can be easily fixed by transtyping the grid parameter : grid=(int((h * w) / 256 + 1), 1)

Tell me if you had this problem and if it's the right fix,
Otherend1

'best_image' is used prior to global declaration

Variable used before it has been declared. Swap lines 284 and 285 in photo_style.py

default values for parameters

Could you list the default values for your parameters in the readme? I'm using the script now and it seems to be working well but I'd like to play with content weight and style weight however I have no idea where to start since I don't know their current values.