vita-group / shapematchinggan Goto Github PK

[ICCV 2019, Oral] Controllable Artistic Text Style Transfer via Shape-Matching GAN

License: MIT License

Shell 0.03% Jupyter Notebook 98.41% Python 1.56%

shapematchinggan's Introduction

ShapeMatchingGAN


source	adjustable stylistic degree of glyph	stylized text	application


liquid artistic text rendering	smoke artistic text rendering

This is a pytorch implementation of the paper.

Shuai Yang, Zhangyang Wang, Zhaowen Wang, Ning Xu, Jiaying Liu and Zongming Guo. Controllable Artistic Text Style Transfer via Shape-Matching GAN, accepted by International Conference on Computer Vision (ICCV), 2019.

[Project] | [Paper] | More about artistic text style transfer [Link]

Please consider citing our paper if you find the software useful for your work.

Usage:

Prerequisites

Python 2.7
Pytorch 1.1.0
matplotlib
scipy
Pillow

Install

Clone this repo:

git clone https://github.com/TAMU-VITA/ShapeMatchingGAN.git
cd ShapeMatchingGAN/src

Testing Example

Download pre-trained models from [Google Drive] or [Baidu Cloud](code:rjpi) to ../save/
Artisic text style transfer using fire style with scale 0.0
- Results can be found in ../output/

python test.py \
--scale 0.0
--structure_model ../save/fire-GS-iccv.ckpt \
--texture_model ../save/fire-GT-iccv.ckpt \
--gpu

Artisic text style transfer with specified parameters
- setting scale to -1 means testing with multiple scales in [0,1] with step of scale_step
- specify the input text name, output image path and name with text_name, result_dir and name, respectively

python test.py \
--text_name ../data/rawtext/yaheiB/val/0801.png \
--scale -1 --scale_step 0.2 \
--structure_model ../save/fire-GS-iccv.ckpt \
--texture_model ../save/fire-GT-iccv.ckpt \
--result_dir ../output --name fire-0801 \
--gpu

or just modifying and running

sh ../script/launch_test.sh

For black and white text images, use option --text_type 1
- utils.text_image_preprocessing will transform BW images into distance-based images
- distance-based images make the network better deal with the saturated regions

Training Examples

Training Sketch Module G_B

Download text dataset from [Google Drive] or [Baidu Cloud](code:rjpi) to ../data/
Train G_B with default parameters
- Adding augmented images to the training set can make G_B more robust

python trainSketchModule.py \
--text_path ../data/rawtext/yaheiB/train --text_datasize 708 \
--augment_text_path ../data/rawtext/augment --augment_text_datasize 5 \
--batchsize 16 --Btraining_num 12800 \
--save_GB_name ../save/GB.ckpt \
--gpu

or just modifying and running

sh ../script/launch_SketchModule.sh

Saved model can be found at ../save/

Use --help to view more training options

python trainSketchModule.py --help

Training Structure Transfer G_S

Train G_S with default parameters
- step1: G_S is first trained with a fixed l = 1 to learn the greatest deformation
- step2: we then use l ∈ {0, 1} to learn two extremes
- step3: G_S is tuned on l ∈ {i/K}, i=0,...,K where K = 3 (i.e. --scale_num 4)
- for structure with directional patterns, training without --Sanglejitter will be a good option

python trainStructureTransfer.py \
--style_name ../data/style/fire.png \
--batchsize 16 --Straining_num 2560 \
--step1_epochs 30 --step2_epochs 40 --step3_epochs 80 \
--scale_num 4 \
--Sanglejitter \
--save_path ../save --save_name fire \
--gpu

or just modifying and running

sh ../script/launch_ShapeMGAN_structure.sh

Saved model can be found at ../save/

To preserve the glyph legibility (Eq. (7) in the paper), use option --glyph_preserve
- need to specify the text dataset --text_path ../data/rawtext/yaheiB/train and --text_datasize 708
- need to load pre-trained G_B model --load_GB_name ../save/GB-iccv.ckpt
- in most cases, --glyph_preserve is not necessary, since one can alternatively use a smaller l
Use --help to view more training options

python trainStructureTransfer.py --help

Training Texture Transfer G_T

Train G_T with default parameters
- for complicated style or style with directional patterns, training without --Tanglejitter will be a good option

python trainTextureTransfer.py \
--style_name ../data/style/fire.png \
--batchsize 4 --Ttraining_num 800 \
--texture_step1_epochs 40 \
--Tanglejitter \
--save_path ../save --save_name fire \
--gpu

or just modifying and running

sh ../script/launch_ShapeMGAN_texture.sh

Saved model can be found at ../save/

To train with style loss, use option --style_loss
- need to specify the text dataset --text_path ../data/rawtext/yaheiB/train and --text_datasize 708
- need to load pre-trained G_S model --load_GS_name ../save/fire-GS.ckpt
- adding --style_loss can slightly improve the texture details
Use --help to view more training options

python trainTextureTransfer.py --help

Three training examples are in the IPythonNotebook ShapeMatchingGAN.ipynb

Have fun :-)

Try with your own style images

Style image preparation
- Applicable style types: To make the stylized text easy to recognize, it is desirable to have a certain distinction between the text and the background. If the texture has no distinct shape, the generated stylized text will be mixed with the background. Therefore, textures with distinct shapes as the reference style are recommended.
- Prepare (X,Y): Use Image Matting Algorithm or the Quick Selection Tool in Photoshop to obtain the black and white structure map X (i.e. foreground mask) of the style image Y.
- Prepare distance-based structure map: Use utils.text_image_preprocessing to transform black and white X into distance-based X.
- Concatenate distance-based X with Y as the format of images in ../data/style/ and copy the result to ../data/style/.

Contact

Shuai Yang

[email protected]

shapematchinggan's People

Contributors

Stargazers

Watchers

shapematchinggan's Issues

onboarding on CubeAI

此模型已上架至CubeAI ★ 智立方的AI模型超市，并部署于CubeAI能力开放平台提供RESTful API访问接口，欢迎访问！

CubeAI模型超市地址：

https://cubeai.dimpt.com/#/ucumos/solution/e0cf88febd5841e8b810f2da6af71b6e/view

CubeAI能力开放地址：

https://cubeai.dimpt.com/#/ai-ability/ability/84fa502739e44fa586f0ac05fe69165c/view

模型演示界面：

https://cubeai.dimpt.com/udemo/#/art_word

Image Matting

Which algorithm have you used for image matting? Or did you use Photoshop?

run

excuse me ,i am a beginner of gan,i have a stupid question ,how could i run your code？，sorry to disturb you.

RuntimeError: The size of tensor a (256) must match the size of tensor b (254) at non-singleton dimension 1

when I run trainStructureTransfer.py

my style image size is 848x650

I got this error

Traceback (most recent call last):
File "trainStructureTransfer.py", line 89, in
main()
File "trainStructureTransfer.py", line 43, in main
opts.Sanglejitter, opts.subimg_size, opts.subimg_size)
File "C:\Users\Artificial Dimension\Desktop\ShapeMatchingGAN\src\utils.py", line 160, in cropping_training_batches
input[:,0] = torch.clamp(input[:,0] + noise[:,0], -1, 1)
RuntimeError: The size of tensor a (256) must match the size of tensor b (254) at non-singleton dimension 1

OS:windows 10
env:Anaconda python 3.7

how to fix it?
thx

float error

when I run :
sh ../script/launch_SketchModule.sh

Traceback (most recent call last):
File "/home/lbl/work/ShapeMatchingGAN/src/trainSketchModule.py", line 46, in
main()
File "/home/lbl/work/ShapeMatchingGAN/src/trainSketchModule.py", line 28, in main
opts.text_datasize, trainnum=opts.Btraining_num)
File "/home/lbl/work/ShapeMatchingGAN/src/utils.py", line 82, in load_train_batchfnames
fnames = [('%04d.png' % (i%usenum)) for i in range(trainnum)]
TypeError: 'float' object cannot be interpreted as an integer

Can the Background become Transparent to merge with other images?

Hello, I want to use the text transfer results to merge in another image, can you help me ? Can I make the background transparent ?

higher resolution

I assume if I retrain the sketch network on higher resolution samples and then train both the structure and texture models on ~same sized samples I can achieve a higher resolution output? Should I also modify the --subimg_size option?

Have you tried this or seen any issues with network size maxing out the GPU?

about "water" image

Hi, thank you for your work. Can you share the "water.png" for the dir "./data/style". I just find "fire.png", "leaf.png", "maple.png","sakura.png" and "smoke.png" in that dir.
Thank you very much.

why training for 3 steps?

Hello, thanks for your excellent job. I wonder why we train in three steps rather than one? Is there any difference? Thank you.

I find some problem

As noted, the project was implemented by python27 & pytorch1.10, but in fact pytorch1.10 is only available in python version >=3.7, soI think it is actually 3.7 instead of 2.7.

Testing with multiple scales in [0,n]

Hello, I'm trying to test a model I trained in a bigger range than [0,1]. I tried adding an elif label == -2: with a while scale <= 2.0: in the test.py file but I'm not sure if it is the right way or if I should add/change something else. Could you help me, please?

about testing multiple images

Hi! At present, I want to test multiple images at the same time, and generate a data set under the same style map, but the code level is too naive, and I have tried many modifications without success. Please help me!

questions about the deformation degree

Hi, I tested the pre-trained models you provided, but I feel that the result like the water.gif effect is not very dynamic. I want to make it more deformed.
(1)What can I do? Do I need to change this "--scale -1 --scale_step 0.2 " or something else?
(2) In your paper, it is mentioned that the range of parameter l is [0,1]. What is the reason for this?
Looking forward for your answer.
Thank you.

The stylization of a string

Hello, may I ask how is the stylization of a string of characters realized in the paper? Is it a combination of single characters after stylization

question about different styles produce different deformation degrees

Hi, I test my images on the pre-trained models you provied. But I found a bit strange in the test results.
For example, in the case of the most deformation, ie scale=1, the test results of fire and water are as follows.

Why is the effect of fire deformation so obvious but water has almost no deformation feeling？

can you provide a javascript veision ?please

question about label smoothing

Hi, when I read your code :

https://github.com/TAMU-VITA/ShapeMatchingGAN/blob/90f139fede9beb16a07d00ed8f279a7c4fff2ac9/src/models.py#L164

What is the reason for doing this step (label smoothing)?

could the network generalize to the unseen style

about training

i have a question with training Texture&Structure
when i training Texture or Structure, the input style should mask and original image both?(side by side) or separate training?(Structure training (mask)) (texture training (original image))

balloon special effects

Hello, I try to train the balloon special effects by myself but it seems not work and losses fail to converge. Can you provide me with the balloon style image. It would be nice if the model could be shared. Thank you.
The losses are as follows.
Step1, Epoch [40/40][193/200]: LDadv: +300.645, LGadv: +169.652, Lrec: +28.295, Lsty: +0.000
Step1, Epoch [40/40][194/200]: LDadv: +238.255, LGadv: +199.722, Lrec: +27.427, Lsty: +0.000
Step1, Epoch [40/40][195/200]: LDadv: +234.354, LGadv: +199.173, Lrec: +27.108, Lsty: +0.000
Step1, Epoch [40/40][196/200]: LDadv: +160.336, LGadv: +164.670, Lrec: +19.172, Lsty: +0.000
Step1, Epoch [40/40][197/200]: LDadv: +303.344, LGadv: +240.842, Lrec: +25.351, Lsty: +0.000

python3支持吗

error with trainStructureTransfer.py

I get an error when trying the example trainStructureTransfer.py command listed in the README:

Traceback (most recent call last):
  File "trainStructureTransfer.py", line 89, in <module>
    main()
  File "trainStructureTransfer.py", line 35, in main
    Xl, X, _, Noise = load_style_image_pair(opts.style_name, scales, netSketch, opts.gpu)
  File "/content/ShapeMatchingGAN/src/utils.py", line 119, in load_style_image_pair
    Noise = torch.tensor(0).float().repeat(1, 1, 1).expand(3, ori_ht, ori_wd)
TypeError: expand(): argument 'size' must be tuple of ints, but found element of type float at pos 3

What license do you have?

Hi!

There's a default license for github projects
https://opensource.stackexchange.com/questions/1720/what-can-i-assume-if-a-publicly-published-project-has-no-license

It is kind of restricted.
Do you plan to add a license to the code?

int and float

class GlyphGenerator(nn.Module):
def init(self, ngf=32, n_layers = 5):
super(GlyphGenerator, self).init()

    encoder = []
    encoder.append(ReplicationPad2d(padding=4))
    encoder.append(Conv2d(out_channels=ngf, kernel_size=9, padding=0, in_channels=3))
    encoder.append(LeakyReLU(0.2))
    encoder.append(myGConv(ngf*2, 2, ngf))
    encoder.append(myGConv(ngf*4, 2, ngf*2))

    transformer = []
    print(n_layers/2-1)
    for n in range(n_layers/2-1):
        transformer.append(myGCombineBlock(ngf*4,p=0.0))
    # dropout to make model more robust    
    transformer.append(myGCombineBlock(ngf*4,p=0.5))
    transformer.append(myGCombineBlock(ngf*4,p=0.5))
    for n in range(n_layers/2+1,n_layers):
        transformer.append(myGCombineBlock(ngf*4,p=0.0))

some float in range() are encontered

RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0

您好，当我运行test.py时出现
RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0
是为什么呀

about other style image training

Hi, I use another style image as following which you has shown in other place.

I train on this snow style image with the same parameters as you say in README.md. And then I test the model, the result is as following: (two images are with different values of l)

I don't know where the problem is. I hope you can give some suggestions or share the parameter settings of the snow style image you trained before.
Thank you very much.

Is distance-based process necessary?

Hello, Thank you for your excellent job. I wonder the the importance of text distance postprocess. I find that the training text datasets are images with distance postprocess. But it is not used if we do not use "Lgly", which is based on the choice of glyph_preserve. Besides, it is not selected by default in the training code. Thank you.

train my own font dataset

Hi, Thank you for you work!
I want to train my own handwrite dataset, the loss result as following:

Epoch [4/10][1757/6000]: LDadv: +0.302, LGadv: +78.938, Lrec: +0.524
Epoch [4/10][1758/6000]: LDadv: +0.314, LGadv: +78.576, Lrec: +0.518
Epoch [4/10][1759/6000]: LDadv: +0.277, LGadv: +78.604, Lrec: +0.461
Epoch [4/10][1760/6000]: LDadv: +0.260, LGadv: +77.750, Lrec: +0.441
Epoch [4/10][1761/6000]: LDadv: +0.237, LGadv: +77.608, Lrec: +0.524
Epoch [4/10][1762/6000]: LDadv: +0.204, LGadv: +78.481, Lrec: +0.516
Epoch [4/10][1763/6000]: LDadv: +0.254, LGadv: +78.299, Lrec: +0.507
Epoch [4/10][1764/6000]: LDadv: +0.308, LGadv: +79.300, Lrec: +0.587
Epoch [4/10][1765/6000]: LDadv: +0.272, LGadv: +79.734, Lrec: +0.561
Epoch [4/10][1766/6000]: LDadv: +0.255, LGadv: +79.427, Lrec: +0.497
Epoch [4/10][1767/6000]: LDadv: +0.207, LGadv: +79.826, Lrec: +0.448
Epoch [4/10][1768/6000]: LDadv: +0.208, LGadv: +79.392, Lrec: +0.501
Epoch [4/10][1769/6000]: LDadv: +0.177, LGadv: +78.863, Lrec: +0.456
Epoch [4/10][1770/6000]: LDadv: +0.225, LGadv: +78.815, Lrec: +0.495
Epoch [4/10][1771/6000]: LDadv: +0.330, LGadv: +78.598, Lrec: +0.467
Epoch [4/10][1772/6000]: LDadv: +0.341, LGadv: +79.278, Lrec: +0.497
Epoch [4/10][1773/6000]: LDadv: +0.315, LGadv: +78.820, Lrec: +0.593
Epoch [4/10][1774/6000]: LDadv: +0.207, LGadv: +78.609, Lrec: +0.428
Epoch [4/10][1775/6000]: LDadv: +0.190, LGadv: +79.434, Lrec: +0.547
Epoch [4/10][1776/6000]: LDadv: +0.314, LGadv: +79.484, Lrec: +0.582
Epoch [4/10][1777/6000]: LDadv: +0.340, LGadv: +79.534, Lrec: +0.545
Epoch [4/10][1778/6000]: LDadv: +0.240, LGadv: +79.546, Lrec: +0.484
Epoch [4/10][1779/6000]: LDadv: +0.305, LGadv: +79.679, Lrec: +0.552
Epoch [4/10][1780/6000]: LDadv: +0.389, LGadv: +78.796, Lrec: +0.527
Epoch [4/10][1781/6000]: LDadv: +0.341, LGadv: +78.778, Lrec: +0.514
Epoch [4/10][1782/6000]: LDadv: +0.353, LGadv: +78.656, Lrec: +0.610

the LGadv is too large, could you tell me whether the loss is normal? Thanks

One model for one style?

Hello, I wonder if I want to transfer, can I transfer more than one style like fire and water with only one model.

vita-group / shapematchinggan Goto Github PK

shapematchinggan's Introduction

ShapeMatchingGAN

Usage:

Prerequisites

Install

Testing Example

Training Examples

Training Sketch Module G_B

Training Structure Transfer G_S

Training Texture Transfer G_T

More

Try with your own style images

Contact

shapematchinggan's People

Contributors

Stargazers

Watchers

Forkers

shapematchinggan's Issues

Recommend Projects

Recommend Topics

Recommend Org