Skip to main content

Machine Learning API tutorial

Fast Style Transfer in TensorFlow

Add styles from famous paintings to any photo in a fraction of a second! You can even style videos!
  
It takes 100ms on a 2015 Titan X to style the MIT Stata Center (1024×680) like Udnie, by Francis Picabia.
Our implementation is based off of a combination of Gatys' A Neural Algorithm of Artistic Style, Johnson's Perceptual Losses for Real-Time Style Transfer and Super-Resolution, and Ulyanov's Instance Normalization.

Running on FloydHub

It is very easy to train, evaluate and serve fast style transfer on [Floydhub][https://www.floydhub.com/]. Follow the instructions below:
  1. Visit Floydhub site to create an account if you do not have one.
  2. Clone this repository to your machine.
  3. Run floyd init <project name> inside the directory.
Now you can:
  1. Train a new model
  2. Evaluate an existing model
  3. Serve a trained model at a public API.

Video Stylization

Here we transformed every frame in a video, then combined the results. Click to go to the full demo on YouTube! The style here is Udnie, as above.
Stylized fox video. Click to go to YouTube!
See how to generate these videos here!

Image Stylization

We added styles from various paintings to a photo of Chicago. Click on thumbnails to see full applied style images.
    
    
   

Implementation Details

Our implementation uses TensorFlow to train a fast style transfer network. We use roughly the same transformation network as described in Johnson, except that batch normalization is replaced with Ulyanov's instance normalization, and the scaling/offset of the output tanh layer is slightly different. We use a loss function close to the one described in Gatys, using VGG19 instead of VGG16 and typically using "shallower" layers than in Johnson's implementation (e.g. we use relu1_1 rather than relu1_2). Empirically, this results in larger scale style features in transformations.

Documentation

Training Style Transfer Networks

Use style.py to train a new style transfer network. Run python style.py to view all the possible parameters. Training takes 4-6 hours on a Maxwell Titan X. More detailed documentation hereBefore you run this, you should run setup.sh. Example usage:
python style.py --style path/to/style/img.jpg \
  --checkpoint-dir checkpoint/path \
  --test path/to/test/img.jpg \
  --test-dir path/to/test/dir \
  --content-weight 1.5e1 \
  --checkpoint-iterations 1000 \
  --batch-size 20

To Train on Floyd

Run style.py like you would normally would. The input data generated by setup.sh is available as a Floyd Data source.
floyd run --data <vgg_and_training_data_source> "python style.py --style path/to/style/img.jpg \
  --checkpoint-dir /output \
  --test path/to/test/img.jpg \
  --test-dir path/to/test/dir \
  --content-weight 1.5e1 \
  --checkpoint-iterations 1000 \
  --batch-size 20"
Remember: floyd command uploads contents of your current directory. So use relative paths only. '/output' is a special path. Contents stored here will be saved after the training is done. So checkpoints should be directed there.

Evaluating Style Transfer Networks

Use evaluate.py to evaluate a style transfer network. Run python evaluate.py to view all the possible parameters. Evaluation takes 100 ms per frame (when batch size is 1) on a Maxwell Titan X. More detailed documentation here. Takes several seconds per frame on a CPU. Models for evaluation are located here. Example usage:
python evaluate.py --checkpoint path/to/style/model.ckpt \
  --in-path dir/of/test/imgs/ \
  --out-path dir/for/results/

To evaluate on Floyd

floyd run --data <output_id_of_training> "python evaluate.py --checkpoint /input/model.ckpt \
  --in-path dir/of/test/imgs/ \
  --out-path dir/for/results/"
Remember: '/input' is a special path. Any data source included in the run command will be available at that path.

Stylizing Video

Use transform_video.py to transfer style into a video. Run python transform_video.py to view all the possible parameters. Requires ffmpegMore detailed documentation here. Example usage:
python transform_video.py --in-path path/to/input/vid.mp4 \
  --checkpoint path/to/style/model.ckpt \
  --out-path out/video.mp4 \
  --device /gpu:0 \
  --batch-size 4

Requirements

You will need the following to run the above:
  • TensorFlow 0.11.0
  • Python 2.7.9, Pillow 3.4.2, scipy 0.18.1, numpy 1.11.2
  • If you want to train (and don't want to wait for 4 months):
    • A decent GPU
    • All the required NVIDIA software to run TF on a GPU (cuda, etc)
  • ffmpeg 3.1.3 if you want to stylize video
Floyd requirements are specified in the floyd_requirements.txt file. They will be installled before running your code.

Citation

  @misc{engstrom2016faststyletransfer,
    author = {Logan Engstrom},
    title = {Fast Style Transfer},
    year = {2016},
    howpublished = {\url{https://github.com/lengstrom/fast-style-transfer/}},
    note = {commit xxxxxxx}
  }

Attributions/Thanks

  • This project could not have happened without the advice (and GPU access) given by Anish Athalye.
    • The project also borrowed some code from Anish's Neural Style
  • Some readme/docs formatting was borrowed from Justin Johnson's Fast Neural Style
  • The image of the Stata Center at the very beginning of the README was taken by Juan Paulo

License

Copyright (c) 2016 Logan Engstrom. Contact me for commercial use (email: engstrom at my university's domain dot edu). Free for research/noncommercial use, as long as proper attribution is given and this copyright notice is retained.

Comments

Popular posts from this blog

Introduction to Machine Learning in Python

Python tutorials for introduction to machine learning Introduction to Machine Learning in Python This repository provides instructional material for machine learning in python. The material is used for two classes taught at NYU Tandon by  Sundeep Rangan : EE-UY / CS-UY 4563: Introduction to Machine Learning (Undergraduate) EL-GY 6123: Introduction to Machine Learning (Graduate) Anyone is free to use and copy this material (at their own risk!). But, please cite the material if you use the material in your own class. Pre-requisites All the software can be run on any laptop (Windows, MAC or UNIX).  Instructions  are also provided to run the code in Google Cloud Platform on a virtual machine (VM). Both classes assume no python or ML experience. However, experience with some programming language (preferably object-oriented) is required. To follow all the mathematical details and to complete the homework exercises, the class assumes undergraduate probability, linear algebra

R tutorials for Data Science, NLP and Machine Learning

R Data Science Tutorials This repo contains a curated list of R tutorials and packages for Data Science, NLP and Machine Learning. This also serves as a reference guide for several common data analysis tasks. Curated list of Python tutorials for Data Science, NLP and Machine Learning . Comprehensive topic-wise list of Machine Learning and Deep Learning tutorials, codes, articles and other resources . Learning R Online Courses tryR on Codeschool Introduction to R for Data Science - Microsoft | edX Introduction to R on DataCamp Data Analysis with R Free resources for learning R R for Data Science - Hadley Wickham Advanced R - Hadley Wickham swirl: Learn R, in R Data Analysis and Visualization Using R MANY R PROGRAMMING TUTORIALS A Handbook of Statistical Analyses Using R , Find Other Chapters Cookbook for R Learning R in 7 simple steps More Resources Awesome-R Repository on GitHub R Reference Card: Cheatsheet R bloggers: blog aggregator R Resources