overview.ipynb (revision b6fb3261f9314811a0f4371741dbb8839866f948) - OpenGrok cross reference for /aosp_15_r20/external/tensorflow/tensorflow/lite/g3doc/examples/style_transfer/overview.ipynb

{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "g_nWetWWd_ns"
      },
      "source": [
        "##### Copyright 2019 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "2pHVBk_seED1"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "M7vSdG6sAIQn"
      },
      "source": [
        "# Artistic Style Transfer with TensorFlow Lite"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fwc5GKHBASdc"
      },
      "source": [
        "\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca target=\"_blank\" href=\"https://www.tensorflow.org/lite/examples/style_transfer/overview\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/examples/style_transfer/overview.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/examples/style_transfer/overview.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView source on GitHub\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca href=\"https://storage.googleapis.com/tensorflow_docs/tensorflow/tensorflow/lite/g3doc/examples/style_transfer/overview.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca href=\"https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/hub_logo_32px.png\" /\u003eSee TF Hub model\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "\u003c/table\u003e"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "31O0iaROAw8z"
      },
      "source": [
        "One of the most exciting developments in deep learning to come out recently is [artistic style transfer](https://arxiv.org/abs/1508.06576), or the ability to create a new image, known as a [pastiche](https://en.wikipedia.org/wiki/Pastiche), based on two input images: one representing the artistic style and one representing the content.\n",
        "\n",
        "![Style transfer example](https://storage.googleapis.com/download.tensorflow.org/models/tflite/arbitrary_style_transfer/formula.png)\n",
        "\n",
        "Using this technique, we can generate beautiful new artworks in a range of styles.\n",
        "\n",
        "![Style transfer example](https://storage.googleapis.com/download.tensorflow.org/models/tflite/arbitrary_style_transfer/table.png)\n",
        "\n",
        "If you are new to TensorFlow Lite and are working with Android, we\n",
        "recommend exploring the following example applications that can help you get\n",
        "started.\n",
        "\n",
        "\u003ca class=\"button button-primary\" href=\"https://github.com/tensorflow/examples/tree/master/lite/examples/style_transfer/android\"\u003eAndroid\n",
        "example\u003c/a\u003e \u003ca class=\"button button-primary\" href=\"https://github.com/tensorflow/examples/tree/master/lite/examples/style_transfer/ios\"\u003eiOS\n",
        "example\u003c/a\u003e\n",
        "\n",
        "If you are using a platform other than Android or iOS, or you are already\n",
        "familiar with the\n",
        "\u003ca href=\"https://www.tensorflow.org/api_docs/python/tf/lite\"\u003eTensorFlow Lite\n",
        "APIs\u003c/a\u003e, you can follow this tutorial to learn how to apply style transfer on any pair of content and style image with a pre-trained TensorFlow Lite model. You can use the model to add style transfer to your own mobile applications.\n",
        "\n",
        "The model is open-sourced on [GitHub](https://github.com/tensorflow/magenta/tree/master/magenta/models/arbitrary_image_stylization#train-a-model-on-a-large-dataset-with-data-augmentation-to-run-on-mobile). You can retrain the model with different parameters (e.g. increase content layers' weights to make the output image look more like the content image)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ak0S4gkOCSxs"
      },
      "source": [
        "## Understand the model architecture"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oee6G_bBCgAM"
      },
      "source": [
        "![Model Architecture](https://storage.googleapis.com/download.tensorflow.org/models/tflite/arbitrary_style_transfer/architecture.png)\n",
        "\n",
        "This Artistic Style Transfer model consists of two submodels:\n",
        "1. **Style Prediciton Model**: A MobilenetV2-based neural network that takes an input style image to a 100-dimension style bottleneck vector.\n",
        "1. **Style Transform Model**: A neural network that takes apply a style bottleneck vector to a content image and creates a stylized image.\n",
        "\n",
        "If your app only needs to support a fixed set of style images, you can compute their style bottleneck vectors in advance, and exclude the Style Prediction Model from your app's binary."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "a7ZETsRVNMo7"
      },
      "source": [
        "## Setup"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3n8oObKZN4c8"
      },
      "source": [
        "Import dependencies."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "xz62Lb1oNm97"
      },
      "outputs": [],
      "source": [
        "import tensorflow as tf\n",
        "print(tf.__version__)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "1Ua5FpcJNrIj"
      },
      "outputs": [],
      "source": [
        "import IPython.display as display\n",
        "\n",
        "import matplotlib.pyplot as plt\n",
        "import matplotlib as mpl\n",
        "mpl.rcParams['figure.figsize'] = (12,12)\n",
        "mpl.rcParams['axes.grid'] = False\n",
        "\n",
        "import numpy as np\n",
        "import time\n",
        "import functools"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1b988wrrQnVF"
      },
      "source": [
        "Download the content and style images, and the pre-trained TensorFlow Lite models."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "16g57cIMQnen"
      },
      "outputs": [],
      "source": [
        "content_path = tf.keras.utils.get_file('belfry.jpg','https://storage.googleapis.com/khanhlvg-public.appspot.com/arbitrary-style-transfer/belfry-2611573_1280.jpg')\n",
        "style_path = tf.keras.utils.get_file('style23.jpg','https://storage.googleapis.com/khanhlvg-public.appspot.com/arbitrary-style-transfer/style23.jpg')\n",
        "\n",
        "style_predict_path = tf.keras.utils.get_file('style_predict.tflite', 'https://tfhub.dev/google/lite-model/magenta/arbitrary-image-stylization-v1-256/int8/prediction/1?lite-format=tflite')\n",
        "style_transform_path = tf.keras.utils.get_file('style_transform.tflite', 'https://tfhub.dev/google/lite-model/magenta/arbitrary-image-stylization-v1-256/int8/transfer/1?lite-format=tflite')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MQZXL7kON-gM"
      },
      "source": [
        "## Pre-process the inputs\n",
        "\n",
        "* The content image and the style image must be RGB images with pixel values being float32 numbers between [0..1].\n",
        "* The style image size must be (1, 256, 256, 3). We central crop the image and resize it.\n",
        "* The content image must be (1, 384, 384, 3). We central crop the image and resize it."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Cg0Vi-rXRUFl"
      },
      "outputs": [],
      "source": [
        "# Function to load an image from a file, and add a batch dimension.\n",
        "def load_img(path_to_img):\n",
        "  img = tf.io.read_file(path_to_img)\n",
        "  img = tf.io.decode_image(img, channels=3)\n",
        "  img = tf.image.convert_image_dtype(img, tf.float32)\n",
        "  img = img[tf.newaxis, :]\n",
        "\n",
        "  return img\n",
        "\n",
        "# Function to pre-process by resizing an central cropping it.\n",
        "def preprocess_image(image, target_dim):\n",
        "  # Resize the image so that the shorter dimension becomes 256px.\n",
        "  shape = tf.cast(tf.shape(image)[1:-1], tf.float32)\n",
        "  short_dim = min(shape)\n",
        "  scale = target_dim / short_dim\n",
        "  new_shape = tf.cast(shape * scale, tf.int32)\n",
        "  image = tf.image.resize(image, new_shape)\n",
        "\n",
        "  # Central crop the image.\n",
        "  image = tf.image.resize_with_crop_or_pad(image, target_dim, target_dim)\n",
        "\n",
        "  return image\n",
        "\n",
        "# Load the input images.\n",
        "content_image = load_img(content_path)\n",
        "style_image = load_img(style_path)\n",
        "\n",
        "# Preprocess the input images.\n",
        "preprocessed_content_image = preprocess_image(content_image, 384)\n",
        "preprocessed_style_image = preprocess_image(style_image, 256)\n",
        "\n",
        "print('Style Image Shape:', preprocessed_style_image.shape)\n",
        "print('Content Image Shape:', preprocessed_content_image.shape)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xE4Yt8nArTeR"
      },
      "source": [
        "## Visualize the inputs"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ncPA4esJRcEu"
      },
      "outputs": [],
      "source": [
        "def imshow(image, title=None):\n",
        "  if len(image.shape) \u003e 3:\n",
        "    image = tf.squeeze(image, axis=0)\n",
        "\n",
        "  plt.imshow(image)\n",
        "  if title:\n",
        "    plt.title(title)\n",
        "\n",
        "plt.subplot(1, 2, 1)\n",
        "imshow(preprocessed_content_image, 'Content Image')\n",
        "\n",
        "plt.subplot(1, 2, 2)\n",
        "imshow(preprocessed_style_image, 'Style Image')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CJ7R-CHbjC3s"
      },
      "source": [
        "## Run style transfer with TensorFlow Lite"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "euu00ldHjKwD"
      },
      "source": [
        "### Style prediction"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "o3zd9cTFRiS_"
      },
      "outputs": [],
      "source": [
        "# Function to run style prediction on preprocessed style image.\n",
        "def run_style_predict(preprocessed_style_image):\n",
        "  # Load the model.\n",
        "  interpreter = tf.lite.Interpreter(model_path=style_predict_path)\n",
        "\n",
        "  # Set model input.\n",
        "  interpreter.allocate_tensors()\n",
        "  input_details = interpreter.get_input_details()\n",
        "  interpreter.set_tensor(input_details[0][\"index\"], preprocessed_style_image)\n",
        "\n",
        "  # Calculate style bottleneck.\n",
        "  interpreter.invoke()\n",
        "  style_bottleneck = interpreter.tensor(\n",
        "      interpreter.get_output_details()[0][\"index\"]\n",
        "      )()\n",
        "\n",
        "  return style_bottleneck\n",
        "\n",
        "# Calculate style bottleneck for the preprocessed style image.\n",
        "style_bottleneck = run_style_predict(preprocessed_style_image)\n",
        "print('Style Bottleneck Shape:', style_bottleneck.shape)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "00t8S2PekIyW"
      },
      "source": [
        "### Style transform"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "cZp5bCj8SX1w"
      },
      "outputs": [],
      "source": [
        "# Run style transform on preprocessed style image\n",
        "def run_style_transform(style_bottleneck, preprocessed_content_image):\n",
        "  # Load the model.\n",
        "  interpreter = tf.lite.Interpreter(model_path=style_transform_path)\n",
        "\n",
        "  # Set model input.\n",
        "  input_details = interpreter.get_input_details()\n",
        "  interpreter.allocate_tensors()\n",
        "\n",
        "  # Set model inputs.\n",
        "  interpreter.set_tensor(input_details[0][\"index\"], preprocessed_content_image)\n",
        "  interpreter.set_tensor(input_details[1][\"index\"], style_bottleneck)\n",
        "  interpreter.invoke()\n",
        "\n",
        "  # Transform content image.\n",
        "  stylized_image = interpreter.tensor(\n",
        "      interpreter.get_output_details()[0][\"index\"]\n",
        "      )()\n",
        "\n",
        "  return stylized_image\n",
        "\n",
        "# Stylize the content image using the style bottleneck.\n",
        "stylized_image = run_style_transform(style_bottleneck, preprocessed_content_image)\n",
        "\n",
        "# Visualize the output.\n",
        "imshow(stylized_image, 'Stylized Image')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vv_71Td-QtrW"
      },
      "source": [
        "### Style blending\n",
        "\n",
        "We can blend the style of content image into the stylized output, which in turn making the output look more like the content image."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "eJcAURXQQtJ7"
      },
      "outputs": [],
      "source": [
        "# Calculate style bottleneck of the content image.\n",
        "style_bottleneck_content = run_style_predict(\n",
        "    preprocess_image(content_image, 256)\n",
        "    )"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "4S3yg2MgkmRD"
      },
      "outputs": [],
      "source": [
        "# Define content blending ratio between [0..1].\n",
        "# 0.0: 0% style extracts from content image.\n",
        "# 1.0: 100% style extracted from content image.\n",
        "content_blending_ratio = 0.5 #@param {type:\"slider\", min:0, max:1, step:0.01}\n",
        "\n",
        "# Blend the style bottleneck of style image and content image\n",
        "style_bottleneck_blended = content_blending_ratio * style_bottleneck_content \\\n",
        "                           + (1 - content_blending_ratio) * style_bottleneck\n",
        "\n",
        "# Stylize the content image using the style bottleneck.\n",
        "stylized_image_blended = run_style_transform(style_bottleneck_blended,\n",
        "                                             preprocessed_content_image)\n",
        "\n",
        "# Visualize the output.\n",
        "imshow(stylized_image_blended, 'Blended Stylized Image')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9k9jGIep8p1c"
      },
      "source": [
        "## Performance Benchmarks\n",
        "\n",
        "Performance benchmark numbers are generated with the tool [described here](https://www.tensorflow.org/lite/performance/benchmarks).\n",
        "\u003ctable \u003e\u003cthead\u003e\u003ctr\u003e\u003cth\u003eModel name\u003c/th\u003e \u003cth\u003eModel size\u003c/th\u003e  \u003cth\u003eDevice \u003c/th\u003e \u003cth\u003eNNAPI\u003c/th\u003e \u003cth\u003eCPU\u003c/th\u003e \u003cth\u003eGPU\u003c/th\u003e\u003c/tr\u003e \u003c/thead\u003e \n",
        "\u003ctr\u003e \u003ctd rowspan = 3\u003e \u003ca href=\"https://tfhub.dev/google/lite-model/magenta/arbitrary-image-stylization-v1-256/int8/prediction/1?lite-format=tflite\"\u003eStyle prediction model (int8)\u003c/a\u003e \u003c/td\u003e \n",
        "\u003ctd rowspan = 3\u003e2.8 Mb\u003c/td\u003e\n",
        "\u003ctd\u003ePixel 3 (Android 10) \u003c/td\u003e \u003ctd\u003e142ms\u003c/td\u003e\u003ctd\u003e14ms*\u003c/td\u003e\u003ctd\u003e\u003c/td\u003e\u003c/tr\u003e\n",
        "\u003ctr\u003e\u003ctd\u003ePixel 4 (Android 10) \u003c/td\u003e \u003ctd\u003e5.2ms\u003c/td\u003e\u003ctd\u003e6.7ms*\u003c/td\u003e\u003ctd\u003e\u003c/td\u003e\u003c/tr\u003e\n",
        "\u003ctr\u003e\u003ctd\u003eiPhone XS (iOS 12.4.1) \u003c/td\u003e \u003ctd\u003e\u003c/td\u003e\u003ctd\u003e10.7ms**\u003c/td\u003e\u003ctd\u003e\u003c/td\u003e\u003c/tr\u003e\n",
        "\u003ctr\u003e \u003ctd rowspan = 3\u003e \u003ca href=\"https://tfhub.dev/google/lite-model/magenta/arbitrary-image-stylization-v1-256/int8/transfer/1?lite-format=tflite\"\u003eStyle transform model (int8)\u003c/a\u003e \u003c/td\u003e \n",
        "\u003ctd rowspan = 3\u003e0.2 Mb\u003c/td\u003e\n",
        "\u003ctd\u003ePixel 3 (Android 10) \u003c/td\u003e \u003ctd\u003e\u003c/td\u003e\u003ctd\u003e540ms*\u003c/td\u003e\u003ctd\u003e\u003c/td\u003e\u003c/tr\u003e\n",
        "\u003ctr\u003e\u003ctd\u003ePixel 4 (Android 10) \u003c/td\u003e \u003ctd\u003e\u003c/td\u003e\u003ctd\u003e405ms*\u003c/td\u003e\u003ctd\u003e\u003c/td\u003e\u003c/tr\u003e\n",
        "\u003ctr\u003e\u003ctd\u003eiPhone XS (iOS 12.4.1) \u003c/td\u003e \u003ctd\u003e\u003c/td\u003e\u003ctd\u003e251ms**\u003c/td\u003e\u003ctd\u003e\u003c/td\u003e\u003c/tr\u003e\n",
        "\n",
        "\u003ctr\u003e \u003ctd rowspan = 2\u003e \u003ca href=\"https://tfhub.dev/google/lite-model/magenta/arbitrary-image-stylization-v1-256/fp16/prediction/1?lite-format=tflite\"\u003eStyle prediction model (float16)\u003c/a\u003e \u003c/td\u003e \n",
        "\u003ctd rowspan = 2\u003e4.7 Mb\u003c/td\u003e\n",
        "\u003ctd\u003ePixel 3 (Android 10) \u003c/td\u003e \u003ctd\u003e86ms\u003c/td\u003e\u003ctd\u003e28ms*\u003c/td\u003e\u003ctd\u003e9.1ms\u003c/td\u003e\u003c/tr\u003e\n",
        "\u003ctr\u003e\u003ctd\u003ePixel 4 (Android 10) \u003c/td\u003e\u003ctd\u003e32ms\u003c/td\u003e\u003ctd\u003e12ms*\u003c/td\u003e\u003ctd\u003e10ms\u003c/td\u003e\u003c/tr\u003e\n",
        "\n",
        "\u003ctr\u003e \u003ctd rowspan = 2\u003e \u003ca href=\"https://tfhub.dev/google/lite-model/magenta/arbitrary-image-stylization-v1-256/fp16/transfer/1?lite-format=tflite\"\u003eStyle transfer model (float16)\u003c/a\u003e \u003c/td\u003e \n",
        "\u003ctd rowspan = 2\u003e0.4 Mb\u003c/td\u003e\n",
        "\u003ctd\u003ePixel 3 (Android 10) \u003c/td\u003e \u003ctd\u003e1095ms\u003c/td\u003e\u003ctd\u003e545ms*\u003c/td\u003e\u003ctd\u003e42ms\u003c/td\u003e\u003c/tr\u003e\n",
        "\u003ctr\u003e\u003ctd\u003ePixel 4 (Android 10) \u003c/td\u003e\u003ctd\u003e603ms\u003c/td\u003e\u003ctd\u003e377ms*\u003c/td\u003e\u003ctd\u003e42ms\u003c/td\u003e\u003c/tr\u003e\n",
        "\n",
        "\u003c/table\u003e\n",
        "\n",
        "*\u0026ast; 4 threads used. \u003cbr/\u003e*\n",
        "*\u0026ast;\u0026ast; 2 threads on iPhone for the best performance.*\n"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [],
      "name": "overview.ipynb",
      "provenance": [],
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}