Stable Diffusion Tutorial: How to bring book characters to live with Stable Diffusion

Introduction

Chroma, is the AI-native open-source embedding database. Chroma makes it easy to build LLM apps by making knowledge, facts, and skills pluggable for LLMs read more…. Get inspired by other Chroma tutorials.

Cohere, is a platform that allows you to build AI-powered applications with just a few lines of code. Cohere’s API allows you to build a wide range of applications, including chatbots, question answering systems, and summarization tools read more…. See what amazing Cohere apps lablab.ai’s community build!

Stable Diffusion, is a new generative model that can generate high-resolution images with a single forward pass. Check out amazing Stable Diffusion application!

What we are going to do?

In this tutorial, I will show you guys how to use Chroma DB and Cohere embeddings to alive the personas from books using Stable Diffusion image generation model. Sit back, relax and enjoy the tutorial! Don’t forget to make a cup of coffee, it may take a while to generate an image.

To make it more clear and understandable, Lemme split the tutorial into two parts:

  • Part 1 – Getting prompt for Stable Diffusion. In this part, we will go through Chroma DB and Cohere LLM. We will load the document, split it into smaller chunks, embed them using Cohere and then we will use Chroma to query the database and get the prompt to use in Part 2.
  • Part 2 – Generating images using Stable Diffusion. In this part, we will go through Stable Diffusion SDK and implement the code to generate images based on the prompt we got from Chroma DB in Part 1.

Learning outcomes

  • How to use Google Colab.
  • Getting familiar with Chroma, Cohere and Stable Diffusion.
  • How to use Cohere LLM to embed large files.
  • How to use Cohere embeddings.
  • How to use Chroma to store the embeddings.
  • How to use Chroma to query the database.
  • How to use Stable Diffusion SDK to generate images and alive the personas from books.
See also  Stable Diffusion tutorial: How to build a Generation Gallery App with Chroma's Semantic Search Capabilities

Prerequisites

To use Cohere embeddings we need API key. Go to Cohere, on the top right corner click TRY NOW, login or create an account. Once you have created an account you will be redirected to the dashboard. Click API Keys on the left sidebar. Copy the API key and save it somewhere safe.

Cohere Dashboard
Cohere Dashboard
Cohere API Key
Cohere API Key

To use Stable Diffusion we need API Key. Go to Dream Studio, Sign up for an account to be taken to your API Key. Click me once you have created an account to be taken to your API Key. Copy the API key and save it somewhere safe.

Stable Diffusion API Key
Stable Diffusion API Key

No, knowledge of using Google Colab is required. I will guide you through the whole process.

Getting started

Create a new project

Let’s start by creating new Notebook in Google Colab. Go to Google Colab > File and click New notebook.

Creating new Notebook in Google Colab
Creating new Notebook in Google Colab

It will open a new Notebook in a new tab, give it a name by clicking on Untitled0 and rename it to Chroma Stable Diffusion Tutorial or whatever you want.

Great, we are ready to start CODING!.

Install dependencies

Add new code cell. You can do it by clicking + Code button or by shortcut CMD/CTRL + M B.

Install the necessary libraries wich we gonna use throughout the tutorial:

Click Run button or CMD/CTRL + Enter, it will run the active code cell and take a few minutes to install all the necessary libraries. Make sure that you have stable internet connection.

Now, if everything is installed correctly, we can move on to the next step.

See also  ChatGPT Plugin Tutorial: How to build ChatGPT Plugin for image generation using Stable Diffusion

Import dependencies

Add new code cell.

Here we will import all the necessary libraries, copy/paste the following lines of code:

Click Run or CMD/CTRL + Enter.

When you run the cell, you may see a warning message/messages. Don’t worry about it. We can ignore it.

Note: You don’t need to save the Notebook every time after running the code cell, Google Colab will automatically save it for you. But, if you want to save it manually, you can do it by clicking File > Save or by shortcut CMD/CTRL + S.

Export environment variables:

Add new code cell.

Click Run or CMD/CTRL + Enter.

Part 1 – Getting prompt for Stable Diffusion

Firstly, let’s quickly upload the book to Google Colab. In this tutorial, we will go with Harry Potter and the Sorcerer’s Stone. You can download the PDF version here.

After, downloading back to Google Colab, go to Files tab on the left side of the screen, click Upload to session storage and upload the file. Wait until the book is uploaded and then copy the path.

Path to uploaded document
Path to uploaded document

Now, we can load the file.

Add new code cell.

Click Run or CMD/CTRL + Enter.

Let’s split the document into smaller chunks.

Why? We should to make sure that the LLM can process the file. If the file is too long, the LLM will not be able to process it.

Add new code cell.

Click Run or CMD/CTRL + Enter.

Next, we will create a vector store.

Add new code cell.

Click Run or CMD/CTRL + Enter.

Now, we should create chain.

Add new code cell.

Click Run or CMD/CTRL + Enter.

Perfect! We have done with chain. Now, we can query based on the processed book. Let’s try to ask a about Harry Potter.

Add new code cell and copy/paste the following lines of code:

Click Run or CMD/CTRL + Enter.

See also  AI21 Labs + Stable Diffusion Tutorial: Beginner friendly Tutorial - How to build your app with AI21 Labs, adding Stable Diffusion integration.

You should see something like in image below, but don’t worry if you see something different.

About Harry Potter
About Harry Potter

Part 2 – Generating image using Stable Diffusion

Now, we will generate an image using Stable Diffusion. We will use Stability SDK to generate an image. Let’s create a Stability SDK client.

Add new code cell.

Click Run or CMD/CTRL + Enter.

Next, paste the prompt you got from the Chroma chain in Part 1.

Add new code cell.

Click Run or CMD/CTRL + Enter.

Now, we can generate an image, based on the prompt.

Add new code cell.

Click Run or CMD/CTRL + Enter.

It will take a while to generate an image. Once it’s done, we can save the image.

Add new code cell and copy/paste the following lines of code:

It will save the image with the seed number as the filename in the same directory as this notebook.

Click Run or CMD/CTRL + Enter.

Now, you can download the image and see the image instantly.

Harry Potter generated art by Stable Diffusion
Harry Potter generated art by Stable Diffusion

Add new code cell.

Click Run or CMD/CTRL + Enter.

Harry Potter generated art by Stable Diffusion
Harry Potter generated art by Stable Diffusion

Congratulations! You’ve successfully alived persona using Stable Diffusion with the prompt generated by Chroma based on the Cohere embeddings.

Summary

Throughout the tutorial, we used various tools and libraries, including Chroma, Cohere embeddings, PyMuPDFLoader, Stability SDK, and the PIL library for image manipulation. We also discussed the prerequisites, which include obtaining API keys for Cohere and Stable Diffusion.

By following this tutorial, you should now have a better understanding of how to leverage Chroma DB and Cohere embeddings to generate images using Stable Diffusion. Feel free to explore further and experiment with different books and settings to generate unique and creative images.

Remember to refer to the respective documentation for Chroma, Cohere, and Stable Diffusion for more in-depth information and advanced usage. You can find them above in the Introduction.

Happy generating!

Thank you for following along with this tutorial, and I hope you learn something new today. If you have any questions, feel free to reach out to me on LinkedIn or Twitter. I’d love to hear from you!

made with 💜 by abdibrokhim for lablab.ai tutorials.