How I created talking RobotEldar using AI

Marcus Strömberg

Published:

AI

In this article, I'll explain the basics of the AI services I used, what they do, and how to effectively use them through a technique known as "prompt engineering".

This little AI experiment takes about 30 minutes and is a fun introduction to artificial intelligence. Each of these AI services can be used in a more sophisticated way to generate even more complex things, but for an introduction, I'll keep it simple and cover the basics.

What is prompt engineering?

Prompt engineering is a way to create a conversation between you and the AI engine you are using, by feeding in task-specific prompts. The information you give to the AI engine is the starting point for what will be generated and is crucial for the outcome.

Tips for prompt engineering:

  • Keep it short and specific
  • Use relevant context in conversation
  • Be clear about the desired outcome if you have a specific type in mind
  • Avoid giving too much specific information
  • Experiment with different requests to see what works best

What you need to create a talking 3D character

For my talking 3D character, I used the following AI services:

  • ChatGPT (created the story)
  • Midjourney (created the image)
  • D-ID (created the video, with the narrative voice)

ChatGPT is free to use, but Midjourney and D-ID have a limit on how much you can use the services before you have to pay.

Before you start, as always, it's smart to have a rough idea of what you want to create. Some questions you might ask yourself are: What kind of story do I want to create? Do I want any specific events? How do I envision the character? Having this somewhat clear in your mind is helpful when instructing the AI services.

In this case, I wanted to create a character inspired by my colleague Eldar, and turn him into an AI robot who gets a job at an advertising agency!

Here's how you do it too, step by step:

Step 1: ChatGPT

Create a story with ChatGPT:

  • Provide a detailed "prompt"
  • Use the generated story as a starting point, if you're not satisfied, ask ChatGPT to edit it
  • Manually adjust the story if desired

Step 2: Midjourney

Create a character with Midjourney:

  • I used a prompt called "IMG + Text". This means I used the URL of a picture plus a text description.
  • Find a reference image
  • Paste the URL of the reference image and a text description to generate a character
  • Choose the best result by clicking on U1, U2, U3, U4 ("U = Upscale"), or generate more images until you are satisfied

Step 3: D-ID

Create animation and voice with D-ID:

  • Upload the image generated in step 2 to D-ID
  • Paste the story generated in step 1
  • Choose a preferred voice for the animation
  • Press "Play Sound" and listen to the story. Adjust words as needed for a smooth flow
  • Click "Generate Video" and wait for the result

The result is impressive, but not perfect

Over the past month, I've spent countless hours trying different AI services and experimenting to see what's possible and what's not. I've generated code, all kinds of data tables, ideas for social media posts, frameworks for text writing, stories, e-books, pictures, bedtime stories, videos, meal plans including shopping lists and recipes, and so on.

But a few days ago, when I first tried using this combination of AI services, I was really impressed with the result. Although the animation isn't perfect, or customizable, and the voice-over is the same old robot, it's impressive what can be achieved in no more than 30 minutes.

If we can create this now, try to imagine what we can create using AI as a tool in the future. Best to stay updated!