Prompting Guide

⏻ Getting Started with Prompts

A prompt is simply a description you write to tell the AI what image you want. Think of it as giving clear instructions to an artist.

Universal Best Practices

Be specific. Instead of "a dog," try "golden retriever sitting in green grass." The more details you provide, the closer the result will match your vision.

Use natural language. Write in complete sentences like you're talking to someone, not keyword lists. "A cozy coffee shop with vintage furniture" works better than "coffee, shop, vintage, cozy."

Different models, different styles. Each AI model has its own strengths. Some prefer short, precise descriptions. Others work better with longer, narrative prompts. The sections below will show you what works best for each model.

💡 Start Simple: Begin with a basic description, generate an image, then refine your prompt based on what you see. Prompting is iterative—you'll improve with practice.

🍌 Gemini 2.5 Flash Image Preview

Gemini 2.5 Flash prefers narrative, descriptive paragraphs over keyword lists. Think of it as telling a story about the image you want, with rich contextual details and specific technical language.

Key Differences from Seedream 4.0

Narrative over concise: Longer, more descriptive prompts work better
Photographic language: Use camera, lens, and lighting terminology
Context matters: Explain the image's purpose and intent

Prompt Structure for Different Types

Photorealistic Scenes

Include these elements:

Camera angle (e.g., "low angle shot", "bird's eye view")
Lens type (e.g., "50mm lens", "wide-angle")
Lighting (e.g., "golden hour", "soft window light", "dramatic backlighting")
Mood and atmosphere
Texture details

Example:
✅ Good: "A photorealistic close-up portrait of an elderly Japanese ceramicist with deep, sun-etched wrinkles. Captured during golden hour with soft window light streaming from the left, emphasizing the clay texture on their weathered hands. Shot with a 50mm lens at f/2.8 for shallow depth of field."

Stylized Illustrations

Be specific about:

Exact style (e.g., "kawaii", "minimalist line art", "watercolor")
Color palette (e.g., "pastel pink and mint green", "monochromatic blue")
Line characteristics (e.g., "bold outlines", "delicate pencil strokes")
Background style (e.g., "simple white background", "abstract geometric patterns")

Example:
✅ Good: "A kawaii-style illustration of a small robot character with round, expressive eyes. Drawn with smooth vector lines in a pastel color palette of soft pink, baby blue, and cream. Simple white background with subtle sparkle accents around the character."

Best Practices

1. Provide Context and Intent

Explain what the image is for:

✅ Good: "Create a hero image for a tech startup website featuring..."

2. Use Photographic Vocabulary

Technical terms help control composition:

Aperture: "f/1.4 for bokeh", "f/8 for sharpness"
Composition: "rule of thirds", "centered composition", "negative space"
Lighting: "Rembrandt lighting", "three-point lighting", "natural diffused light"

3. Iterate and Refine

Start with a base prompt and progressively add details based on results. Gemini works well with refinement.

4. Use Semantic Negative Prompts

Instead of listing what to avoid, describe what you want:

❌ Avoid: "No blur, no distortion, no artifacts"
✅ Better: "Sharp focus, clean lines, professional quality"

⚠️ Known Limitations: Complex typography can be challenging - keep text simple. Character feature consistency may require multiple iterations. Very specific celebrity likenesses are restricted.

📚 Learn More: Check out the official Google guide for additional examples and advanced techniques.

🌱 Seedream 4.0

Seedream 4.0 favors concise and precise descriptions over verbose, ornate language. The model has improved understanding compared to 3.0, so you can achieve better results with clearer, simpler prompts.

Basic Structure

Use this formula: subject + action + environment

For aesthetic images, add: style, color, lighting, composition

Example:
✅ Good: "A girl in a lavish dress walking under a parasol along a tree-lined path, in the style of a Monet oil painting."
❌ Avoid: "Girl, umbrella, tree-lined street, oil painting texture."

Essential Techniques

1. Text Rendering

Use double quotes for text that should appear in the image:

✅ Good: "Generate a poster with the title \"Seedream 4.0\""
❌ Avoid: "Generate a poster titled Seedream 4.0"

2. Be Specific, Not Vague

Use specific identifiers instead of vague pronouns:

✅ Good: "Dress the tallest panda in pink"
❌ Avoid: "Put that one in pink" or "Dress it"

3. Specify What Changes AND What Stays

When editing, clearly state what should be preserved:

✅ Good: "Replace the bread with a croissant, keeping the action and expression unchanged"

4. State Purpose/Type Explicitly

Tell the model what kind of image you need:

✅ Good: "Design a logo for a gaming company..."
❌ Avoid: "An abstract image with..."

5. Multi-Image References

Clearly label which image does what:

✅ Good: "Replace the character in Image 2 with the character from Image 1"

Editing Operations

Use operation prefixes to clarify your intent:

[Addition] - "Add matching silver earrings and a necklace to the girl"
[Deletion] - "Remove the girl's hat"
[Replacement] - "Replace the largest bread man with a croissant man, keeping the action and expression unchanged"
[Modification] - "Turn the three robots into transparent crystal, colored red, yellow and green from left to right"

❌ Common Mistakes to Avoid: Using fragmented keywords without context • Using vague pronouns (it, that one, this) • Stacking ornate, complex vocabulary unnecessarily • Missing quotes around text content • Not specifying what should stay unchanged during edits

🖌️ Inpainting (General)

Inpainting prompts are different from regular generation prompts. Instead of commanding the AI what to do, describe the final outcome you want in the masked area.

Key Principles

No commands: Don't say "change X to Y" or "remove Z"
Describe the result: Tell the AI what should be there, not what to do
Be specific: Include colors, materials, textures, and styles
Keep it concise: Short but informative descriptions work best

Best Practices

1. Avoid Commanding Language

❌ Bad: "Change the shoes to brown"
✅ Good: "Dark brown leather shoes"

2. Be Detailed and Descriptive

Provide specific details about what you want:

❌ Bad: "Add a chair"
✅ Good: "Vintage wooden rocking chair with a cushioned seat"

3. Specify Materials and Textures

❌ Bad: "Make this a fancy dress"
✅ Good: "Elegant black evening gown with silk fabric"

4. Use Clear, Concise Language

Eliminate vague descriptions and unnecessary words:

❌ Bad: "Put a dog in the image"
✅ Good: "Golden retriever sitting on grass"

Working with Complex Edits

For complex changes, break them into multiple separate edits:

First edit: Change clothing
Second edit: Adjust hairstyle
Third edit: Modify accessories

💡 Tip: This approach maintains better image quality and gives you more control over each element. Ensure your mask covers exactly the area you want to change • Specify precise materials and finishes • Use clear shape and style descriptions

Example Prompts

Goal	Bad Prompt	Good Prompt
Change color	"make shoes brown"	"dark brown leather shoes"
Add object	"put a dog in the image"	"golden retriever sitting on grass"
Change style	"make this a fancy dress"	"elegant black evening gown"
Adjust hair	"shorter hair"	"wavy shoulder-length bob"

FLUX.1 Pro Fill

FLUX.1 Fill is a specialized text-driven image inpainting model designed for precise, targeted editing. It excels at preserving surrounding image context while seamlessly filling masked areas.

Primary Use Cases

Selective region editing: Replace objects while maintaining scene integrity
Text and element changes: Modify specific graphical elements or text
Object replacement: Swap objects while preserving the overall composition
Targeted modifications: Make precise changes to specific image regions

Prompting Best Practices

Be clear and specific: Describe exactly what should appear in the masked area
Focus on the content: FLUX Fill automatically preserves context from surrounding areas
Use descriptive language: Provide details about colors, materials, and characteristics
Trust the context awareness: The model understands the scene and maintains consistency

💡 How It Works: Paint over the areas you want to change with the brush tool. FLUX Fill will regenerate only those masked areas while preserving everything else, maintaining seamless integration with the surrounding image.

📚 Learn More: Check out the official FLUX.1 Fill documentation from Black Forest Labs.

FLUX.1 Kontext

⚠️ Reference Image Required: FLUX.1 Kontext requires a reference image to function properly. You must upload a reference image when using this model.

FLUX.1 Kontext is a specialized inpainting model designed for iterative, context-aware edits. Use this model when you have a reference image that you want to use to maintain style, character identity, or visual consistency during inpainting.

How Reference Images Work

Think of the reference image as a visual dictionary containing details the AI can transfer to your edited image. You must specify which details you want to use in your prompt.

Example: Adding Specific Shoes

If you have a reference image of red Converse All-Star sneakers and want those exact sneakers on your character's feet:

Create a mask over the current shoes/feet of your character
Upload the red Converse sneakers image as your reference
Prompt: "Red Converse All-Star sneakers"

Important: You can add more description if needed. The reference image contains many details (color, style, laces, texture, etc.), but you need to specify which ones matter. If left too vague, the model may ignore important details—even critical ones like color!

⚠️ Be Specific About Details: The reference image provides visual information, but the AI won't automatically use all of it. If you don't mention important details like "red" in your prompt, the AI might ignore the color entirely. Always specify the key characteristics you want transferred from the reference image.

💡 When to Use Kontext: Choose FLUX Kontext when you need to preserve style, character identity, or specific visual elements from a reference image while making edits to your main image.

Core Principles

Be explicitly clear: State exactly what should change and what should stay the same
Iterative approach: Start with simple edits and progressively refine
Precision over transformation: Use specific action words, avoid vague verbs like "transform"

Key Techniques

1. Object Replacement with Reference Details

Use the reference image to guide what should appear in the masked region:

Specify the main object or element from the reference
Describe key characteristics (color, style, material)
Be explicit about which details from the reference to use
Mention how it should fit into the existing scene

Example: "Red Converse All-Star sneakers with white rubber soles and white laces"

2. Blending Reference Elements with Your Scene

The reference image provides visual details to incorporate into the masked area:

Describe how the reference element should appear in context
Specify which visual characteristics from the reference to use
The unmasked areas of your main image are automatically preserved
Focus your prompt on describing what goes in the masked region

3. Visual Editing Strategies

For targeted modifications:

Use the painted mask to guide targeted changes
Be explicit about preservation of surrounding elements
Break complex transformations into sequential steps
Maintain original composition and positioning

Prompt Structure Best Practices

Do:

Use precise action words
Explicitly state what should be preserved
Reference specific elements from your reference image
Keep edits focused and manageable per iteration

Avoid:

Vague transformation verbs
Overly complex multi-step instructions in one prompt
Assuming context without explicit statement
Ignoring the need to specify preservation

✅ Guiding Principle: "Making things more explicitly clear never hurts if the number of instructions per edit is not too complicated." Break complex edits into multiple sequential steps for best results.

📚 Learn More: Check out the official FLUX Kontext guide from Black Forest Labs for additional examples and techniques.

FLUX.1 General

FLUX.1 is a powerful family of image generation models excelling at both artistic and photorealistic outputs. It works best with detailed, structured prompts that combine multiple elements.

Prompt Structure Components

Build comprehensive prompts using these elements:

Subject: What is the main focus of the image
Style: Artistic approach or visual aesthetic
Composition: How elements are arranged
Lighting: Light quality and direction
Color Palette: Dominant colors and tones
Mood/Atmosphere: Emotional quality
Technical Details: Camera settings, lens info
Additional Elements: Supporting details

Key Techniques

1. Be Specific and Descriptive

Provide detailed information about your subject:

❌ Avoid: "a woman"
✅ Good: "a middle-aged woman with curly red hair, green eyes, and freckles"

2. Use Artistic References

Reference specific artists, movements, or styles to guide the creative direction:

Example: "Create an image in the style of Vincent van Gogh's 'Starry Night'"

3. Specify Technical Details

Include camera settings and technical aspects for photorealistic images:

Example: "Shot with a wide-angle lens (24mm) at f/1.8, shallow depth of field"

4. Blend Concepts

Combine different ideas or themes to create unique images:

Example: "Reimagine 'The Last Supper' with robots in a futuristic setting"

5. Use Contrast and Juxtaposition

Create visually striking images with contrasting elements:

Example: "A delicate cherry blossom tree growing through a cracked concrete sidewalk in an urban alley"

Best Practices

Use natural language: Write in complete sentences, not keyword lists
Balance specificity: Be detailed but leave room for creative interpretation
Iterate and refine: Start simple, generate, then add details based on results

Vertex AI Imagen 3

Imagen 3 is Google Cloud's powerful image generation model. It excels at photorealistic images and can generate text within images (up to 25 characters).

Core Prompt Structure

Build your prompts with three key components:

Subject: The primary object or scene
Context: Background and environment details
Style: Artistic or photographic approach

Key Techniques

1. Use Descriptive Language

Be clear and specific with your descriptions:

Example:
"A park next to a lake" → "A park next to a lake, sun setting across the lake, golden hour, red wildflowers"

2. Photography-Specific Modifiers

Use technical photography terms for better control:

Camera proximity: "close-up", "zoomed out", "extreme close-up"
Camera position: "aerial view", "from below", "eye level"
Lighting: "natural light", "dramatic lighting", "soft diffused light"
Lens types: "macro lens", "wide-angle", "50mm lens"
Film styles: "polaroid", "black and white", "35mm film"

3. Iterative Refinement

Start simple and progressively add details:

Begin with the core concept
Generate and review the result
Add specific details or modifiers
Regenerate and refine further

Text Generation in Images

Imagen 3 can generate text within images, but with limitations:

Limit text to 25 characters or less
Can handle 2-3 distinct text phrases
Text placement is experimental—results may vary

Photorealistic Tips

For highly realistic images:

Use specific lens and focal length descriptors
Add technical photography terms (exposure, focus, depth of field)
Specify precise lighting conditions
Reference real-world photography styles

💡 Pro Tip: Experiment with different aspect ratios (1:1, 4:3, 16:9) to see which works best for your composition. Use negative prompts to exclude unwanted elements from your images.

📚 Learn More: Check out the official Vertex AI Imagen guide for additional examples and advanced techniques.

⌨️ Prompting Guide

📚 Quick Navigation

⏻ Getting Started with Prompts

Universal Best Practices

🍌 Gemini 2.5 Flash Image Preview

Key Differences from Seedream 4.0

Prompt Structure for Different Types

Photorealistic Scenes

Stylized Illustrations

Best Practices

1. Provide Context and Intent

2. Use Photographic Vocabulary

3. Iterate and Refine

4. Use Semantic Negative Prompts

🌱 Seedream 4.0

Basic Structure

Essential Techniques

1. Text Rendering

2. Be Specific, Not Vague

3. Specify What Changes AND What Stays

4. State Purpose/Type Explicitly

5. Multi-Image References

Editing Operations

🖌️ Inpainting (General)

Key Principles

Best Practices

1. Avoid Commanding Language

2. Be Detailed and Descriptive

3. Specify Materials and Textures

4. Use Clear, Concise Language

Working with Complex Edits

Example Prompts

FLUX.1 Pro Fill

Primary Use Cases

Prompting Best Practices

FLUX.1 Kontext

How Reference Images Work

Example: Adding Specific Shoes

Core Principles

Key Techniques

1. Object Replacement with Reference Details

2. Blending Reference Elements with Your Scene

3. Visual Editing Strategies

Prompt Structure Best Practices

Do:

Avoid:

FLUX.1 General

Prompt Structure Components

Key Techniques

1. Be Specific and Descriptive

2. Use Artistic References

3. Specify Technical Details

4. Blend Concepts

5. Use Contrast and Juxtaposition

Best Practices

Vertex AI Imagen 3

Core Prompt Structure

Key Techniques

1. Use Descriptive Language

2. Photography-Specific Modifiers

3. Iterative Refinement

Text Generation in Images

Photorealistic Tips