β¨οΈ Prompting Guide
Learn how to write effective prompts for different AI models and tasks
π Quick Navigation
-
β» Getting Started with Prompts
Learn the basics
-
π Gemini 2.5 Flash
a.k.a. nanobanana
-
π± Seedream 4.0
Concise & precise prompts
-
ποΈ Inpainting
General best practices
-
FLUX.1 Pro Fill
Recommended inpainting model
-
FLUX.1 Kontext
Inpainting with reference images
-
FLUX.1 General
General-purpose inpainting
-
Vertex AI Imagen 3
Inpainting model
β» Getting Started with Prompts
A prompt is simply a description you write to tell the AI what image you want. Think of it as giving clear instructions to an artist.
Universal Best Practices
Be specific. Instead of "a dog," try "golden retriever sitting in green grass." The more details you provide, the closer the result will match your vision.
Use natural language. Write in complete sentences like you're talking to someone, not keyword lists. "A cozy coffee shop with vintage furniture" works better than "coffee, shop, vintage, cozy."
Different models, different styles. Each AI model has its own strengths. Some prefer short, precise descriptions. Others work better with longer, narrative prompts. The sections below will show you what works best for each model.
π Gemini 2.5 Flash Image Preview
Gemini 2.5 Flash prefers narrative, descriptive paragraphs over keyword lists. Think of it as telling a story about the image you want, with rich contextual details and specific technical language.
Key Differences from Seedream 4.0
- Narrative over concise: Longer, more descriptive prompts work better
- Photographic language: Use camera, lens, and lighting terminology
- Context matters: Explain the image's purpose and intent
Prompt Structure for Different Types
Photorealistic Scenes
Include these elements:
- Camera angle (e.g., "low angle shot", "bird's eye view")
- Lens type (e.g., "50mm lens", "wide-angle")
- Lighting (e.g., "golden hour", "soft window light", "dramatic backlighting")
- Mood and atmosphere
- Texture details
β Good: "A photorealistic close-up portrait of an elderly Japanese ceramicist with deep, sun-etched wrinkles. Captured during golden hour with soft window light streaming from the left, emphasizing the clay texture on their weathered hands. Shot with a 50mm lens at f/2.8 for shallow depth of field."
Stylized Illustrations
Be specific about:
- Exact style (e.g., "kawaii", "minimalist line art", "watercolor")
- Color palette (e.g., "pastel pink and mint green", "monochromatic blue")
- Line characteristics (e.g., "bold outlines", "delicate pencil strokes")
- Background style (e.g., "simple white background", "abstract geometric patterns")
β Good: "A kawaii-style illustration of a small robot character with round, expressive eyes. Drawn with smooth vector lines in a pastel color palette of soft pink, baby blue, and cream. Simple white background with subtle sparkle accents around the character."
Best Practices
1. Provide Context and Intent
Explain what the image is for:
2. Use Photographic Vocabulary
Technical terms help control composition:
- Aperture: "f/1.4 for bokeh", "f/8 for sharpness"
- Composition: "rule of thirds", "centered composition", "negative space"
- Lighting: "Rembrandt lighting", "three-point lighting", "natural diffused light"
3. Iterate and Refine
Start with a base prompt and progressively add details based on results. Gemini works well with refinement.
4. Use Semantic Negative Prompts
Instead of listing what to avoid, describe what you want:
β Better: "Sharp focus, clean lines, professional quality"
π± Seedream 4.0
Seedream 4.0 favors concise and precise descriptions over verbose, ornate language. The model has improved understanding compared to 3.0, so you can achieve better results with clearer, simpler prompts.
Basic Structure
Use this formula: subject + action + environment
For aesthetic images, add: style, color, lighting, composition
β Good: "A girl in a lavish dress walking under a parasol along a tree-lined path, in the style of a Monet oil painting."
β Avoid: "Girl, umbrella, tree-lined street, oil painting texture."
Essential Techniques
1. Text Rendering
Use double quotes for text that should appear in the image:
β Avoid: "Generate a poster titled Seedream 4.0"
2. Be Specific, Not Vague
Use specific identifiers instead of vague pronouns:
β Avoid: "Put that one in pink" or "Dress it"
3. Specify What Changes AND What Stays
When editing, clearly state what should be preserved:
4. State Purpose/Type Explicitly
Tell the model what kind of image you need:
β Avoid: "An abstract image with..."
5. Multi-Image References
Clearly label which image does what:
Editing Operations
Use operation prefixes to clarify your intent:
- [Addition] - "Add matching silver earrings and a necklace to the girl"
- [Deletion] - "Remove the girl's hat"
- [Replacement] - "Replace the largest bread man with a croissant man, keeping the action and expression unchanged"
- [Modification] - "Turn the three robots into transparent crystal, colored red, yellow and green from left to right"
ποΈ Inpainting (General)
Inpainting prompts are different from regular generation prompts. Instead of commanding the AI what to do, describe the final outcome you want in the masked area.
Key Principles
- No commands: Don't say "change X to Y" or "remove Z"
- Describe the result: Tell the AI what should be there, not what to do
- Be specific: Include colors, materials, textures, and styles
- Keep it concise: Short but informative descriptions work best
Best Practices
1. Avoid Commanding Language
β Good: "Dark brown leather shoes"
2. Be Detailed and Descriptive
Provide specific details about what you want:
β Good: "Vintage wooden rocking chair with a cushioned seat"
3. Specify Materials and Textures
β Good: "Elegant black evening gown with silk fabric"
4. Use Clear, Concise Language
Eliminate vague descriptions and unnecessary words:
β Good: "Golden retriever sitting on grass"
Working with Complex Edits
For complex changes, break them into multiple separate edits:
- First edit: Change clothing
- Second edit: Adjust hairstyle
- Third edit: Modify accessories
Example Prompts
| Goal | Bad Prompt | Good Prompt |
|---|---|---|
| Change color | "make shoes brown" | "dark brown leather shoes" |
| Add object | "put a dog in the image" | "golden retriever sitting on grass" |
| Change style | "make this a fancy dress" | "elegant black evening gown" |
| Adjust hair | "shorter hair" | "wavy shoulder-length bob" |
FLUX.1 Pro Fill
FLUX.1 Fill is a specialized text-driven image inpainting model designed for precise, targeted editing. It excels at preserving surrounding image context while seamlessly filling masked areas.
Primary Use Cases
- Selective region editing: Replace objects while maintaining scene integrity
- Text and element changes: Modify specific graphical elements or text
- Object replacement: Swap objects while preserving the overall composition
- Targeted modifications: Make precise changes to specific image regions
Prompting Best Practices
- Be clear and specific: Describe exactly what should appear in the masked area
- Focus on the content: FLUX Fill automatically preserves context from surrounding areas
- Use descriptive language: Provide details about colors, materials, and characteristics
- Trust the context awareness: The model understands the scene and maintains consistency
FLUX.1 Kontext
FLUX.1 Kontext is a specialized inpainting model designed for iterative, context-aware edits. Use this model when you have a reference image that you want to use to maintain style, character identity, or visual consistency during inpainting.
How Reference Images Work
Think of the reference image as a visual dictionary containing details the AI can transfer to your edited image. You must specify which details you want to use in your prompt.
Example: Adding Specific Shoes
If you have a reference image of red Converse All-Star sneakers and want those exact sneakers on your character's feet:
- Create a mask over the current shoes/feet of your character
- Upload the red Converse sneakers image as your reference
- Prompt: "Red Converse All-Star sneakers"
Important: You can add more description if needed. The reference image contains many details (color, style, laces, texture, etc.), but you need to specify which ones matter. If left too vague, the model may ignore important detailsβeven critical ones like color!
Core Principles
- Be explicitly clear: State exactly what should change and what should stay the same
- Iterative approach: Start with simple edits and progressively refine
- Precision over transformation: Use specific action words, avoid vague verbs like "transform"
Key Techniques
1. Object Replacement with Reference Details
Use the reference image to guide what should appear in the masked region:
- Specify the main object or element from the reference
- Describe key characteristics (color, style, material)
- Be explicit about which details from the reference to use
- Mention how it should fit into the existing scene
2. Blending Reference Elements with Your Scene
The reference image provides visual details to incorporate into the masked area:
- Describe how the reference element should appear in context
- Specify which visual characteristics from the reference to use
- The unmasked areas of your main image are automatically preserved
- Focus your prompt on describing what goes in the masked region
3. Visual Editing Strategies
For targeted modifications:
- Use the painted mask to guide targeted changes
- Be explicit about preservation of surrounding elements
- Break complex transformations into sequential steps
- Maintain original composition and positioning
Prompt Structure Best Practices
Do:
- Use precise action words
- Explicitly state what should be preserved
- Reference specific elements from your reference image
- Keep edits focused and manageable per iteration
Avoid:
- Vague transformation verbs
- Overly complex multi-step instructions in one prompt
- Assuming context without explicit statement
- Ignoring the need to specify preservation
FLUX.1 General
FLUX.1 is a powerful family of image generation models excelling at both artistic and photorealistic outputs. It works best with detailed, structured prompts that combine multiple elements.
Prompt Structure Components
Build comprehensive prompts using these elements:
- Subject: What is the main focus of the image
- Style: Artistic approach or visual aesthetic
- Composition: How elements are arranged
- Lighting: Light quality and direction
- Color Palette: Dominant colors and tones
- Mood/Atmosphere: Emotional quality
- Technical Details: Camera settings, lens info
- Additional Elements: Supporting details
Key Techniques
1. Be Specific and Descriptive
Provide detailed information about your subject:
β Good: "a middle-aged woman with curly red hair, green eyes, and freckles"
2. Use Artistic References
Reference specific artists, movements, or styles to guide the creative direction:
3. Specify Technical Details
Include camera settings and technical aspects for photorealistic images:
4. Blend Concepts
Combine different ideas or themes to create unique images:
5. Use Contrast and Juxtaposition
Create visually striking images with contrasting elements:
Best Practices
- Use natural language: Write in complete sentences, not keyword lists
- Balance specificity: Be detailed but leave room for creative interpretation
- Iterate and refine: Start simple, generate, then add details based on results
Vertex AI Imagen 3
Imagen 3 is Google Cloud's powerful image generation model. It excels at photorealistic images and can generate text within images (up to 25 characters).
Core Prompt Structure
Build your prompts with three key components:
- Subject: The primary object or scene
- Context: Background and environment details
- Style: Artistic or photographic approach
Key Techniques
1. Use Descriptive Language
Be clear and specific with your descriptions:
"A park next to a lake" β "A park next to a lake, sun setting across the lake, golden hour, red wildflowers"
2. Photography-Specific Modifiers
Use technical photography terms for better control:
- Camera proximity: "close-up", "zoomed out", "extreme close-up"
- Camera position: "aerial view", "from below", "eye level"
- Lighting: "natural light", "dramatic lighting", "soft diffused light"
- Lens types: "macro lens", "wide-angle", "50mm lens"
- Film styles: "polaroid", "black and white", "35mm film"
3. Iterative Refinement
Start simple and progressively add details:
- Begin with the core concept
- Generate and review the result
- Add specific details or modifiers
- Regenerate and refine further
Text Generation in Images
Imagen 3 can generate text within images, but with limitations:
- Limit text to 25 characters or less
- Can handle 2-3 distinct text phrases
- Text placement is experimentalβresults may vary
Photorealistic Tips
For highly realistic images:
- Use specific lens and focal length descriptors
- Add technical photography terms (exposure, focus, depth of field)
- Specify precise lighting conditions
- Reference real-world photography styles