The Developer's Guide to Google's Nano Banana 2: AI Image Generation for Apps


- Premium Results
- Publish articles on SitePoint
- Daily curated jobs
- Learning Paths
- Discounts to dev tools
7 Day Free Trial. Cancel Anytime.
Google's generative AI platform provides models that produce images from text prompts and integrate into web and mobile applications. By the end of this guide, you will have a working Next.js app that sends a text prompt to Google's image generation API, displays the result, and deploys to Vercel.
How to Build an AI Image Generation App with Google's API
- Verify your model identifier and SDK requirements against the official Google AI or Vertex AI documentation.
- Scaffold a Next.js App Router project with TypeScript and install the Google AI SDK at a pinned version.
- Store your
GOOGLE_AI_API_KEYin.env.localand create a shared client initialization utility. - Create a Server Action that validates the prompt, calls the generation endpoint with a timeout, and checks for safety filter responses.
- Build a React client component with a prompt input, loading and error states, and base64 image rendering with MIME type validation.
- Configure advanced options such as style presets, resolution, and safety settings using documented parameter names for your model.
- Implement rate limiting on the Server Action to prevent API quota exhaustion and runaway costs.
- Deploy to Vercel with the API key set in the dashboard and the Node.js runtime enforced, then verify with
vercel --prod.
Table of Contents
- What This Guide Covers and Why It Matters
- Prerequisites and Environment Setup
- Installing the Google AI SDK and Configuring Your Project
- Generating Your First Image
- Building a React UI for Prompt Input and Image Display
- Advanced Options: Style, Size, and Safety Controls
- Deploying to Vercel
- Implementation Checklist
- Next Steps
What This Guide Covers and Why It Matters
Google's generative AI platform provides models that produce images from text prompts and integrate into web and mobile applications. By the end of this guide, you will have a working Next.js app that sends a text prompt to Google's image generation API, displays the result, and deploys to Vercel. You skip provisioning GPUs, managing model weights, and running inference servers. The SDK handles the network layer; you write the prompt logic and the UI.
Important: Google's image generation capabilities are evolving rapidly. At the time of writing, image generation from text prompts is available through models such as Imagen on Vertex AI. Before starting, consult the official Google AI documentation and Vertex AI Imagen documentation to confirm the current model identifier, SDK requirements, and supported parameters for your use case. The code examples below use placeholder model and parameter names that must be replaced with verified values from the official documentation before use.
What Modern Google Image Generation Models Offer
Recent Google image generation models produce output at up to 1024×1024 resolution (check the model card for your specific version). The models follow multi-clause prompts without requiring reweighting syntax, which cuts the back-and-forth prompt editing that older models demanded. Style controls now include dedicated API parameters for artistic and photographic presets; consult the API reference for the full list of accepted style values.
The API exposes configurable safety filters that let developers tune content moderation per category and threshold. The SDK itself reduced boilerplate compared to earlier versions by combining initialization and configuration into fewer method calls.
You skip provisioning GPUs, managing model weights, and running inference servers. The SDK handles the network layer; you write the prompt logic and the UI.
Prerequisites and Environment Setup
What You'll Need
Before starting, ensure the following are in place:
- Node.js 18.17+ installed locally. Earlier 18.x releases have incomplete
fetchsupport, which the SDK depends on. - A Google Cloud or AI Studio account with API key provisioning enabled and access to an image generation model
- Working familiarity with Next.js App Router (v13.4+), React, and TypeScript
- A Vercel account for deployment
- A code editor and terminal access
Cost warning: Image generation API calls incur per-request charges depending on your Google Cloud billing plan and the model used. Review your plan's pricing and free-tier limits before generating images at scale.
Installing the Google AI SDK and Configuring Your Project
Scaffolding a Next.js + TypeScript App
A standard Next.js project with TypeScript and the App Router provides the foundation. If you already have a project, skip scaffolding and proceed directly to SDK installation.
If you are adding this to an existing project (not created with create-next-app), ensure your tsconfig.json includes "paths": {"@/*": ["./*"]} for the @/ import alias used throughout this guide.
Adding the Google AI SDK
The Google AI SDK provides the client library for communicating with Google's generative AI endpoints. Installation, environment configuration, and client initialization take three steps:
npx create-next-app@latest nano-banana-demo --typescript --app --tailwind
cd nano-banana-demo
npm install @google/[email protected]
Note: This guide was tested with @google/generative-ai version 0.21.0. Pin your version to avoid breaking changes as the SDK evolves. Check npm for the latest version and review the changelog before upgrading.
Create a file named .env.local in your project root and add the following line. Do not run this in your terminal. Open the file in your editor and paste:
GOOGLE_AI_API_KEY=your-api-key-here
Then create a shared utility file for SDK client initialization:
// lib/google-ai.ts
import { GoogleGenerativeAI } from "@google/generative-ai";
const CONFIGURED_MODEL_ID = "REPLACE_WITH_VERIFIED_MODEL_ID";
function getGenAI(): GoogleGenerativeAI {
if (!process.env.GOOGLE_AI_API_KEY) {
throw new Error("GOOGLE_AI_API_KEY is not set in environment variables");
}
return new GoogleGenerativeAI(process.env.GOOGLE_AI_API_KEY);
}
// IMPORTANT: Replace the value of CONFIGURED_MODEL_ID above with a verified
// model name from the official Google AI documentation before running this code.
// For example, Imagen models on Vertex AI use identifiers like
// "imagen-3.0-generate-002". The correct identifier depends on your
// API access and the generation endpoint you are targeting.
export function getImageModel() {
if (CONFIGURED_MODEL_ID === "REPLACE_WITH_VERIFIED_MODEL_ID") {
throw new Error(
"Model ID has not been configured. Set CONFIGURED_MODEL_ID in lib/google-ai.ts."
);
}
return getGenAI().getGenerativeModel({ model: CONFIGURED_MODEL_ID });
}
export default getGenAI;
This centralizes authentication and model selection. The API key check and model instantiation are deferred to call time rather than module-load time, so that next build succeeds even when the environment variable is absent (for example, in CI or preview deploys where the key is only available at runtime).
Which SDK do you need? The @google/generative-ai package targets Google AI Studio endpoints. If your image generation model is hosted on Vertex AI (as Imagen models typically are), you need the @google-cloud/aiplatform SDK instead. Consult the documentation for your specific model to confirm the correct SDK and endpoint.
Generating Your First Image
Understanding the API Request Shape
You send a prompt string and optional parameters to the generation endpoint. You specify the model identifier when you initialize the client. The exact set of supported parameters, including size presets, style modifiers, and seed, varies by model and SDK version. Consult the official API reference for the model you are using to determine which parameters are available.
Seed-based reproducibility is best-effort if the model supports a seed parameter. Results will differ across model versions or infrastructure changes.
Building a Server Action for Image Generation
A Next.js Server Action keeps the API key server-side and avoids exposing it to the browser. The "use server" directive requires Next.js 13.4 or later. The action accepts a prompt, calls the generation endpoint, and returns image data.
Before deploying to production, you must add rate limiting. The action below enforces a maximum prompt length (2000 characters) and rejects empty strings, but does not throttle requests. Implement rate limiting (for example, using Vercel KV with a sliding window) to prevent abuse and runaway API costs.
// app/actions/generate-image.ts
"use server";
import { getImageModel } from "@/lib/google-ai";
import { FinishReason } from "@google/generative-ai";
export const runtime = "nodejs";
interface GenerateImageRequest {
prompt: string;
seed?: number;
}
interface GenerateImageResponse {
success: boolean;
imageBase64?: string;
mimeType?: string;
error?: string;
}
const MAX_PROMPT_LENGTH = 2000;
const GENERATION_TIMEOUT_MS = 25_000;
export async function generateImage(
request: GenerateImageRequest
): Promise<GenerateImageResponse> {
const trimmedPrompt = request.prompt?.trim() ?? "";
if (trimmedPrompt.length === 0) {
return { success: false, error: "Prompt cannot be empty" };
}
if (trimmedPrompt.length > MAX_PROMPT_LENGTH) {
return {
success: false,
error: `Prompt must be ${MAX_PROMPT_LENGTH} characters or fewer`,
};
}
try {
const imageModel = getImageModel();
// Enforce a timeout so a hung API call does not consume the entire
// serverless function lifetime (up to 30 s on Vercel).
const timeoutPromise = new Promise<never>((_, reject) =>
setTimeout(
() => reject(new Error("Image generation timed out")),
GENERATION_TIMEOUT_MS
)
);
// IMPORTANT: The generationConfig fields below are placeholders.
// Replace or remove responseMimeType, size, style, and seed parameters
// with the actual field names documented for your chosen model's API.
// Consult the official API reference for your model.
const generationPromise = imageModel.generateContent({
contents: [{ role: "user", parts: [{ text: trimmedPrompt }] }],
// NOTE: Remove or replace generationConfig fields once confirmed
// against official API documentation for your chosen model.
});
const result = await Promise.race([generationPromise, timeoutPromise]);
const response = result.response;
const candidate = response.candidates?.[0];
// Check whether the safety filter blocked the response before
// attempting to read image data.
if (candidate?.finishReason === FinishReason.SAFETY) {
return {
success: false,
error:
"Your prompt was flagged by content safety filters. Try rephrasing.",
};
}
const imagePart = candidate?.content?.parts?.find(
(part) => part.inlineData
);
if (!imagePart?.inlineData) {
return { success: false, error: "No image data returned from model" };
}
return {
success: true,
imageBase64: imagePart.inlineData.data,
mimeType: imagePart.inlineData.mimeType,
};
} catch (error) {
const message =
error instanceof Error ? error.message : String(error);
console.error("[generateImage] Generation failed", {
error: message,
promptLength: trimmedPrompt.length,
});
return { success: false, error: message };
}
}
The validation trims the prompt before both the empty check and the length check, so leading/trailing whitespace cannot bypass the limit. A 25-second Promise.race timeout prevents a hung API call from silently consuming the entire serverless function lifetime.
The FinishReason.SAFETY check is built into the action so that filtered content produces a specific user message rather than a generic "no image data" error. Errors log to console.error with contextual metadata before returning, keeping production failures visible in server logs. getImageModel() runs per-request rather than as a module-level singleton, which avoids cross-request state concerns and ensures the build does not fail when the API key is absent.
The action returns a typed response object, making downstream error handling explicit rather than relying on thrown exceptions that client components cannot easily catch across the server boundary.
Building a React UI for Prompt Input and Image Display
Rendering and Downloading the Generated Image
The client component ties together prompt submission, loading state, error display, and image rendering with a download option:
// components/ImageGenerator.tsx
"use client";
import { useState } from "react";
import { generateImage } from "@/app/actions/generate-image";
const ALLOWED_MIME_TYPES = ["image/png", "image/jpeg", "image/webp"] as const;
type AllowedMimeType = (typeof ALLOWED_MIME_TYPES)[number];
function isAllowedMimeType(value: string): value is AllowedMimeType {
return (ALLOWED_MIME_TYPES as readonly string[]).includes(value);
}
function triggerDownload(base64Data: string, mimeType: AllowedMimeType): void {
const ext = mimeType.split("/")[1] ?? "png";
const link = document.createElement("a");
link.href = `data:${mimeType};base64,${base64Data}`;
link.download = `generated-image-${Date.now()}.${ext}`;
document.body.appendChild(link);
try {
link.click();
} finally {
document.body.removeChild(link);
}
}
export default function ImageGenerator() {
const [prompt, setPrompt] = useState("");
const [submittedPrompt, setSubmittedPrompt] = useState("");
const [imageData, setImageData] = useState<string | null>(null);
const [mimeType, setMimeType] = useState<AllowedMimeType>("image/png");
const [loading, setLoading] = useState(false);
const [error, setError] = useState<string | null>(null);
const handleSubmit = async (e: React.FormEvent) => {
e.preventDefault();
if (!prompt.trim()) return;
setLoading(true);
setError(null);
setImageData(null);
try {
const result = await generateImage({ prompt });
if (result.success && result.imageBase64) {
const safeMime = isAllowedMimeType(result.mimeType ?? "")
? (result.mimeType as AllowedMimeType)
: "image/png";
setImageData(result.imageBase64);
setMimeType(safeMime);
setSubmittedPrompt(prompt);
} else {
setError(result.error ?? "An unknown error occurred");
}
} finally {
setLoading(false);
}
};
const handleDownload = () => {
if (!imageData) return;
triggerDownload(imageData, mimeType);
};
return (
<div className="max-w-2xl mx-auto p-6">
<form onSubmit={handleSubmit} className="flex gap-3 mb-6">
<input
type="text"
value={prompt}
onChange={(e) => setPrompt(e.target.value)}
placeholder="Describe the image you want to generate..."
className="flex-1 px-4 py-2 border rounded-lg"
disabled={loading}
maxLength={2000}
/>
<button
type="submit"
disabled={loading || !prompt.trim()}
className="px-6 py-2 bg-blue-600 text-white rounded-lg disabled:opacity-50"
>
{loading ? "Generating..." : "Generate"}
</button>
</form>
{error && (
<div className="p-4 mb-4 bg-red-50 text-red-700 rounded-lg">
{error}
</div>
)}
{imageData && (
<div className="space-y-4">
<img
src={`data:${mimeType};base64,${imageData}`}
alt={`Generated image for: ${submittedPrompt.slice(0, 100)}`}
className="w-full rounded-lg shadow-lg"
/>
<button
onClick={handleDownload}
className="px-4 py-2 bg-gray-800 text-white rounded-lg"
>
Download Image
</button>
</div>
)}
</div>
);
}
This component uses a raw <img> tag rather than the Next.js Image component. This is intentional: base64 data URIs are not compatible with next/image's external URL optimization pipeline.
The try/finally block around generateImage ensures setLoading(false) runs even if the server action throws due to a network error crossing the server boundary, preventing the UI from being permanently spinner-locked. The submittedPrompt state variable captures the prompt at submission time so the alt attribute stays accurate even if the user edits the input field after generation.
The MIME type returned from the server is validated against an explicit allowlist before interpolation into a data: URI. Unrecognized values fall back to image/png, preventing a malformed or unexpected MIME type (such as text/html) from causing unintended browser behavior. The download filename uses the extension derived from the validated MIME type (e.g., .jpeg, .webp) rather than hardcoding .png. The triggerDownload helper wraps link.click() in a try/finally block so the temporary <a> element is removed from the DOM even if .click() throws.
The maxLength attribute on the input element enforces the same 2000-character limit used by the server action, preventing obviously oversized payloads from submission. The component uses a controlled input paired with explicit disabled states during generation, preventing duplicate submissions. Error messages from the server action surface directly in the UI rather than failing silently.
The MIME type returned from the server is validated against an explicit allowlist before interpolation into a
data:URI. Unrecognized values fall back toimage/png, preventing a malformed or unexpected MIME type from causing unintended browser behavior.
Advanced Options: Style, Size, and Safety Controls
Configuring Output Parameters
Depending on the model you are using, you can configure resolution presets, aspect ratio selection, style modifiers, and negative prompts. The exact parameter names and accepted values vary by model and must be sourced from the official API reference.
The generationConfig object in the @google/generative-ai SDK does not include fields named imageSize, imageStyle, negativePrompt, or safetyThreshold. These are not valid GenerationConfig properties. If your model supports such parameters, it exposes them through a different API surface or SDK. Consult the documentation for your specific model to find the correct parameter names and where they belong in the request structure.
For safety settings specifically, the @google/generative-ai SDK uses a safetySettings array at the request level, not a safetyThreshold string inside generationConfig. Here is the correct pattern:
import {
HarmCategory,
HarmBlockThreshold,
} from "@google/generative-ai";
// Example: configuring safety settings (verify enum values against your SDK version)
const result = await imageModel.generateContent({
contents: [{ role: "user", parts: [{ text: prompt }] }],
generationConfig: {
responseMimeType: "image/png",
// Add only parameters documented for your model here.
},
safetySettings: [
{
category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
},
// Add additional category/threshold pairs as needed.
],
});
Handling Content Safety Filters Gracefully
When the safety threshold triggers, the API returns a candidate with a finishReason indicating the content was filtered rather than throwing an exception. The generateImage server action already checks for this (see the implementation above), but if you are building additional generation endpoints, apply the same pattern. Check the value explicitly and present a user-friendly message rather than displaying a generic failure:
import { FinishReason } from "@google/generative-ai";
const candidate = result.response.candidates?.[0];
if (candidate?.finishReason === FinishReason.SAFETY) {
return {
success: false,
error:
"Your prompt was flagged by content safety filters. Try rephrasing.",
};
}
After installing the SDK, confirm that FinishReason.SAFETY exists in your installed version by inspecting the type definitions: grep -A 20 "FinishReason" node_modules/@google/generative-ai/dist/types.d.ts
Deploying to Vercel
Environment Variables and Edge Considerations
Set the GOOGLE_AI_API_KEY in the Vercel dashboard under project settings. Server Actions require the Node.js runtime, so ensure the deployment does not default to the Edge runtime, which lacks full Node.js API compatibility. The export const runtime = "nodejs" directive in the server action file forces the Node.js runtime on Vercel.
Deploy with vercel --prod from the project root.
Cold-start latency on serverless functions typically adds 1 to 3 seconds to the first request after an idle period. For applications where responsiveness matters, consider implementing optimistic UI patterns to mask generation time.
Before deploying, ensure you have implemented rate limiting on your Server Action to prevent abuse. An unprotected generation endpoint can exhaust your API quota and incur significant costs within minutes. Consider using Vercel KV with a sliding window, middleware-based throttling, or a similar mechanism.
An unprotected generation endpoint can exhaust your API quota and incur significant costs within minutes. Consider using Vercel KV with a sliding window, middleware-based throttling, or a similar mechanism.
Implementation Checklist
- Verified the correct model identifier and SDK from official Google AI documentation
- Google AI API key provisioned and stored in
.env.local - Google AI SDK installed (version pinned) and client initialized in
lib/google-ai.tswith lazy initialization - Server Action created for image generation with typed request/response
- Input validation (trim-first) and prompt length limits added to Server Action
- Generation timeout configured in Server Action
- Safety filter (
FinishReason.SAFETY) check integrated in Server Action - Server-side error logging added to Server Action catch block
- Rate limiting implemented on the Server Action
export const runtime = "nodejs"added to Server Action file- React prompt form with loading (
try/finally) and error states - MIME type allowlist enforced before rendering
data:URIs - Download filename extension derived from validated MIME type
altattribute uses capturedsubmittedPrompt, not live input statemaxLengthattribute on prompt input mirrors server-side limit- Output parameters configured using verified, documented field names
- Content safety filter fallback UI implemented using SDK enum values
- Environment variables set in Vercel dashboard
- Production deployment verified with
vercel --prod
Next Steps
For further exploration, consult the Google AI documentation and the Vertex AI Imagen documentation for details on available models, batch generation workflows, and prompt strategies. Natural extensions include persisting generated images to a gallery with database storage or gating generation behind user authentication.