Nano Banana Pro完全指南:专业资产制作的10个技巧

图片[1]-Nano Banana Pro完全指南:专业资产制作的10个技巧-AI Express News

Nano-Banana Pro is a significant leap forward from previous generation models, moving from “fun” image generation to “functional” professional asset production. It excels in text rendering, character consistency, visual synthesis, world knowledge (Search), and high-resolution (4K) output.

Nano-Banana Pro 相比上一代模型有了显著的飞跃,从“有趣”的图像生成转向了“功能性”的专业资产生产。它在文本渲染、角色一致性、视觉合成、世界知识(搜索)和高分辨率(4K)输出方面表现出色。

Following the developer guide on how to get started with AI Studio and the API, this guide covers the core capabilities and how to prompt them effectively.

继关于如何开始使用 AI Studio 和 API 的 开发者指南 之后,本指南将涵盖核心功能以及如何有效地提示它们。

By Guillaume Vernade, Gemini Developer Advocate, Google DeepMind


Table of Contents | 目录

  1. The Golden Rules of Prompting (提示词黄金法则)

  2. Text Rendering, Infographics & Visual Synthesis (文本渲染、信息图表与视觉合成)

  3. Character Consistency & Viral Thumbnails (角色一致性与病毒式缩略图)

  4. Grounding with Google Search (基于 Google 搜索的现实锚定)

  5. Advanced Editing, Restoration & Colorization (高级编辑、修复与上色)

  6. Dimensional Translation (2D ↔ 3D) (维度转换)

  7. High-Resolution & Textures (高分辨率与材质)

  8. Thinking & Reasoning (思考与推理)

  9. One-Shot Storyboarding & Concept Art (单次生成故事板与概念艺术)

  10. Structural Control & Layout Guidance (结构控制与布局指导)

  11. What’s Next? (下一步)


0. The Golden Rules of Prompting

0. 提示词黄金法则

Nano-Banana Pro is a “Thinking” model. It doesn’t just match keywords; it understands intent, physics, and composition. To get the best results, stop using “tag soups” (e.g., dog, park, 4k, realistic) and start acting like a Creative Director.

Nano-Banana Pro 是一个“思考型”模型。它不只是匹配关键词;它理解意图、物理规律和构图。为了获得最佳结果,请停止使用“标签堆砌”(例如:dog, park, 4k, realistic),开始像创意总监一样行事。

The model is exceptionally good at understanding conversational edits. If an image is 80% correct, do not generate a new one from scratch. Instead, simply ask for the specific change you need.

该模型非常擅长理解对话式编辑。如果图像有 80% 是正确的,不要从头开始生成新的。相反,只需通过对话要求进行特定的更改。

Example: “That’s great, but change the lighting to sunset and make the text neon blue.”

示例:“这很棒,但把光线改成日落,并把文字改成霓虹蓝。”

Talk to the model as if you were briefing a human artist. Use proper grammar and descriptive adjectives.

像给人类艺术家下简报一样与模型交谈。使用正确的语法和描述性形容词。

❌ Bad: “Cool car, neon, city, night, 8k.”

❌ 差评:“酷车,霓虹,城市,夜晚,8k。”

✅ Good: “A cinematic wide shot of a futuristic sports car speeding through a rainy Tokyo street at night. The neon signs reflect off the wet pavement and the car’s metallic chassis.”

✅ 好评:“充满电影感的广角镜头,拍摄一辆未来派跑车在雨夜的东京街道上疾驰。霓虹灯牌的倒影映在湿漉漉的路面和汽车的金属底盘上。”

Vague prompts yield generic results. Define the subject, the setting, the lighting, and the mood.

模糊的提示会产生通用的结果。定义主体、背景、灯光和情绪。

Subject: Instead of “a woman,” say “a sophisticated elderly woman wearing a vintage chanel-style suit.”

主体: 不要只说“一个女人”,要说“一位身穿复古香奈儿风格套装的精致老妇人”。

Materiality: Describe textures. “Matte finish,” “brushed steel,” “soft velvet,” “crumpled paper.”

材质: 描述纹理。“哑光表面”、“拉丝钢”、“柔软的天鹅绒”、“揉皱的纸”。

Because the model “thinks,” giving it context helps it make logical artistic decisions.

因为模型会“思考”,给它提供背景信息有助于它做出符合逻辑的艺术决策。

Example: “Create an image of a sandwich for a Brazilian high-end gourmet cookbook.” (The model will infer professional plating, shallow depth of field, and perfect lighting).

示例:“为一本巴西高端美食食谱制作一张三明治的图片。”(模型会推断出专业的摆盘、浅景深和完美的灯光)。


1. Text Rendering, Infographics & Visual Synthesis

1. 文本渲染、信息图表与视觉合成

Nano-Banana Pro has SOTA capabilities for rendering legible, stylized text and synthesizing complex information into visual formats.

Nano-Banana Pro 在渲染清晰、风格化的文本以及将复杂信息合成为视觉格式方面拥有最先进(SOTA)的能力。

Best Practices | 最佳实践:

  • Compression: Ask the model to “compress” dense text or PDFs into visual aids.

    • 压缩:
       要求模型将密集的文本或 PDF “压缩”成视觉辅助材料。
  • Style: Specify if you want a “polished editorial,” a “technical diagram,” or a “hand-drawn whiteboard” look.

    • 风格:
       指定你想要“精美的社论”、“技术图表”还是“手绘白板”风格。
  • Quotes: Clearly specify the text you want in quotes.

    • 引用:
       明确指定你想要引用的文本。

Example Prompts | 提示词示例:

Earnings Report Infographic (Data Ingestion): [Input PDF of Google’s latest earnings report] “Generate a clean, modern infographic summarizing the key financial highlights from this earnings report. Include charts for ‘Revenue Growth’ and ‘Net Income’, and highlight the CEO’s key quote in a stylized pull-quote box.”

财报信息图(数据摄取): [输入 Google 最新财报的 PDF] “生成一张清晰、现代的信息图表,总结这份财报的关键财务亮点。包括‘收入增长’和‘净收入’的图表,并在一个风格化的引用框中突出显示 CEO 的关键语录。”

Earnings Report Infographic

Retro Infographic: “Make a retro, 1950s-style infographic about the history of the American diner. Include distinct sections for ‘The Food,’ ‘The Jukebox,’ and ‘The Decor.’ Ensure all text is legible and stylized to match the period.”

复古信息图: “制作一张复古的、1950 年代风格的关于美式餐厅历史的信息图表。包括‘食物’、‘点唱机’和‘装饰’等不同部分。确保所有文字清晰可读,并具有符合那个时代的风格。”

Retro Infographic

Technical Diagram: “Create an orthographic blueprint that describes this building in plan, elevation, and section. Label the ‘North Elevation’ and ‘Main Entrance’ clearly in technical architectural font. Format 16:9.”

技术图纸: “创建一个正交蓝图,以平面图、立面图和剖面图描述这座建筑。用技术建筑字体清晰地标注‘北立面’和‘主入口’。格式 16:9。”

Technical Diagram

Whiteboard Summary (Educational): “Summarize the concept of ‘Transformer Neural Network Architecture’ as a hand-drawn whiteboard diagram suitable for a university lecture. Use different colored markers for the Encoder and Decoder blocks, and include legible labels for ‘Self-Attention’ and ‘Feed Forward’.”

白板摘要(教育用): “将‘Transformer 神经网络架构’的概念总结为适合大学讲座的手绘白板图。使用不同颜色的马克笔绘制编码器和解码器模块,并包含清晰的‘自注意力’和‘前馈’标签。”

Whiteboard Summary

2. Character Consistency & Viral Thumbnails

2. 角色一致性与病毒式缩略图

Nano-Banana Pro supports up to 14 reference images (6 with high fidelity). This allows for “Identity Locking”—placing a specific person or character into new scenarios without facial distortion.

Nano-Banana Pro 支持多达 14 张参考图像(其中 6 张为高保真)。这允许“身份锁定”——将特定的人物或角色放入新场景中,而不会发生面部变形。

Best Practices | 最佳实践:

  • Identity Locking: Explicitly state: “Keep the person’s facial features exactly the same as Image 1.”

    • 身份锁定:
       明确说明:“保持人物的面部特征与图 1 完全一致。”
  • Expression/Action: Describe the change in emotion or pose while maintaining the identity.

    • 表情/动作:
       描述情绪或姿势的变化,同时保持身份不变。
  • Viral Composition: Combine subjects with bold graphics and text in a single pass.

    • 病毒式构图:
       在一次生成中结合主体、大胆的图形和文字。

Example Prompts | 提示词示例:

The “Viral Thumbnail” (Identity + Text + Graphics): “Design a viral video thumbnail using the person from Image 1. Face Consistency: Keep the person’s facial features exactly the same as Image 1, but change their expression to look excited and surprised. Action: Pose the person on the left side, pointing their finger towards the right side of the frame. Subject: On the right side, place a high-quality image of a delicious avocado toast. Graphics: Add a bold yellow arrow connecting the person’s finger to the toast. Text: Overlay massive, pop-style text in the middle: ‘3分钟搞定!’ (Done in 3 mins!). Use a thick white outline and drop shadow. Background: A blurred, bright kitchen background. High saturation and contrast.”

“病毒式缩略图”(身份+文字+图形): “使用图 1 中的人物设计一个病毒式视频缩略图。面部一致性:保持人物的面部特征与图 1 完全一致,但将表情改为兴奋和惊讶。动作:将人物放在左侧,手指向画面右侧。主体:在右侧放置一张美味牛油果吐司的高质量图片。图形:添加一个粗大的黄色箭头,连接人物的手指和吐司。文字:在中间叠加巨大的流行风格文字:‘3分钟搞定!’。使用粗白色轮廓和投影。背景:模糊、明亮的厨房背景。高饱和度和对比度。”

Viral Thumbnail

The “Fluffy Friends” Scenario (Group Consistency): [Input 3 images of different plush creatures] “Create a funny 10-part story with these 3 fluffy friends going on a tropical vacation. The story is thrilling throughout with emotional highs and lows and ends in a happy moment. Keep the attire and identity consistent for all 3 characters, but their expressions and angles should vary throughout all 10 images. Make sure to only have one of each character in each image.”

“毛茸茸的朋友”场景(群体一致性): [输入 3 张不同毛绒生物的图片] “用这 3 个毛茸茸的朋友去热带度假创作一个有趣的 10 部分故事。故事全程惊险刺激,情绪跌宕起伏,最后以幸福时刻结尾。保持所有 3 个角色的服装和身份一致,但他们的表情和角度在所有 10 张图片中应有所不同。确保每张图片中每个角色只出现一次。”

Fluffy Friends

Brand Asset Generation: [Input 1 image of a product] “Create 9 stunning fashion shots as if they’re from an award-winning fashion editorial. Use this reference as the brand style but add nuance and variety to the range so they convey a professional design touch. Please generate nine images, one at a time.”

品牌资产生成: [输入 1 张产品图片] “创作 9 张令人惊叹的时尚照片,就像出自获奖时尚杂志一样。使用此参考作为品牌风格,但增加细微差别和多样性,使其传达出专业的设计感。请一张接一张地生成九张图片。”

Brand Asset

3. Grounding with Google Search

3. 基于 Google 搜索的现实锚定

Nano-Banana Pro uses Google Search to generate imagery based on real-time data, current events, or factual verification, reducing hallucinations on timely topics.

Nano-Banana Pro 使用 Google Search 根据实时数据、时事或事实验证生成图像,减少对时效性话题的幻觉。

Best Practices | 最佳实践:

  • Ask for visualizations of dynamic data (weather, stocks, news).

    • 要求可视化动态数据(天气、股票、新闻)。
  • The model will “Think” (reason) about the search results before generating the image.

    • 模型在生成图像之前会“思考”(推理)搜索结果。

Example Prompts | 提示词示例:

Event Visualization: “Generate an infographic of the best times to visit the U.S. National Parks in 2025 based on current travel trends.”

事件可视化: “根据当前的旅游趋势,生成一张 2025 年游览美国国家公园最佳时间的信息图表。”

Event Visualization

4. Advanced Editing, Restoration & Colorization

4. 高级编辑、修复与上色

The model excels at complex edits via conversational prompting. This includes “In-painting” (removing/adding objects), “Restoration” (fixing old photos), “Colorization” (Manga/B&W photos), and “Style Swapping.”

该模型擅长通过对话式提示进行复杂编辑。这包括“重绘”(移除/添加物体)、“修复”(修复旧照片)、“上色”(漫画/黑白照片)和“风格转换”。

Best Practices | 最佳实践:

  • Semantic Instructions: You do not need to manually mask; simply tell the model what to change naturally.

    • 语义指令:
       你不需要手动遮罩;只需自然地告诉模型要更改什么。
  • Physics Understanding: You can ask for complex changes like “fill this glass with liquid” to test physics generation.

    • 物理理解:
       你可以要求进行复杂的更改,如“把这个杯子装满液体”,以测试物理生成能力。

Example Prompts | 提示词示例:

Object Removal & In-painting: “Remove the tourists from the background of this photo and fill the space with logical textures (cobblestones and storefronts) that match the surrounding environment.”

物体移除与重绘: “从这张照片的背景中移除游客,并用符合周围环境的逻辑纹理(鹅卵石和店面)填充空间。”

Object Removal

Manga/Comic Colorization: [Input black and white manga panel] “Colorize this manga panel. Use a vibrant anime style palette. Ensure the lighting effects on the energy beams are glowing neon blue and the character’s outfit is consistent with their official colors.”

漫画/连环画上色: [输入黑白漫画分镜] “给这个漫画分镜上色。使用充满活力的动漫风格调色板。确保能量束的灯光效果是发光的霓虹蓝,并且角色的服装与他们的官方颜色一致。”

Manga Colorization

Localization (Text Translation + Cultural Adaptation): [Input image of a London bus stop ad] “Take this concept and localize it to a Tokyo setting, including translating the tagline into Japanese. Change the background to a bustling Shibuya street at night.”

本地化(文本翻译+文化适应): [输入伦敦公交车站广告图片] “采用这个概念并将其本地化到东京场景,包括将标语翻译成日语。将背景改为夜晚熙熙攘攘的涩谷街道。”

Localization

Lighting/Seasonal Control: [Input image of a house in summer] “Turn this scene into winter time. Keep the house architecture exactly the same, but add snow to the roof and yard, and change the lighting to a cold, overcast afternoon.”

光照/季节控制: [输入夏天的房子图片] “将此场景变为冬天。保持房屋结构完全相同,但在屋顶和院子里添加积雪,并将光线改为寒冷、阴沉的下午。”

Lighting/Seasonal Control

5. Dimensional Translation (2D ↔ 3D)

5. 维度转换 (2D ↔ 3D)

A powerful new capability is translating 2D schematics into 3D visualizations, or vice versa. This is ideal for interior designers, architects, and meme creators.

一个强大的新功能是将 2D 示意图转化为 3D 可视化,反之亦然。这对室内设计师、建筑师和梗图创作者来说非常理想。

Example Prompts | 提示词示例:

2D Floor Plan to 3D Interior Design Board: “Based on the uploaded 2D floor plan, generate a professional interior design presentation board in a single image. Layout: A collage with one large main image at the top (wide-angle perspective of the living area), and three smaller images below (Master Bedroom, Home Office, and a 3D top-down floor plan). Style: Apply a Modern Minimalist style with warm oak wood flooring and off-white walls across ALL images. Quality: Photorealistic rendering, soft natural lighting.”

2D 平面图转 3D 室内设计板: “根据上传的 2D 平面图,生成一张专业的室内设计展示板。布局:拼贴画,顶部有一张大的主图(起居区的广角透视),下面有三张较小的图片(主卧室、家庭办公室和 3D 俯视平面图)。风格:在所有图片中应用现代极简主义风格,搭配温暖的橡木地板和灰白色墙壁。质量:照片级渲染,柔和的自然光。”

2D Floor Plan to 3D

2D to 3D Meme Conversion: “Turn the ‘This is Fine’ dog meme into a photorealistic 3D render. Keep the composition identical but make the dog look like a plush toy and the fire look like realistic flames.”

2D 转 3D 梗图转换: “将‘This is Fine’狗的梗图变成照片级真实的 3D 渲染。保持构图完全相同,但让狗看起来像毛绒玩具,火焰看起来像真实的火焰。”

2D to 3D Meme

6. High-Resolution & Textures

6. 高分辨率与材质

Nano-Banana Pro supports native 1K to 4K image generation. This is particularly useful for detailed textures or large-format prints.

Nano-Banana Pro 支持原生 1K 到 4K 图像生成。这对于细节纹理或大幅面打印特别有用。

Best Practices | 最佳实践:

  • Explicitly request high resolutions (2K or 4K) if your API/Interface allows.

    • 如果你的 API/界面允许,明确要求高分辨率(2K 或 4K)。
  • Describe high-fidelity details (imperfections, surface textures).

    • 描述高保真细节(瑕疵、表面纹理)。

Example Prompts | 提示词示例:

4K Texture Generation: “Harness native high-fidelity output to craft a breathtaking, atmospheric environment of a mossy forest floor. Command complex lighting effects and delicate textures, ensuring every strand of moss and beam of light is rendered in pixel-perfect resolution suitable for a 4K wallpaper.”

4K 纹理生成: “利用原生高保真输出,打造令人惊叹、氛围浓厚的长满苔藓的森林地面环境。掌控复杂的灯光效果和细腻的纹理,确保每一缕苔藓和光束都以适合 4K 壁纸的像素级完美分辨率呈现。”

4K Texture

Complex Logic (Thinking Mode): “Create a hyper-realistic infographic of a gourmet cheeseburger, deconstructed to show the texture of the toasted brioche bun, the seared crust of the patty, and the glistening melt of the cheese. Label each layer with its flavor profile.”

复杂逻辑(思考模式): “制作一张超写实的美食芝士汉堡信息图,将其解构以展示烤奶油蛋卷面包的质感、肉饼的焦香外壳以及芝士融化的光泽。用风味简介标注每一层。”

Complex Logic Burger

7. Thinking & Reasoning

7. 思考与推理

Nano-Banana Pro defaults to a “Thinking” process where it generates interim thought images (not charged) to refine composition before rendering the final output. This allows for data analysis and solving visual problems.

Nano-Banana Pro 默认为“思考”过程,在渲染最终输出之前,它会生成中间思维图像(不收费)来优化构图。这允许进行数据分析和解决视觉问题。

Example Prompts | 提示词示例:

Solve Equations: “Solve log_{x^2+1}(x^4-1)=2 in C on a white board. Show the steps clearly.”

解方程: “在白板上求解 log_{x^2+1}(x^4-1)=2 (在复数集 C 中)。清晰地展示步骤。”

Solve Equations

Visual Reasoning: “Analyze this image of a room and generate a ‘before’ image that shows what the room might have looked like during construction, showing the framing and unfinished drywall.”

视觉推理: “分析这张房间的图片,并生成一张‘之前’的图片,展示该房间在施工期间可能的样子,显示框架和未完成的干墙。”

Visual Reasoning

8. One-Shot Storyboarding & Concept Art

8. 单次生成故事板与概念艺术

You can generate sequential art or storyboards without a grid, ensuring a cohesive narrative flow in a single session. This is also popular for “Movie Concept Art” (e.g., fake leaks of upcoming films).

你可以在没有网格的情况下生成连续艺术或故事板,确保在单次会话中实现连贯的叙事流程。这在“电影概念艺术”(例如,即将上映电影的虚假泄露)中也很流行。

Example Prompt | 提示词示例:

“Create an addictively intriguing 9-part story with 9 images featuring a woman and man in an award-winning luxury luggage commercial. The story should have emotional highs and lows, ending on an elegant shot of the woman with the logo. The identity of the woman and man and their attire must stay consistent throughout but they can and should be seen from different angles and distances. Please generate images one at a time. Make sure every image is in a 16:9 landscape format.”

“创作一个引人入胜的 9 部分故事,包含 9 张图片,主角是一男一女,拍摄一支获奖的豪华行李箱广告。故事应该有情绪的起伏,最后以女人与 Logo 的优雅镜头结束。女人和男人的身份及其服装必须全程保持一致,但可以且应该从不同的角度和距离看到他们。请一张接一张地生成图片。确保每张图片都是 16:9 的横向格式。”

Storyboarding

9. Structural Control & Layout Guidance

9. 结构控制与布局指导

Input images aren’t limited to character references or subjects to edit. You can use them to strictly control the composition and layout of the final output. This is a game-changer for designers who need to turn a napkin sketch, a wireframe, or a specific grid layout into a polished asset.

输入图像不仅限于角色参考或要编辑的主体。你可以利用它们来严格控制最终输出的构图和布局。对于需要将餐巾纸草图、线框图或特定网格布局转化为精美资产的设计师来说,这是一个颠覆性的功能。

Best Practices | 最佳实践:

  • Drafts & Sketches: Upload a hand-drawn sketch to define exactly where the text and object should sit.

    • 草稿与素描:
       上传手绘草图,精确定义文字和物体的位置。
  • Wireframes: Use screenshots of existing layouts or wireframes to generate high-fidelity UI mockups.

    • 线框图:
       使用现有布局或线框图的截图来生成高保真 UI 模型。
  • Grids: Use grid images to force the model to generate assets for tile-based games or LED displays.

    • 网格:
       使用网格图像强制模型为基于瓦片的游戏或 LED 显示屏生成资产。

Example Prompts | 提示词示例:

Sketch to Final Ad: “Create a ad for a [product] following this sketch.”

草图转最终广告: “按照这个草图为 [产品] 制作一个广告。”

Sketch to Final Ad

UI Mockup from Wireframe: “Create a mock-up for a [product] following these guidelines.”

线框图转 UI 模型: “按照这些指南为 [产品] 创建一个模型。”

UI Mockup

Pixel Art & LED Displays: “Generate a pixel art sprite of a unicorn that fits perfectly into this 64x64 grid image. Use high contrast colors.” (Tip: Developers can then programmatically extract the center color of each cell to drive a connected 64x64 LED matrix display).

像素艺术与 LED 显示屏: “生成一个独角兽的像素艺术精灵,完美适配这个 64x64 的网格图像。使用高对比度的颜色。”(提示:开发人员可以通过编程提取每个单元格的中心颜色,以驱动连接的 64x64 LED 矩阵显示屏)。

Pixel Art

Sprites: “Sprite sheet of a woman doing a backflip on a drone, 3x3 grid, sequence, frame by frame animation, square aspect ratio. Follow the structure of the attached reference image exactly…” (Tip: You can then extract each cell and make a gif)

精灵图: “一个女人在无人机上做后空翻的精灵图表,3x3 网格,序列,逐帧动画,正方形纵横比。完全遵循所附参考图像的结构。”(提示:你可以提取每个单元格并制作 gif)

Sprites
GIF

(Resulting GIF / 结果 GIF)


10. What’s Next?

10. 下一步

Now that you have mastered the basics of prompting, here is how you can start building:

现在你已经掌握了提示词的基础知识,以下是你开始构建的方法:

  • Experiment in the UI:Google AI Studio is the fastest way to test prompts and parameters.

    • 在 UI 中实验:
      Google AI Studio 是测试提示词和参数的最快方式。

原文链接:https://mp.weixin.qq.com/s/XUFXU_lu-deZwNvUQ7kfqg

© 版权声明
THE END
喜欢就支持一下吧
点赞7 分享
骇客地锅的头像-AI Express News
评论 抢沙发

请登录后发表评论

    暂无评论内容