Produce a clean set of 2D references that AI can accurately turn into 3D: one finalized front view + a multi-view turnaround (ideally side / back / 3-4 views) + optional part breakouts. These images are the foundation for the next 8 steps — the cleaner they are, the cleaner your 3D character will be.
Image-to-3D tools like Meshy and Tripo feed almost entirely on the reference you give them. The cleaner your reference, the cleaner the mesh that comes out.
A "pretty illustration" with dramatic lighting, a dynamic pose, and a busy background becomes a blob of melted geometry in a 3D generator's hands — armpits fused, proportions warped, and the background baked into the mesh too.
So the goal of this step isn't to paint something beautiful, it's to paint something "a machine can read." Every minute you invest here pays back many times over across the next 8 steps.
Every operation below exists to make your final reference satisfy these 5 rules. Burn the rules into memory first, then start.
With arms tight against the torso, the generator can't tell arm from body and the armpits fuse. An A-pose (arms hanging 30–45° out) is usually more stable than a T-pose, and the wrists are less likely to clip into the thighs.
Dramatic and rim lighting get baked straight into the geometry and textures, and once in 3D you can't remove them. What you want is even, soft studio light.
Pure white or pure gray, no ground, no cast shadow, no props. The cleaner the background, the more accurate the cutout and generation — otherwise the background gets modeled right along with the character.
Eye level, no strong high or low angles. Perspective makes the generator misjudge proportions — big head, small feet, near-large-far-small, all thrown off.
Front / side / back are the same person: same colors, same gear, same proportions. This is the hardest thing for AI and the core battleground of this section — solved with a consistency model + reference inputs.
Don't start pulling the slot machine right away. First use ChatGPT to write the "design" out clearly: a one-line concept, silhouette features, color palette, signature details. The clearer the words, the easier and more consistent the images. Just drop in the structured prompt below:
You are a senior character designer for a stylized 3D action game.
Lock a production-ready character concept from my seed idea. Output:
1. One-paragraph concept (who they are, their world)
2. Silhouette features (what makes them readable as a black shape)
3. Color palette: 3-4 hex colors with roles (primary / secondary / accent)
4. 5 distinctive design details (gear, marks, materials)
5. A ready-to-paste FRONT-VIEW image prompt: full-body, A-pose,
plain white background, neutral studio lighting.
Seed idea: [你的一句话点子,例如 a wandering blue-cat spirit swordsman]
Keep it production-oriented: clean silhouette, separable parts,
no extreme anatomy that is hard to model.Use a structured prompt to generate the front view, make a batch at once, and pick the one with the strongest silhouette that best matches the design. The structure is always: subject + style + pose + framing + light + background.
full-body character concept of [角色描述], front orthographic view, A-pose with arms held ~35° away from the body, legs slightly apart, character reference sheet, flat even studio lighting, soft shadows only, plain solid white background, clean readable design, [STYLE], symmetrical, entire figure visible head to feet, no cropping, high detail, sharp focus avoid: dramatic lighting, rim light, busy background, props, ground shadow, action pose, foreshortening, extreme perspective, multiple characters, text, watermark, cropped limbs
Once you've picked one, use image editing to clean it up, then lock it — no more design changes. This is your master.
Using this image, keep the EXACT same character design, outfit and colors. Clean it up only: - remove the background to pure white - complete any cropped limbs so the full body (head to feet) is visible - flatten dramatic lighting into even neutral studio light - correct obvious proportion issues A-pose, front orthographic view. Do not redesign anything.
The core trick: feed the master as a reference image to Nano Banana Pro (Gemini 3 Pro Image) — its strength is exactly "locking character identity and staying consistent across images," and it takes up to 14 reference images. Have it change only the viewing angle, never the design.
Use the provided reference image as the single source of truth for this
character. Generate the SAME character in a [side profile / back]
orthographic view. Keep identical outfit, colors, proportions, gear and
design details - this must read as the EXACT same character, just rotated.
A-pose, flat neutral studio lighting, plain white background,
character turnaround sheet style. Do not change or add any design element.Create a character turnaround / model sheet of THIS exact character showing front, side and back views in a row. All A-pose, identical design and colors, flat neutral lighting, plain white background, evenly aligned at the same height and scale.
Things like weapons, helmets, capes, and large accessories: isolate them on a pure white background, orthographic. Later you generate each in higher quality as separate 3D and assemble — far cleaner than brute-forcing one whole complex character. Split whatever you can.
Isolate the [weapon / helmet / cape] from this character.
Show it ALONE on a plain white background, orthographic front and side
views, clean even studio lighting, no character, no background props,
product-shot style, high detail.Rules unchanged, just add style words to the prompt. The friendliest category for 3D generation.
semi-realistic game character, PBR-friendly, neutral expression, balanced anatomyWatch out: big eyes and exaggerated proportions make 3D generation harder — keep the silhouette more restrained.
anime style, cel-shaded, clean lineart, flat colors, readable silhouetteProportions can be exaggerated, but the limbs must stay separated and the silhouette clear, or the rounded blob will fuse together.
chibi / stylized cartoon, exaggerated proportions, separated limbs, bold shapes▸ All three pass → on to Module 02: 3D model generation.