The two problems no-code tools quietly skip
There are exactly two hard parts to illustrating a book with AI, and nearly every consumer tool glosses over both.
The first is character consistency. A children's picture book lives or dies on the reader believing the brave little mouse on page 3 is the same mouse on page 20 — same face, same scarf, same proportions. Text-to-image alone won't give you that. You get a new mouse every prompt. The fix is to design one master character first, then feed that image back in as a reference on every subsequent page.
The second is resolution, and specifically what print resolution costs. Amazon KDP and most print-on-demand services require interior art at 300 DPI. An 8.5″ × 8.5″ square page — the most common picture-book trim — needs 2,550 × 2,550 pixels to hit that. Plenty of AI tools happily generate at 1024 × 1024 and call it “print-ready”; it isn't, and you find out after you've laid out the whole book. Both models below can reach print resolution. The real difference is how much each one charges you to do it across an entire book. That's where the two-model split earns its keep.
Prerequisites
You'll need a GPT Proto API key (generate one in the dashboard under API Keys — it looks like sk-xxxxx), Python 3 with requests, and a decision about your print target before you generate a single pixel. Pick that target now, because regenerating 32 pages at the wrong size is the most common and most avoidable mistake here.
A few numbers worth committing to up front, all from standard Amazon KDP interior requirements: picture books run 24–32 pages, and the total page count must be a multiple of 4. Interior art should be 300 DPI. If your art runs to the page edge (full bleed), add 0.125″ (3 mm) of bleed on every side, and keep anything important at least 0.25″ inside the trim. Translate your trim size to pixels at 300 DPI and that's your generation size:
KDP trim sizes and their pixel dimensions at 300 DPI
| Trim size |
Pixels at 300 DPI |
Notes |
| 8.5″ × 8.5″ (square) |
2550 × 2550 |
Most common picture-book format |
| 8.25″ × 8.25″ (square) |
2475 × 2475 |
KDP standard square |
| 8″ × 10″ (portrait) |
2400 × 3000 |
Good for vertical scenes |
| 8.5″ × 11″ (letter) |
2550 × 3300 |
Activity books; see the resolution note below |
One quirk to plan around: gpt-image-2's generation is capped at a maximum edge of 3840 px and a total of 8,294,400 pixels (per the GPT Proto endpoint spec). A 2550 × 2550 square is 6.5M pixels — comfortably inside that. But 8.5″ × 11″ at 300 DPI is 8.42M pixels, which exceeds the cap. For that largest trim you'll want Seedream 5.0 (native 4K, larger canvas) or a separate upscale step. For standard square and portrait picture books, both models clear the bar.
Set up a shared helper first. Both models on GPT Proto use the same authentication header — the raw key, no Bearer prefix — and the same asynchronous pattern: you submit a job, get back an id and a result URL, then poll until it's done.
import requests
import time
BASE = "https://gptproto.com/api/v3"
HEADERS = {
"Authorization": "GPTPROTO_API_KEY", # paste your sk-xxxxx key here, no "Bearer"
"Content-Type": "application/json",
}
def generate(endpoint: str, payload: dict, poll_every: float = 2.0):
"""Submit a generation job and poll the prediction endpoint until it completes."""
resp = requests.post(f"{BASE}/{endpoint}", headers=HEADERS, json=payload)
resp.raise_for_status()
data = resp.json()["data"]
result_url = data["urls"]["get"] # e.g. .../api/v3/predictions/abc/result
while True:
res = requests.get(result_url, headers=HEADERS).json()["data"]
status = res["status"]
if status == "completed":
return res["outputs"] # list of image URLs
if status in ("failed", "error") or res.get("error"):
raise RuntimeError(res.get("error") or "generation failed")
time.sleep(poll_every)
That generate() helper is the spine of everything below. Each step is just a different endpoint and payload passed into it.
Step 1 — Design the master character with gpt-image-2
Do this step once, and do it carefully — every page in the book inherits it. I reach for gpt-image-2 here specifically because it follows long, fussy instructions better than anything else in this workflow, and getting the character exactly right is worth paying for when you only pay for it once.
Generate a clean character reference sheet on a plain background. Describe the character in concrete, repeatable detail — the specific colors and props you'll be naming again on every page.
character = generate("openai/gpt-image-2/text-to-image", {
"prompt": (
"Character reference sheet for a children's picture book: a small round owl "
"named Pip, soft teal feathers, large amber eyes, a tiny red knitted scarf, "
"gentle friendly expression. Show a front view and a three-quarter view. "
"Flat watercolor storybook style, plain white background, no text."
),
"size": "1024x1024",
"quality": "high",
"response_format": "url",
})
master_url = character[0]
print("Master character:", master_url)
Note the size format here uses a lowercase x (1024x1024) and quality accepts low, medium, high, or auto. Keep this reference URL. It's the thread that holds the whole book together.
Step 2 — Lock the character into every scene
This is the part the one-click tools fake and you're about to do for real. gpt-image-2's image-edit endpoint takes an images array of reference URLs and generates a new scene that preserves the subject. Pass your master_url as the reference, then describe the new page — and explicitly name the features that must stay identical.
def illustrate_page(scene: str, size: str = "2560x2560"):
out = generate("openai/gpt-image-2/image-edit", {
"prompt": (
"The same owl character, Pip, from the reference image. Keep the teal "
"feathers, amber eyes and red knitted scarf identical. New scene: " + scene +
". Full-page children's picture-book watercolor illustration."
),
"images": [master_url],
"size": size,
"quality": "high",
"response_format": "url",
})
return out[0]
page_03 = illustrate_page("Pip flying over a moonlit forest, looking down curiously")
A 2560x2560 request lands you at print resolution for an 8.5″ square page (2560 is a multiple of 16, which gpt-image-2 requires, and 6.55M pixels sits under the cap). The honest catch: gpt-image-2 bills by tokens, and token usage climbs with both resolution and quality. A high-quality 2560-pixel page costs meaningfully more than a 1024-pixel draft, and the exact figure varies per image. Which is the whole reason for Step 3.
Step 3 — Mass-produce the interior pages with Seedream 5.0
gpt-image-2 is the right tool for the one character design and the cover. It's the wrong tool for grinding out 30 full-resolution pages, because every one of those pages is metered. Seedream 5.0, ByteDance's image model, charges a flat $0.0298 per image on GPT Proto regardless of size — a 4K page costs exactly the same as a thumbnail. It also supports the same reference-image consistency workflow through its own image-edit endpoint, and generates native 4K (up to 3840 × 3840).
So: hand Seedream the same master_url, loop over your scenes, and let it produce print-resolution pages at a fixed price. Two format details differ from gpt-image-2 and will bite you if you miss them — Seedream's size uses an asterisk (2560*2560, not 2560x2560), and reference images go in the images array of URLs.
scenes = [
"Pip waking up in a cozy treehouse at sunrise",
"Pip meeting a shy hedgehog by a stream",
"Pip and the hedgehog sharing acorns under an oak",
# ... one entry per interior page
]
pages = []
for i, scene in enumerate(scenes, start=1):
out = generate("bytedance/seedream-5-0-260128/image-edit", {
"prompt": (
"Keep the exact same owl character from the reference image — teal "
"feathers, amber eyes, red knitted scarf, same proportions. New scene: "
+ scene + ". Full-page children's picture-book watercolor, consistent style."
),
"images": [master_url],
"size": "2560*2560", # asterisk format; print-res 8.5in square
})
pages.append(out[0])
print(f"page {i:>2}: {out[0]}")
print(f"\n{len(pages)} interior pages generated.")
Now the arithmetic that no consumer tool will show you. A 32-page book at Seedream's flat rate is 32 × $0.0298 ≈ $0.95 for every interior illustration at print resolution. Add the master character and a few variations on gpt-image-2, plus a cover, and a complete book's worth of AI art lands around one to two dollars. Compare that to “200 credits per book” pricing where you can never quite work out the real cost.
One trade-off to budget for: both endpoints are asynchronous and rate-limited, so a 32-page loop isn't instant, and hammering the API in a tight loop can trip a 429. The polling helper handles the waiting; for large books, add a short delay between submissions rather than firing all 32 at once.
Step 4 — Make the files actually printable
A sharp 2560-pixel PNG is necessary but not sufficient. Before layout, run the checklist that gets books rejected when skipped. Confirm each image is the right pixel size for your trim at 300 DPI (the table above). For full-bleed pages, your art needs to extend into the 0.125″ bleed zone, so generate slightly larger than the trim or compose with margin to spare. Keep faces and key action out of the outer 0.25″ so nothing important gets trimmed. Convert to CMYK for the print interior (screens are RGB; print is not), and make sure your final page count is a multiple of 4.
If you're publishing the 8.5″ × 11″ format that exceeds gpt-image-2's pixel cap, generate those pages on Seedream at up to 3840 × 3840 and downscale to your exact target, or upscale a smaller generation — downsizing keeps detail, upsizing invents it, so prefer the former.
gpt-image-2 vs Seedream 5.0: which model does what
People want one model. For this job you want two, and the split is not arbitrary — each wins on the axis that matters for a different part of the book.
gpt-image-2 vs Seedream 5.0 on GPT Proto — what each is for
| |
gpt-image-2 |
Seedream 5.0 |
| Best at |
Following long, precise prompts; rendering legible text in-image |
Flat-rate high-resolution output at volume |
| Max native resolution |
3840 px max edge, ≤ 8.29M pixels total (~2880² square) |
Native 4K, up to 3840 × 3840 |
| Pricing on GPT Proto |
Token-based: input $6.4 / 1M, output $24 / 1M (varies per image) |
Flat $0.0298 per image, any size |
| Character consistency |
Reference images via image-edit |
Reference images via image-edit |
| Format gotcha |
size: "1024x1024" (x) |
size: "2560*2560" (asterisk) |
My recommendation, stated plainly: use gpt-image-2 for the master character and the cover, where instruction-following and clean title text are worth a variable, higher price you only pay a handful of times. Use Seedream 5.0 for the 24–32 interior pages, where a flat $0.0298 per print-resolution image is the difference between a book that costs a dollar and a book that costs ten.
The honest caveats, because no model is free of them. gpt-image-2's edge in following complex English prompts is real — Seedream is strong, but for an intricate cover with exact text I trust gpt-image-2 more. Seedream's win is price and native resolution, not universal superiority; it earns its place here because cost-per-page and print size happen to be precisely what a 32-page book stresses. Against the broader field — models like Google's Nano Banana or Midjourney — the same logic holds: this workflow optimizes for consistent characters and print-ready pages at a predictable cost, and the gpt-image-2-plus-Seedream pairing covers both better than any single model I've tested for this specific task.
Common errors and how to fix them
A 401 means the key is missing or wrong — check you pasted the raw sk-xxxxx value into the Authorization header with no Bearer prefix. A 403 usually means insufficient balance, not a broken key. A 429 is rate limiting; space out your requests and the polling helper's waits will absorb most of it. A 400/503 flagged as a content policy violation means the prompt tripped a safety filter — rephrase it. And if your images come back blurry in print, the cause is almost always size: you generated below your trim's 300 DPI pixel count. Regenerate at the correct dimensions rather than scaling up afterward.
FAQ
Can you legally sell a book illustrated with AI?
Generally yes for commercial use, but the specifics depend on the platform's terms and your jurisdiction's stance on AI-generated work, which is still shifting — confirm against your printer's and marketplace's current policies before you publish.
How many illustrations does a children's book need?
Picture books typically run 24–32 pages with a multiple of 4, so plan for roughly that many interior illustrations plus a cover.
What does a full AI-illustrated book actually cost in API calls?
At Seedream's flat $0.0298 per page, 32 interior pages are about $0.95; with the gpt-image-2 character work and cover, a complete book lands around one to two dollars in generation cost.
Will the character really stay consistent?
Using a master reference image through the image-edit endpoints holds character features far better than re-prompting from scratch, but it's strong, not perfect — expect to regenerate the occasional page where a detail drifts.
Do I need to code, or is there an online option?
You can try both models in the browser on their model pages first to feel out the style, then move to the API when you're producing a whole book and want batch control.
Start here
Try the two models in-browser to find your style, then wire up the pipeline above: