AI pricing transparency: how to estimate your translation costs accurately

When Maya opened the spreadsheet, the numbers looked like static on an old TV screen. Three vendors had sent proposals...
  • by
  • Jan 8, 2026

When Maya opened the spreadsheet, the numbers looked like static on an old TV screen. Three vendors had sent proposals for converting her startup’s onboarding guides into Spanish and Japanese, each with a different billing unit: one priced by word, another by character, the third by tokens with a vague note about model input versus output. The calendar was tight, the budget was tighter, and the team could not afford guesswork. What Maya wanted was simple: a clear path from content volume and quality expectations to a reliable price. The promise seemed to shimmer just out of reach, especially with AI in the mix.

She was not alone. If you have ever tried to budget for language work, you know how fast the hidden meters spin—rush fees, specialized terminology, file formatting, and that mysterious line item called model usage. This piece is your map. We will unpack the moving parts, translate them into a practical estimating method, and apply it to real scenarios so you can talk to vendors and internal stakeholders with confidence. Think of it as AI pricing transparency: how to estimate your translation costs accurately, not in theory but in practice.

By the end, you will be able to read quotes like a tech spec, build your own estimate from scratch, and pressure-test the result with a simple playbook. Clear, calm numbers beat anxiety every time.

Cost clarity begins with knowing what you are actually buying.

Before you can estimate, you need to see what drives the meter. AI-enabled language work is not one thing; it is a bundle of tasks with distinct cost drivers.

Unit of measure. Classic quotes use per-word or per-character pricing. AI models think in tokens—fragments of text, roughly 0.75 words in English on average but highly variable across languages. You may pay for input tokens, output tokens, or both. If you receive a token-based quote, ask how many tokens the vendor assumes per 1,000 words of your source text and what output expansion factor they expect in the target language.

Volume and duplication. Not all content is new. Repeated phrases, boilerplate, and templated UI strings can be partially or fully reused. Good workflows recognize repetition and reduce cost for matches, but only if your content is segmented consistently. If you have a past bilingual database of approved segments, leverage reduces both price and turnaround.

Subject matter and quality tier. A casual blog post, a regulated medical leaflet, and software release notes do not demand the same effort. Define quality in terms of consequences of error and required review. Typical tiers are machine output only, machine plus human edit, and human-verified specialist review. The higher the risk, the more human time you should budget.

Turnaround time. Short timelines trigger parallelization, extra reviewers, and extended hours. The technology might be fast, but coordination still costs.

Formats and extras. PDFs, InDesign files, subtitling, right-to-left scripts, and complex markup add prep and desktop publishing time. Terminology creation, style guides, and brand voice tuning also add steps—and value.

All these variables compound. A clear quote tells you which ones apply. If any line item feels abstract, convert it to a crisp assumption: tokens per word, expected reuse, hours of editing per thousand words, file prep time per document. The moment you reframe fuzzy items as measurable assumptions, the fog lifts and the pricing becomes something you can check.

Use a simple worksheet to turn fuzzy quotes into a concrete estimate.

My go-to method is a five-step worksheet you can reuse for any project. You can build it in a spreadsheet in ten minutes.

Step 1: Measure volume in two ways. Count words or characters in the source, then estimate tokens. A practical rule of thumb is tokens ≈ 1.3 × word count for English-like texts, but adjust by language and content type. When in doubt, run a small sample through a tokenizer to calibrate.

Step 2: Define the workflow. Pick your quality tier: machine only, machine + human edit, or specialist review. List each stage as a row with its pricing unit: model usage per 1,000 tokens, human editing per word or per hour, file prep per hour, engineering per page, and so on.

Step 3: Set assumptions explicitly. For the model, note input token rate and output token rate, plus the expected expansion factor (some target languages expand by 10–30 percent). For human editing, decide on effort units: for example, 700–1,200 words per hour for light editing of clean machine output, 400–700 for technical material, lower if safety-critical. For file prep, estimate minutes per file based on prior projects. For reuse, set match bands or a flat reuse percentage and discount accordingly.

Step 4: Do the math. Example: 8,000-source-word help center article set. Estimated tokens in: 8,000 × 1.3 = 10,400. Output expansion 1.1 ⇒ tokens out ≈ 11,440. Suppose a high-quality model costs 0.40 USD per 1,000 input tokens and 1.20 USD per 1,000 output tokens. Model usage cost ≈ (10.4 × 0.40) + (11.44 × 1.20) = 4.16 + 13.73 ≈ 17.89 USD. Add human editing at, say, 0.02 USD per source word after a 20 percent reuse deduction: billable words = 6,400 ⇒ 128 USD. Add 2 hours of term alignment at 35 USD/hour = 70 USD. Total working subtotal ≈ 215.89 USD. Then add 10–15 percent for project management and QA spot checks.

Step 5: Validate against a pilot. Run 1–2 representative pages end-to-end. Measure actual tokens, editing time, and quality results. Adjust your assumptions and re-run the estimate for the full set.

Why this works: every number points back to an assumption you can verify—tokens, hours, or files. If a vendor proposes a higher rate, you can ask which assumption differs: Is the model heavier? Is editing time higher due to domain complexity? Do files need extra engineering? That conversation moves you from haggling to problem solving.

Put the numbers to work on a real project.

Let’s apply the worksheet to a scenario many teams face: an ecommerce catalog, 100 product pages, each about 120 words of copy plus specs and bullet points, destined for four target languages. Your goals are brand consistency, low error risk for measurements and materials, and a two-week deadline.

1) Measure volume. Core copy: 100 × 120 = 12,000 words. Specs and bullets: assume an additional 8,000 words. Total source volume ≈ 20,000 words. Estimated tokens in ≈ 26,000 using the 1.3 factor. Output expansion varies; assume 1.15 on average, so tokens out ≈ 29,900 per language.

2) Map the workflow. You choose a balanced approach: model generation, human editing for clarity and accuracy, and a final brand pass on the top 20 product pages. Stages and units: model usage per 1,000 tokens, editing per source word, brand pass per page, file prep per hour.

3) Set assumptions. Model pricing: 0.30 USD per 1,000 input tokens, 0.90 USD per 1,000 output tokens. Editing effort: 0.015 USD per source word after reuse. Because SKUs repeat, expect 30 percent reuse from one page to another if your content is segmented cleanly. Brand pass: 6 minutes per high-priority page at 40 USD/hour. File prep: 1 minute per page to normalize bullet symbols, convert units, and check markup, at 30 USD/hour.

4) Calculate per language, then multiply by four. Model cost per language ≈ (26 × 0.30) + (29.9 × 0.90) = 7.80 + 26.91 = 34.71 USD. Editing base words per language after 30 percent reuse: 20,000 × 0.70 = 14,000 words ⇒ 210 USD. Brand pass: 20 pages × 0.1 hours × 40 USD = 80 USD. File prep: 100 pages × 1 minute = 100 minutes ≈ 1.67 hours × 30 USD = 50.10 USD. Subtotal per language ≈ 34.71 + 210 + 80 + 50.10 = 374.81 USD.

Multiply by four target languages: ≈ 1,499.24 USD. Add 12 percent for project management and QA spot checks: ≈ 1,679.15 USD total estimate.

5) Pilot and adjust. Process five product pages in one language. Track actual tokens and editing time. Maybe you discover the specs section expands more than expected in one language and less in another. Update the expansion factor and editing effort. If reuse is higher than 30 percent after you clean and standardize attribute names, lower the editing cost accordingly and re-run the numbers. If measurement units need conversion and validation, add a small engineering buffer.

As you iterate, log your assumptions in plain language: “Editing effort 0.015 USD/word based on 55 minutes per 1,000 words in pilot,” “Tokens out 1.17× tokens in for language A,” “Brand pass required for lifestyle copy only.” These notes are gold when you revisit the catalog next quarter.

Conclusion

Pricing for AI-driven language work stops being mysterious the moment you express every element as a checkable assumption: tokens in, tokens out, hours per task, and reuse percentage. That clarity turns negotiations into collaboration. Vendors can tell you exactly which assumptions they challenge and why, and you can test alternatives without emotional drama. Most importantly, you gain control over scope, quality, and schedule instead of hoping the final invoice is kind.

Here is the simple rhythm to remember: measure the text, pick the workflow, write down your assumptions, do the math, then validate with a pilot. Once you do this a couple of times, the pattern becomes second nature. You will start to see opportunities to reduce spend without cutting quality—by improving segmentation, building a living glossary, or aligning your brand voice so editors work faster and more confidently.

If this blueprint helped, try it on a small piece of your current project today and share what you discover. What assumptions held? Which ones broke? Drop your questions and experiences, and let’s refine the method together so the next estimate you build is not just accurate—it’s defensible, teachable, and calm.

For more on the importance of interpretation in translation projects, be sure to check the link provided.

You May Also Like