Coining Golden Terms with AI
Terminology: the linguistic wild child of localization. Synonyms, homonyms, preferred vs. forbidden terms, part of speech, gender, definitions, context, usage examples – each term in a concept-based terminology database (termbase) is like a bauble on an overdecorated Christmas tree. And just like trimming a tree, it takes dedication to keep everything well-balanced and up to date. Back in the early ‘90s, pioneers wielded Trados MultiTerm (yes, the DOS version!) to prune, nurture, and cultivate terminology gardens with surprising efficiency.
Yet, in many cases, terminology remains the neglected stepchild among language assets. Will AI finally be its fairy godparent?
One of espell’s clients is a master craftsman of terminology, meticulously coining terms like a linguistic goldsmith. Their marketing and tech writers follow a rigorous process: from drafting a skeleton concept with a definition and context to selecting the perfect English term, then layering it with metadata, acronyms, synonyms, and use cases. These polished English concepts arrive at espell in weekly batches.
Our job? Finding and adding corresponding translations in six languages, adjusting metadata, and adding synonyms or forbidden equivalents as needed. Once validated, these terms become the backbone of localization and QA, injected into content by linguists, DeepL, or LLMs, depending on the workflow.
But terminology is a living entity.
Product features evolve, branding trends shift, and marketing loves to bestow fancy (sometimes utterly baffling) names on relabeled features and components – not to mention household acronyms. Managing this lifecycle means more than just adding and translating terms – it requires constant vigilance: updating usage statuses, proposing and approving new terms, and gracefully retiring outdated ones.
The challenge? This kind of meticulous terminology change management is both brain-heavy and labor-intensive, hindered by less-than-ideal software support. With our current workflow, we painstakingly harvest a few hundred fully curated concepts per year – one weekly batch at a time. There had to be a better way.
Enter AI. Our client challenged us to explore whether AI could help extract meaningful term candidates from large volumes of existing content, feeding into their terminology-building cycle.
From experience, we knew that term extraction is like gold panning: after the initial automated sift, most of the work lies in shaking out the fool’s gold – filtering out noise and redundancy. We previously relied on Power Queries and Excel’s deduplication tools for this task. Now, it was time to put AI to the test. Could smart prompting and LLMs do the job faster and more efficiently?
For years, synthetic voices sounded hollow and robotic, a far cry from anything a serious creator would use for high-stakes content.
But times have changed.
What Didn’t Work
Like any good experiment, our first attempts taught us what not to do:
- Dumping 100-200k words of XLIFF content into a chat AI – A context window that large turned out to be an invitation for hallucinations, no matter how precisely we structured our prompts with strict term extraction criteria.
- Using the AI-powered term extraction tool in the client’s cloud-based TMS – This system was neither configurable nor scalable. It functioned as a black box with hardwired logic, leaving us with little control over the output.
The Winning Approach
Then came our old reliable: memoQ. This tool provided a manageable volume of extracted terms, which we could refine further using AI prompts crafted by an espell terminologist. After three days of iterative testing, we discovered an important insight: running the same prompts recursively improved the results.
Once we filtered out non-terms and duplicates already present in the termbase, we were left with 386 “nuggets”, i.e., promising term candidates. These underwent further QA and refinement, leaving us with 47 true “golden” terms – meticulously vetted by our client’s terminologist, enriched with metadata, and integrated into their terminology workflow for translation.
Lessons Learned and Future Optimizations
While this AI-assisted process didn’t necessarily save time in its first run compared to a fully manual approach, it did yield valuable efficiencies for future iterations. The key advantage? The expertise of our terminologists is now embedded in structured AI prompts, reducing the amount of manual intervention needed going forward.
This won’t turn terminology extraction into a gold factory overnight – there will always be a back-and-forth between manual curation and semi-automated processing. But well-refined terminology, when managed intelligently, delivers strong ROI. It also enhances AI-augmented translation, post-editing, and QA workflows, making it an invaluable asset in modern localization.