top of page

Content Conversion through the Eyes of an Ex-Expat

This is a guest post from Péter Ács of, experts in creating information solutions that augment your product with superior user experience. Today, Péter is sharing his experience with converting unstructured content to DITA and offers an unusual, but very convincing perspective...

As business expectations and organizations evolve, content development processes must follow.

Sometimes it is no longer possible to match the changing needs with an existing toolset and content format must be converted. Typically, conversion is seen as a purely technical challenge, but it is more complex – and often a fantastic opportunity to improve the content and processes along with the tooling.

We have helped numerous companies with their content conversion challenges and - having lived in China with my family for three years myself – I can best compare the content conversion process to relocation to a distant country – will use this analogy to present my case.

Know your destination

As with all major undertakings involving unknown factors, proper preparation is key. You may not yet know exactly how and when you will travel, but you must know the destination you are heading to:

Understand the business case - even if related decisions are beyond your responsibility.

What do you want to achieve by leaving your home and live elsewhere? What is the main driver for the conversion? What were the key pain points with the existing format and are these properly addressed with the destination? Define and agree on exact success criteria for the intended output.

Challenge the conversion scope. Do you really need to take all those suitcases with you, or can you leave some stuff behind you won’t necessarily need? There is no better opportunity to stop, review, and declutter actively maintained content than a conversion process.

Align the output requirements carefully. You already know you will relocate to a first-tier city China, but Beijing, Shanghai and Hong Kong are worlds apart in climate, culture, and opportunities. A typical example: do you really need to convert glossaries and indices if you are switching to exclusively web-based output deliverables? Can you reuse your existing metadata model converting it as data, or it’s time to adjust?

Get samples and investigate. This is equivalent to reading reviews from other travelers and is the most important preparation step.

  • It’s always advised to start with a pilot conversion – see if your theory works in practice - like how a shorter visit to the target destination is advised before the actual relocation)

  • Never extrapolate your effort estimates from a sole product or document type – get a sample set as versatile as possible – even if you understand the challenges of US expats in China, if you were raised in Europe, you will face different hardships.

  • Plan for the worst and multiply by two. We are not joking here. Unless you have done the exact same conversion earlier, surprises will happen. Have a plan B ready and allocate enough time for it in your estimates. I will not bring travel analogies as there were far too many surprises to choose from… If you plan the conversion process carefully, the more time you can spend planning, the less you will deviate from the path you planned.

Travel with an open eye

The equivalent of the conversion process is relocation – the journey and the first few weeks of settling in.

Consider the following aspects:

Automate – but don’t over-automate

It’s not a question that on such a long trip traveling by plane is clearly better than taking a ship or a train – exactly how during a conversion automation is always preferred over manual work. But when it comes to the last bit of your journey, you may consider using China’s excellent train network or the Shanghai Maglev instead of a private minibus transfer.

The analogy I’m trying to draw here is that automation saves time and effort, but there are always those little, rare, or inconsistent bits and pieces that are better handled though some manual work than a custom script.

Screen the market for existing conversion tools

You are not the only one with your problem… Many others have successfully made the journey from – for example Word to DITA which can be also automated with DITAToo! – and you often can rely on customizing existing tools to your use case. Sometimes with less-known tools or custom content architectures customs scripts are the only way, so you need to be prepared to do some scripting or have someone available who has the right skills.

Think about this as a real-estate agent specialized for expatriates – they already know typical needs, so you don’t have to explain from scratch.

Pre-process or post-process

As in – taking your belongings in countless boxes or buying equivalents locally.

Post-processing converted content is always unavoidable: There is always some content that even the best script can’t catch, and no conversion is complete without a rigorous QA review of the output.

To help post-processing, we typically adjust whatever tool does the automation to output warnings (or in some cases even validation errors) for ambiguous or inconsistent content.

Pre-processing is often overlooked and comes to the rescue – if there are dozens of instances of a tedious manual post-processing clean-up, think twice and see if you can modify the input content in a way that allows the conversion tool to output more reliable content.

Don’t be afraid to experiment with pre-processing and iterating the conversion script / configuration countless times – depending on the conversion volume this may be worth the effort.

Sometimes the quickest way is not the straight way

Once I was flying from Chicago to Shanghai and after take-off the caption told us, shocked passengers not to worry, we are on the right plane and deliberately heading northeast instead of southwest to reach Shanghai – it turns out due to favorable winds it was quicker to fly above the North Pole than the Pacific Ocean.

During content conversion you may feel even the best tool is not reliable enough or requires too much post-processing. An intermediary format can come to the rescue. A typical example: If all you have are PDFs, it is better to convert PDFs to Word and then Word to DITA with DITAToo, than taking the direct PDF to DITA route.

Write your travel diary

I don’t think the importance of good documentation must be explained to the audience of this newsletter.

During our China years, often it took an unreasonable amount of time to update our blog on what happened with us – but it helped to keep close ties with our friends and relatives. Now it helps us to recollect our experience and I also hope, growing up our kids will be grateful for the comprehensive documentation on their early childhood spent in the other half of the world.

Same goes for conversion documentation. Post-processing typically involves quick hacks that are forgotten, if not documented. So, take care of the following:

  • Have a generic overview on the end-to-end conversion process ready. You will forget quickly.

  • If you use custom scripts during the process, comment the code well to avoid scratching your head trying to understand what it does.

  • Define your success criteria and make sure it is fully agreed on, by all parties.

  • Most often post-processing is done by someone else than whoever is responsible for automation. Help their work with the following:

    • Checklists work best for quality assurance. Check your input sources and deliverables, compare output equivalents, and note each thing that must match or is different

    • Build a QA knowledge base on all output warnings / errors and the required post-processing steps

  • Two full QA reviews seldom fit conversion budgets but have someone with a sharp eye conduct random spot-checks.

Overall – similar to living in China – content conversion may have a bad reputation. It is often seen as a boring, repetitive must-do activity. With proper planning and a careful, detailed approach it can be a pain-free process and more importantly, act as an enabler for content operations efficiency improvements.

If you plan to relocate to China 😊, need help in elevating your content operations to the next level, or someone to provide cost-efficient and bulletproof content conversion, do not hesitate to reach out.

We are glad to help!

104 views0 comments

Recent Posts

See All


bottom of page