top of page

Is Your Content Semantically Rich? Why You May Want to Care about Semantics

Updated: Feb 20, 2023

When your content reaches your customers, usually they have no idea how you created it.

Basically, they don't care whether you used DITA, Markdown, Word, or Notepad to write the piece of information they are looking for. So why should you care about having content semantically rich if once it got to the customer, it all looks the same?

Well, it doesn't have to look the same. Here are a few examples of the use cases when semantics is essential not only for faster content creation, but also for a more efficient content consumption.

Interactive real-time flight documentation for pilots

In aerospace, there's an XML content standard for flight operations called ATA2300. It's semantically richer than DITA and provides a dozen of information types with a very granular semantic structure. On the other hand, DITA is way easier to implement and adjust for specific needs. To get the best of the two worlds, we've made DITA and ATA2300 to get married.

We've created a DITA specialization which mimics ATA2300 features and customized our DITAToo DITA CMS to support ATA2300. DITAToo exports DITA to ATA2300, which is then uploaded to an electronic flight bag (EFB). EFB is an electronic a device with a touch screen that lets pilots access the flight documentation electronically during the flight. Not only flight procedures, check lists, and other mission-critical documentation are now assembled automatically for a particular modification of the aircraft, but rich semantics make them interactive.

For example, some procedures include actions performed by the captain and the first officer simultaneously. Because the markup allows for labeling who of them performs each action, the EFB may display to each crew member only the actions assigned to their role (on different screens). This wouldn’t be possible without a markup that allows you to assign a meaning to any piece of content, which can determine how this content should behave and interact with the user.

Similarly, when an electronic message is displayed on the EFB, it's assumed that the pilot knows exactly what it is, and no further explanations are required. However, to ensure that pilots are provided with a complete information, the EFB allows the pilot to tap the screen to see a detailed description of the message and the expected response (which could be different depending on the phase of flight). It's because in the ATA2300 markup and in our DITA specialization, there are separate elements for the message and its description. This makes EFB to behave differently for different elements and makes interaction with the documentation live and context-dependent.

Semantically rich content for chatbots

To let chatbots provide users with smart answers, chatbots should be fed with smart content. This requires labeling content in a way that enables the chatbot to match a specific piece of content to a specific user’s context.

Part of this task is solved by our DITAToo natural language understanding (NLU) engine. It analyzes the contents of DITA topics and adds semantic hierarchical metadata to each topic. For example, a procedure from a printer's user guide that explains how to replace a black-and-white cartridge would be automatically tagged with something like this:

Maintenance > Cartridge > Black-and-white > Replacing

Similarly, a procedure that explains how to utilize a color cartridge would be automatically tagged with something like this:

Maintenance > Cartridge > Color > Utilizing

Having semantic markup allows the DITAToo NLU engine not only analyze the text itself, but also extract additional information based on the markup (especially, when information isn’t explicitly mentioned in the content): product family, product name, target audience, and so on.

Combining the information retrieved by the NLU engine directly from the text with the information based on semantic markup allows for tagging topics even more specifically:

Product Family > Printers > Printer ABC > End-Users > Maintenance > Cartridge > Black-and-white > Replacing

All this information allows the chatbot to find the content that addresses a specific issue more precisely.

Finding Content within a Topic to Address the User’s Question

Another area where semantic markup is useful is when the answer to a user’s question is a certain part of a topic rather than the entire thing. While semantic markup allows for differentiating between, for example, procedure, individual step, and step result, the NLU analysis enables capturing information about what a particular step is talking about.

Suppose, for example, the following procedure:

Restarting Your Printer

  1. Disconnect the printer from your computer.

  2. Hold the Power button for three seconds.

  3. Connect the printer back to your computer.

  4. Press the Power button again.

If the user asks “How to restart a printer”, providing the user with the entire topic would be just fine. But if the user asks “For how long should I hold the power button if I want to restart the printer?”, the answer should be more precise. Having semantic markup that allows for identifying individual steps within the procedure and the meaning of each of them would enable decomposing the topic and retrieving only a specific piece of content that addresses the question.

As you can see, while content reuse and single source publishing are arguments more related to how content authors benefit, semantics is more focusing on your customers and the way their experience can be improved.

15 views0 comments

Recent Posts

See All


bottom of page