Boosting DITA XML Workflows with Artificial Intelligence

Alex Jitianu, Syncro Soft/Oxygen XML Editor
March 15, 2025

Artificial Intelligence (AI) has become a transformative force across industries, and the field of technical documentation is no exception. However, while AI offers immense potential, it’s important to approach its integration thoughtfully. Not every problem requires an AI solution, and over-engineering can lead to inefficiencies. The key is to identify specific pain points in your workflow and leverage AI’s strengths to address them effectively. In this article, we’ll explore some common challenges in DITA XML workflows and how AI can help overcome them.

Pain Point: Generating Short Descriptions

Technical writers often face the challenge of crafting meaningful short descriptions (shortdesc). These brief overviews provide a snapshot of a topic’s content and serve as the initial entry point for readers. Writing effective short descriptions is a demanding task, as they must be both concise and informative.

Short descriptions are essential for several reasons:
– They act as a gateway, helping users decide whether to engage with a topic.
– For tasks, they should address key questions such as: What are the benefits and purpose of the task? When and by whom should the task be performed?

Despite their significance, some companies opt to omit short descriptions due to a lack of expertise among their writers. This decision represents a missed opportunity, especially as short descriptions play an increasingly important role in Retrieval-Augmented Generation (RAG). In RAG, external data is retrieved and integrated into prompts to produce more accurate and context-aware responses. Short descriptions help narrow the focus, enhancing the precision and usefulness of the generated output.

AI, with its strength in summarization, offers an effective solution for generating short descriptions. By utilizing AI, you can efficiently create concise and meaningful overviews for your topics, saving time and improving the quality of your content.

Ensuring AI Adheres to Your Style Guide

When using AI for content generation, a common concern is whether it will follow your style guide. By default, AI may not automatically adhere to specific guidelines, but there are several strategies to ensure compliance:

1. Do Nothing: Allow the AI to generate content freely and rely on existing tools, such as terminology checkers or Schematron rules, to identify and correct errors. Schematron’s human-readable messages can even be fed back to the AI for further refinement.

2. Prompt Engineering: Improve the AI’s output by refining its instructions. This iterative process helps align the AI’s responses with your expectations and ensures better adherence to your style guide.

3. Prompt Chaining: For complex requirements, divide tasks into smaller, more manageable prompts. This approach minimizes the risk of the AI overlooking important details. Since style guides often contain numerous rules, breaking them into smaller prompts prevents the creation of an overly lengthy and cumbersome instruction set.

4. Retrieval-Augmented Generation (RAG) with Style Guide: Make your style guide accessible to the AI for reference. For instance, you can upload the style guide to a vector database or use tools like OpenAI’s Assistants API. This allows the AI to retrieve relevant sections of the style guide when generating content. For example, if you ask the AI to create a short description, it can reference the relevant guidelines on crafting meaningful short descriptions and generate content accordingly.

5. Fine-Tuning: Train the AI model using a dataset based on your style guide. While this method can produce highly customized results, it requires expertise and carries the risk of degrading the model’s overall performance if not executed properly. Therefore, fine-tuning should be considered a last resort.

Pain Point: Generating Image Alternate Text

Creating alternate text for images is another challenging task. Effective alt text must balance several factors:
– Context Dependence: The description should align with the image’s context, which may vary depending on its use.
– Conciseness vs. Detail: It should be brief yet sufficiently detailed to convey the image’s essential meaning.
– Relevance: Focus on what’s important about the image rather than describing every visual detail.

AI’s ability to analyze and understand images makes it well-suited for generating alt text. By providing additional context from the document along with the image, you can achieve even better results.

Pain Point: Creating Drafts or Updating Documentation

Writing and updating documentation can be a challenging task, particularly in large-scale projects. At Oxygen XML Editor, for instance, we use JIRA as an issue tracker for feature development. Once a developer completes a feature, it transitions to the documentation phase. This process often presents several challenges:

– Developers may provide incomplete or unclear documentation notes, assuming the writer has prior knowledge of the issue and omitting critical details.
– Valuable information is often buried within JIRA ticket comments, making it difficult to extract relevant insights.
– Identifying which topics require updates, including outdated images, and determining how to address them can be time-consuming and complex.

AI can simplify this process by analyzing all content within a JIRA ticket, including comments and attachments, to deliver a comprehensive understanding of the feature. By leveraging Retrieval-Augmented Generation (RAG) techniques, AI can also identify related topics and suggest updates, significantly streamlining the documentation workflow.

DITA’s structured and semantic nature makes it particularly well-suited for RAG. Features like short descriptions, map hierarchies, relationship tables, classification maps, and ontologies provide valuable context for AI, enabling it to understand resource relationships and requirements more effectively.

Quick Refactoring

Don’t forget that it is the small things that matter, small things that you perform frequently and the AI can take it of your back, such as:
– Converting lists to definition lists or tables.
– Adding menu cascade markup.

These minor improvements can save time and reduce repetitive work.

Editorial Support

As documentation evolves, the likelihood of inconsistencies and logical errors increases. AI can serve as a valuable reviewer, identifying conflicting information and logical flaws. For instance, we have successfully used AI to detect such issues in our own documentation, particularly in areas like installation procedures. This approach offers a quick and efficient way to leverage AI—simply provide it with some of your topics to analyze and review the results.

During the review process, valuable insights are often buried within comment threads. Once the review is complete, the writer must carefully integrate all the feedback into the documentation, ensuring nothing is overlooked. Why not enlist the help of an “AI assistant”? AI can streamline this process by incorporating these insights into the document, resulting in a more comprehensive and accurate final product.

Takeaways

Here are some key points to consider in your daily tasks when leveraging AI assistance:

1. Set Realistic Expectations: AI is not a magical solution that flawlessly handles everything. It is a tool that requires careful and thoughtful application to be effective.
2. Align AI Strengths with Your Needs: Focus on areas where AI can deliver the most value, rather than attempting to use it for every task indiscriminately.
3. Expertise Enhances AI Performance: The quality of AI output is directly influenced by the quality of input—poor input leads to poor results.
4. Combine Tools for Better Results: Use AI in conjunction with existing tools like Schematron to address specific challenges. For instance, Schematron can identify images missing alt text, while AI can generate appropriate alt text for those images.

By understanding AI’s capabilities and limitations, you can harness its power to enhance your DITA XML workflows, boost productivity, and improve the quality of your documentation.