Agentic AI for a Zero-Accident Future: Accelerating MISRA Code Compliance to Strengthen Software Reliability

November 19, 2025

Agentic AI for a Zero-Accident Future: Accelerating MISRA Code Compliance to Strengthen Software Reliability

Authored by: Suigen Koide (Senior MLOps Engineer), Yuya Mochimaru (Senior MLOps Engineer), and Yosuke Sawai (Senior Engineer)

Leveraging Azure OpenAI, Woven by Toyota has demonstrated at the proof-of-concept stage, the automated correction of ~80% of MISRA violations in automotive software.
This early-stage breakthrough is expected to save several hundred millions of JPY (several million USD) and accelerate advances in vehicle software reliability.

The automotive industry is being reshaped by an unprecedented surge in software complexity. In 2000, a typical car relied on about one million lines of code. By 2025, that figure is expected to reach 600 million*. This is a 600-fold increase in just 25 years. Modern vehicles now rival massive IT systems in complexity, making the design, development, and validation of reliable software one of the sector’s greatest hurdles.

Meeting this challenge requires new tools and approaches. Coding standards such as those defined by MISRA (The Motor Industry Software Reliability Association) are one way to improve reliability in embedded systems. However, strict adherence often comes at a steep cost in both time and resources.

To help close this gap, Woven by Toyota — working in collaboration with Microsoft — has created MISRA Copilot, an AI-powered tool that fuses deep automotive software expertise with cutting-edge generative AI models. This blog explores how the tool is built and how it can transform the way automotive software is developed.

*Nissay Asset Management, “クルマは鉄の塊からソフトウェアの塊へ｜アナリストの眼,” ニッセイアセットマネジメント, September 21, 2022, https://www.nam.co.jp/market/column/analyst/2022/220921.html.

The MISRA Challenge

MISRA’s guidelines are one of several established C and C++ coding standards for embedded systems.

MISRA compliance, in particular, demands deep C and C++ expertise and mastery of hundreds of pages of specifications. Proofs of concept, such as for an ADAS module, often have a large number of violations* that need to be fixed before the code enters production. Static analysis tools can flag violations, but fixing them still demands significant manual effort by large engineering teams.

As software-defined vehicles grow in scale and complexity, these costs are only expected to rise. Even a modest violation rate (such as one violation per 20 lines of code) could result in tens of billions of JPY (hundreds of millions of USD) in annual remediation costs. An efficient, effective and above all else trustworthy countermeasure is therefore vital.

*This article prioritizes technical accuracy. In this context, ‘violations’ refer specifically to MISRA compliance violations identified during validation or development. MISRA guidelines are intended to improve code reliability, and this article explains how we are streamlining the process of achieving compliance.

Three (AI) Heads Are Better Than One

Initial proof-of-concept experiments with Azure OpenAI were promising, automatically resolving about 50% of MISRA violations. But while general-purpose OpenAI models excel at understanding and generating code, they lack the automotive-specific knowledge needed to apply MISRA’s complex rules with precision.

Moreover, the results from a single general-purpose model were also a black box: opaque and hard to trust. Key details such as the rationale for a fix or the confidence level of a correction remained hidden.

By applying Retrieval-Augmented Generation (RAG)* to inject domain-specific knowledge, and by separately evolving the architecture from a single generative model to a multi-agent system with shared context and persistent memory, the team saw a way to deliver both higher accuracy and greater transparency for engineers.

With this vision, Woven by Toyota collaborated with Microsoft, leveraging their cutting-edge language models to develop MISRA Copilot.

*Retrieval-Augmented Generation (RAG) is a technique that enables LLMs to pull in relevant information from external sources, such as databases, documents, or specifications, and use it to generate more accurate, context-aware responses.

MISRA Copilot: Agentic AI for Automotive Software Development

By systemizing WbyT and Toyota’s accumulated expertise in MISRA compliance, a structured database of historical code fixes was created. This knowledge base was then integrated into the generative AI workflow using RAG, enabling the model to ground its outputs in a domain-specific context rather than relying solely on general-purpose training data.

To further enhance both the accuracy and the interpretability of AI-generated corrections, a multi-agent architecture was implemented. In this design, each component functions as a specialized reasoning module within a shared context, collaboratively handling different stages of the remediation pipeline. This division of roles enables complex tasks to be decomposed, evaluated, and recomposed with greater rigor and traceability than a monolithic model could provide.

Within MISRA Copilot, three specialized agents interact in sequence:

The Coder proposes fixes that adhere to MISRA guidelines
The Reviewer evaluates the structure, readability, and maintainability of proposed fixes
The Evaluator checks compliance, provides rationales, and assigns confidence scores for each correction

Rather than relying on a single general-purpose LLM to juggle all tasks, all at once, this architecture enables each agent to be optimized for its own domain. This allows it to engage in deeper, rule-specific reasoning, which is essential when working with MISRA’s nuanced guardrails.

For example, the Coder can propose a fix based on local context, while the Reviewer evaluates it for clarity and maintainability across the broader codebase. The Evaluator then checks the fix against formal MISRA rules and provides LLM-generated justifications. Finally, a human reviews the results and decides whether to accept the proposed corrections.

Human Judgment In The Loop

In the Toyota Production System, the principle of jidoka (often translated as “automation with a human touch”) is central to the complete elimination of “waste” and “inefficiency”. Applied to knowledge work, this principle draws attention to the many inefficiencies in daily tasks — especially the repetitive ones that AI agents can take over. However, since generative AI can still produce errors, the role of human judgment in the loop is vital.

MISRA Copilot is designed with this principle in mind. Tasks suitable for automation are delegated to AI agents, while final judgment and accountability remain with human engineers. MISRA Copilot never alters code automatically; instead, it generates annotated proposals as GitHub pull requests. Acceptance or rejection of these proposals depends on the expertise and discernment of seasoned engineers.

This integration of machine precision with human knowledge and responsibility exemplifies Toyota’s concept of jidoka — a process design essential in safety-critical environments.

The Results

Proof-of-concept evaluation on internal ADAS development code showed MISRA Copilot delivering strong results:

81.5% of violations resolved automatically
97.1% syntactically correct code generation

Following promising early proof-of-concept results, the tool is now in further stages of evaluation for potential deployment. Early projections indicate cost savings on the scale of several hundred million JPY (several million USD).

Collaboration with Microsoft is also advancing, with future initiatives being explored.

Toward a Zero-Accident Future, Together

Strategic collaborations such as these are essential to advancing mobility for the benefit of society, combining deep automotive expertise with cutting-edge AI and cloud technologies to accelerate progress toward ever-safer, more reliable vehicles.

Building on MISRA Copilot, Woven by Toyota is developing an ecosystem of AI-assisted tools that enable engineers to create trustworthy automotive software with greater efficiency and reliability — all in pursuit of a zero-accident future.

*All monetary figures are based on the exchange rate at the time of writing.