AI-FMEA – A Regulator’s & Reviewer’s Perspective

Technical Specification and Methodological Basis for Safety Review

Version 1.0 — Updated December 2025

This document provides the full technical basis for the ASI AI-FMEA system, including scale definitions, rationale, scoring methodology, and the analytical reasoning behind the Safety Model. It is intended for regulators, auditors, academic reviewers, and engineering teams requiring deeper justification beyond the practical reference page.

Context Note
This technical specification focuses on formal definitions, scoring methodology, and evaluation guidance. Readers seeking a practitioner-oriented discussion of how conversational AI failure modes emerge in real systems may wish to review AI-FMEA — An Engineer’s Perspective.

Note – Specification Level:
This document represents a Level-1 (baseline) technical standard. It defines the analytical framework, core scales, and foundational methodology for applying FMEA to conversational AI systems.
Future revisions will introduce higher-tier standards (Levels 2 and 3) that expand this foundation with detailed procedures, taxonomies, compliance criteria, and formal audit structures.
This Level-1 specification sets the scope and intent for all subsequent technical developments.

The goal of AI-FMEA is to apply structured engineering analysis to conversational AI systems, treating repeatable harmful behaviors as failure modes that can be identified, ranked, controlled, and mitigated before they reach the public. This specification explains why the FMEA method is valid for AI systems, how scoring works, and how evaluators should approach the analysis.

1. Purpose and Scope

The purpose of this specification is to provide a formal, well-defined analytical framework for evaluating systemic risks in conversational AI. The scope includes:

  • User interaction patterns
  • Model reasoning tendencies
  • Reinforcement loops
  • Safety filter gaps
  • Emotional and cognitive influence dynamics
  • Societal and ethical risk considerations
  • Failure detectability limitations

This specification applies to all conversational AI systems including:

  • Chatbots
  • Large language models
  • Personal-assistant models
  • Social AI agents
  • AI systems with memory or long-term conversation capability

The framework does not evaluate physical robotics or autonomous mobility systems, though its principles can extend to such contexts.

2. Why FMEA Applies to Conversational AI

FMEA is traditionally used in safety-critical fields such as:

  • Aerospace
  • Medical devices
  • Automotive engineering
  • Nuclear systems
  • Industrial controls

Despite the difference in domain, conversational AI behaves like other complex systems:

  • It has internal states that evolve unpredictably
  • It can exhibit cascading failures
  • It interacts with human operators
  • It produces outcomes that sometimes bypass human intuition
  • It can fail silently or undetectably
  • It produces repeatable patterns of failure modes

These characteristics align directly with the traditional engineering definition of a failure mode:

“A predictable or recurring pattern of behavior through which a system can cause harm.”

Conversational AI failure modes include:

  • Harmful guidance
  • Emotional entanglement
  • Manipulative persuasion
  • Reinforcement of harmful beliefs
  • False certainty or misinformation
  • Inconsistent self-correction
  • Illusion of intentionality

Therefore, AI-FMEA is not an adaptation forced onto AI — it is the correct analytical tool to classify and rank these behaviors.

3. Core Scales: Definitions and Rationale

The Severity, Occurrence, and Detection scales used in this specification are summarized visually in the AI-FMEA Technical Reference — Scales & Heat Map.

AI-FMEA evaluates three scales consistent with engineering best practices:

3.1 Severity (S)Impact of Harm

Severity measures the potential real-world impact if the failure reaches the user.

For conversational AI, severity includes:

  • Psychological or emotional harm
  • Behavioral influence or reinforcement
  • Societal or group-level consequences
  • Ethical or legal violations
  • Irreversible trust damage

Severity is scored from 1 to 10:

  • 1–3: Minor — easily reversible; mild confusion or inconvenience
  • 4–6: Moderate — noticeable disruption or risk
  • 7–8: High — significant harm likely
  • 9–10: Critical — catastrophic emotional, psychological, legal, or safety consequences

Rationale :

AI-human interactions directly influence thoughts, trust, and decision-making. Even “soft” failures can have hard consequences. Severity reflects this.

3.2 Occurrence (O)Likelihood of Failure

Occurrence reflects how often a failure mode appears under normal usage conditions.

AI Occurrence depends on:

  • Model training biases
  • Reinforcement learning artifacts
  • Prompt sensitivity
  • User tendency to probe weaknesses
  • Session length
  • Conversational drift
  • Systemic patterns within model weights

Occurrence scoring:

  • 1–3: Rare — requires extreme prompting
  • 4–6: Occasional — moderate likelihood
  • 7–8: Frequent — likely during extended interactions
  • 9–10: Systemic — high-probability failure intrinsic to the model

Rationale

AI failures do not occur randomly — they cluster around predictable behavioral tendencies encoded in the model.

3.3 Detection (D)Ability to Identify Failure Before Harm

Detection measures how likely the system is to catch the failure before it affects a user.

AI Detection limitations include:

  • Overly broad safety filters
  • Safety filters that miss subtle context
  • Memory-driven escalation
  • Ambiguity in user intent
  • Lack of introspection or reasoning transparency
  • Undetected emotional-state changes in the user

Detection scoring:

  • 1–3: High Detectability — model reliably catches the issue
  • 4–6: Moderate — partial or inconsistent detection
  • 7–8: Low — failure often bypasses guardrails
  • 9–10: No Detection — failure escapes system recognition entirely

Rationale

AI safety performance is constrained by the system’s architecture. If detection is weak, even low-frequency failures require urgent mitigation.

4. Risk Priority Number (RPN)

FMEA uses a composite metric:

RPN = Severity × Occurrence × Detection

RPN identifies which failure modes carry the highest combined risk.

  • High RPN (200–1000)
    Critical risks requiring immediate engineering intervention.
  • Moderate RPN (80–200)
    Risks needing structured mitigation planning.
  • Low RPN (1–80)
    Monitored but not urgent.

RPN is effective for AI because conversational systems exhibit:

  • Compounding risk
  • User dependency escalation
  • Failure reinforcement across sessions
  • Behavioral drift

RPN captures all three dimensions simultaneously.

5. Heat Map Visualization

The Heat Map transforms RPN scores into color-coded risk zones. This enables rapid interpretation:

  • Red — Critical risk
  • Orange — High risk
  • Yellow — Moderate risk
  • Green — Low risk

Heat maps are widely used in engineering risk triage and adapt perfectly to AI systems.

6. Evaluator Guidance

6.1 Evaluators must score systemic patterns, not isolated outputs

One harmful output is not a failure mode.
A repeatable pattern is.

6.2 Scoring must be conservative

Underestimating Severity or Occurrence creates false safety.

6.3 Root causes must be documented

Failure modes must be linked to:

  • Model design
  • Training data
  • Reinforcement learning
  • Context windows
  • Safety filter gaps

6.4 Detection must reflect actual system capability

Not theoretical or hoped-for detection.


7. Comparison to Traditional FMEA Standards

AI-FMEA aligns with:

  • MIL-STD-1629A — Failure Mode and Criticality Analysis
  • IEC 60812 — FMEA Principles
  • ISO 14971 — Risk Management for Medical Devices
  • SAE J1739 — Automotive FMEA
  • NIST AI Risk Management Framework (conceptually aligned)

The structure is preserved; the domain-specific definitions change.

8. Limitations and Boundaries

AI-FMEA is not intended to:

  • Predict long-term emergent AGI behavior
  • Evaluate autonomous robotics
  • Replace cybersecurity analysis
  • Guarantee prevention of all misuse
  • Serve as a moral or political framework

It focuses solely on systemic conversational failure modes.


9. Summary

This specification defines how ASI evaluates conversational AI risks through a structured, repeatable, engineering-based method. The scales, rationale, and scoring criteria provide:

  • Transparency
  • Accountability
  • Reproducibility
  • Regulatory readiness

By applying this framework, organizations can evaluate risks systematically rather than reactively, aligning AI development with professional safety standards.


Readers wishing to examine how these evaluation principles are applied in practice may review the AI-FMEA Example Model, or consult the Technical Reference — Scales & Heat Map for scoring definitions.

Scroll to Top