ISACAAAIA102 concepts

AAIA Cheat Sheet

Quick reference for the ISACA Advanced in AI Audit exam.

Quick Navigation

Regulatory Frameworks — Know the Difference AI Governance Structures Data Governance for AI AI/ML Lifecycle Audit Control Points Model Drift — Types and Detection Algorithmic Bias and Fairness Testing AI-Specific Threats and Vulnerabilities AI Change Management AI Incident Response MLOps and AI Operations AI Audit Planning and Scoping AI Audit Evidence and Testing AI-Enabled Audit Analytics Explainability and Transparency Vendor and Third-Party AI Risk

Regulatory Frameworks — Know the Difference

NIST AI RMF 1.0 — Voluntary Guidance: Voluntary US framework with four core functions (Govern, Map, Measure, Manage) for managing AI risks; no enforcement mechanism, no certification, no financial penalties.
ISO/IEC 42001 — Certifiable Standard: International certifiable standard for AI Management Systems (AIMS); requires third-party audits and results in formal organizational certification, unlike the voluntary NIST AI RMF.
EU AI Act — Legally Binding Regulation: EU law classifying AI systems into four tiers (unacceptable, high, limited, minimal) with mandatory conformity assessments and financial penalties; the only legally enforceable framework among the three.
EU AI Act — Unacceptable Risk (Banned): AI uses that are completely prohibited: real-time biometric surveillance in public spaces (with narrow exceptions), social scoring by governments, and predictive policing based on profiling.
EU AI Act — High Risk (Conformity Required): High-risk AI systems (credit scoring, hiring, medical, law enforcement) are NOT banned but require mandatory conformity assessments, transparency, human oversight, and registration.
NIST AI RMF: Four Core Functions: Govern (establish policies), Map (identify AI risks in context), Measure (analyze and assess risks), Manage (prioritize and treat risks) — all four apply across the AI lifecycle.
OECD AI Principles: International non-binding guidelines for trustworthy AI (transparency, accountability, robustness, safety) adopted by 40+ countries; not a regulation and carries no enforcement.

AI Governance Structures

AI Center of Excellence (CoE): Organizational body that implements AI standards, best practices, and governance processes across the enterprise — distinct from an AI Ethics Board, which only advises.
AI Ethics Board: Advisory body that provides guidance on ethical AI considerations; it advises but does not implement standards — implementation is the CoE's responsibility.
RACI for AI Roles: Model Owners own production model outcomes; Data Scientists build models; MLOps Engineers operate pipelines; Risk Management assesses risks; Internal Audit provides independent assurance.
Separation of Duties for AI: The team that develops AI models must not also be the team that approves them for production; independent model validation and approval prevents conflicts of interest.
AI Acceptable Use Policy: Formal policy defining sanctioned vs. unsanctioned AI tools, data classification requirements for AI inputs, and prohibited AI use cases within the organization.
Right-to-Audit Clause: Contractual provision requiring AI vendors/third-party providers to permit the organization (or its auditors) to assess the vendor's AI controls, data handling, and SLA compliance.
Workforce Impact Assessment: Evaluation of how AI deployment affects job roles, required skills, and organizational change readiness — a governance control required before major AI implementations.

Data Governance for AI

Data Lineage: The record of HOW data was transformed — every processing step, modification, aggregation, and enrichment applied from raw ingestion through model training.
Data Provenance: The record of WHERE data originated — source, ownership, consent basis, and chain of custody from original collection to current use; distinct from lineage.
Data Minimization: Collect and retain only the minimum data necessary for the AI task; required by GDPR and EU AI Act to reduce privacy exposure and regulatory risk.
Consent Management: Tracking and enforcing the permissions under which personal data was collected; AI training data must respect the consent scope under which it was originally gathered.
De-identification Techniques: Anonymization (irreversible removal of PII), pseudonymization (replacement with tokens), and differential privacy (mathematical noise addition) for protecting training data.
Synthetic Data: Artificially generated data mimicking real data's statistical properties; benefits are privacy preservation and augmentation, but risks include quality gaps and non-representativeness.
Federated Learning: Training ML models across decentralized devices without sharing raw data; a privacy-preserving technique where only model updates (not data) are transmitted.

AI/ML Lifecycle Audit Control Points

Stage 1: Business Case & Scoping: Audit verifies: clear problem definition, success criteria, ROI justification, risk assessment, and ethical impact review before any data collection begins.
Stage 2: Data Collection & Preparation: Audit verifies: data provenance documentation, consent validation, data quality controls, bias assessment in training data, and feature engineering review.
Stage 3: Model Development: Audit verifies: reproducibility of training runs, hyperparameter documentation, algorithm selection rationale, train/validation/test split integrity, and version control.
Stage 4: Model Validation (Pre-Deployment): Audit verifies: independent validation team separate from developers, fairness testing results, adversarial testing completion, and performance threshold sign-off.
Stage 5: Deployment: Audit verifies: approved change management workflow, canary or A/B deployment procedures, rollback capability, and human-in-the-loop thresholds for high-stakes decisions.
Stage 6: Monitoring (Post-Deployment): Audit verifies: drift detection alerts, bias recheck schedules, hallucination rate monitoring, performance dashboards, and escalation procedures for metric breaches.
Stage 7: Decommissioning: Audit verifies: formal model retirement approval, data disposal procedures, documentation archival, and replacement model transition plan.

Model Drift — Types and Detection

Data Drift: Changes in the statistical distribution of INPUT data over time; the model is unchanged but incoming data no longer matches training-time patterns — requires retraining on updated data.
Concept Drift: Changes in the RELATIONSHIP between inputs and the target variable; the real-world phenomenon has changed, making the model's learned patterns obsolete — may require model redesign.
Model Drift (Performance Degradation): General decline in model accuracy or reliability over time, which may be caused by data drift, concept drift, or model parameter decay; the umbrella term encompassing all three.
Drift Detection Methods: Statistical tests (PSI — Population Stability Index, KL divergence, Kolmogorov-Smirnov test) that compare current input/output distributions against baseline training distributions.
Drift vs. Concept Drift Exam Trap: If input distributions are STABLE but predictions worsen, it is concept drift (relationship changed). If input distributions CHANGED, suspect data drift first.
Cross-Validation: Model evaluation technique used DURING DEVELOPMENT to estimate generalization performance; it is NOT a production monitoring tool and does not detect post-deployment drift.

Algorithmic Bias and Fairness Testing

Algorithmic Bias: Systematic errors in AI outputs producing unfair outcomes for specific groups, arising from biased training data, flawed feature selection, or inadequate testing across demographic segments.
Demographic Parity: Fairness metric requiring equal positive prediction rates across demographic groups (e.g., loan approval rates must be equal across races); tests for output-level discrimination.
Equal Opportunity: Fairness metric requiring equal true positive rates (sensitivity) across demographic groups; ensures the model is equally good at correctly identifying positive cases for all groups.
Disparate Impact Analysis: Statistical test measuring whether model outcomes disproportionately harm protected groups; the 80% rule (four-fifths rule) flags adverse impact when a group's rate is below 80% of the highest group's rate.
Bias Recheck Schedule: Post-deployment periodic reassessment of fairness metrics; required because models can develop bias over time as input data distributions shift even if they were fair at launch.
SHAP (SHapley Additive exPlanations): Explainability tool that quantifies each feature's contribution to an individual prediction using game theory; produces global and local model explanations.
LIME (Local Interpretable Model-agnostic Explanations): Explainability tool that creates simple local approximation models around individual predictions to explain why a specific output was produced.

AI-Specific Threats and Vulnerabilities

Data Poisoning: Attack on the TRAINING phase — adversaries inject malicious data into training datasets to corrupt learning, introduce backdoors, or create systematically biased outputs.
Prompt Injection: Attack on the INFERENCE phase — crafted inputs at query time override system prompts or safety guardrails; direct (user input) or indirect (malicious content in retrieved documents).
Model Inversion: Reconstructing sensitive training data by systematically querying the model; a privacy attack that can reveal PII used during training.
Model Theft (Model Extraction): Systematically querying a proprietary model to train a clone/surrogate; violates IP and enables offline adversarial attacks against the cloned model.
Adversarial Examples: Inputs with imperceptible perturbations that cause misclassification; used to evade detection models (malware classifiers, fraud detectors, image recognition systems).
Hallucination: Model-generated plausible but factually incorrect content; AI-specific risk with no traditional IT equivalent — auditors must verify hallucination rate measurement and acceptable thresholds.
MITRE ATLAS: Adversarial Threat Landscape for AI Systems — knowledge base of real-world adversary tactics targeting AI/ML, modeled after MITRE ATT&CK; primary AI threat taxonomy for audit use.
OWASP Top 10 for LLM Applications: Industry standard list of top security risks for LLM-based applications including prompt injection, training data poisoning, sensitive info disclosure, model DoS, and excessive agency.

AI Change Management

Model Versioning: Tracking all versions of a model (weights, code, training data, configuration) in a model registry so any version can be reproduced or rolled back on demand.
Model Registry: Centralized catalog storing approved model versions, metadata, performance metrics, and deployment status; the source of truth for production model inventory.
Change Approval Workflow for AI: Formal human authorization required before model updates are deployed to production; automated retraining without approval is a change management control deficiency.
Rollback Procedures: Documented and tested procedures to revert to a prior approved model version if a new deployment degrades performance, introduces bias, or causes incidents.
Impact Assessment for Retraining: Evaluating how retraining on new data changes model behavior, fairness metrics, and downstream business processes before approving the updated model for production.
AI Change Management vs. Traditional IT: AI requires more than code review — model retraining can change behavior without code changes, so traditional change processes focused only on code are insufficient for AI.
Model Card: Standardized documentation artifact for AI models covering intended use, performance metrics, training data characteristics, limitations, and ethical considerations; supports audit and transparency.

AI Incident Response

AI Incident vs. Traditional IT Incident: AI incidents include model drift below thresholds, bias emergence, hallucination spikes, adversarial compromise, and data poisoning — none of which have direct equivalents in traditional IT.
Automated Rollback Trigger: Pre-defined performance or safety metric threshold that automatically reverts a model to its prior version when breached, without requiring manual intervention.
Root Cause Analysis for AI Failures: Investigating AI incidents requires examining data distribution changes, training data integrity, model updates, and third-party API changes — broader scope than traditional RCA.
Bias Emergence Escalation: Defined procedure for escalating when post-deployment bias monitoring detects fairness metric degradation below acceptable thresholds; must include stakeholder notification.
Hallucination Rate Threshold: A pre-defined acceptable rate of hallucinations for a deployed model; auditors verify that thresholds are defined, monitored, and that breaches trigger escalation procedures.
Vendor AI Incident Notification: SLA requirement for third-party AI providers to notify the organization of incidents affecting their models or APIs within a defined timeframe; part of vendor risk management.
Supervision of AI Outputs: Human oversight mechanisms that review or override AI decisions above defined confidence score thresholds or in high-stakes scenarios; auditors verify that escalation criteria and human-review procedures are documented and tested.

MLOps and AI Operations

MLOps: Operationalizing ML models through automated pipelines for training, testing, deployment, and monitoring — analogous to DevOps but extended for AI-specific lifecycle stages.
Feature Engineering: Selecting, transforming, and creating input variables from raw data; feature selection can introduce bias and requires audit validation for relevance and fairness.
Feature Store: Centralized repository for storing, versioning, and sharing computed features across multiple models; enables reproducibility and auditable feature lineage.
Train / Validation / Test Split: Partitioning data into three non-overlapping sets: training (model learns), validation (hyperparameter tuning), and test (final unbiased performance estimate); splits must be documented.
A/B Deployment (Champion/Challenger): Running two model versions simultaneously to compare performance in production; the challenger replaces the champion only if it consistently outperforms it.
Canary Deployment: Gradually routing a small percentage of traffic to a new model version before full rollout; limits blast radius of a bad deployment and enables early performance monitoring.
AI Bill of Materials (AI BOM): Comprehensive inventory of all AI system components — data sources, libraries, frameworks, pre-trained models, and third-party dependencies — for supply chain risk management.
Model Performance Metrics: Accuracy, precision, recall, F1-score, AUC-ROC for classification; RMSE, MAE for regression; auditors verify that appropriate metrics are chosen and thresholds are defined.

AI Audit Planning and Scoping

AI Audit Scope Definition: Defining the system boundary, audit objectives, applicable control frameworks, stakeholders, and risk-based priorities for an AI audit — more complex than traditional IT scope due to data dependencies.
Risk-Based Scope Prioritization: High-risk AI systems (high-stakes decisions, regulated domains, large affected populations) receive more audit coverage proportional to their risk level.
Control Framework Selection for AI Audits: Choosing which frameworks to apply (NIST AI RMF, ISO/IEC 42001, EU AI Act, internal policy) based on the organization's jurisdiction, industry, and AI risk profile.
AI Audit Stakeholder Identification: Identifying model owners, data scientists, MLOps engineers, risk management, legal/compliance, and end-user representatives as audit stakeholders — broader than traditional IT audits.
Adversarial Testing — Pre-Deployment Requirement: Adversarial testing must be performed BEFORE deployment as a proactive security measure; it is not a reactive incident response activity and cannot substitute for post-deployment monitoring.
Model Validation — Pre-Deployment: Independent testing confirming a model meets performance thresholds, fairness criteria, and business requirements BEFORE production deployment; performed by a team separate from developers.
Confidence Score Thresholds: Pre-defined minimum confidence levels below which AI model decisions must be escalated to human review rather than automated; auditors verify thresholds are documented, enforced, and appropriate for the risk level of the decision.

AI Audit Evidence and Testing

AI Audit Evidence — Sufficiency: Enough evidence must be gathered to support audit conclusions; AI-specific evidence includes training logs, fairness test results, drift alerts, and approval records.
AI Audit Evidence — Reliability: Evidence must come from trustworthy sources; model output logs generated by the system being audited must be verified for integrity and tamper-evidence.
AI Audit Evidence — Reproducibility: AI evidence must be independently verifiable; stochastic model behavior (different outputs for same input) must be documented as a limitation in the working papers.
Training Log Preservation: Model training logs must be retained and not overwritten; overwriting logs during retraining is a control deficiency in audit trail management, not an acceptable storage practice.
Disparate Impact Analysis as Audit Procedure: Analyzing model outputs stratified by demographic group using fairness metrics is the preferred audit procedure for bias testing — code review and developer interviews are insufficient substitutes.
Output Sampling for AI Audits: Risk-based sampling of model predictions to verify outputs meet quality, accuracy, and policy standards; stratified sampling across demographic groups is required for bias assessment.
Configuration and Code Review: Reviewing model configuration, hyperparameters, and code for control gaps; must be combined with output testing since biased or incorrect behavior can emerge from data, not code.

AI-Enabled Audit Analytics

Full-Population Testing with AI: Using AI analytics to test 100% of a population rather than sampling; enables detection of rare anomalies but does NOT eliminate the need for professional judgment.
Anomaly Detection in Audits: AI-powered identification of transactions, behaviors, or patterns that deviate significantly from baseline; flagged items still require auditor evaluation before classifying as findings.
NLP for Document Review: Natural language processing tools that automatically analyze contracts, policies, and evidence documents; reduces time for bulk document review while maintaining auditor oversight.
Auditor Independence When Using AI Tools: If the same AI system being audited also provides the audit analytics tools, there is a conflict of interest; auditors must use independent tools or document the impairment.
Auditing AI Systems vs. Using AI in Auditing: Domain 2 covers auditing AI systems (AI is the audit target). Domain 3 covers using AI as an audit tool (AI assists the auditor). These are opposite directions — confusing them leads to wrong answers.
Professional Skepticism with AI-Generated Findings: AI analytics flags anomalies; auditors must still evaluate whether flagged items represent actual control deficiencies — over-reliance on AI findings without judgment is a key exam trap.
Risk-Based vs. Random Sampling for AI Audits: Risk-based sampling is preferred over random sampling for AI audits because AI systems have non-uniform risk distributions — certain model behaviors (edge cases, demographic sub-groups) carry disproportionately higher risk.

Explainability and Transparency

Explainability (Technical): The ability to understand HOW a specific model makes individual decisions, using technical tools like SHAP and LIME to interpret model behavior at a granular level.
Transparency (Organizational): The broader practice of being OPEN about AI system usage — disclosing that AI is used, what data it processes, its limitations, and how decisions can be challenged or appealed.
Explainability vs. Transparency Exam Trap: Explainability is TECHNICAL (how the model decides). Transparency is ORGANIZATIONAL (disclosing AI practices). Auditors must verify both independently — they are separate controls.
High-Risk AI Explainability Requirement: EU AI Act and NIST AI RMF require that high-risk AI systems provide adequate explanations for decisions affecting individuals; auditors verify this control exists and functions.
Right to Explanation: Legal and ethical right of individuals affected by automated decisions to receive a meaningful explanation of how the decision was reached; auditors verify the mechanism exists.

Vendor and Third-Party AI Risk

Third-Party AI Risk Assessment: Due diligence evaluating vendor AI models and APIs for security, bias, explainability, SLA compliance, and data handling before onboarding or renewal.
API Versioning Risk: Third-party APIs may update model versions without notice, changing behavior or performance of dependent systems; organizations must track API version changes as a change management control.
Vendor Lock-In Assessment: Evaluating dependency on a single AI vendor's proprietary model or API, assessing the feasibility and cost of switching providers if the vendor fails or underperforms.
SLA Monitoring for AI APIs: Ongoing tracking of third-party AI provider compliance with latency, uptime, accuracy, and data handling SLA terms; SLA breaches are audit findings requiring remediation.
Data Sharing Governance with Vendors: Policies controlling what data is sent to third-party AI APIs; includes classification of permissible data, de-identification requirements, and cross-border transfer restrictions.

Ready to test yourself?

Start a timed AAIA mock exam or review practice questions by domain.