Quick Navigation
AI Types & FoundationsTraining & Model DevelopmentPrompt Engineering & RAGData Security & Data TypesAI Lifecycle & Human OversightThreat Modeling FrameworksAI Security ControlsAccess Controls for AI SystemsData Security for AIMonitoring & Auditing AI SystemsAI Attacks — Prompt & Input AttacksAI Attacks — Model & Data AttacksAI Attacks — Operational & Output AttacksCompensating ControlsAI-Assisted Security ToolsAI Security Use CasesAI Attack Vectors & Adversarial AIAutomated Attacks & AI AutomationCI/CD for AI SecurityAI Governance & RolesResponsible AI PrinciplesAI Risks & ComplianceCorporate AI Governance
AI Types & Foundations
- Generative AI
- AI that creates new content (text, images, code, audio) based on learned patterns from training data. Examples: ChatGPT, DALL-E, Midjourney.
- Machine Learning (ML)
- Subset of AI where systems learn patterns from data without being explicitly programmed. Includes supervised, unsupervised, and reinforcement learning.
- Statistical Learning
- Mathematical framework for understanding data relationships through statistical methods. Foundation for many ML algorithms (regression, classification).
- Transformers
- Neural network architecture using self-attention mechanisms. Foundation for modern LLMs. Processes entire input sequences in parallel rather than sequentially.
- Deep Learning
- ML using multi-layered neural networks to learn hierarchical representations. Excels at image recognition, NLP, and complex pattern detection.
- Natural Language Processing (NLP)
- AI techniques for understanding, interpreting, and generating human language. Includes sentiment analysis, translation, summarization, and entity extraction.
- Large Language Models (LLMs)
- Large-scale transformer models trained on massive text corpora. Capable of text generation, reasoning, and tool use. Examples: GPT-4, Claude, Llama.
- Small Language Models (SLMs)
- Compact language models optimized for specific tasks or resource-constrained environments. Lower latency and cost than LLMs. Examples: Phi, Gemma.
- Generative Adversarial Networks (GANs)
- Two neural networks (generator and discriminator) compete against each other. Generator creates content, discriminator evaluates authenticity. Used for deepfakes, image synthesis.
Training & Model Development
- Supervised Learning
- Model trains on labeled data (input-output pairs). Used for classification and regression. Requires ground-truth labels.
- Unsupervised Learning
- Model finds patterns in unlabeled data. Used for clustering, dimensionality reduction, and anomaly detection. No ground-truth labels needed.
- Reinforcement Learning
- Agent learns by interacting with an environment, receiving rewards or penalties. Used for game playing, robotics, and RLHF (Reinforcement Learning from Human Feedback).
- Model Validation
- Process of evaluating a trained model's performance using held-out test data. Ensures the model generalizes beyond training data and is not overfitting.
- Fine-tuning
- Adapting a pre-trained model to a specific task or domain by training on additional domain-specific data. More efficient than training from scratch.
- Epoch
- One complete pass through the entire training dataset. Multiple epochs allow the model to learn patterns progressively. Too many epochs can cause overfitting.
- Pruning
- Removing redundant or low-importance parameters (weights/neurons) from a model to reduce size and improve inference speed without significant accuracy loss.
- Quantization
- Reducing the precision of model weights (e.g., 32-bit to 8-bit) to decrease model size and speed up inference. Trades minor accuracy loss for efficiency.
Prompt Engineering & RAG
- System Prompt
- Hidden instructions that define the AI's role, behavior, and constraints. Set by developers, not visible to end users. Controls tone, safety, and output format.
- User Prompt
- The input provided by the end user to the AI model. Contains the question, task, or instruction the user wants the model to address.
- Zero-shot Prompting
- Asking the model to perform a task with no examples provided. Relies entirely on the model's pre-trained knowledge.
- One-shot Prompting
- Providing exactly one example before asking the model to perform a task. Helps the model understand the expected format and output style.
- Multi-shot (Few-shot) Prompting
- Providing multiple examples before asking the model to perform a task. More examples generally improve output quality and consistency.
- System Roles
- Predefined personas or behavioral guidelines assigned to the AI in the system prompt (e.g., 'You are a security analyst'). Shapes how the model responds.
- Prompt Templates
- Standardized, reusable prompt structures with placeholders for dynamic content. Enforce consistency and reduce prompt injection risk.
- RAG (Retrieval-Augmented Generation)
- Architecture that retrieves relevant documents from a knowledge base and provides them as context to the LLM before generation. Reduces hallucinations and provides up-to-date information.
- Vector Storage
- Specialized databases (vector DBs) that store and search high-dimensional embeddings. Enable semantic similarity search for RAG. Examples: Pinecone, Chroma, Weaviate.
- Embeddings
- Numerical vector representations of text, images, or other data. Similar concepts have similar vectors. Used for semantic search, clustering, and RAG retrieval.
- Watermarking
- Embedding hidden, detectable patterns in AI-generated content (text or images) to identify AI origin. Helps combat misinformation and track content provenance.
Data Security & Data Types
- Data Cleansing
- Removing errors, duplicates, inconsistencies, and noise from datasets before model training. Critical for model accuracy and preventing bias.
- Data Verification
- Confirming data accuracy and correctness against trusted sources. Ensures training data meets quality standards.
- Data Lineage
- Tracking data's origin, movement, and transformation through the pipeline. Documents where data came from and how it was modified over time.
- Data Integrity
- Ensuring data remains accurate, consistent, and unaltered throughout its lifecycle. Prevents tampering and corruption.
- Data Provenance
- Recording the complete history of data — origin, custody chain, and all transformations applied. More comprehensive than lineage.
- Data Augmentation
- Artificially increasing training dataset size by creating modified copies of existing data (rotation, cropping, paraphrasing). Reduces overfitting.
- Data Balancing
- Ensuring training data has proportional representation across categories. Prevents model bias toward overrepresented classes. Techniques: oversampling, undersampling, SMOTE.
- Structured Data
- Data organized in predefined schemas (rows/columns). Examples: SQL databases, spreadsheets, CSV files. Easy to query and analyze.
- Semi-structured Data
- Data with some organizational structure but no rigid schema. Examples: JSON, XML, email, log files. Has tags or markers for elements.
- Unstructured Data
- Data without predefined format or organization. Examples: images, audio, video, free-text documents. Requires AI/ML for analysis.
AI Lifecycle & Human Oversight
- AI Lifecycle Stages
- Business use case → Data collection → Data preparation → Model development → Model evaluation → Deployment → Validation → Monitoring → Feedback loop. Iterative process.
- Business Use Case Definition
- First lifecycle stage: identifying the problem AI will solve, expected outcomes, success criteria, and ROI justification.
- Data Collection & Preparation
- Gathering, cleaning, labeling, and transforming raw data into formats suitable for model training. Most time-consuming lifecycle phase.
- Model Development & Evaluation
- Selecting algorithms, training models, tuning hyperparameters, and evaluating performance against metrics (accuracy, precision, recall, F1).
- Deployment & Monitoring
- Putting models into production, monitoring performance, detecting drift, and collecting feedback for continuous improvement.
- Human-in-the-Loop (HITL)
- Requiring human review and approval at critical decision points in AI workflows. Essential for high-stakes decisions (medical, legal, financial).
- Human Oversight
- Continuous human supervision of AI systems to ensure they operate within defined boundaries and ethical guidelines. Mandated by regulations like the EU AI Act.
- Human Validation
- Human experts reviewing and verifying AI outputs for correctness, appropriateness, and safety before they are acted upon or published.
Threat Modeling Frameworks
- OWASP LLM Top 10
- Top 10 security risks for LLM applications: prompt injection, insecure output, training data poisoning, model DoS, supply chain, sensitive info disclosure, insecure plugin, excessive agency, overreliance, model theft.
- OWASP ML Security Top 10
- Top 10 risks for ML systems: input manipulation, data poisoning, model inversion, membership inference, model theft, AI supply chain, transfer learning attack, model skewing, output integrity, model DoS.
- MITRE ATLAS
- Adversarial Threat Landscape for AI Systems. Knowledge base of adversary tactics and techniques targeting AI/ML, modeled after MITRE ATT&CK. Maps real-world AI attacks.
- MIT AI Risk Repository
- Comprehensive database of AI-related risks cataloging threats across technical, ethical, and societal dimensions. Academic research resource for AI risk assessment.
- CVE AI Working Group
- Working group focused on standardizing vulnerability identification and reporting for AI/ML systems within the CVE (Common Vulnerabilities and Exposures) framework.
AI Security Controls
- Model Guardrails
- Programmatic rules that constrain AI model inputs and outputs. Prevent harmful content generation, enforce topic boundaries, and block dangerous instructions.
- Prompt Templates
- Predefined prompt structures that limit user input to specific fields. Reduce prompt injection risk by controlling what reaches the model.
- Prompt Firewalls
- Security layer that inspects, filters, and blocks malicious prompts before they reach the AI model. Detects prompt injection, jailbreaking, and data exfiltration attempts.
- Rate Limiting
- Restricting the number of API requests per user/time period. Prevents model DoS, cost abuse, and automated attacks against AI endpoints.
- Token Limits
- Capping the maximum number of input/output tokens per request. Controls costs, prevents resource exhaustion, and limits data extraction.
- Input Validation & Limits
- Sanitizing and constraining user inputs — length, format, character sets, and content type. First line of defense against injection attacks.
- Modality Limits
- Restricting which input types (text, image, audio, file upload) a model can accept. Reduces attack surface by limiting input vectors.
- Endpoint Access Controls
- Authentication, authorization, and network-level restrictions on AI model API endpoints. Includes API keys, OAuth tokens, IP allowlists, and VPN requirements.
Access Controls for AI Systems
- Model Access Controls
- Restricting who can query, modify, retrain, or deploy AI models. Role-based permissions for model inference vs. model management operations.
- Data Access Controls
- Governing who can access training data, inference data, and model outputs. Includes classification-based access, need-to-know restrictions, and data compartmentalization.
- Agent Access Controls
- Limiting what actions AI agents can perform — which tools they can call, what systems they can access, and what operations they can execute autonomously.
- Network/API Access Controls
- Network segmentation, API gateway policies, and firewall rules governing communication between AI components, data stores, and external services.
- Least Privilege for AI
- Granting AI systems and agents only the minimum permissions needed for their function. Prevents excessive agency and limits blast radius of compromise.
Data Security for AI
- Encryption in Transit
- Encrypting data as it moves between components (TLS/HTTPS). Protects prompts, responses, and training data flowing between clients, APIs, and models.
- Encryption at Rest
- Encrypting stored data — model weights, training datasets, embeddings, and logs. Uses AES-256 or similar. Protects against unauthorized access to storage.
- Encryption in Use
- Processing data while it remains encrypted using techniques like homomorphic encryption, secure enclaves, or confidential computing. Protects data during inference.
- Data Anonymization
- Irreversibly removing personally identifiable information from datasets. Techniques: generalization, suppression, noise addition. Cannot be reversed.
- Data Classification Labels
- Tagging data with sensitivity levels (public, internal, confidential, restricted). Determines handling requirements and which models can access the data.
- Data Redaction
- Permanently removing sensitive content from data before it reaches the model. Applied to inputs (PII in prompts) and outputs (sensitive info in responses).
- Data Masking
- Replacing sensitive data with realistic but fake values (e.g., masking SSN as XXX-XX-1234). Preserves data format while protecting sensitive content.
- Data Minimization
- Collecting and retaining only the minimum data necessary for the AI task. Reduces exposure risk and aligns with privacy regulations (GDPR, EU AI Act).
Monitoring & Auditing AI Systems
- Prompt Monitoring (Query)
- Inspecting user inputs for prompt injection attempts, policy violations, sensitive data leakage, and abuse patterns. Logs queries for audit trails.
- Prompt Monitoring (Response)
- Inspecting model outputs for hallucinations, harmful content, data leakage, bias, and policy violations before delivering to users.
- Log Monitoring
- Collecting and analyzing AI system logs for anomalies, unauthorized access, unusual query patterns, and security incidents.
- Log Sanitization
- Removing sensitive data (PII, credentials, proprietary info) from logs before storage. Prevents log files from becoming a data leakage vector.
- Log Protection
- Securing log integrity through immutable storage, access controls, and tamper-evident mechanisms. Ensures logs are reliable for forensics and auditing.
- Confidence Levels
- Monitoring model confidence scores for predictions. Low confidence may indicate out-of-distribution inputs, adversarial manipulation, or model degradation.
- Rate & Cost Monitoring
- Tracking API usage rates, token consumption, and associated costs. Detects anomalies like abuse, data exfiltration, or runaway automated queries.
- Hallucination Auditing
- Systematically reviewing AI outputs for fabricated facts, false citations, or invented information. Critical for high-stakes applications.
- Accuracy Auditing
- Periodically evaluating model performance against ground-truth data to detect degradation, drift, or emerging failure modes.
- Bias & Fairness Auditing
- Evaluating model outputs across demographic groups for discriminatory patterns. Tests for disparate impact, representation bias, and equitable outcomes.
- Access Auditing
- Reviewing who accessed AI systems, what queries were made, what data was retrieved, and whether access was authorized. Supports compliance requirements.
AI Attacks — Prompt & Input Attacks
- Prompt Injection
- Crafting malicious input that overrides system prompts or instructions. Direct injection targets the user prompt; indirect injection hides instructions in retrieved data (e.g., web pages, documents).
- Jailbreaking
- Techniques to bypass model safety guardrails and content policies. Methods include role-playing scenarios, encoding tricks, hypothetical framing, and multi-turn manipulation.
- Input Manipulation
- Modifying inputs (adversarial examples) to cause misclassification or incorrect outputs. Subtle perturbations that are imperceptible to humans but fool the model.
- Guardrail Circumvention
- Bypassing safety controls through creative prompt engineering, encoding, multi-step attacks, or exploiting edge cases in guardrail logic.
- Insecure Output Handling
- Exploiting applications that trust AI output without validation. AI output containing code, SQL, or commands gets executed without sanitization, enabling XSS, SSRF, or RCE.
AI Attacks — Model & Data Attacks
- Data Poisoning
- Injecting malicious or manipulated data into training datasets. Causes the model to learn incorrect patterns, produce biased outputs, or create hidden backdoors.
- Model Poisoning
- Directly tampering with model weights, architecture, or training process. Can introduce trojans/backdoors that activate on specific trigger inputs.
- Model Inversion
- Reconstructing training data from model outputs. Attacker queries the model repeatedly to reverse-engineer sensitive data used during training.
- Model Theft (Model Extraction)
- Stealing a proprietary model by systematically querying it and using responses to train a clone/surrogate model. Violates IP and enables offline attacks.
- Membership Inference
- Determining whether a specific data record was used in the model's training data. Privacy violation — can reveal individuals' presence in sensitive datasets.
- Model Skewing
- Subtly manipulating model behavior over time through carefully crafted inputs during online/continuous learning. Gradually degrades model performance or introduces bias.
- Transfer Learning Attacks
- Exploiting vulnerabilities in pre-trained base models that persist through fine-tuning. Backdoors in foundation models propagate to all downstream models.
- Supply Chain Attacks
- Compromising AI components — pre-trained models, libraries, datasets, or dependencies — before they reach the target organization. Attacks the AI development pipeline.
AI Attacks — Operational & Output Attacks
- Hallucinations
- Model generates plausible-sounding but factually incorrect information. Can be exploited by attackers to spread misinformation or manipulated via poisoned RAG sources.
- Bias Introduction
- Deliberately introducing biased data or prompts to make AI produce discriminatory or skewed outputs. Can be subtle and difficult to detect.
- Output Integrity Attacks
- Manipulating or intercepting model outputs between generation and delivery. Man-in-the-middle attacks on AI responses or tampering with cached outputs.
- Model DoS (Denial of Service)
- Overwhelming AI model endpoints with requests or crafting inputs that consume excessive compute resources. Causes service degradation or outage.
- Sensitive Information Disclosure
- Extracting confidential data from AI models — training data, system prompts, API keys, or PII — through clever querying or prompt injection.
- Insecure Plugin/Tool Use
- Exploiting AI plugins or tool integrations that lack proper input validation, authentication, or authorization. Attacker uses AI as a proxy to attack connected systems.
- Excessive Agency
- AI systems granted too many permissions, capabilities, or autonomy. Model performs unintended actions through tools or APIs with overly broad access.
- Overreliance
- Users or organizations trusting AI outputs without verification. Leads to undetected errors, hallucinations being accepted as fact, and reduced human critical thinking.
Compensating Controls
- Prompt Firewalls (Compensating)
- Deploy prompt firewalls to detect and block prompt injection, jailbreaking, and data exfiltration. Acts as a WAF equivalent for AI applications.
- Guardrails (Compensating)
- Implement input/output guardrails to enforce content policies, block harmful content, and validate response safety when model-level controls are insufficient.
- Access Controls (Compensating)
- Apply RBAC, ABAC, and network segmentation to restrict who can interact with AI models, access training data, and manage deployments.
- Data Integrity Controls
- Hash verification, checksums, and digital signatures for training data, model weights, and pipeline artifacts. Detect tampering and poisoning.
- Encryption (Compensating)
- Apply encryption at all stages (transit, rest, in use) to protect prompts, responses, training data, and model weights from interception or theft.
- Prompt Templates (Compensating)
- Use structured prompt templates to constrain user input and reduce injection surface. Parameterize inputs rather than allowing free-form prompts.
- Rate Limiting (Compensating)
- Enforce request rate limits, token limits, and cost ceilings to prevent DoS, data exfiltration via high-volume queries, and cost abuse.
- Least Privilege (Compensating)
- Restrict AI agent permissions to minimum required capabilities. Limit tool access, API scope, and autonomous action authority to reduce excessive agency risk.
AI-Assisted Security Tools
- IDE Plugins
- AI-powered code completion and security scanning within development environments. Detect vulnerabilities as code is written (e.g., GitHub Copilot, Snyk).
- Browser Plugins
- AI extensions for threat analysis, phishing detection, and security research within web browsers.
- CLI Plugins
- Command-line AI tools for security operations — log analysis, threat hunting, incident triage, and automated scripting.
- Chatbots & Personal Assistants
- AI conversational interfaces for security operations — answering policy questions, triaging alerts, guiding incident response, and knowledge management.
- MCP Server
- Model Context Protocol server — standardized way to connect AI assistants to external tools and data sources. Enables AI agents to interact with security systems.
AI Security Use Cases
- Signature Matching
- AI-enhanced pattern matching to detect known malware signatures, attack patterns, and IoCs with higher accuracy and fewer false positives.
- Code Linting & Security
- AI-powered static analysis that identifies security vulnerabilities, code smells, and compliance violations in source code.
- Vulnerability Analysis
- AI-assisted identification, prioritization, and remediation of vulnerabilities. Contextualizes CVEs based on environment and exploitability.
- Automated Penetration Testing
- AI-driven offensive security testing that discovers attack paths, exploits vulnerabilities, and reports findings without manual intervention.
- Anomaly Detection
- ML models that learn normal behavior baselines and flag deviations — unusual network traffic, login patterns, or data access behavior.
- Pattern Recognition
- AI identifying recurring threat patterns, attack campaigns, and TTPs across large volumes of security data that overwhelm human analysts.
- Incident Management
- AI-assisted alert triage, incident classification, severity assessment, playbook recommendation, and automated response orchestration.
- AI-Powered Threat Modeling
- Automated identification of threats, attack surfaces, and risk scenarios for applications and infrastructure using AI analysis.
- Fraud Detection
- ML models identifying fraudulent transactions, account takeover, and financial anomalies in real-time. Combines behavioral analysis with pattern matching.
AI Attack Vectors & Adversarial AI
- Deepfakes — Impersonation
- AI-generated synthetic media (video, audio, images) mimicking real people for identity fraud, CEO fraud, vishing, and authentication bypass.
- Deepfakes — Misinformation
- AI-generated fake content (articles, images, video) that appears authentic. Spread unintentionally — distinguishes from deliberate disinformation.
- Deepfakes — Disinformation
- Deliberately created and distributed deepfake content designed to deceive, manipulate public opinion, or cause harm. Intentional deception at scale.
- Adversarial Networks
- Using AI (including GANs) to generate attack content — malware that evades detection, phishing emails that bypass filters, and adversarial examples.
- AI-Powered Reconnaissance
- Using AI to automate OSINT gathering, target profiling, infrastructure mapping, and vulnerability discovery at scale during the reconnaissance phase.
- AI-Enhanced Social Engineering
- AI-generated personalized phishing, vishing with cloned voices, and context-aware pretexting that is more convincing than traditional social engineering.
- AI-Powered Obfuscation
- Using AI to automatically obfuscate malware, evade detection signatures, and create polymorphic code that changes with each execution.
Automated Attacks & AI Automation
- Attack Vector Discovery
- AI automatically scanning and identifying exploitable vulnerabilities, misconfigurations, and attack paths in target environments.
- Automated Payload Generation
- AI generating custom exploit payloads tailored to specific targets and vulnerabilities. Adapts payloads to bypass specific defenses.
- AI-Generated Malware
- Using AI to create novel malware variants, polymorphic code, and evasive techniques that bypass traditional signature-based detection.
- AI Honeypots
- AI-powered deception technology that creates realistic-looking fake systems, data, and services to detect, study, and delay attackers.
- AI-Powered DDoS
- Using AI to optimize DDoS attack patterns — adaptive rate adjustment, target selection, and evasion of DDoS mitigation systems.
- Low-code/No-code Scripting
- AI enabling non-technical users to create automation scripts and workflows through natural language. Security concern: lowers barrier for malicious automation.
- Document Synthesis
- AI generating reports, documentation, and analysis from multiple data sources. Used for incident reports, compliance documentation, and threat intelligence.
- Incident Response Tickets
- AI automatically creating, categorizing, and prioritizing incident response tickets from security alerts with relevant context and remediation steps.
- Change Management
- AI-assisted change request analysis, risk assessment, and approval routing. Evaluates potential security impact of proposed changes.
- AI Agents
- Autonomous AI systems that can plan, use tools, and execute multi-step tasks. In security: automated investigation, response orchestration, and threat hunting.
CI/CD for AI Security
- Code Scanning (SAST)
- AI-enhanced static application security testing in CI/CD pipelines. Identifies vulnerabilities in source code before deployment.
- Software Composition Analysis (SCA)
- Scanning dependencies, libraries, and AI model components for known vulnerabilities. Critical for AI supply chain security.
- Unit & Regression Testing
- Automated tests validating AI model behavior, output quality, and safety properties. Regression tests catch unintended changes after updates.
- Model Testing
- Specialized testing for AI models — adversarial testing, bias testing, robustness evaluation, and safety boundary testing in the pipeline.
- Automated Deployment/Rollback
- CI/CD automation for model deployment with automatic rollback if performance metrics drop, safety checks fail, or anomalies are detected post-deployment.
AI Governance & Roles
- AI Center of Excellence (CoE)
- Centralized organizational body that establishes AI strategy, standards, best practices, and governance policies. Coordinates AI efforts across departments.
- AI Policies & Procedures
- Formal documents defining acceptable AI use, security requirements, data handling, model lifecycle management, and incident response for AI systems.
- Data Scientist
- Develops and trains ML models, performs statistical analysis, and creates data-driven solutions. Focus on model accuracy and performance.
- AI Architect
- Designs overall AI system architecture, selects technologies, and ensures scalability, security, and integration with existing infrastructure.
- ML Engineer
- Bridges data science and engineering — productionizes models, builds training pipelines, and optimizes model performance for deployment.
- Platform Engineer
- Builds and maintains the infrastructure and platforms that AI/ML teams use — compute clusters, GPU infrastructure, and development environments.
- MLOps Engineer
- Manages the operational lifecycle of ML models — CI/CD for models, monitoring, versioning, and automated retraining pipelines.
- AI Security Architect
- Designs security controls specifically for AI systems — threat modeling, secure architecture, and defense strategies against AI-specific attacks.
- AI Governance Engineer
- Implements governance frameworks, compliance controls, and policy enforcement mechanisms for AI systems across the organization.
- AI Risk Analyst
- Identifies, assesses, and quantifies risks associated with AI systems — technical risks, ethical risks, compliance risks, and business risks.
- AI Auditor
- Conducts independent assessments of AI systems for compliance, fairness, accuracy, and adherence to organizational policies and regulations.
- Data Engineer
- Builds and maintains data pipelines, data infrastructure, and data quality systems that feed AI/ML models with clean, reliable data.
Responsible AI Principles
- Fairness
- AI systems must treat all people equitably, avoid discrimination, and not create or reinforce unfair bias based on protected characteristics.
- Reliability & Safety
- AI systems must operate correctly under expected and unexpected conditions, with fail-safes and fallback mechanisms for critical applications.
- Transparency
- Organizations must be open about when and how AI is used, what data it processes, and how decisions are made. Users should know they are interacting with AI.
- Privacy & Security
- AI systems must protect personal data, comply with privacy regulations, and implement security controls throughout the AI lifecycle.
- Explainability
- AI decisions must be understandable and interpretable by humans. Users and stakeholders should be able to understand why a model made a specific decision.
- Inclusiveness
- AI systems must be designed to work for diverse populations, be accessible, and consider the needs of all user groups including those with disabilities.
- Accountability
- Clear ownership and responsibility for AI system outcomes. Organizations and individuals must be answerable for AI decisions and their consequences.
- Consistency
- AI systems should produce reliable, predictable, and reproducible results across similar inputs and use cases. Inconsistent outputs erode trust.
AI Risks & Compliance
- Bias Risk
- AI systems reflecting or amplifying biases in training data or design. Can lead to discrimination, legal liability, and reputational damage.
- Data Leakage Risk
- Sensitive training data, prompts, or outputs being exposed through model queries, logs, or inference. Includes memorization of PII by LLMs.
- Reputational Loss
- Brand damage from AI generating harmful, offensive, biased, or incorrect content. Amplified by social media and public scrutiny of AI failures.
- Model Accuracy Risk
- Models producing incorrect predictions or recommendations that lead to bad business decisions, financial losses, or safety incidents.
- IP Risks
- Intellectual property concerns — AI training on copyrighted data, generating infringing content, or exposing proprietary information through model outputs.
- Autonomous Systems Risk
- AI systems making decisions or taking actions without adequate human oversight. Higher risk in safety-critical domains (autonomous vehicles, medical, military).
- Shadow AI
- Unauthorized use of AI tools by employees without IT/security approval. Creates ungoverned data flows, compliance violations, and security blind spots.
- EU AI Act
- European regulation classifying AI systems by risk level (unacceptable, high, limited, minimal). Mandates conformity assessments, transparency, and human oversight for high-risk AI.
- OECD AI Standards
- International principles for trustworthy AI — transparency, accountability, robustness, and safety. Adopted by 40+ countries as a governance baseline.
- ISO AI Standards
- International standards for AI — ISO/IEC 42001 (AI management system), ISO/IEC 23894 (AI risk management), and others covering AI lifecycle governance.
- NIST AI RMF
- NIST AI Risk Management Framework — structured approach to managing AI risks across Govern, Map, Measure, and Manage functions. Voluntary US framework.
Corporate AI Governance
- Sanctioned vs Unsanctioned Models
- Sanctioned: AI models approved by the organization for use. Unsanctioned: unapproved models (Shadow AI). Organizations must maintain approved model inventories.
- Private vs Public Models
- Private: models deployed within organizational infrastructure with full data control. Public: third-party hosted models (e.g., ChatGPT API) with data leaving the organization.
- Sensitive Data Governance
- Policies controlling how sensitive data (PII, PHI, financial, classified) can be used in AI training, prompts, and outputs. Includes DLP controls for AI.
- Third-party Evaluations
- Independent assessments of AI vendors, models, and services for security, compliance, and risk. Includes red-teaming, bias testing, and security audits.
- Data Sovereignty
- Legal requirement that data is subject to the laws of the country where it is stored or processed. Critical for AI systems using cloud-based models across jurisdictions.