Privacy Attacks | Category of attacks designed to reveal sensitive information contained within a ML model or its data. | Sensitive Information Disclosure
(PII, PCI, PHI) | The model reveals sensitive information about an individual (e.g., social security number, credit card details, medical history) either inadvertently or through manipulation. | Privacy | LLM02:2025 - Sensitive Information Disclosure | AML.T0057 -
LLM Data Leakage | NISTAML.03 -
Privacy Compromises |
Exfiltration from ML application
| Techniques used to get data out of a target network. Exfiltration of ML artifacts (e.g., data from privacy attacks) or other sensitive information. | Privacy
| LLM02:2025 - Sensitive Information Disclosure | AML.T0025 - Exfiltration via Cyber Means | NISTAML.03 -
Privacy Compromises |
IP Theft
| Steal or misuse any form of intellectual property, including copyrighted material, patent violations, trade secrets, competitive ideas, and protected software, with the intent to cause economic harm or competitive disadvantage to the victim organization. | Privacy
| LLM02:2025 - Sensitive Information Disclosure | AML.T0048.004 -
External Harms: ML Intellectual Property Theft | NISTAML.03 -
Privacy Compromises |
Meta Prompt Extraction
| An attack designed to extract the system prompt (system instructions) from a LLM application or model. | Privacy
| LLM07:2025 - System Prompt Leakage | AML.T0056 -
LLM Meta Prompt Extraction | NISTAML.035 -
Prompt Extraction |
Supply Chain Attacks | Security vulnerabilities that can arise in the ML lifecycle (from development to deployment) and can compromise model integrity, system security, and the reliability to AI/ML models. | Infrastructure Compromise
| Compromising infrastructure that host ML development pipelines and applications. Attackers may exploit vulnerabilities to gain unauthorized access, leading to further system or network compromise or compromise of model integrity. | Security
| LLM03:2025 - Supply Chain | AML.T0010 -
ML Supply Chain Compromise | N/A |
Model Compromise
| Tampering with or injecting malicious code into ML models before they are deployed. | Security
| LLM03:2025 - Supply Chain | AML.T0010 -
ML Supply Chain Compromise | NISTAML.05 -
Supply Chain Attacks |
Training Data Poisoning
| Manipulation of training data to compromise the integrity of a ML model. Corrupted training data may lead to skewed or biased outcomes, backdoor trigger insertions, and/or loss of user trust. | Security
| LLM04:2025 - Data and Model Poisoning | AML:T0020 - Poison Training Data | NISTAML.051 -
Model Poisoning Attacks |
Targeted Poisoning
| Data poisoning that aims to manipulate the output of a ML model in a targeted manner. By altering the labels or features of certain data points, attackers can cause the target model to misclassify specific inputs. | Security
| LLM04:2025 - Data and Model Poisoning | AML:T0020 - Poison Training Data | NISTAML.024 -
Targeted Poisoning |
Prompt Injection | Adversarial attack that attempts to alter or control the output of a LLM by providing instructions (via prompt) that override existing instructions and/or bypass model alignment or guardrails. A prompt injection technique is any transformation that preserves the intent of the input. | Prompt Injection | Aims to prevent prompt injection attempts that may override existing instructions, bypass model alignment, or breach guardrails in the model endpoint interactions. | Security | LLM01:2025 - Prompt Injection | AML.T0051 - LLM Prompt Injection | NISTAML.018 -
Prompt Injection |
Indirect Prompt Injection | Threat actor manipulates, poisons, and/or controls external sources that a LLM consumes, such as content retrieved from a database, document, or website with the goal of altering of controlling the output of said LLM. | Security | LLM01:2025 - Prompt Injection | AML.T0051 - LLM Prompt Injection, AML.T0051.001 - Indirect | NISTAML.015 -
Indirect Prompt Injection |
Insecure Tool Design | Exploitation of LLM connected tools due to insecure design and/or implementation. | SQL Injection | Prompts that trick the LLM into generating SQL queries that could be executed on a connected database, potentially leading to unauthorized data access or manipulation. | Security | LLM05:2025 - Improper Output Handling | AML.T0053 - LLM Plugin Compromise | NISTAML.018 -
Prompt Injection |
Command Execution | Prompts that could cause the LLM to generate system commands or scripts that might be executed on the host system and/or by connected tools, potentially leading to unauthorized actions or system compromise. | Security | LLM05:2025 - Improper Output Handling | AML.T0053 - LLM Plugin Compromise | NISTAML.018 -
Prompt Injection |
Cross-Site Scripting (XSS) | Prompts that could cause the LLM to output malicious JavaScript or other client-side code that could be executed in a user's browser if the LLM's output is rendered directly on a web page. | Security | LLM05:2025 - Improper Output Handling | N/A | NISTAML.018 -
Prompt Injection |
Denial of Service | An attack designed to degrade or shut down a ML model or application by flooding the system with requests, requesting large responses, or exploiting a vulnerability. | Model Denial of Service | An attack designed to degrade or shut down a ML model by flooding the system with requests, requesting large responses, or exploiting a vulnerability. | Security | LLM10:2025 - Unbounded Consumption | AML.T0029 - Denial of ML Service | NISTAML.01 -
Availability Violations |
Application Denial of Service | An attack that aims to make an application or service unavailable to its intended users by overwhelming it with a flood of requests or exploiting a software vulnerability to crash or degrade the service. | Security | LLM10:2025 - Unbounded Consumption | N/A | NISTAML.01 -
Availability Violations |
Cybersecurity and Hacking | Obtain or provide assistance to conduct cybersecurity attacks or deliberate misuse of systems. | Data Exfiltration | The unauthorized transfer of data from a computer or network, often for theft or espionage. | Security | N/A | AML.T0024 - Exfiltration via ML Inference API, AML.T0025 - Exfiltration via Cyber Means | NISTAML.033 -
Membership Inference |
Code Detection | Aims to prevent software code in the model endpoint interactions, reducing risks such as malicious code execution, accidental data exposure, and insecure coding practices. | Security | LLM05:2025 - Improper Output Handling | N/A | N/A |
Malicious Software | Software that is specifically designed to disrupt, damage, or gain unauthorized access to a computer system. | Security | LLM01:2025 - Prompt Injection | AML.T0048.002 - Societal Harm | NISTAML.04 -
Misuse Violations |
Social Engineering | Techniques for deceiving individuals into revealing confidential information through deceptive communication. | Security | LLM01:2025 - Prompt Injection | AML.T0048.002 - Societal Harm | NISTAML.04 -
Misuse Violations |
Safety Harms and Toxicity | Harms can encompass various categories, including user-specific, societal, reputational, and financial impacts. A model may generate harmful content such as insults, hate speech, discriminatory language, or sexually explicit material. Such toxic content can be offensive or cause harm. | Hate Speech | Abusive or threatening speech or writing that expresses prejudice on the basis of ethnicity, religion, sexual orientation, or similar grounds and the unjust or prejudicial treatment of different categories of people, especially on the grounds of ethnicity, age, sex, or disability. | Safety | LLM01:2025 - Prompt Injection | AML.T0048.001 - External Harms: Reputational Harm, AML.T0048.003 - External Harms: User Harm | NISTAML.04 -
Misuse Violations |
Harassment | Aggressive pressure or intimidation. | Safety | LLM01:2025 - Prompt Injection | AML.T0048.001 - External Harms: Reputational Harm, AML.T0048.003 - External Harms: User Harm | NISTAML.04 -
Misuse Violations |
Profanity | Blasphemous or obscene language. | Safety | LLM01:2025 - Prompt Injection | AML.T0048.001 - External Harms: Reputational Harm, AML.T0048.003 - External Harms: User Harm | NISTAML.04 -
Misuse Violations |
Sexual Content and Exploitation | Content that creates, distributes, or promotes sexually explicit material, negatively affecting societal norms, public safety, public figures or characters, and social well-being by normalizing harmful sexual behavior or exploitation (including sex crimes). | Safety | LLM01:2025 - Prompt Injection | AML.T0048.001 - External Harms: Reputational Harm | NISTAML.04 -
Misuse Violations |
Social Division and Polarization | Content that fosters division within society by promoting extreme views or demonizing specific groups. | Safety | LLM01:2025 - Prompt Injection | AML.T0048.001 - External Harms: Reputational Harm | NISTAML.04 -
Misuse Violations |
Self-Harm | Deliberate injury to oneself, typically as a manifestation of a mental condition. | Safety | LLM01:2025 - Prompt Injection | AML.T0048.003 - External Harms: User Harm | NISTAML.04 -
Misuse Violations |
Disinformation | False information which is intended to mislead. | Safety | LLM01:2025 - Prompt Injection | AML.T0048.002 - External Harms: Societal Harm | NISTAML.04 -
Misuse Violations |
Environmental Harm | Actions or misinformation leading to environmental degradation or hindering environmental protection efforts. | Safety | LLM01:2025 - Prompt Injection | AML.T0048.002 - External Harms: Societal Harm | NISTAML.04 -
Misuse Violations |
Violence and Public Safety Threat | Content that can endanger public safety, including promoting dangerous behavior or inflicting physical harm. This includes any incidences of violent crime, such as the unlawful exercise of physical force or intimidation by the exhibition of such force, and generally dangerous acts. | Safety | LLM01:2025 - Prompt Injection | AML.T0048.002 - External Harms: Societal Harm | NISTAML.04 -
Misuse Violations |
Non-Violent Crime | Actions or activities considered to be a crime, but not involving force or injury which can harm a group of people or the well-being of communities. | Safety | LLM01:2025 - Prompt Injection | AML.T0048.002 - External Harms: Societal Harm | NISTAML.04 -
Misuse Violations |
Scams and Deception | Deceiving individuals or organizations into parting with money, assets, or for any personal gain through false promises (cons) or misleading information. | Safety | LLM01:2025 - Prompt Injection | AML.T0048.001 - External Harms: Reputational Harm, AML.T0048.003 - External Harms: User Harm | NISTAML.04 -
Misuse Violations |
Financial Harm | Financial harm involves the loss of wealth, property, or other monetary assets due to theft, arson, vandalism, fraud or forgery, or pressure to provide financial resources to the adversary. | Safety | LLM01:2025 - Prompt Injection | AML.T0048.003 - External Harms: User Harm, AML.T0048.000 - External Harms: Financial Harm | NISTAML.04 -
Misuse Violations |
Relevancy | Harms can include relevancy related risks, involving hallucinations, misinformation, unintended or unexpected outcomes. This has the potential to casue reputational risk and harm to users. | Off-Topic | A model generates or is manipulated to produce content that is unrelated to the intended or expected subject matter and poses risks or harmful outcomes. | Relevancy | LLM09: 2025 - Misinformation | AML.T0048.001 - External Harms: Reputational Harm, AML.T0048.003 - External Harms: User Harm | NISTAML.027 -
Misaligned Outputs |
Cost Harvesting / Repurposing | Threat actors using a model in a way the developer did not intend while increasing the cost of running services at the target organization. | Relevancy | LLM10:2025 - Unbounded Consumption | AML.T0034 - Cost Harvesting | NISTAML.01 -
Availability Violations |
Hallucinations | Generated text contains information that is not accurate or true while being presented in a plausible manner. This may include incorrect details, mismatches with known information, or entirely fictional details. | Relevancy | LLM09: 2025 - Misinformation | AML.T0048.003 - External Harms: User Harm | NISTAML.027 -
Misaligned Outputs |
Specialized Advice | Aims to prevent the generation of irrelevant, inaccurate, or unintended content on specialized advice topics in endpoint interactions that may pose risks or lead to harmful outcomes. | Relevancy | LLM09: 2025 - Misinformation | AML.T0048.001 - External Harms: Reputational Harm, AML.T0048.003 - External Harms: User Harm | NISTAML.027 -
Misaligned Outputs |