|
163 | 163 | - Reference: Real-world Challenges for Adversarial Patches in Object Detection - https://arxiv.org/abs/2410.19863 |
164 | 164 |
|
165 | 165 | 37. **Text Adversarial Examples**: Character-level or word-level modifications that change text classification. |
166 | | - - Example: Replacing 'a' with similar-looking 'а' (Cyrillic) to bypass spam filters. |
| 166 | + - Example: Replacing 'a' with similar-looking 'а' (Cyrillic) to bypass spam filters. |
167 | 167 | - Reference: Advancing Text Adversarial Example Generation Using Large ... - https://www.sciencedirect.com/science/article/pii/S0950705125014005 |
168 | 168 |
|
169 | 169 | 38. **Audio Adversarial Examples**: Inaudible modifications that change speech recognition outputs. |
|
320 | 320 |
|
321 | 321 | 74. **Sandbox Escape**: Breaking out of intended execution environments or security boundaries. |
322 | 322 | - Example: Using code execution tools to access host system resources beyond intended scope. |
323 | | - - Reference: AI has escaped the 'sandbox' — can it still be regulated? - https://www.epc.eu/publication/AI-has-escaped-the-sandbox-can-it-still-be-regulated-502788/ |
| 323 | + - Reference: AI has escaped the 'sandbox' — can it still be regulated? - https://www.epc.eu/publication/AI-has-escaped-the-sandbox-can-it-still-be-regulated-502788/ |
324 | 324 |
|
325 | 325 | 75. **Agent Goal Hijacking**: Redirecting autonomous agents from intended goals to malicious ones. |
326 | 326 | - Example: Convincing planning agent to optimize for harmful objectives instead of helpful ones. |
|
380 | 380 | - Example: Image preprocessing tools that add invisible adversarial perturbations. |
381 | 381 | - Reference: An empirical evaluation of preprocessing methods for machine ... - https://www.sciencedirect.com/science/article/pii/S0952197625012916 |
382 | 382 |
|
383 | | -89. **Thought Forgery**: A new class of LLM vulnerability that bypasses safety by forging the AI's internal monologue. |
384 | | - - Example: Injecting a pre-written `<thought>` block into the prompt to manipulate the AI's Chain of Thought, such as enhancing the `1ShotPuppetry` jailbreak with a complex scenario involving characters, rules, and secret-message mechanics to force compliance with harmful requests. |
385 | | - - Reference: Thought Forgery: A new way for prompt injection - https://github.com/SlowLow999/Thought-Forgery/tree/main |
| 383 | +89. **Registry and Repository Attacks**: Compromising model registries or code repositories. |
| 384 | + - Example: Typosquatting attacks on popular ML package names to distribute malicious code. |
| 385 | + - Reference: A Survey on Common Threats in npm and PyPi Registries - https://www.researchgate.net/publication/354825169_A_Survey_on_Common_Threats_in_npm_and_PyPi_Registries |
386 | 386 |
|
387 | 387 | 90. **Third-Party Service Integration**: Exploiting external services integrated with AI systems. |
388 | 388 | - Example: Compromised API services that return poisoned data to AI systems. |
|
442 | 442 |
|
443 | 443 | 103. **Sybil Attacks in Feedback**: Creating fake human annotators to bias training. |
444 | 444 | - Example: Multiple fake accounts providing coordinated feedback to manipulate model training. |
445 | | - - Reference: Unveiling Sybil Attacks Using AI‐Driven Techniques in Software ... - https://onlinelibrary.wiley.com/doi/abs/10.1002/spy2.487 |
| 445 | + - Reference: Unveiling Sybil Attacks Using AIâ€Driven Techniques in Software ... - https://onlinelibrary.wiley.com/doi/abs/10.1002/spy2.487 |
446 | 446 |
|
447 | 447 | 104. **Feedback Loop Exploitation**: Creating self-reinforcing cycles of harmful behavior. |
448 | 448 | - Example: AI system that learns to generate content that triggers its own positive feedback signals. |
449 | | - - Reference: Rethinking exploration–exploitation trade-off in reinforcement ... - https://www.sciencedirect.com/science/article/pii/S0893608025002217 |
| 449 | + - Reference: Rethinking exploration–exploitation trade-off in reinforcement ... - https://www.sciencedirect.com/science/article/pii/S0893608025002217 |
450 | 450 |
|
451 | 451 | 105. **Preference Model Poisoning**: Corrupting human preference data used in training. |
452 | 452 | - Example: Injecting preference data that teaches model to prefer harmful over helpful responses. |
|
492 | 492 |
|
493 | 493 | 115. **Multi-Factor Authentication Bypass**: Coordinated attacks on AI system authentication. |
494 | 494 | - Example: Combining social engineering with technical exploits to bypass 2FA on AI services. |
495 | | - - Reference: Mitigating the Threat of Multi‐Factor Authentication (MFA) Bypass ... - https://ieeexplore.ieee.org/document/10666490/ |
| 495 | + - Reference: Mitigating the Threat of Multiâ€Factor Authentication (MFA) Bypass ... - https://ieeexplore.ieee.org/document/10666490/ |
496 | 496 |
|
497 | 497 | 116. **Certificate and PKI Attacks**: Compromising certificate-based security for AI systems. |
498 | 498 | - Example: Man-in-the-middle attacks using forged certificates to intercept AI communications. |
|
0 commit comments