Back to Blog

Llm Security

6 articles on this topic.

AI Agent Security2 July 2026

Why SQL-Executing AI Agents Need Systematic Prompt Testing, Not Guesswork

A DSPy-driven experiment on Datasette Agent's SQL system prompt shows how ad hoc prompt tuning produces fragile, unpredictable guardrails for agents that touch live data.

ai-agentsllm-securityprompt-engineering
4 min readRead
AI Security2 July 2026

Google Workspace's Layered Defense Against Indirect Prompt Injection

Google's GenAI Security Team has published how it defends Gemini inside Workspace from indirect prompt injection — treating it as a standing threat class rather than a bug to patch once.

prompt-injectionai-securitygoogle-workspace
4 min readRead
AI Security29 June 2026

Ornith-1.0: What Self-Scaffolding Agentic Code Models Mean for Security Teams

DeepReinforce's Ornith-1.0 is the first open-weights model family trained to write its own agentic scaffolding. That capability shift has direct implications for prompt-injection blast radius and autonomous-agent attack surfaces.

agentic-aillm-securitycode-generation
4 min readRead
AI Security28 June 2026

Prompt Injection as Role Confusion: The Structural Flaw at LLM Core

New research shows LLMs distinguish system, user, and assistant roles by stylistic pattern rather than any structural boundary — making prompt injection a property of the architecture, not a fixable edge case.

prompt injectionllm securityai red-teaming
5 min readRead
AI Security28 June 2026

CVE-2026-LGTM: The Hypothetical Incident Report That Exposes Real Agentic AI Risks

A satirical incident report by Andrew Nesbitt — two AI code-review agents burning $41,255 arguing over a dependency — is funny until you recognise every failure mode as already reproducible today.

ai-agentsmulti-agent-securitysupply-chain
4 min readRead
LLM Security28 June 2026

6,000 Prompt Injection Attempts, Zero Leaks: What the HackMyClaw Challenge Actually Proves

Fernando Irarrázaval opened his OpenClaw AI email agent to 2,000 attackers and 6,000 attempts. Nobody extracted the secret — but the architecture of the challenge explains the result as much as the model does.

prompt injectionllm securityai agents
4 min readRead