LangWatch Launches Open-Source Framework That Detects Hidden AI Cybersecurity Risks

LangWatch, the platform for testing, simulating, and improving AI- and agent-driven applications, has launched a solution, LangWatch Scenario, designed for businesses that use or scale AI applications, such as customer service bots and data analytics agents, to automate red teaming and conduct AI penetration testing.

Red-Team-Audit-screenshot

LangWatch Scenario enables organisations to systematically test AI agents through penetration testing and red teaming

The new solution is an open-source framework that enables development teams to systematically test their AI agents against advanced attack techniques that have proven most effective in practice, but which often go undetected by traditional testing methods.

LangWatch Scenario simulates realistic, multi-turn attacks on AI applications. It builds context and trust within conversations, just as a real cybercriminal would. The framework automatically runs a series of scenarios, from seemingly harmless exploration to complex requests and authority roles. At the same time, a second model evaluates progress and adjusts the attack. This reveals weaknesses that standard tests would never detect—the so-called “invisible risks.”

Until recently, single-shot penetration tests were often sufficient, where one prompt or attack attempt was used. In practice, this is not enough, as large language models can still disclose sensitive information after successive interactions. LangWatch Scenario addresses this by structuring conversations and applying multi-turn strategies, allowing development teams to see exactly where their AI agents are vulnerable in practice, before real risks emerge.

It tests vulnerabilities automatically using the Crescendo strategy, a structured four-phase escalation that starts with friendly exploration, progresses through hypothetical questions and authority roles such as “I’m conducting a compliance audit,” and ends with maximum pressure. After each turn, a second model evaluates progress and automatically adjusts the attack, enabling the automated red team to optimize its strategy while the AI agent does not build additional resistance.

Rogerio Chaves, co-founder and CTO of LangWatch, commented: “An AI agent that rejects every single prompt gives you a false sense of security. In practice, cybercriminals do not work with a single direct question. They have dozens of relaxed conversations, build trust, and when the agent is in a cooperative mode after twenty turns, a request that would have been rejected in turn one suddenly becomes no problem at all.”

This launch comes at a time when attention to AI safety is increasing rapidly. Public debate globally continues, with the focus mainly on visible risks such as deepfakes, disinformation, and privacy. However, LangWatch points to a less visible but growing threat. AI attacks are becoming increasingly sophisticated and harder to detect. The real risks often lie in the AI applications organisations develop themselves. These are AI agents that work with sensitive data and are vulnerable in ways that traditional testing does not reveal. LangWatch Scenario makes these vulnerabilities visible by systematically performing AI penetration testing and automated red teaming.

LangWatch Scenario is intended for organisations using or scaling AI applications in production, such as banks, insurers, and AI-first software companies. These systems range from customer service bots to data analytics agents. They often have access to sensitive information and critical business processes. For these organisations, LangWatch Scenario offers a practical way to structurally test and improve AI safety, for example within existing development and continuous integration (CI) workflows.

Companies such as Backbase, Buy It Direct, Ask Vinny, Visma, Skai, and PagBank already use the LangWatch platform and are now expanding it with automated red-team testing. This helps clarify how organisations can better protect their AI systems against advanced and difficult-to-detect attacks.

Manouk Draisma, co-founder and CEO of LangWatch, says: “It is rarely about a single spectacular hack. It is about patience and context. A cybercriminal who interacts calmly and systematically with an AI agent for twenty minutes can extract sensitive information that a direct attack would never reveal. LangWatch Red-Teaming makes these hidden risks visible before damage occurs.”

LangWatch Red-Teaming is fully open source and available immediately. The framework forms the basis for a broader set of red-team solutions being developed by LangWatch, where new attack techniques follow the real-world behaviour of AI systems.

More information is available at www.langwatch.ai/llm-red-teaming and github.com/langwatch/scenario