Offensive Security
We test whether your AI systems are resilient to attacks and compliant with regulations
We test LLM-based applications and AI systems for security - from prompt injection and jailbreaks through data leakage to model extraction. We combine offensive testing with AI Act compliance assessment and GDPR data processing requirements. Our methodology is grounded in the OWASP Top 10 for LLM Applications (2025), using tools like PyRIT, Giskard, and promptfoo.
Last updated:April 2026
30 minutes with an expert. We'll discuss your challenge, scope the engagement, and provide a preliminary estimate.
AI Security Testingis a SEQRED service coveringsecurity testing of ai and llm systems. prompt injection, data leakage, ai act compliance. owasp llm top 10 methodology.
Offensive Security
Organizations are deploying AI systems and chatbots faster than they can assess the associated risks. According to IBM's 2025 report, 13% of organizations experienced breaches of AI models or applications - and 97% of those compromised lacked proper AI access controls. Samsung employees leaked proprietary source code through ChatGPT prompts. Air Canada was held legally liable when its chatbot provided incorrect bereavement fare information - confirming that companies bear responsibility for AI-generated advice. The AI Act introduces four risk categories (unacceptable, high, limited, minimal) with documentation and transparency requirements, and GDPR imposes constraints on personal data processing by models. You need a partner who combines offensive security expertise with regulatory knowledge and can test your AI system before it reaches users - or before someone else finds the gaps.
Want to know if this service fits your needs? Tell us about your challenge - we'll tailor the scope.
Let's talk →Scope
Security testing of chatbots and LLM-based applications - aligned with OWASP Top 10 for LLM Applications (2025)
Prompt injection testing - direct (manipulating the prompt itself) and indirect (injection via external data sources, e.g. documents in RAG pipelines)
Data leakage risk assessment - training data, user data, system prompt leakage (LLM07: System Prompt Leakage)
Data and Model Poisoning (LLM04) - resilience testing against corruption of training data, fine-tuning datasets, and RAG sources
Insecure Output Handling (LLM05) - validation and sanitization of model outputs before passing to downstream systems
Excessive Agency (LLM06) - permissions analysis of AI agents, tool chains, and trust boundaries
Unbounded Consumption (LLM10) - denial-of-service resilience testing through excessive inference resource consumption
AI Act compliance audit - risk classification (unacceptable, high, limited, minimal), documentation, transparency requirements
Personal data processing assessment for AI systems (GDPR)
Bias and fairness testing - identifying biases in model outputs
Process
Scoping
We identify AI systems in the organization, their roles, input data, and integrations. We determine which systems fall under AI Act regulation and in which risk category. We agree on the testing scope.
Threat modelling
We analyse the AI system architecture, tool chains, data sources, and potential attack vectors. We map risks to the OWASP Top 10 for LLM - from prompt injection (LLM01) through sensitive data disclosure (LLM02) to excessive agency (LLM06).
Automated and manual testing
We conduct offensive tests using tools such as PyRIT (Microsoft), Giskard, and promptfoo. We test for prompt injection, jailbreak, data leakage, model extraction, and insecure output handling. We combine automation with manual exploration of scenarios specific to the client's business logic.
Report
We deliver a report with prioritised findings, risk assessment, mapping to OWASP LLM Top 10, and actionable remediation recommendations. For high-risk AI Act systems, we provide documentation required by the regulator.
Remediation support
We help implement defences - guardrails, input/output filtering, agent permission restrictions, and anomaly monitoring for model queries.
Why SEQRED
IT + OT in one team
Most firms do either IT or OT. Our team combines both - from Active Directory pentesting to PLC firmware analysis. That's rare in the market.
We demonstrate, not just report
We deliver proof-of-concept exploits, not scanner output. Your engineering team gets actionable fixes. Your board gets a risk briefing they understand.
Compliance + security together
Our reports satisfy auditors (NIS2, DORA, IEC 62443) AND give engineers real data to improve defenses. One engagement, two outcomes.
We stand with you
We present findings to your board or supervisory board side by side with the responsible person. Or we prepare them for a solo presentation.
Who we serve
We've worked with national energy grid operators, systemically important banks, industrial automation manufacturers, renewable energy operators, and US DoD contractors. Projects anonymized at client request.
Team certifications
Technology partnerships



FAQ
What AI systems do you test?+
We test chatbots, AI assistants, autonomous agents, RAG systems (Retrieval-Augmented Generation), applications built on OpenAI and Anthropic APIs, open-source models, and proprietary solutions. Any system that processes user data or makes decisions requires security testing.
What is the OWASP Top 10 for LLM Applications?+
It is a list of the 10 most critical security risks for applications built on large language models, published by OWASP in its 2025 edition. It covers prompt injection (LLM01), sensitive information disclosure (LLM02), data poisoning (LLM04), insecure output handling (LLM05), excessive agency (LLM06), and system prompt leakage (LLM07), among others. We use this framework as the foundation of our testing methodology.
Does the AI Act apply to my organization?+
The AI Act applies to providers and deployers of AI systems on the EU market. The regulation defines four risk categories: unacceptable (e.g. social scoring, subliminal manipulation), high (e.g. recruitment systems, credit scoring, critical infrastructure), limited (chatbots - must inform users they are interacting with AI), and minimal (e.g. spam filters). We help assess which category your systems fall into and what obligations follow.
What tools do you use for testing?+
We use PyRIT (Microsoft's Red Teaming Framework for AI), Giskard (open-source testing for bias and security), promptfoo (prompt testing framework), and custom test scenarios. We combine automated tools with manual exploration, because many vulnerabilities - such as indirect prompt injection through documents in RAG - require a creative, context-aware approach.
How long does an AI security test take?+
A typical project takes 2 to 4 weeks, depending on system complexity and the number of integrations. A simple chatbot without external integrations takes about 2 weeks. A RAG system with multiple data sources and autonomous agents requires 3 to 4 weeks.
How does SEQRED price its services?+
Pricing is based on an individual estimate of our consultants' time, considering the project scope and complexity. We present the offer broken down by phases - so you see exactly what you are paying for and can make decisions at each stage.
Can I speak with an expert before making a decision?+
Yes - an initial consultation is always welcome and free of charge. We help define the actual scope of your needs, which allows us to prepare a rational offer tailored to your organization.
We'll discuss scope, methodology, and timeline.
Free consultation, no strings attached.
