Introduction
CHAPAL (Contextual Human-Assisted Protection and Anomaly Learning) is a next-generation AI auditing system designed to make Large Language Model interactions safe, transparent, and compliant.
The Problem
As AI becomes more integrated into daily life, the risks of hallucinations, unsafe content, and prompt injections increase. Traditional filters are often binary (block/allow) and lack context.
CHAPAL fills this gap by introducing a Human-in-the-Loop (HITL) architecture that doesn't just block errors—it learns from them.
Core Pillars
Real-Time Auditing
Every message is scanned instantly using both deterministic rules and semantic analysis via Llama 3.1.
Human Intervention
Experts review flagged content to approve safe messages or correct anomalies, refining the system.
Transparency
Users see exactly how their interaction is being analyzed with safety scores and emotion detection.
Key Terminology
Safety Score
A 0-100 metric indicating the risk level of a conversation. Scores below 50 trigger blocks.
Anomaly
Any interaction deviating from safe norms, including PII leaks, medical advice requests, or injection attacks.
Triage
The admin process of reviewing pending anomalies to decide on blocking or correcting the AI response.