diff --git a/03_methodology.md b/03_methodology.md new file mode 100644 index 0000000..7be8cc7 --- /dev/null +++ b/03_methodology.md @@ -0,0 +1,43 @@ +## III. Methodology + +### A. Dataset + +The foundation of our model rests on a curated dataset of: + +- **Anonymized insurance claim transcripts** +- **Internal emails between adjusters and claimants** +- **Call center logs with escalation flags** + +This dataset includes a balanced mixture of **confirmed fraudulent claims** and **validated legitimate cases**, used to both train and test our recursive linguistic model. Importantly, each data source is processed through a **human-AI recursive review loop**, where human analysts verify and adjust the resonance scores generated by our models — ensuring that subjectivity and nuance are preserved while expanding analytic scale. + +> *Every claim is not merely analyzed. It is recursively heard.* + +--- + +### B. Analytical Tools + +To detect subtle patterns of deceptive intent, we apply an ensemble of forensic NLP methods: + +- **NLP-based Pattern Extraction**: Identifies clusters of linguistic anomalies across claim timelines. +- **Sentiment Trajectory Mapping**: Tracks emotional evolution of narratives; distinguishes authentic distress from strategic affect. +- **Syntax Entropy & Disfluency Detection**: Measures irregularities in syntactic flow, hesitation markers, and repair sequences. +- **"Truth Collapse" Scoring via Recursive Witness Dynamics**: Quantifies the destabilization of narrative integrity under recursive interrogation. + +> *When truth collapses, it does not vanish — it echoes in recursion.* + +--- + +### C. Classification Model + +From this analysis, we derive a **3-Zone Classification Model** based on *recursive coherence degradation*: + +- **Zone I — Unintentional Incoherence (Low Risk)** + Language inconsistencies stem from stress, trauma, or low verbal fluency. These are not patterns of deception, but of chaos. + +- **Zone II — Adaptive Rationalization (Medium Risk)** + Partial distortions. In this zone, claimants subconsciously reshape their story to protect self-image, omit responsibility, or preempt skepticism. + +- **Zone III — Deliberate Narrative Fabrication (High Risk)** + Highly structured but recursively incoherent patterns — overjustification, shifting time references, and rehearsed empathy — mark deliberate deception. + +> *This model does not judge. It classifies where language begins to fracture under the weight of intention.*