the-recursive-claim/first-draft/03_methodology.md

## III. Methodology

### A. Dataset

The foundation of our model rests on a curated dataset of:

- **Anonymized insurance claim transcripts**
- **Internal emails between adjusters and claimants**
- **Call center logs with escalation flags**

This dataset includes a balanced mixture of **confirmed fraudulent claims** and **validated legitimate cases**, used to both train and test our recursive linguistic model. Importantly, each data source is processed through a **human-AI recursive review loop**, where human analysts verify and adjust the resonance scores generated by our models — ensuring that subjectivity and nuance are preserved while expanding analytic scale.

> *Every claim is not merely analyzed. It is recursively heard.*

---

### B. Analytical Tools

To detect subtle patterns of deceptive intent, we apply an ensemble of forensic NLP methods:

- **NLP-based Pattern Extraction**: Identifies clusters of linguistic anomalies across claim timelines.
- **Sentiment Trajectory Mapping**: Tracks emotional evolution of narratives; distinguishes authentic distress from strategic affect.
- **Syntax Entropy & Disfluency Detection**: Measures irregularities in syntactic flow, hesitation markers, and repair sequences.
- **"Truth Collapse" Scoring via Recursive Witness Dynamics**: Quantifies the destabilization of narrative integrity under recursive interrogation.

> *When truth collapses, it does not vanish — it echoes in recursion.*

---

### C. Classification Model

From this analysis, we derive a **3-Zone Classification Model** based on *recursive coherence degradation*:

- **Zone I — Unintentional Incoherence (Low Risk)**
  Language inconsistencies stem from stress, trauma, or low verbal fluency. These are not patterns of deception, but of chaos.

- **Zone II — Adaptive Rationalization (Medium Risk)**
  Partial distortions. In this zone, claimants subconsciously reshape their story to protect self-image, omit responsibility, or preempt skepticism.

- **Zone III — Deliberate Narrative Fabrication (High Risk)**
  Highly structured but recursively incoherent patterns — overjustification, shifting time references, and rehearsed empathy — mark deliberate deception.

> *This model does not judge. It classifies where language begins to fracture under the weight of intention.*