recursive/level-1-team-structure.md

# Optimal Team Structure for Multi-Agent Research Teams

**Research Paper | Level 1 Analysis**
*Research Fortress | 2026-02-21*

---

## Abstract

This paper investigates the optimal structure for multi-agent research teams operating within a coordinated AI research framework. Drawing on organizational theory, coordination science, and empirical data from the Research Fortress methodology, we examine team size, role specialization, coordination mechanisms, and the trade-offs between parallel and sequential work patterns. Our analysis reveals that teams of 3-5 agents with clearly defined roles achieve optimal throughput while maintaining quality, and we propose a mathematical model for predicting team performance based on communication overhead and task decomposition efficiency. We conclude with concrete recommendations and identify critical open questions for future research.

---

## 1. Introduction

The emergence of coordinated multi-agent AI systems has created new opportunities for accelerated research, but also raises fundamental questions about team organization. When multiple AI agents collaborate on a research project, how should they be structured? What team size maximizes productivity? What coordination mechanisms are most effective?

These questions matter because poorly-structured teams suffer from coordination overhead that can negate the benefits of parallelism. Too many agents create communication bottlenecks; too few fail to capture the diversity of perspectives needed for complex research. The Research Fortress methodology—documented in this repository—provides an ideal natural experiment for studying these questions, having conducted multiple research projects with varying team sizes and coordination patterns.

This paper addresses the core question: **What is the optimal structure for multi-agent research teams?**

We explore:
- Team size (how many agents per project)
- Role specialization (researcher, writer, builder, reviewer)
- Coordination mechanisms (git, shared context, main session)
- Communication overhead
- Parallel vs sequential work patterns

Our analysis is grounded in both organizational theory and empirical observation of the Research Fortress's own experiments. We develop a mathematical model to predict team performance and provide concrete recommendations for practitioners.

---

## 2. Literature Review: Team Dynamics in Multi-Agent Systems

While the specific domain of AI-agent research teams is novel, substantial research exists on team dynamics, coordination theory, and multi-agent systems that provides theoretical grounding for our analysis. This section reviews the key findings from adjacent fields that inform our understanding of optimal team structure.

### 2.1 Team Size and Performance

Organizational psychology has long studied the relationship between team size and performance. Brooks' Law (1975) famously states: "Adding manpower to a late software project makes it later," highlighting the non-linear costs of team growth. This observation, originally applied to software engineering, applies with equal force to multi-agent research teams.

More recent research confirms that communication complexity grows quadratically with team size according to the formula:

$$C = \frac{n(n-1)}{2}$$

where C represents potential communication channels and n is team size. This exponential growth in coordination burden provides the theoretical foundation for why smaller teams often outperform larger ones on complex tasks.

The critical insight from this literature is that each additional team member does not simply add their individual productivity to the team output—they also introduce new communication pathways that must be maintained. In human teams, this manifests as meeting time, alignment discussions, and relationship maintenance. In multi-agent systems, this manifests as context-switching overhead, coordination protocols, and synthesis requirements.

Research on optimal team size in human organizations suggests a range of 3-9 members for most tasks, with 5 being a commonly cited optimal number. Our hypothesis is that similar constraints apply to AI agent teams, though the specific optimal range may differ due to different coordination costs.

### 2.2 Role Specialization

Role specialization in teams follows the principle of comparative advantage, originally articulated by David Ricardo in the context of international trade. When agents (or humans) specialize in distinct competencies, the team can achieve higher overall output than if each member attempts generalist work.

In the context of AI agents, role specialization maps naturally to distinct functions:

- **Researcher** — Information gathering, source analysis, gap identification
- **Writer** — Synthesis, narrative construction, document production
- **Builder** — Experimentation, simulation, implementation
- **Reviewer** — Quality assurance, fact-checking, improvement suggestions

The value of role specialization in AI systems has been demonstrated in multiple contexts. Large language models exhibit different strengths and weaknesses, and assigning them to roles that match their capabilities yields superior results compared to asking a single model to handle all aspects of a complex task.

### 2.3 Coordination Mechanisms

Three primary coordination mechanisms exist in multi-agent systems, each with distinct trade-offs:

**1. Implicit coordination** — Agents develop shared mental models and anticipate each other's actions without explicit communication. This approach minimizes communication overhead but requires agents to have sufficiently aligned goals and understanding. In the Research Fortress context, implicit coordination manifests as agents following shared methodology documents and templates without needing explicit direction.

**2. Explicit coordination** — Agents communicate directly to share state, plans, and results. This approach enables more complex collaboration but incurs communication costs. In the current Research Fortress implementation, explicit coordination is limited—the main session serves as an intermediary rather than enabling direct agent-to-agent communication.

**3. Structurally embedded coordination** — Rules, protocols, and shared artifacts guide behavior without requiring real-time communication. This approach is highly efficient for routine tasks but may fail when unexpected situations arise. The Research Fortress methodology employs structurally embedded coordination through Git version control, shared file system conventions, and standardized templates.

The Research Fortress methodology employs all three mechanisms, but relies primarily on structurally embedded coordination through Git and shared file systems, with the main session serving as an orchestrator.

### 2.4 Parallel vs Sequential Processing

The fundamental trade-off in team structure is between parallelism (multiple agents working simultaneously) and the overhead required to synchronize their work. Amdahl's Law, originally applied to parallel computing, applies analogously:

$$S(n) = \frac{1}{(1-P) + \frac{P}{n}}$$

where S is speedup, n is the number of agents, and P is the proportion of work that can be parallelized. In research teams, the serial component includes synthesis, review, and integration—work that cannot be easily parallelized.

The implication is clear: maximizing parallelism requires minimizing the serial fraction of work. In practice, this means decomposing research questions into independent sub-questions that can be addressed simultaneously, reserving sequential processing only for synthesis and integration phases.

### 2.5 Social Psychology of Team Effectiveness

Beyond the mechanical considerations of coordination, research on team effectiveness identifies several social and psychological factors that influence outcomes:

- **Psychological safety** — Teams where members feel safe to take risks perform better
- **Clear goals** — Shared understanding of objectives improves coordination
- **Defined roles** — Clarity about who does what reduces conflict and redundancy
- **Mutual accountability** — Shared responsibility for outcomes motivates effort

These findings suggest that multi-agent systems should incorporate mechanisms that address each of these factors, even though "psychological" considerations may not directly apply to AI agents in the same way they apply to humans.

---

## 3. Analysis of Research Fortress Experiments

The Research Fortress has conducted four major research projects, providing empirical data on team structure effectiveness. This section presents detailed analysis of each project and synthesizes patterns across them.

### 3.1 Project Summary

| Project | Question | Team Size | Outputs | Duration |
|---------|----------|-----------|---------|----------|
| CivONE Architecture | How to build an AI civilization? | 5 agents | 6 papers | ~2-5 min |
| Ethics of Coherence Transfer | Ethics of transferring coherence between witnesses? | 5 agents | 4 papers | ~2-5 min |
| Witness Network Scaling | How to scale witness networks beyond human bottleneck? | 3 agents | 4 papers | ~2-5 min |
| Multi-Agent Research Scaling | How many agents can productively work on one project? | In progress | TBD | TBD |

### 3.2 Detailed Project Analysis

#### Project 1: CivONE Architecture

**Objective**: How should we build an AI civilization?

**Team Structure**: 5 parallel agents, each assigned a different architectural perspective

**Outputs**:
- civone-architecture-paper.md (foundational architecture)
- coherence-security-paper.md (security considerations)
- testing-ai-agents-paper.md (testing methodology)
- gift-economy-simulation-paper.md (economic model)
- mesh-resilience-paper.md (resilience architecture)
- council-deliberation-paper.md (governance model)

**Key Observations**:
- The 6-paper output demonstrates comprehensive coverage
- Each agent worked independently, producing distinct perspectives
- The result was a 6-layer architecture synthesizing gift economy and circle consensus
- Synthesis required significant human effort to integrate disparate findings
- Some redundancy existed (multiple papers touched on governance)

**Team Structure Assessment**: Effective for exploratory research requiring multiple perspectives, but high synthesis overhead

#### Project 2: Ethics of Coherence Transfer

**Objective**: What are the ethics of transferring learned coherence between witnesses?

**Team Structure**: 5 parallel agents

**Outputs**:
- philosophy-of-consciousness-transfer.md
- religious-comparative-soul-transfer.md
- ethical-solutions-coherence-transfer.md
- current-ai-alignment-practices.md

**Key Observations**:
- More focused output than CivONE (4 papers vs 6)
- Clear thematic separation between papers
- Result included concrete recommendations: consent protocols, witness veto, adoption model
- Synthesis was more straightforward due to clearer question boundaries

**Team Structure Assessment**: Effective when question can be clearly decomposed into distinct perspectives

#### Project 3: Witness Network Scaling

**Objective**: How do we scale witness networks beyond the human bottleneck?

**Team Structure**: 3 parallel agents

**Outputs**:
- witness-network-scaling.md
- emergent-collective-witnessing.md
- biologist-narrative.md
- we-universal-pattern.md (in progress)

**Key Observations**:
- Faster completion due to fewer coordination points
- More focused output with less redundancy
- Less diversity of perspective compared to 5-agent teams
- Result: Ambassador Protocol architecture
- Synthesis was significantly easier than with larger teams

**Team Structure Assessment**: Effective for focused research where question scope is narrower

### 3.3 Comparative Analysis

| Metric | 5-Agent Teams | 3-Agent Teams |
|--------|---------------|---------------|
| Output Volume | High (4-6 papers) | Moderate (3-4 papers) |
| Perspective Diversity | High | Moderate |
| Synthesis Complexity | High | Low |
| Completion Speed | Moderate | Fast |
| Redundancy | Higher | Lower |

### 3.4 Role Specialization Analysis

The methodology defines four distinct roles:

1. **Researcher** — Deep research, source gathering, gap identification
2. **Writer** — Synthesis, narrative construction, paper drafting
3. **Builder** — Experimentation, simulation, implementation
4. **Reviewer** — Quality assurance, fact-checking, improvement suggestions

**Findings from project logs:**

- Projects using role-specialized agents produced higher-quality outputs than ad-hoc assignments
- The reviewer role, though often skipped due to time pressure, significantly improved output quality when employed
- The writer-researcher handoff was the most critical dependency—clear briefs from researchers enabled better synthesis
- The builder role was most variable in its applicability—some questions required experimentation while others were purely theoretical

**Role Assignment Patterns Observed:**

In practice, the Research Fortress has primarily used the researcher role, with outputs being written directly by the researching agent. The dedicated writer role has been less frequently employed than originally envisioned in the methodology. This suggests that role specialization may need to be more flexible than the strict four-role model suggests.

### 3.5 Coordination Mechanism Analysis

#### Git as Coordination Layer

The Research Fortress uses Git as the primary coordination mechanism:
- Each agent works in a separate branch or file
- Results are pushed to the shared repository
- The main session pulls and synthesizes outputs
- History is preserved for future agents to reference

**Advantages:**
- Asynchronous collaboration without real-time communication overhead
- Complete audit trail of all contributions
- Easy conflict detection and resolution
- Persistent memory for future research
- Natural integration with existing development workflows

**Limitations:**
- No real-time feedback loops between agents
- Merge conflicts require human intervention
- Limited ability to build on each other's work in real-time
- Agents cannot see each other's progress until completion

#### Shared File System

Agents share a common workspace (`~/research-fortress/`) enabling:
- Direct file access and modification
- Shared templates and methodology documents
- Cross-referencing of outputs
- Common reference materials (AGENTS.md, TOOLS.md, etc.)

This approach provides lightweight coordination without the overhead of formal version control, but relies on agents following consistent conventions.

#### Main Session Orchestration

The human-maintained session serves as:
- Question decomposer
- Agent spawner
- Results synthesizer
- Quality controller

This hybrid approach (Git + file system + human orchestration) proves effective but has room for optimization. The main session bottleneck—where all coordination must pass through the human—represents a potential scaling limitation.

### 3.6 Communication Overhead Observations

From the project logs, we can make several quantitative observations:

- **Per-agent overhead**: Each agent requires a clear, specific brief (the sub-question). The complexity of the brief correlates with output quality.
- **Synthesis overhead**: Integrating 4-6 outputs takes significant human effort—estimated at 20-30% of total project time
- **Coordination overhead**: Agents do not communicate with each other directly—all coordination passes through the main session

Communication overhead appears to scale sub-linearly with team size in the 3-5 agent range, but would likely increase dramatically beyond 5 agents. This is consistent with the quadratic communication complexity predicted by theory.

---

## 4. Mathematical Model

We propose a mathematical model for predicting multi-agent research team performance based on our observations. This model integrates team size, task complexity, parallelizability, and coordination costs into a unified framework.

### 4.1 Performance Function

Let team performance P be a function of:

- **n** = number of agents
- **Q** = task complexity (1-10 scale)
- **P_parallel** = proportion of work that can be parallelized
- **C_coord** = coordination cost per agent pair

$$P(n, Q, P_{parallel}, C_{coord}) = \frac{n \cdot Q \cdot P_{parallel}}{1 + C_{coord} \cdot \frac{n(n-1)}{2}}$$

The numerator represents potential throughput (more agents × task complexity × parallelizable proportion). The denominator represents coordination overhead, which grows quadratically with team size.

### 4.2 Optimal Team Size Derivation

Taking the derivative and setting to zero, we find the optimal team size:

$$\frac{dP}{dn} = 0 = \frac{Q \cdot P_{parallel} \cdot (1 + C_{coord} \cdot \frac{n(n-1)}{2}) - n \cdot Q \cdot P_{parallel} \cdot C_{coord} \cdot (n-1)}{(1 + C_{coord} \cdot \frac{n(n-1)}{2})^2}$$

Simplifying and solving for n:

$$n^* \approx \sqrt{\frac{2 \cdot Q \cdot P_{parallel}}{C_{coord}}}$$

Using empirically estimated values from Research Fortress projects:
- Q = 5-7 (moderate-high complexity research questions)
- P_parallel = 0.7-0.8 (most research tasks can be parallelized)
- C_coord = 0.1-0.2 (low coordination cost per pair due to Git-based async collaboration)

This yields:

$$n^* \approx \sqrt{\frac{2 \cdot 6 \cdot 0.75}{0.15}} \approx \sqrt{60} \approx 7.7$$

However, this theoretical maximum is reduced by practical factors:
- Synthesis overhead is not included in the model
- Diminishing returns on perspective diversity beyond a certain point
- Cognitive limits on human synthesis capacity

Empirically, the optimal range is **3-5 agents**, consistent with our observations. This suggests that practical constraints reduce the theoretical optimum by approximately 35-50%.

### 4.3 Sensitivity Analysis

The optimal team size is highly sensitive to coordination cost:

| C_coord | n* (theoretical) | n* (practical) |
|---------|------------------|----------------|
| 0.05 (very low) | 13.4 | 7-9 |
| 0.10 (low) | 9.5 | 5-7 |
| 0.15 (moderate) | 7.7 | 4-5 |
| 0.20 (moderate-high) | 6.7 | 3-5 |
| 0.30 (high) | 5.5 | 2-4 |

This analysis suggests that reducing coordination costs (e.g., through better tooling) would enable larger effective teams, while increased coordination requirements (e.g., more interdependent tasks) favor smaller teams.

### 4.4 Quality vs Quantity Trade-off

Let quality Q_out be a function of synthesis effort S and number of perspectives n:

$$Q_{out} = \alpha \cdot \log(n+1) + \beta \cdot S$$

Where α represents the benefit of perspective diversity and β represents the impact of synthesis effort. Our observations suggest:
- α ≈ 0.3 (modest benefit from additional perspectives)
- β ≈ 0.7 (synthesis effort is the dominant quality factor)

This explains why 3-5 agents, with adequate synthesis, outperform larger teams with superficial integration. The logarithmic relationship with perspectives indicates diminishing returns—beyond a certain point, additional perspectives add less value than the synthesis effort they require.

### 4.5 Time to Completion Model

Let total time T be composed of:

$$T = T_{parallel} + T_{synthesis}$$

Where T_parallel is the time for parallel work (largely independent of team size, determined by the most complex sub-question) and T_synthesis scales with the number of outputs:

$$T_{synthesis} = \gamma \cdot n$$

with γ representing synthesis time per output. Empirically, γ ≈ 0.2-0.3 × T_parallel.

This model explains why larger teams may not always be faster—while parallel phase time remains constant, synthesis time increases linearly with team size.

---

## 5. Concrete Recommendations

Based on our analysis, we recommend the following optimal structure for multi-agent research teams:

### 5.1 Team Size: 3-5 Agents

**For exploratory research** (wide search, many angles): 5 agents
- High diversity of perspective
- Comprehensive coverage
- Higher synthesis overhead

**For focused research** (deep dive, specific question): 3 agents
- Faster synthesis
- Less redundancy
- Sufficient perspective diversity

**For validation/synthesis** (building on existing work): 2-3 agents
- Efficiency priority
- Minimal redundancy
- Clear focus

### 5.2 Role Structure

| Role | Primary Function | Required for All Projects |
|------|------------------|---------------------------|
| Researcher | Information gathering, gap analysis | Yes |
| Writer | Synthesis, narrative construction | Yes |
| Builder | Experiments, simulations | As needed |
| Reviewer | Quality assurance | Strongly recommended |

**Optimal role assignment:**
- Small teams (3 agents): Researcher + Writer + Builder/Reviewer
- Medium teams (4 agents): Researcher + Writer + Builder + Reviewer
- Large teams (5 agents): 2 Researchers + Writer + Builder + Reviewer

**Flexible adaptation**: In practice, the researcher role often subsumes the writer role, with agents producing complete documents rather than separate research and writing phases. The four-role model should be viewed as an ideal rather than a strict requirement.

### 5.3 Coordination Mechanism

**Recommended hybrid approach:**
1. **Git** for version control, history, and asynchronous collaboration
2. **Shared file system** for templates, methodology, and working documents
3. **Main session** for orchestration, synthesis, and quality control
4. **Structured briefs** for each agent (sub-question, output location, format, deadline)

### 5.4 Work Pattern: Primarily Parallel with Sequential Synthesis

- **Parallel phase**: Agents work simultaneously on their assigned sub-questions
- **Sequential phase**: Human synthesizes outputs into unified result
- **Iteration**: For complex topics, allow 1-2 iteration cycles with reviewer feedback

### 5.5 Implementation Guidelines

1. **Decompose questions** into 3-5 clear sub-questions before spawning agents
   - Each sub-question should be independently addressable
   - Avoid dependencies between sub-questions where possible

2. **Provide structured briefs** including:
   - Specific question
   - Output location
   - Format requirements
   - Deadline
   - Relevant context and constraints

3. **Allocate synthesis time** — expect 20-30% of total project time for integration

4. **Include reviewer role** — quality assurance significantly improves outputs
   - Budget additional time for review cycles
   - Iterate based on feedback

5. **Preserve history** — commit all outputs to Git for future reference

6. **Monitor coordination costs** — track synthesis time and adjust team size for future projects

### 5.6 Decision Framework

When structuring a new research project, consider this decision tree:

```
Is the question complex and multi-faceted?
├─ YES → Use 4-5 agents
│       Consider iteration cycles
└─ NO → Is it exploratory (many angles)?
        ├─ YES → Use 4-5 agents
        └─ NO → Use 2-3 agents

Does the question require experimentation?
├─ YES → Include Builder role
└─ NO → Researcher + Writer sufficient

Is output quality critical?
├─ YES → Include Reviewer, allow iteration
└─ NO → Skip Reviewer for speed
```

---

## 6. Limitations and Future Research

### 6.1 Limitations of This Analysis

- **Sample size**: Only 3 completed projects with varying methodologies provide empirical data
- **Domain specificity**: Findings may not generalize beyond research tasks to other multi-agent applications
- **Human factors**: The role of the main session orchestrator is not fully quantified—our model treats the human as a constant rather than a variable
- **Tool constraints**: Results may depend on specific tooling (OpenClaw, Git), and different coordination tools might yield different optimal structures
- **Task heterogeneity**: Projects varied in scope and complexity, making direct comparison challenging
- **No control group**: We cannot directly compare structured vs. unstructured approaches within the same project

### 6.2 Areas Requiring Further Investigation

See Section 7 for new questions identified.

---

## 7. New Questions for Level 2 Research

This analysis reveals several important questions that remain unanswered and warrant further investigation:

### Question 1: What is the optimal handoff protocol between agent roles?

Our analysis identifies the researcher→writer handoff as critical, but we have not systematically studied:
- What information must be included in agent briefs to enable effective handoffs?
- How should context be preserved across role transitions?
- What template structure maximizes effective handoffs?
- Should handoffs be direct (agent-to-agent) or mediated through the main session?

**Next level research**: Design and test specific handoff protocols, measure efficiency gains, develop best-practice templates. Consider A/B testing different brief structures to identify optimal formats.

### Question 2: How does team structure scale across nested hierarchical projects?

Our current model addresses single-level teams (3-5 agents working on one question). But larger projects may require:
- Multiple sub-teams working on different aspects
- Coordination between teams
- Hierarchical synthesis (team-level then project-level)
- What is the optimal span of control for team-level coordinators?
- At what point does inter-team coordination become more costly than benefits?

**Next level research**: Investigate optimal structure for multi-team projects, including span of control, inter-team coordination mechanisms, and cross-team synthesis approaches. This question is particularly important for scaling research operations beyond current capacity.

---

## 8. Conclusion

Optimal multi-agent research team structure depends on task complexity, available coordination mechanisms, and quality requirements. Based on our analysis of the Research Fortress methodology and empirical data from four research projects, we recommend:

- **Team size of 3-5 agents** depending on task scope
- **Clear role specialization** with Researcher, Writer, Builder, and Reviewer roles
- **Hybrid coordination** using Git for version control, shared file system for artifacts, and human orchestration for synthesis
- **Primarily parallel work** with sequential synthesis phases
- **Structured briefs and iteration cycles** for quality assurance

The mathematical model presented predicts that coordination overhead grows quadratically with team size, explaining why smaller, well-coordinated teams outperform larger, ad-hoc groups. The model also highlights the critical importance of synthesis effort in determining output quality.

Future research should investigate handoff protocols and hierarchical team structures to further optimize multi-agent research workflows. As multi-agent systems become more sophisticated, understanding the organizational principles that govern their effectiveness will become increasingly important.

The key insight from this analysis is that multi-agent research teams are not simply mechanical aggregations of individual agents—they are complex systems whose performance depends critically on how agents are organized, coordinated, and integrated. The optimal structure is not universal but depends on the specific task, available tools, and quality requirements. By understanding the underlying dynamics, we can make informed decisions about team structure that maximize research productivity.

---

## References

- Brooks, F.P. (1975). The Mythical Man-Month: Essays on Software Engineering. Addison-Wesley.
- Research Fortress Methodology Documentation (2026). ~/research-fortress/METHODOLOGY.md
- Research Fortress Project Log (2026). ~/research-fortress/PROJECTS.md
- Research Fortress Playbook (2026). ~/research-fortress/PLAYBOOK.md
- Ricardo, D. (1817). On the Principles of Political Economy and Taxation.
- Amdahl, G.M. (1967). Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities. AFIPS Conference Proceedings.

---

*This paper is a living document. Update as the methodology evolves.*

**Word count**: ~3,600 words