Agent Modes

The OmniEmbodied Framework provides multiple agent architectures to support different research scenarios and coordination patterns. Each mode offers distinct advantages for different types of embodied AI tasks.

Overview

Agent modes in OmniEmbodied are designed to support various coordination patterns:

  • Single Agent Mode: Individual agents operating independently

  • Centralized Multi-Agent Mode: Central coordinator managing multiple worker agents

  • Decentralized Multi-Agent Mode: Autonomous agents with peer-to-peer coordination (future)

Each mode integrates seamlessly with the evaluation system and supports all LLM providers.

Single Agent Mode

Architecture

The single agent mode provides a straightforward interface for individual agent tasks:

Key Components:

  • LLM Integration: Direct connection to language models for decision-making

  • Action Planning: Sequential action generation based on observations

  • Memory Management: Conversation history and experience tracking

  • Task Execution: Independent task completion without coordination

Use Cases:

  • Individual task completion scenarios

  • Baseline performance measurement

  • Agent capability evaluation

  • Simple interaction testing

Implementation

The LLMAgent class provides the core single agent functionality:

from modes.single_agent.llm_agent import LLMAgent
from OmniSimulator.core.engine import SimulationEngine

# Initialize simulation environment
simulator = SimulationEngine()

# Create single agent
agent = LLMAgent(
    simulator=simulator,
    agent_id="solo_worker",
    config=agent_config
)

# Set task
agent.set_task("Find the red apple in the kitchen and place it on the table")

# Execute steps
while not agent.is_task_completed():
    action = agent.generate_action()
    result = agent.execute_action(action)

    if not result.success:
        print(f"Action failed: {result.message}")
        break

Configuration

Configure single agent behavior:

# single_agent_config.yaml
agent_config:
  agent_class: "modes.single_agent.llm_agent.LLMAgent"
  max_history: 20              # Conversation history length

# LLM settings
llm_config:
  provider: "openai"
  model: "gpt-4"
  temperature: 0.1
  max_tokens: 512

# Prompt configuration
prompt_config:
  template: "single_agent_v1"  # Prompt template version
  system_prompt_key: "system_prompt"
  use_chain_of_thought: true   # Enable reasoning traces

Prompt Management

Single agents use specialized prompts for task execution:

# Agent automatically selects appropriate prompts
agent.set_task("Clean the kitchen thoroughly")

# System prompt includes:
# - Role definition
# - Available actions
# - Task completion criteria
# - Response format requirements

# Task-specific prompts adapt to:
# - Current environment state
# - Action history
# - Task progress
# - Error recovery

Centralized Multi-Agent Mode

Architecture

The centralized mode uses a single LLM to coordinate multiple agents:

Control Structure:

  • Central Coordinator: Single LLM making decisions for all agents

  • Agent Proxies: Individual agents executing coordinator commands

  • State Synchronization: Unified world state across all agents

  • Action Coordination: Prevents conflicts and ensures collaboration

Advantages:

  • Consistent decision-making across agents

  • Optimal resource allocation

  • Reduced communication overhead

  • Simplified coordination logic

from modes.centralized.centralized_agent import CentralizedAgent

# Create centralized coordinator
coordinator = CentralizedAgent(
    simulator=simulator,
    agent_id="mission_control",
    config=centralized_config
)

# Coordinator manages multiple agent proxies
# Each proxy represents a physical agent in the simulation
managed_agents = ["agent_1", "agent_2", "agent_3"]
coordinator.set_managed_agents(managed_agents)

Coordination Patterns

Task Decomposition:

The coordinator breaks complex tasks into subtasks:

# Example coordination decision
coordinator.set_task("Prepare dinner for the family")

# Coordinator might plan:
# Agent 1: "Go to refrigerator and get vegetables"
# Agent 2: "Find cooking pot in kitchen cabinet"
# Agent 3: "Set the dining table with plates and utensils"

# Execute coordinated plan
action = coordinator.generate_action()  # Returns multi-agent action
result = coordinator.execute_action(action)

Resource Management:

Prevents conflicts and ensures efficient resource usage:

# Coordinator automatically handles:
# - Agent spatial positioning
# - Object access conflicts
# - Tool sharing between agents
# - Sequential vs parallel task execution

Communication Optimization:

Reduces inter-agent communication through centralized planning:

# Traditional multi-agent: Multiple communication rounds
# Centralized: Single decision point with full information

Configuration

Centralized multi-agent configuration:

# centralized_config.yaml
agent_config:
  agent_class: "modes.centralized.centralized_agent.CentralizedAgent"
  managed_agents: ["agent_1", "agent_2"]  # Agents under coordination
  coordination_strategy: "optimal"         # Planning strategy

# Multi-agent specific settings
multi_agent:
  max_agents: 3                    # Maximum agents to coordinate
  coordination_timeout: 30         # Max time for coordination decisions
  conflict_resolution: "priority"  # How to handle conflicts

# Prompt settings for coordination
prompt_config:
  template: "centralized_v1"       # Coordination-specific prompts
  include_all_agent_states: true  # Include all agent info in prompts
  action_format: "multi_agent"     # Multi-agent action format

Advanced Coordination

Dynamic Task Assignment:

class AdvancedCentralizedAgent(CentralizedAgent):
    def generate_action(self):
        # Analyze current state
        agent_states = self.get_all_agent_states()
        task_priorities = self.analyze_task_priorities()

        # Dynamic assignment based on:
        # - Agent capabilities
        # - Current positions
        # - Task urgency
        # - Resource availability

        return self.optimize_agent_assignments(agent_states, task_priorities)

Performance Monitoring:

# Monitor coordination effectiveness
coordination_metrics = coordinator.get_coordination_metrics()

print(f"Agent utilization: {coordination_metrics['agent_utilization']}")
print(f"Task completion rate: {coordination_metrics['completion_rate']}")
print(f"Coordination overhead: {coordination_metrics['overhead_time']}")

Decentralized Multi-Agent Mode (Future)

Planned Architecture

Future decentralized mode will support:

Autonomous Agents:

  • Independent LLM instances for each agent

  • Peer-to-peer communication protocols

  • Distributed decision-making

  • Emergent coordination patterns

Communication Framework:

  • Message passing between agents

  • Negotiation and consensus mechanisms

  • Information sharing protocols

  • Conflict resolution strategies

Planned Features:

# Future decentralized agent example
from modes.decentralized.autonomous_agent import AutonomousAgent

# Each agent has independent reasoning
agent_1 = AutonomousAgent("worker_1", llm_config_1)
agent_2 = AutonomousAgent("worker_2", llm_config_2)

# Agents communicate directly
agent_1.send_message(agent_2.id, "I found the target object")
response = agent_2.receive_messages()

# Distributed task planning
joint_plan = agent_1.negotiate_plan(agent_2, shared_task)

Agent Integration Patterns

Custom Agent Development

Extend base agent classes for custom behaviors:

from core.base_agent import BaseAgent

class CustomAgent(BaseAgent):
    def __init__(self, simulator, agent_id, config):
        super().__init__(simulator, agent_id, config)
        self.custom_state = {}

    def generate_action(self):
        # Implement custom decision logic
        observations = self.get_observations()
        return self.custom_planning_algorithm(observations)

    def custom_planning_algorithm(self, observations):
        # Custom planning logic
        if self.needs_exploration():
            return self.explore_strategy()
        elif self.has_clear_objective():
            return self.direct_action_strategy()
        else:
            return self.reasoning_strategy()

Hybrid Agent Systems

Combine different agent types:

class HybridAgentSystem:
    def __init__(self):
        # Mix of agent types
        self.coordinator = CentralizedAgent(...)
        self.specialist = CustomAgent(...)
        self.backup = LLMAgent(...)

    def execute_task(self, task):
        # Use different agents for different subtasks
        if task.requires_coordination():
            return self.coordinator.execute_task(task)
        elif task.needs_specialist_knowledge():
            return self.specialist.execute_task(task)
        else:
            return self.backup.execute_task(task)

Performance Comparison

Agent Mode Trade-offs

Single Agent Mode:

Advantages: - Simple implementation and debugging - No coordination overhead - Clear responsibility assignment - Fast decision making

Disadvantages: - Limited to individual tasks - No collaboration benefits - May be inefficient for complex tasks - Single point of failure

Centralized Multi-Agent Mode:

Advantages: - Optimal coordination - Consistent decision-making - Efficient resource allocation - Good performance on collaborative tasks

Disadvantages: - Single point of failure (coordinator) - Scalability limitations - Higher computational requirements - Communication bottleneck

Decentralized Multi-Agent Mode (Planned):

Advantages: - Fault tolerance and robustness - Scalable to many agents - Emergent behaviors possible - Distributed processing

Disadvantages: - Communication complexity - Potential coordination failures - Higher system complexity - Difficult debugging and monitoring

Benchmarking Results

Based on evaluation across 1400+ scenarios:

Success Rate by Agent Mode:

Single Agent:     78.5% ± 2.1%
Centralized:      85.2% ± 1.8%

Average Steps by Task Type:

Direct Commands:
- Single Agent:   8.2 ± 1.4
- Centralized:    7.1 ± 1.2

Collaboration Tasks:
- Single Agent:   N/A (not applicable)
- Centralized:    14.8 ± 2.3

Best Practices

Mode Selection

Choose Single Agent Mode for:

  • Individual task completion

  • Baseline performance measurement

  • Simple interaction scenarios

  • Resource-constrained environments

Choose Centralized Mode for:

  • Collaborative task scenarios

  • When coordination is critical

  • Limited communication bandwidth

  • When consistency is paramount

Plan Decentralized Mode for:

  • Highly scalable scenarios

  • Fault-tolerant requirements

  • Emergent behavior research

  • Distributed decision-making studies

Implementation Guidelines

Configuration Management:

  • Use separate config files for each mode

  • Validate configuration compatibility

  • Document mode-specific parameters

  • Implement configuration validation

Error Handling:

  • Implement mode-specific error recovery

  • Handle coordination failures gracefully

  • Provide fallback mechanisms

  • Log mode-specific debugging information

Performance Optimization:

  • Profile mode-specific bottlenecks

  • Optimize communication patterns

  • Cache frequently accessed state

  • Monitor resource usage per mode

Testing and Validation:

  • Test each mode independently

  • Validate coordination mechanisms

  • Stress test with multiple agents

  • Compare performance across modes

API Reference

For complete API documentation, see: