Troubleshooting
This guide helps you diagnose and resolve common issues with OmniEmbodied. Issues are organized by category with detailed solutions and prevention tips.
Installation Issues
ImportError: No module named ‘OmniSimulator’
Symptoms: - Python can’t find the OmniSimulator module - Import statements fail
Causes: - OmniSimulator package not installed correctly - Python path issues - Virtual environment problems
Solutions:
Reinstall OmniSimulator:
cd OmniEmbodied/OmniSimulator pip install -e .
Check Python path:
import sys print(sys.path) # Ensure OmniEmbodied directory is in the path
Verify virtual environment:
which python pip list | grep -i omnisimulator
Permission Denied Errors
Symptoms: - Can’t write to directories - Installation fails with permission errors
Solutions:
Use virtual environment (recommended):
python -m venv omniembodied-env source omniembodied-env/bin/activate pip install -e .
Install for current user only:
pip install --user -e .
Check directory permissions:
ls -la # Ensure you have write permissions
YAML Configuration Errors
Symptoms: - “yaml.scanner.ScannerError” messages - Configuration not loading
Solutions:
Validate YAML syntax:
python -c "import yaml; yaml.safe_load(open('config.yaml'))"
Check indentation (use spaces, not tabs):
# Correct dataset: default: "eval_single" # Incorrect (mixed tabs/spaces) dataset: default: "eval_single"
Escape special characters:
# For strings with special characters message: "Task: \"find the key\""
Runtime Errors
Simulation Hangs or Times Out
Symptoms: - Simulation appears stuck - No progress for extended periods - Timeout errors
Diagnostic Steps:
Enable debug logging:
import logging logging.basicConfig(level=logging.DEBUG)
Check LLM connectivity:
curl -I https://api.openai.com/v1/models # Or test your LLM endpoint
Monitor system resources:
top # Linux/Mac htop # Enhanced version # Check CPU, memory usage
Solutions:
Set reasonable timeouts:
execution: max_steps_per_task: 35 timeout_seconds: 300
Check API rate limits:
llm_config: timeout: 30 max_retries: 3
Use faster models for testing:
llm_config: model_name: "gpt-3.5-turbo" # Faster than GPT-4
Invalid Action Errors
Symptoms: - Agent attempts impossible actions - Action validation failures - “Action not allowed” messages
Diagnostic Steps:
Check action logs:
grep -i "action" simulation.log
Verify environment state:
# Add debug prints in your agent print(f"Current room: {agent.current_room}") print(f"Available objects: {environment.get_objects()}")
Solutions:
Improve agent prompting:
agent_config: environment_description: detail_level: 'full' show_object_properties: true
Add action validation:
# In custom agent code if not self.can_execute_action(action, target): return self.fallback_action()
Enable step-by-step verification:
task_verification: enabled: true mode: "step_by_step"
LLM API Issues
Authentication Errors
Symptoms: - “Invalid API key” errors - “Authentication failed” messages - HTTP 401 responses
Solutions:
Verify API key:
echo $OPENAI_API_KEY # Should show your actual API key
Test API access:
curl -H "Authorization: Bearer $OPENAI_API_KEY" \ https://api.openai.com/v1/models
Check key permissions: - Ensure API key has required permissions - Check account billing status - Verify key hasn’t expired
Rate Limit Errors
Symptoms: - “Rate limit exceeded” messages - HTTP 429 responses - Slow or failed requests
Solutions:
Reduce request frequency:
parallel_evaluation: scenario_parallelism: max_parallel_scenarios: 2 # Reduce from default
Add request delays:
import time time.sleep(1) # Add delay between requests
Upgrade API plan: - Consider higher tier for increased limits - Monitor usage in API provider dashboard
Model Not Found Errors
Symptoms: - “Model not found” errors - Invalid model name responses
Solutions:
Check available models:
curl -H "Authorization: Bearer $OPENAI_API_KEY" \ https://api.openai.com/v1/models
Use correct model names:
llm_config: model_name: "gpt-4-turbo-preview" # Check exact name
Verify model access: - Some models require special access - Check account eligibility
Performance Issues
Slow Simulation Speed
Symptoms: - Simulations take much longer than expected - High CPU or memory usage - System becomes unresponsive
Diagnostic Tools:
Profile execution:
import cProfile pr = cProfile.Profile() pr.enable() # Run simulation pr.disable() pr.print_stats()
Monitor resources:
# Memory usage ps aux | grep python # Disk I/O iotop # Network activity netstat -i
Solutions:
Optimize configuration:
agent_config: max_history: 10 # Reduce from default 20 execution: max_steps_per_task: 25 # Reduce if appropriate
Use parallel processing wisely:
parallel_evaluation: scenario_parallelism: max_parallel_scenarios: 4 # Based on your CPU cores
Clean up regularly:
# Remove old logs find . -name "*.log" -mtime +7 -delete # Clear temporary files rm -rf /tmp/omniembodied_*
Memory Issues
Symptoms: - “Out of memory” errors - System swapping excessively - Process killed by OS
Solutions:
Reduce memory usage:
agent_config: max_history: 5 # Smaller history logging: level: "WARNING" # Less verbose logging
Process scenarios in batches:
# Instead of processing all at once scenarios = get_all_scenarios() batch_size = 10 for i in range(0, len(scenarios), batch_size): batch = scenarios[i:i+batch_size] process_batch(batch)
Monitor memory usage:
import psutil process = psutil.Process() print(f"Memory usage: {process.memory_info().rss / 1024 / 1024:.2f} MB")
Data and File Issues
Missing Dataset Files
Symptoms: - “File not found” errors for scenarios - Empty evaluation results
Solutions:
Verify data directory structure:
ls -la data/ # Should contain eval/, sft/, data-all/ directories
Check file paths in configuration:
dataset: default: "eval_single" # Must match directory structure
Download missing data:
# If data is in separate repository git submodule update --init --recursive
Corrupted JSON Files
Symptoms: - JSON parsing errors - “Invalid JSON” messages - Partial data loading
Diagnostic Steps:
Validate JSON files:
python -m json.tool scenario.json > /dev/null echo $? # Should be 0 for valid JSON
Find corrupted files:
find data/ -name "*.json" -exec sh -c 'python -m json.tool "$1" > /dev/null || echo "Invalid: $1"' _ {} \;
Solutions:
Restore from backup:
git checkout HEAD -- data/corrupted_file.json
Fix manually: - Use JSON validator to identify issues - Common problems: missing commas, unescaped quotes
Logging and Debugging
Enable Detailed Logging
For general debugging:
import logging
# Enable debug for all modules
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
For specific components:
# Simulator core
logging.getLogger("OmniSimulator.core").setLevel(logging.DEBUG)
# Agent decisions
logging.getLogger("modes.single_agent").setLevel(logging.DEBUG)
# LLM interactions
logging.getLogger("llm").setLevel(logging.DEBUG)
In configuration file:
logging:
level: "DEBUG"
show_llm_details: true
Save Debug Information
# Save detailed state
import json
debug_info = {
'agent_state': agent.get_state(),
'environment_state': env.get_state(),
'action_history': agent.get_history(),
'error_context': str(exception)
}
with open('debug_output.json', 'w') as f:
json.dump(debug_info, f, indent=2)
Getting Help
Before asking for help, collect:
System information:
python --version pip list | grep -E "(omni|llm|yaml)" uname -a # Linux/Mac # Windows: systeminfo
Error details: - Complete error messages - Stack traces - Configuration files (remove sensitive data) - Steps to reproduce
Log files: - Enable debug logging - Include relevant log excerpts - Timestamp information
Where to get help:
Check this troubleshooting guide first
Search existing GitHub issues
Create new issue with detailed information
Ask in GitHub Issues for usage questions
Creating effective bug reports:
Clear title: Describe the problem concisely
Environment: System details, versions
Steps to reproduce: Exact sequence of actions
Expected vs actual: What should happen vs what does
Logs and errors: Relevant error messages
Minimal example: Simplest case that shows the problem
Common Error Patterns
Pattern: “Attribute ‘X’ not found” - Usually indicates missing configuration - Check spelling and indentation in YAML - Verify all required fields are present
Pattern: “Connection refused” or “Timeout” - Network connectivity issues - API endpoint problems - Firewall or proxy blocking requests
Pattern: “Permission denied” - File system permissions - Virtual environment not activated - Trying to modify read-only files
Pattern: “Module not found” - Installation incomplete - Python path issues - Wrong virtual environment
Remember: most issues have been encountered before. Take time to search existing solutions before creating new issues.