Performance Guide
This comprehensive guide covers performance optimization, monitoring, and best practices for the homodyne package.
Performance Overview (v0.6.5+)
The homodyne package includes performance optimizations for classical and robust optimization methods. Key features include JIT compilation, vectorized NumPy operations, performance monitoring, and automated benchmarking.
Key Performance Features
- JIT Compilation (Numba)
3-5x speedup for core computational kernels
Automatic warmup and caching
Optimized for chi-squared calculations and correlation functions
- Vectorized NumPy Operations
High-performance array computations
Optimized memory access patterns
- Performance Monitoring
Built-in profiling decorators
Memory usage tracking
Performance regression detection
Automated benchmarking with statistical analysis
- Optimization-Specific Performance
Classical: Optimized angle filtering, vectorized operations
Robust: CVXPY solver optimization, caching, progressive optimization
Method Performance Comparison
Speed Ranking (fastest to slowest):
Classical Optimization (Nelder-Mead, Gurobi) - ~seconds to minutes - Best for: Exploratory analysis, parameter screening - Trade-offs: No uncertainty quantification, sensitive to local minima
Robust Optimization (Wasserstein DRO, Scenario-based, Ellipsoidal) - ~2-5x classical - Best for: Noisy data, outlier resistance, measurement uncertainty - Trade-offs: Slower than classical, requires CVXPY
Best for: Full uncertainty quantification, publication-quality results
Trade-offs: Slowest method, requires careful convergence assessment
Performance Optimization Strategies
Classical Optimization
Angle Filtering Optimization:
# Enable smart angle filtering for faster optimization
config = {
"optimization_config": {
"angle_filtering": {
"enabled": True,
"target_ranges": [[-10, 10], [170, 190]]
}
}
}
Gurobi Trust Region Optimization:
# Iterative Gurobi with trust region for improved convergence
config = {
"optimization_config": {
"classical_optimization": {
"methods": ["Gurobi", "Nelder-Mead"], # Gurobi with trust regions tried first
"method_options": {
"Gurobi": {
"max_iterations": 50, # Outer trust region iterations
"tolerance": 1e-6,
"trust_region_initial": 0.1,
"trust_region_min": 1e-8,
"trust_region_max": 1.0
}
}
}
}
}
Robust Optimization
Solver Optimization:
# CLARABEL is typically fastest, followed by SCS
config = {
"optimization_config": {
"robust_optimization": {
"solver_settings": {
"preferred_solver": "CLARABEL",
"enable_caching": True,
"enable_progressive_optimization": True
}
}
}
}
Method Selection by Speed:
Ellipsoidal - Fastest robust method
Wasserstein DRO - Moderate speed, good uncertainty modeling
Scenario-based - Slowest, most robust to outliers
Optimization Performance Configuration
Classical Optimization Configuration:
# Configure for optimal CPU performance
config = {
"optimization_config": {
"classical_optimization": {
"methods": ["Nelder-Mead"],
"method_options": {
"Nelder-Mead": {
"maxiter": 5000,
"xatol": 1e-6,
"fatol": 1e-6
}
}
}
},
"performance_settings": {
"num_threads": 4, # Multi-core CPU parallelism
"enable_jit": True, # Numba JIT compilation
"data_type": "float64" # Precision control
}
}
Optimization Strategy by Problem Size:
# Static mode (3 parameters) - Faster convergence
static_config = {
"optimization_config": {
"classical_optimization": {
"methods": ["Nelder-Mead"],
"method_options": {
"Nelder-Mead": {"maxiter": 2000}
}
}
}
}
# Laminar flow (7 parameters) - More iterations needed
flow_config = {
"optimization_config": {
"classical_optimization": {
"methods": ["Nelder-Mead"],
"method_options": {
"Nelder-Mead": {"maxiter": 5000}
}
}
},
"performance_settings": {
"num_threads": 8 # More parallelism for complex problems
}
}
Memory Optimization:
# For memory-constrained systems
memory_config = {
"draws": 5000,
"tune": 1000,
"thin": 5, # Effective samples: 1000, lower memory usage
"chains": 2
}
Performance Monitoring
Built-in Profiling
Function-level Monitoring:
from homodyne.core.profiler import performance_monitor
@performance_monitor(monitor_memory=True, log_threshold_seconds=0.5)
def my_analysis_function(data):
return process_data(data)
# Get performance statistics
from homodyne.core.profiler import get_performance_summary
summary = get_performance_summary()
print(f"Function called {summary['my_analysis_function']['calls']} times")
print(f"Average time: {summary['my_analysis_function']['avg_time']:.3f}s")
Benchmarking Utilities:
from homodyne.core.profiler import stable_benchmark
# Reliable performance measurement with statistical analysis
results = stable_benchmark(my_function, warmup_runs=5, measurement_runs=15)
print(f"Mean time: {results['mean']:.4f}s, CV: {results['std']/results['mean']:.3f}")
Performance Testing
Automated Performance Tests:
# Run performance validation
python -m pytest -m performance
# Run regression detection
python -m pytest -m regression
# Benchmark with statistical analysis
python -m pytest -m benchmark --benchmark-only
Performance Baselines:
The package maintains performance baselines with excellent stability:
Chi-squared calculation: ~0.8-1.2ms (CV ≤ 0.09)
Correlation calculation: ~0.26-0.28ms (CV ≤ 0.16)
Memory efficiency: Automatic cleanup prevents >50MB accumulation
Stability: 95%+ improvement in coefficient of variation
Environment Optimization
Threading Configuration:
# Conservative threading for numerical stability (automatically set)
export NUMBA_NUM_THREADS=4
export OPENBLAS_NUM_THREADS=4
JIT Optimization:
# Balanced optimization (automatically configured)
export NUMBA_FASTMATH=0 # Disabled for numerical stability
export NUMBA_LOOP_VECTORIZE=1
export NUMBA_OPT=2 # Moderate optimization level
Memory Management:
# Numba caching for faster startup
export NUMBA_CACHE_DIR=~/.numba_cache
Troubleshooting Performance Issues
Common Issues and Solutions:
Enable JIT compilation: Already included with Numba
Reduce problem size: Use angle filtering
High Memory Usage - Use progressive optimization:
"enable_progressive_optimization": true- Monitor with:@performance_monitor(monitor_memory=True)Classical Optimization Convergence - Try improved Gurobi solver:
pip install gurobipy(requires license, uses iterative trust region) - Adjust tolerances: Lowerxatolandfatolin config - Enable angle filtering: Reduces parameter space complexity - Configure trust region: Adjusttrust_region_initialin Gurobi optionsRobust Optimization Solver Issues - Install preferred solvers:
pip install clarabel- Enable fallback:"fallback_to_classical": true- Adjust regularization: Lowerregularization_alpha
Performance Profiling:
# Profile a complete analysis
from homodyne.core.profiler import performance_monitor
@performance_monitor(monitor_memory=True)
def full_analysis():
analysis = HomodyneAnalysisCore(config)
return analysis.optimize_all()
result = full_analysis()
# Check logs for performance breakdown
Best Practices
Development Workflow:
Start with classical methods for rapid prototyping
Use angle filtering to reduce computational complexity
Enable robust methods for noisy/uncertain data
Monitor performance with built-in profiling tools
Production Deployment:
Install performance extras:
pip install homodyne-analysis[performance]Configure environment variables for optimal threading
Enable caching in robust optimization settings
Validate with benchmarks before deployment
Code Quality and Maintenance
Code Quality Standards (v0.6.5+):
The homodyne package maintains high code quality standards with comprehensive tooling:
Formatting and Style:
# All code formatted with Black (88-character line length)
black homodyne --line-length 88
# Import sorting with isort
isort homodyne --profile black
# Linting with flake8
flake8 homodyne --max-line-length 88
# Type checking with mypy
mypy homodyne --ignore-missing-imports
Quality Improvements (Recent):
✅ Black formatting: 100% compliant across all files
✅ Import organization: Consistent import sorting with isort
✅ Code reduction: Removed 308 lines of unused fallback implementations
✅ Type annotations: Improved import patterns to resolve mypy warnings
✅ Critical fixes: Resolved comparison operators and missing function definitions
Code Statistics:
Tool |
Status |
Issues |
Notes |
|---|---|---|---|
Black |
✅ 100% |
0 |
88-char line length |
isort |
✅ 100% |
0 |
Sorted and optimized |
flake8 |
⚠️ ~400 |
E501, F401 |
Mostly line length and data scripts |
mypy |
⚠️ ~285 |
Various |
Missing library stubs, annotations |
Development Workflow:
Pre-commit hooks: Automatic formatting and linting
Continuous integration: Code quality checks on all PRs
Performance regression detection: Automated benchmarking
Test coverage: Comprehensive test suite with 95%+ coverage
Documentation: Sphinx-based documentation with examples
Performance and Quality Balance:
The package achieves both high performance and maintainable code through:
Optimized algorithms: Trust region Gurobi, vectorized operations
Clean architecture: Modular design with clear separation of concerns
Comprehensive testing: Unit, integration, and performance tests
Documentation: Detailed API documentation and user guides
The homodyne package is designed for high-performance scientific computing with comprehensive optimization strategies and maintainable, high-quality code.