Performance Guide

This comprehensive guide covers performance optimization, monitoring, and best practices for the homodyne package.

Performance Overview (v0.6.5+)

The homodyne package includes performance optimizations for classical and robust optimization methods. Key features include JIT compilation, vectorized NumPy operations, performance monitoring, and automated benchmarking.

Key Performance Features

JIT Compilation (Numba)
  • 3-5x speedup for core computational kernels

  • Automatic warmup and caching

  • Optimized for chi-squared calculations and correlation functions

Vectorized NumPy Operations
  • High-performance array computations

  • Optimized memory access patterns

Performance Monitoring
  • Built-in profiling decorators

  • Memory usage tracking

  • Performance regression detection

  • Automated benchmarking with statistical analysis

Optimization-Specific Performance
  • Classical: Optimized angle filtering, vectorized operations

  • Robust: CVXPY solver optimization, caching, progressive optimization

Method Performance Comparison

Speed Ranking (fastest to slowest):

  1. Classical Optimization (Nelder-Mead, Gurobi) - ~seconds to minutes - Best for: Exploratory analysis, parameter screening - Trade-offs: No uncertainty quantification, sensitive to local minima

  2. Robust Optimization (Wasserstein DRO, Scenario-based, Ellipsoidal) - ~2-5x classical - Best for: Noisy data, outlier resistance, measurement uncertainty - Trade-offs: Slower than classical, requires CVXPY

    • Best for: Full uncertainty quantification, publication-quality results

    • Trade-offs: Slowest method, requires careful convergence assessment

Performance Optimization Strategies

Classical Optimization

Angle Filtering Optimization:

# Enable smart angle filtering for faster optimization
config = {
    "optimization_config": {
        "angle_filtering": {
            "enabled": True,
            "target_ranges": [[-10, 10], [170, 190]]
        }
    }
}

Gurobi Trust Region Optimization:

# Iterative Gurobi with trust region for improved convergence
config = {
    "optimization_config": {
        "classical_optimization": {
            "methods": ["Gurobi", "Nelder-Mead"],  # Gurobi with trust regions tried first
            "method_options": {
                "Gurobi": {
                    "max_iterations": 50,  # Outer trust region iterations
                    "tolerance": 1e-6,
                    "trust_region_initial": 0.1,
                    "trust_region_min": 1e-8,
                    "trust_region_max": 1.0
                }
            }
        }
    }
}

Robust Optimization

Solver Optimization:

# CLARABEL is typically fastest, followed by SCS
config = {
    "optimization_config": {
        "robust_optimization": {
            "solver_settings": {
                "preferred_solver": "CLARABEL",
                "enable_caching": True,
                "enable_progressive_optimization": True
            }
        }
    }
}

Method Selection by Speed:

  1. Ellipsoidal - Fastest robust method

  2. Wasserstein DRO - Moderate speed, good uncertainty modeling

  3. Scenario-based - Slowest, most robust to outliers

Optimization Performance Configuration

Classical Optimization Configuration:

# Configure for optimal CPU performance
config = {
    "optimization_config": {
        "classical_optimization": {
            "methods": ["Nelder-Mead"],
            "method_options": {
                "Nelder-Mead": {
                    "maxiter": 5000,
                    "xatol": 1e-6,
                    "fatol": 1e-6
                }
            }
        }
    },
    "performance_settings": {
        "num_threads": 4,              # Multi-core CPU parallelism
        "enable_jit": True,            # Numba JIT compilation
        "data_type": "float64"         # Precision control
    }
}

Optimization Strategy by Problem Size:

# Static mode (3 parameters) - Faster convergence
static_config = {
    "optimization_config": {
        "classical_optimization": {
            "methods": ["Nelder-Mead"],
            "method_options": {
                "Nelder-Mead": {"maxiter": 2000}
            }
        }
    }
}

# Laminar flow (7 parameters) - More iterations needed
flow_config = {
    "optimization_config": {
        "classical_optimization": {
            "methods": ["Nelder-Mead"],
            "method_options": {
                "Nelder-Mead": {"maxiter": 5000}
            }
        }
    },
    "performance_settings": {
        "num_threads": 8  # More parallelism for complex problems
    }
}

Memory Optimization:

# For memory-constrained systems
memory_config = {
    "draws": 5000,
    "tune": 1000,
    "thin": 5,        # Effective samples: 1000, lower memory usage
    "chains": 2
}

Performance Monitoring

Built-in Profiling

Function-level Monitoring:

from homodyne.core.profiler import performance_monitor

@performance_monitor(monitor_memory=True, log_threshold_seconds=0.5)
def my_analysis_function(data):
    return process_data(data)

# Get performance statistics
from homodyne.core.profiler import get_performance_summary
summary = get_performance_summary()
print(f"Function called {summary['my_analysis_function']['calls']} times")
print(f"Average time: {summary['my_analysis_function']['avg_time']:.3f}s")

Benchmarking Utilities:

from homodyne.core.profiler import stable_benchmark

# Reliable performance measurement with statistical analysis
results = stable_benchmark(my_function, warmup_runs=5, measurement_runs=15)
print(f"Mean time: {results['mean']:.4f}s, CV: {results['std']/results['mean']:.3f}")

Performance Testing

Automated Performance Tests:

# Run performance validation
python -m pytest -m performance

# Run regression detection
python -m pytest -m regression

# Benchmark with statistical analysis
python -m pytest -m benchmark --benchmark-only

Performance Baselines:

The package maintains performance baselines with excellent stability:

  • Chi-squared calculation: ~0.8-1.2ms (CV ≤ 0.09)

  • Correlation calculation: ~0.26-0.28ms (CV ≤ 0.16)

  • Memory efficiency: Automatic cleanup prevents >50MB accumulation

  • Stability: 95%+ improvement in coefficient of variation

Environment Optimization

Threading Configuration:

# Conservative threading for numerical stability (automatically set)
export NUMBA_NUM_THREADS=4
export OPENBLAS_NUM_THREADS=4

JIT Optimization:

# Balanced optimization (automatically configured)
export NUMBA_FASTMATH=0      # Disabled for numerical stability
export NUMBA_LOOP_VECTORIZE=1
export NUMBA_OPT=2           # Moderate optimization level

Memory Management:

# Numba caching for faster startup
export NUMBA_CACHE_DIR=~/.numba_cache

Troubleshooting Performance Issues

Common Issues and Solutions:

  • Enable JIT compilation: Already included with Numba

  • Reduce problem size: Use angle filtering

  1. High Memory Usage - Use progressive optimization: "enable_progressive_optimization": true - Monitor with: @performance_monitor(monitor_memory=True)

  2. Classical Optimization Convergence - Try improved Gurobi solver: pip install gurobipy (requires license, uses iterative trust region) - Adjust tolerances: Lower xatol and fatol in config - Enable angle filtering: Reduces parameter space complexity - Configure trust region: Adjust trust_region_initial in Gurobi options

  3. Robust Optimization Solver Issues - Install preferred solvers: pip install clarabel - Enable fallback: "fallback_to_classical": true - Adjust regularization: Lower regularization_alpha

Performance Profiling:

# Profile a complete analysis
from homodyne.core.profiler import performance_monitor

@performance_monitor(monitor_memory=True)
def full_analysis():
    analysis = HomodyneAnalysisCore(config)
    return analysis.optimize_all()

result = full_analysis()
# Check logs for performance breakdown

Best Practices

Development Workflow:

  1. Start with classical methods for rapid prototyping

  2. Use angle filtering to reduce computational complexity

  3. Enable robust methods for noisy/uncertain data

  4. Monitor performance with built-in profiling tools

Production Deployment:

  1. Install performance extras: pip install homodyne-analysis[performance]

  2. Configure environment variables for optimal threading

  3. Enable caching in robust optimization settings

  4. Validate with benchmarks before deployment

Code Quality and Maintenance

Code Quality Standards (v0.6.5+):

The homodyne package maintains high code quality standards with comprehensive tooling:

Formatting and Style:

# All code formatted with Black (88-character line length)
black homodyne --line-length 88

# Import sorting with isort
isort homodyne --profile black

# Linting with flake8
flake8 homodyne --max-line-length 88

# Type checking with mypy
mypy homodyne --ignore-missing-imports

Quality Improvements (Recent):

  • Black formatting: 100% compliant across all files

  • Import organization: Consistent import sorting with isort

  • Code reduction: Removed 308 lines of unused fallback implementations

  • Type annotations: Improved import patterns to resolve mypy warnings

  • Critical fixes: Resolved comparison operators and missing function definitions

Code Statistics:

Code Quality Metrics

Tool

Status

Issues

Notes

Black

✅ 100%

0

88-char line length

isort

✅ 100%

0

Sorted and optimized

flake8

⚠️ ~400

E501, F401

Mostly line length and data scripts

mypy

⚠️ ~285

Various

Missing library stubs, annotations

Development Workflow:

  1. Pre-commit hooks: Automatic formatting and linting

  2. Continuous integration: Code quality checks on all PRs

  3. Performance regression detection: Automated benchmarking

  4. Test coverage: Comprehensive test suite with 95%+ coverage

  5. Documentation: Sphinx-based documentation with examples

Performance and Quality Balance:

The package achieves both high performance and maintainable code through:

  • Optimized algorithms: Trust region Gurobi, vectorized operations

  • Clean architecture: Modular design with clear separation of concerns

  • Comprehensive testing: Unit, integration, and performance tests

  • Documentation: Detailed API documentation and user guides

The homodyne package is designed for high-performance scientific computing with comprehensive optimization strategies and maintainable, high-quality code.