10 releases (5 breaking)

Uses new Rust 2024

new 0.6.0	Dec 24, 2025
0.5.3	Dec 22, 2025
0.4.1	Dec 17, 2025
0.3.0	Dec 2, 2025
0.1.0	Nov 16, 2025

#53 in Biology

Used in fastlowess

AGPL-3.0-or-later

310KB
4.5K SLoC

lowess

A high-performance implementation of LOWESS (Locally Weighted Scatterplot Smoothing) in Rust. This crate provides a robust, production-ready implementation with support for confidence intervals, multiple kernel functions, and optimized execution modes.

[!IMPORTANT] For parallelization or ndarray support, use fastLowess.

Features

Robust Statistics: IRLS with Bisquare, Huber, or Talwar weighting for outlier handling.
Uncertainty Quantification: Point-wise standard errors, confidence intervals, and prediction intervals.
Optimized Performance: Delta optimization for skipping dense regions and streaming/online modes for large or real-time datasets.
Parameter Selection: Built-in cross-validation for automatic smoothing fraction selection.
Flexibility: Multiple weight kernels (Tricube, Epanechnikov, etc.) and no_std support (requires alloc).
Validated: Numerical agreement with R's stats::lowess and Python's statsmodels.

Robustness Advantages

This implementation is more robust than statsmodels due to two key design choices:

MAD-Based Scale Estimation

For robustness weight calculations, this crate uses Median Absolute Deviation (MAD) for scale estimation:

s = median(|r_i - median(r)|)

In contrast, statsmodels uses median of absolute residuals:

s = median(|r_i|)

Why MAD is more robust:

MAD is a breakdown-point-optimal estimator—it remains valid even when up to 50% of data are outliers.
The median-centering step removes asymmetric bias from residual distributions.
MAD provides consistent outlier detection regardless of whether residuals are centered around zero.

Boundary Padding

This crate applies boundary policies (Extend, Reflect, Zero) at dataset edges:

Extend: Repeats edge values to maintain local neighborhood size.
Reflect: Mirrors data symmetrically around boundaries.
Zero: Pads with zeros (useful for signal processing).

statsmodels does not apply boundary padding, which can lead to:

Biased estimates near boundaries due to asymmetric local neighborhoods.
Increased variance at the edges of the smoothed curve.

Gaussian Consistency Factor

For interval estimation (confidence/prediction), residual scale is computed using:

sigma = 1.4826 * MAD

The factor 1.4826 = 1/Phi^-1(3/4) ensures consistency with the standard deviation under Gaussian assumptions.

Performance Advantages

Benchmarked against Python's statsmodels. Achieves 113-2813× faster performance across all tested scenarios, with no regressions. Performance gains scale dramatically with dataset size.

Summary

Category	Matched	Median Speedup	Mean Speedup
Scalability	5	481×	1057×
Financial	4	270×	301×
Iterations	6	238×	248×
Pathological	4	234×	220×
Scientific	4	212×	239×
Fraction	6	218×	268×
Genomic	4	6.9×	10.4×
Delta	4	5.0×	5.0×

Top 10 Performance Wins

Benchmark	statsmodels	Rust	Speedup
scale_100000	43.7s	15.5ms	2813×
scale_50000	11.2s	7.6ms	1466×
fraction_0.05	197.2ms	0.38ms	516×
financial_10000	497.1ms	0.97ms	512×
scale_10000	663.1ms	1.38ms	481×
scientific_10000	777.2ms	1.86ms	418×
financial_5000	170.9ms	0.49ms	346×
fraction_0.1	227.9ms	0.67ms	339×
scale_5000	229.9ms	0.69ms	334×
iterations_0	74.2ms	0.26ms	289×

Check Benchmarks for detailed results and reproducible benchmarking code.

Installation

Add this to your Cargo.toml:

[dependencies]
lowess = "0.6"

For no_std environments:

[dependencies]
lowess = { version = "0.6", default-features = false }

Quick Start

use lowess::prelude::*;

fn main() -> Result<(), LowessError> {
    let x = vec![1.0, 2.0, 3.0, 4.0, 5.0];
    let y = vec![2.0, 4.1, 5.9, 8.2, 9.8];

    // Basic smoothing
    let result = Lowess::new()
        .fraction(0.5)
        .adapter(Batch)
        .build()?
        .fit(&x, &y)?;

    println!("Smoothed values: {:?}", result.y);
    Ok(())
}

Builder Methods

use lowess::prelude::*;

Lowess::new()
    // Smoothing span (0, 1]
    .fraction(0.5)

    // Robustness iterations
    .iterations(3)

    // Interpolation threshold
    .delta(0.01)

    // Kernel selection
    .weight_function(Tricube)

    // Robustness method
    .robustness_method(Bisquare)

    // Zero-weight fallback behavior
    .zero_weight_fallback(UseLocalMean)

    // Boundary handling (for edge effects)
    .boundary_policy(Extend)

    // Confidence intervals
    .confidence_intervals(0.95)

    // Prediction intervals
    .prediction_intervals(0.95)

    // Diagnostics
    .return_diagnostics()
    .return_residuals()
    .return_robustness_weights()

    // Cross-validation (for parameter selection)
    .cross_validate(KFold(5, &[0.3, 0.5, 0.7]).seed(123))

    // Convergence
    .auto_converge(1e-4)

    // Execution mode
    .adapter(Batch)

    // Build the model
    .build()?;

Result Structure

pub struct LowessResult<T> {
    // Sorted x values
    pub x: Vec<T>,

    // Smoothed y values
    pub y: Vec<T>,

    // Point-wise standard errors
    pub standard_errors: Option<Vec<T>>,

    // Confidence intervals
    pub confidence_lower: Option<Vec<T>>,
    pub confidence_upper: Option<Vec<T>>,

    // Prediction intervals
    pub prediction_lower: Option<Vec<T>>,
    pub prediction_upper: Option<Vec<T>>,

    // Residuals
    pub residuals: Option<Vec<T>>,

    // Final IRLS weights
    pub robustness_weights: Option<Vec<T>>,

    // Diagnostics
    pub diagnostics: Option<Diagnostics<T>>,

    // Actual iterations used
    pub iterations_used: Option<usize>,

    // Selected fraction
    pub fraction_used: T,

    // CV RMSE per fraction
    pub cv_scores: Option<Vec<T>>,
}

Streaming Processing

For datasets that don't fit in memory:

let mut processor = Lowess::new()
    .fraction(0.3)
    .iterations(2)
    .adapter(Streaming)
    .chunk_size(1000)
    .overlap(100)
    .build()?;

// Process data in chunks
for chunk in data_chunks {
    let result = processor.process_chunk(&chunk.x, &chunk.y)?;
}

// Finalize processing
let final_result = processor.finalize()?;

Online Processing

For real-time data streams:

let mut processor = Lowess::new()
    .fraction(0.2)
    .iterations(1)
    .adapter(Online)
    .window_capacity(100)
    .build()?;

// Process points as they arrive
for (x, y) in data_stream {
    if let Some(output) = processor.add_point(x, y)? {
        println!("Smoothed: {}", output.smoothed);
    }
}

Parameter Selection Guide

Fraction (Smoothing Span)

0.1-0.3: Local, captures rapid changes (wiggly)
0.4-0.6: Balanced, general-purpose
0.7-1.0: Global, smooth trends only
Default: 0.67 (2/3, Cleveland's choice)
Use CV when uncertain

Robustness Method

Bisquare (default): Best all-around, smooth, efficient
Huber: Theoretically optimal MSE

Robustness Iterations

0: Clean data, speed critical
1-2: Light contamination
3: Default, good balance (recommended)
4-5: Heavy outliers
>5: Diminishing returns

Kernel Function

Tricube (default): Best all-around, smooth, efficient
Epanechnikov: Theoretically optimal MSE
Gaussian: Very smooth, no compact support
Uniform: Fastest, least smooth (moving average)

Delta Optimization

None: Small datasets (n < 1000)
0.01 × range(x): Good starting point for dense data
Manual tuning: Adjust based on data density

Examples

Check the examples directory for more complex scenarios:

cargo run --example batch_smoothing
cargo run --example online_smoothing
cargo run --example streaming_smoothing

MSRV

Rust 1.85.0 or later (2024 Edition).

Validation

Validated against:

Python (statsmodels): Passed on 44 distinct test scenarios.
Original Paper: Reproduces Cleveland (1979) results.

Check Validation for more information. Small variations in results are expected due to differences in scale estimation and padding.

Contributing

Contributions are welcome! Please see the CONTRIBUTING.md file for more information.

License

Dual-licensed under AGPL-3.0 (Open Source) or Commercial License. Contact <thisisamirv@gmail.com> for commercial inquiries.

References

Cleveland, W.S. (1979). "Robust Locally Weighted Regression and Smoothing Scatterplots". JASA.
Cleveland, W.S. (1981). "LOWESS: A Program for Smoothing Scatterplots". The American Statistician.

Dependencies

~700KB
~14K SLoC