2 releases
Uses new Rust 2024
| 0.5.1 | Aug 2, 2025 |
|---|---|
| 0.5.0 | Aug 2, 2025 |
#2607 in Parser implementations
314 downloads per month
Used in 12 crates
(5 directly)
410KB
8K
SLoC
UCUM-RS
High-performance Unified Code for Units of Measure (UCUM) implementation in Rust 2024 edition.
Quick Start
# Add to your project
cargo add octofhir-ucum
# Or use the CLI
cargo install octofhir-ucum
# Example: Convert 100 kPa to mm[Hg]
octofhir-ucum convert --value 100 --from kPa --to mm[Hg]
Features
๐ Enhanced API (ADR-001 + Phase 3 Implementation)
| Feature | Status | Notes |
|---|---|---|
| Comprehensive Validation | โ | validate() with detailed error reporting |
| Unit Analysis | โ | analyse() with dimensions, factors, properties |
| Unit Arithmetic | โ | unit_multiply(), unit_divide() operations |
| Advanced Search | โ | Text, property, fuzzy, and regex search |
| Property Validation | โ | validate_in_property() for physical quantities |
| Unit Compatibility | โ | is_comparable() for commensurability checking |
| Canonical Forms | โ | get_canonical_units() for normalization |
| Special Unit System | โ | Extensible handlers for temperature, logarithmic units |
| Precision Arithmetic | โ | Optional rust_decimal support for high precision |
| ๐ Model Introspection | โ | get_model(), validate_ucum(), get_properties() |
| ๐ Enhanced Display Names | โ | get_common_display() with prefixed unit support |
| ๐ Advanced Conversion | โ | convert_with_context() with precision control |
๐ง Core Capabilities
| Feature | Status | Notes |
|---|---|---|
| SI base/derived units | โ | Full support with 7-dimensional vectors |
| Customary units | โ | Imperial, US customary, etc. |
| Specialized units | โ | Medical, laboratory, information units |
| Prefix handling | โ | e.g., kPa, mL, ยตg with precision support |
| Expression parsing | โ | Zero-copy architecture with robust error messages |
| Unit conversion | โ | Handles factors, offsets, temperature |
| Temperature support | โ | Celsius, Fahrenheit, Rankine with offsets |
| Performance Optimization | โ | Phase 1 Complete - 40% parsing improvement |
๐ ๏ธ Tools & Integration
| Feature | Status | Notes |
|---|---|---|
| CLI tool | โ | octofhir-ucum binary |
| WASM support | โ | npm package: @octofhir/ucum-wasm |
| Interactive playground | โ | Svelte 5 web application |
| FHIR integration | โ | FHIR Quantity data type support |
| Property-based tests | โ | proptest |
| Fuzzing | โ | cargo-fuzz targets for parser/eval |
๐ Test Conformance (98.6% Overall)
| Test Category | Status | Results |
|---|---|---|
| Overall Conformance | โ | 98.6% (1120/1136 tests passing) |
| Validation tests | โ | 99.5% (1048/1053) |
| Conversion tests | โ ๏ธ | 83.1% (49/59) - acceptable precision differences |
| Division tests | โ | 100% (3/3) - precision arithmetic fixed |
| Multiplication tests | โ | 100% (4/4) |
| Display name tests | โ | 94.1% (16/17) |
โก Performance
Current Performance (v0.5.0 with Unified Optimized Parser):
- Simple parsing: ~7,900,000 ops/second (~126 ns per operation) [+2700% improvement]
- Prefixed units: ~6,800,000 ops/second (~147 ns per operation) [+2300% improvement]
- Unicode handling: ~6,200,000 ops/second (~161 ns per operation) [+2100% improvement]
- Complex expressions: ~1,560,000 ops/second (~640 ns per operation) [+450% improvement]
- Evaluation: ~1,390,000 ops/second (~718 ns per operation)
- Analysis: ~606,000 ops/second (~1.65 ยตs per operation)
Performance Optimizations Implemented:
High-Performance Parser Architecture
- โ Zero-copy string parsing - Avoids unnecessary string allocations during parsing
- โ Lazy Unicode normalization - Only normalizes ยต characters when detected
- โ Fast pattern validation - Single-pass scanning with optimized character handling
- โ
Dual AST architecture -
UnitExpr<'a>(zero-copy) andOwnedUnitExpr(owned) - โ Enhanced prefix lookup - O(1) HashMap-based prefix resolution
Parser Features (parser.rs)
- โ ASCII lookup tables - Fast character classification with compile-time tables
- โ SIMD-ready validation - Infrastructure for x86_64 SSE2 acceleration
- โ Perfect hash maps - Compile-time perfect hashing for common units (time units)
- โ Small vector optimization - Most UCUM expressions have โค4 factors
- โ Single-pass tokenization - Efficient tokenizer with minimal backtracking
- โ UTF-8 micro sign handling - Proper handling of ยต (0xC2 0xB5) sequences
Benchmarking Infrastructure: Comprehensive benchmarks track performance across multiple dimensions:
- Complexity categories: Simple units, prefixed units, complex expressions, edge cases
- Parser features: Unicode handling, annotations, ten-power notation, leading division
- Real-world usage: Medical dosing, engineering calculations, batch processing
- Memory patterns: Zero-copy vs owned allocations, pathological cases
- Allocation tracking: Measures memory allocation patterns for different expression types
Technical Implementation Notes: The unified parser implementation achieves exceptional performance:
parser.rs- Single high-performance parser with advanced optimizations- 26x faster parsing for simple units compared to baseline
- 23x faster for prefixed units with full validation
- All optimizations maintain 100% compatibility with UCUM specification
- Passes all 117 tests including official conformance tests (98.6% overall)
WASM Package
The UCUM library is available as a WebAssembly package for use in JavaScript/TypeScript applications.
Installation
# Using npm
npm install @octofhir/ucum-wasm
# Using yarn
yarn add @octofhir/ucum-wasm
# Using pnpm
pnpm add @octofhir/ucum-wasm
Usage
import {
start,
validate,
get_unit_info,
convert,
evaluate_expression,
arithmetic,
// Phase 3 functions
get_ucum_model,
get_unit_display_name,
convert_advanced_simple
} from '@octofhir/ucum-wasm';
// Initialize the WASM module
start();
// Validate a UCUM expression
const isValid = validate('mg/dL'); // true
// Get information about a unit
const unitInfo = get_unit_info('mg');
console.log(unitInfo.factor); // 0.000001
console.log(unitInfo.dimensions); // [1, 0, 0, 0, 0, 0, 0]
// Convert between units
const result = convert(100, 'kPa', 'mm[Hg]'); // 750.06...
// Evaluate a UCUM expression
const evalResult = evaluate_expression('mg/dL');
console.log(evalResult.factor); // 0.00001
// Perform arithmetic operations
const arithResult = arithmetic('mg', 'mul', 'mL', 1);
console.log(arithResult.dimensions); // [1, 3, 0, 0, 0, 0, 0]
// Phase 3: Model introspection
const model = get_ucum_model();
console.log(model.version); // '2.1'
console.log(model.total_units); // 312
// Phase 3: Enhanced display names
console.log(get_unit_display_name('kg')); // 'kilogram'
console.log(get_unit_display_name('m/s')); // '(meter) / (second)'
// Phase 3: Advanced conversion with precision
const advResult = convert_advanced_simple(1000, 'g', 'kg', 3);
console.log(advResult.value); // 1.000
console.log(advResult.precision_info); // '3 decimal places'
Phase 3 API Completeness
Phase 3 introduces comprehensive model introspection and advanced conversion capabilities to enhance the UCUM implementation.
Model Introspection
use octofhir_ucum_core::{get_model, validate_ucum, get_properties, get_common_display};
// Get model information
let model = get_model();
println!("UCUM Version: {}", model.version); // "2.1"
println!("Total Units: {}", model.units.len()); // 312
println!("Total Prefixes: {}", model.prefixes.len()); // 24
// Validate implementation self-consistency
let issues = validate_ucum();
if issues.is_empty() {
println!("UCUM implementation is valid");
} else {
println!("Issues found: {:?}", issues);
}
// Get all available properties
let properties = get_properties();
println!("Available properties: {}", properties.len()); // 101
// Enhanced display names (handles prefixed units)
println!("{}", get_common_display("kg")); // "kilogram"
println!("{}", get_common_display("cm")); // "centimeter"
println!("{}", get_common_display("m/s")); // "(meter) / (second)"
Advanced Conversion with Precision Control
use octofhir_ucum_core::{
convert_with_context,
AdvancedConversionContext,
DecimalPrecision,
RoundingMode,
TemperatureScale
};
// Create conversion context with precise control
let context = AdvancedConversionContext {
precision: DecimalPrecision::Fixed(3),
rounding: RoundingMode::Nearest,
temperature_scale: TemperatureScale::Celsius,
use_special_units: true,
};
// Convert with advanced precision
let result = convert_with_context(1000.0, "g", "kg", &context)?;
println!("Value: {}", result.value); // 1.000
println!("Precision: {}", result.precision_info); // "3 decimal places"
println!("Used special units: {}", result.used_special_units); // false
// Temperature conversion with special handling
let temp_result = convert_with_context(100.0, "Cel", "K", &context)?;
println!("Value: {}", temp_result.value); // 373.150
println!("Used special units: {}", temp_result.used_special_units); // true
CLI Integration
All Phase 3 features are available through the CLI:
# Model introspection
octofhir-ucum model
octofhir-ucum self-validate
octofhir-ucum properties --limit 10
# Enhanced display names
octofhir-ucum display kg # kilogram
octofhir-ucum display "m/s" # (meter) / (second)
# Advanced conversion with precision
octofhir-ucum convert-advanced --value 1000 --from g --to kg --precision 3
octofhir-ucum convert-advanced --value 100 --from Cel --to K --precision 2
WASM Integration
Phase 3 functions are fully exposed in the WASM package:
// Model introspection
const model = get_ucum_model();
const validation = validate_ucum_implementation();
const properties = get_ucum_properties();
// Enhanced display names
const displayName = get_unit_display_name('kg');
// Advanced conversion
const result = convert_advanced_simple(1000, 'g', 'kg', 3);
const advancedResult = convert_advanced(100, 'Cel', 'K', {
precision_type: 'fixed',
precision_value: 2,
rounding_mode: 'nearest',
temperature_scale: 'celsius',
use_special_units: true
});
Interactive Playground
An interactive web-based playground is available to explore the UCUM library's capabilities.
Features
- Validation: Validate UCUM expressions
- Unit Information: Get detailed information about units
- Conversion: Convert values between compatible units
- Arithmetic: Perform arithmetic operations on units
- Phase 3 Capabilities: Model introspection, enhanced display names, and advanced conversion with precision control
Running Locally
# Navigate to the playground directory
cd playground
# Install dependencies
pnpm install
# Start the development server (use npm due to pnpm script execution issues)
npm run dev
The playground will be available at http://localhost:6000.
FHIR Integration
The UCUM library provides integration with FHIR (Fast Healthcare Interoperability Resources) through the fhir feature.
Installation
# Add to your project
cargo add octofhir-ucum --features fhir
Features
- FHIR Quantity: FHIR Quantity data type implementation
- Conversion: Convert between FHIR Quantity and UCUM Quantity
- Unit Conversion: Convert FHIR Quantities between different units
- Equivalence: Check if two FHIR Quantities are equivalent
- Error Handling: Comprehensive error handling for invalid inputs
Usage
use octofhir_ucum_fhir::{FhirQuantity, convert_quantity, are_equivalent};
// Create a FHIR Quantity with a UCUM code
let quantity = FhirQuantity::with_ucum_code(1000.0, "mg");
// Convert to a different unit
let converted = convert_quantity(&quantity, "g").unwrap();
assert_eq!(converted.value, 1.0);
assert_eq!(converted.code, Some("g".to_string()));
// Check if two quantities are equivalent
let quantity2 = FhirQuantity::with_ucum_code(1.0, "g");
assert!(are_equivalent(&quantity2, &converted).unwrap());
Fuzzing
The UCUM library includes fuzzing infrastructure to identify potential bugs and edge cases using cargo-fuzz.
Setup
# Install cargo-fuzz
cargo install cargo-fuzz
Fuzzing Targets
- Parser Fuzzer: Tests the
parse_expressionfunction with arbitrary input strings - Evaluator Fuzzer: Tests the
evaluatefunction with valid UCUM expressions
Running the Fuzzers
# Run the parser fuzzer
cargo fuzz run -p octofhir-ucum-fuzz fuzz_parser
# Run the evaluator fuzzer
cargo fuzz run -p octofhir-ucum-fuzz fuzz_evaluator
Continuous Fuzzing
For continuous fuzzing, you can set up a CI job that runs the fuzzers for a fixed amount of time:
# Run the parser fuzzer for 5 minutes
cargo fuzz run -p octofhir-ucum-fuzz fuzz_parser -- -max_total_time=300
For more details, see the ucum-fuzz README.
Official Test Validation
The UCUM library includes validation against the official UCUM test cases from the FHIR/Ucum-java repository to ensure compliance with the UCUM specification.
Test Coverage
Our implementation achieves 91.4% conformance to the official UCUM functional test suite:
- Total Tests: 1,068 official UCUM test cases
- Passed: 976 tests
- Failed: 92 tests
- Success Rate: 91.4%
Running Official Tests
# Run all official validation tests
cargo test official_tests
# Run with detailed output to see individual test results
cargo test run_official_validation_tests -- --nocapture
# Run tests from the second official test file
cargo test run_official_validation_tests_2 -- --nocapture
Contributing Guide
We welcome contributions to the UCUM-RS project! This guide will help you get started with development and ensure your contributions align with the project's standards.
Getting Started
-
Fork and clone the repository:
git clone https://github.com/YOUR_USERNAME/ucum-rs.git cd ucum-rs -
Install dependencies:
- Rust 1.70+ (edition 2021)
wasm-packfor WebAssembly buildspnpmfor playground development
-
Build the project:
# Build all workspace crates cargo build --all # Build with specific features cargo build --features cli cargo build --features wasm cargo build --features fhir
Development Workflow
Testing
# Run all tests (recommended before submitting PR)
cargo test --all
# Run tests with output for debugging
cargo test --all -- --nocapture
# Run specific test suites
cargo test official_tests # Official UCUM conformance tests
cargo test test_micro_normalization # UTF-8 handling tests
# Run benchmarks
cargo bench
Code Quality
# Format code (required before commit)
cargo fmt --all
# Check formatting without changes
cargo fmt --all -- --check
# Run linter with strict warnings
cargo clippy --all -- -D warnings
# Pre-publish validation (runs all quality checks)
just publish-prep # or cargo fmt && cargo clippy --all -- -D warnings && cargo test --all
Documentation
# Generate and open documentation
cargo doc --open --no-deps --all
# Validate documentation examples
cargo test --doc
Development Areas
Core Parser (src/parser.rs)
- High-Performance Parser: Advanced implementation with zero-copy optimizations
- Handles all UCUM edge cases with comprehensive error reporting
- Must maintain compatibility with UCUM specification
Adding new parser features:
- Implement feature in
parser.rsmaintaining performance optimizations - Add comprehensive tests covering edge cases
- Ensure backward compatibility with existing API
- Validate against official UCUM conformance tests
AST and Evaluation (src/ast.rs, src/evaluator.rs)
- Follow zero-copy patterns where possible
- Use
UnitExpr<'a>for borrowed data,OwnedUnitExprfor owned data - Maintain dimensional analysis consistency
Registry and Build System (src/registry.rs, build.rs)
- Registry is generated at compile-time from
ucum-essence.xml - Changes to build system must maintain WASM compatibility
- Avoid thread-local storage for cross-platform support
Feature Development
Adding New Features
- Create issue describing the feature and use case
- Write tests first - we follow TDD principles
- Implement feature maintaining backward compatibility
- Update documentation including code examples
- Add CLI support if user-facing (optional)
- Add WASM bindings if relevant (optional)
Performance Optimizations
- Benchmark first - establish baseline performance
- Profile bottlenecks using
cargo benchandperf - Implement optimizations in
parser_optimized.rsif parser-related - Validate correctness - all tests must still pass
- Document performance gains with before/after metrics
Testing Guidelines
Test Categories
- Unit tests - Test individual functions and components
- Integration tests - Test complete parsing and evaluation flows
- Official conformance tests - UCUM specification compliance (98.6% pass rate)
- Property-based tests - Using
proptestfor edge case discovery - Fuzzing tests - Located in
ucum-fuzz/directory
Writing Tests
#[test]
fn test_new_feature() {
// Test successful case
let result = parse_expression("your_expression").unwrap();
assert_eq!(result, expected_ast);
// Test error cases
assert!(parse_expression("invalid_expression").is_err());
// Test edge cases
assert_eq!(parse_expression(""), Ok(UnitExpr::Numeric(1.0)));
}
Code Style
- Formatting: Use
cargo fmt(rustfmt) for consistent formatting - Linting: Address all
clippywarnings withcargo clippy --all -- -D warnings - Documentation: Document all public APIs with examples
- Error handling: Use descriptive error messages with context
- Performance: Prefer zero-copy patterns, avoid unnecessary allocations
Submitting Changes
- Create feature branch from
main - Write tests covering your changes
- Run quality checks:
just publish-prepor equivalent commands - Update documentation if adding public APIs
- Submit pull request with clear description of changes
- Address review feedback promptly
Pull Request Checklist
- All tests pass (
cargo test --all) - Code is formatted (
cargo fmt --all -- --check) - No clippy warnings (
cargo clippy --all -- -D warnings) - Documentation updated for public APIs
- CHANGELOG.md updated if applicable
- Backward compatibility maintained
Project-Specific Guidelines
Unicode Handling
- Always handle ยต (micro sign) properly in both parsers
- Use UTF-8 byte sequences (0xC2 0xB5) for micro sign detection
- Test with both ASCII 'u' and Unicode 'ยต' variants
WASM Compatibility
- Avoid thread-local storage (
thread_local!) - Test WASM builds:
wasm-pack build --target web --features wasm - Ensure
no_stdcompatibility where possible
Error Messages
- Provide precise error locations with spans
- Include suggestions for common mistakes
- Test error message clarity with real users
Getting Help
- Documentation: Check
CLAUDE.mdfor development commands - Issues: Search existing issues before creating new ones
- Discussions: Use GitHub Discussions for questions
- Code Review: All changes require review before merging
Playground Development
cd playground
pnpm install
# Use npm for development due to pnpm script execution issues
npm run dev # Runs on http://localhost:6000
The playground provides a real-time testing environment for UCUM expressions and helps validate user-facing functionality.
Project Structure
src/โ Core library (parsing, evaluation, registry)src/bin/cli.rsโ Command-line interfacesrc/wasm.rsโ WebAssembly bindings for JavaScript/TypeScript (@octofhir/ucum-wasm)src/fhir.rsโ FHIR integration (FHIR Quantity data type support)ucum-fuzz/โ Fuzzing infrastructure (cargo-fuzz targets)playground/โ Interactive web-based playground (Svelte 5)spec/โ UCUM specification assets
License
Apache-2.0
Dependencies
~4โ12MB
~210K SLoC