Skip to content

cdbennett/transaction-processor-rust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transaction Processor

Program to process financial transactions to accounts and produce final state. Uses CSV files for input and output.

Goals

  • Correctness: use rust_decimal to avoid floating-point error in account balances.
  • Performance: the task is fundamentally simple - user will expect it to process millions of transactions per second.
  • Scalability: should be able to handle any number of transactions or accounts without failure. Since transaction IDs are 32-bit integers, presumably the program should handle up to 2^32 transactions.

Challenges

To balance the goals of correctness, performance, and scalability is difficult. For example, to increase scalability by using temporary disk storage instead of RAM for the transaction log. This decreases performance.

Development Notes

  • Scaling to billions of transactions: Given the fact that client IDs are u16, we can easily keep all clients in memory. However, the transaction IDs are u32 and if we want to be able to handle 4 billion transactions it's not practical to store all transactions in memory (we probably don't want to require 16 GB to 32 GB of RAM). Future goal: use a file on disk for very large data sets.

  • Choice to only store the data that is actually required by the program specification. Specifically, we don't store a log of all transactions, but only deposits, because current requirements only need to refer to past deposits and not withdrawals.

  • Numeric representation: using the rust_decimal crate to avoid floating-point error. Potential optimization if this proved to be a significant performance penalty: implement a custom numeric type using i64 with a fixed 4-digit scale. But initially using rust_decimal provides a proven and complete solution to start with. The choice to use floating-point (f64) seems a poor choice to representing financial balances which have perfect precision (at least, careful analysis would be needed to prove that no floating point error could accumulate and cause incorrect behavior).

  • Future optimization opportunities: parallelize processing of separate accounts, or parsing of records.

  • AI disclosure: Did not use generative AI for any significant portion of code; only used in the context of JetBrains RustRover code completion.

Behavior Questions

The input data format leaves open some potential pitfalls, that should be checked for or behavior should be defined for.

  • What should happen if a Dispute is submitted when the account balance (or even available balance) is less than the amount of the disputed Deposit transaction? Example: a new client deposits 1000, then withdraws 600, and then the first Deposit is disputed while the account balance is only 400. Currently, we allow the held funds to be increased beyond the total balance in order to keep the numbers balanced, but this isn't necessarily how we'd want to do it in the real world.

Performance Optimization

Avoiding premature optimization, the goal is to actually measure performance and optimize where it's worthwhile. The tools are criterion for running benchmarks, and samply for profiling.

Results as of 0.1.0:

time to process 1M transactions: median 604 ms profiling:

  • 98% apply_transactions_to_db
    • 75% csv deserialize records iterator next (target for first optimizations)
    • 22% Database::process
  • 1.6% write_account_states (insignificant, so will not try to optimize at the moment)

Plan:

  • Avoid unnecessary allocations by reusing a record ("amortize allocations")
  • Instead of using serde deserialization into a struct, read into CSV byte records

Results as of 0.2.0:

time to process: 1M transactions: 405 ms (improved 34.4%)

Possible experiments:

  • Try switching to f64 instead of rust_decimal to see if it makes a significant improvement.

  • UTF-8 validation: first try unsafe str from byte string to see if it's worth avoiding UTF-8 validation and if it does, then find a safe alternative number parsing method supporting byte strings.

About

Transaction Processor in Rust

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages