Official pure Rust typed client for ClickHouse DB.
- Uses
serde
for encoding/decoding rows. - Supports
serde
attributes:skip_serializing
,skip_deserializing
,rename
. - Uses
RowBinaryWithNamesAndTypes
orRowBinary
formats over HTTP transport.- By default,
RowBinaryWithNamesAndTypes
with database schema validation is used. - It is possible to switch to
RowBinary
, which can potentially lead to increased performance (see below). - There are plans to implement
Native
format over TCP.
- By default,
- Supports TLS (see
native-tls
andrustls-tls
features below). - Supports compression and decompression (LZ4 and LZ4HC).
- Provides API for selecting.
- Provides API for inserting.
- Provides API for infinite transactional (see below) inserting.
- Provides mocks for unit testing.
Note: ch2rs is useful to generate a row type from ClickHouse.
Starting from 0.14.0, the crate uses RowBinaryWithNamesAndTypes
format by default, which allows row types validation
against the ClickHouse schema. This enables clearer error messages in case of schema mismatch at the cost of
performance. Additionally, with enabled validation, the crate supports structs with correct field names and matching
types, but incorrect order of the fields, with an additional slight (5-10%) performance penalty.
If you are looking to maximize performance, you could disable validation using Client::with_validation(false)
. When
validation is disabled, the client switches to RowBinary
format usage instead.
The downside with plain RowBinary
is that instead of clearer error messages, a mismatch between Row
and database
schema will result in a NotEnoughData
error without specific details.
However, depending on the dataset, there might be x1.1 to x3 performance improvement, but that highly depends on the shape and volume of the dataset.
It is always recommended to measure the performance impact of validation in your specific use case. Additionally, writing smoke tests to ensure that the row types match the ClickHouse schema is highly recommended, if you plan to disable validation in your application.
To use the crate, add this to your Cargo.toml
:
[dependencies]
clickhouse = "0.13.3"
[dev-dependencies]
clickhouse = { version = "0.13.3", features = ["test-util"] }
use clickhouse::Client;
let client = Client::default()
.with_url("http://localhost:8123")
.with_user("name")
.with_password("123")
.with_database("test");
- Reuse created clients or clone them in order to reuse a connection pool.
use serde::Deserialize;
use clickhouse::Row;
#[derive(Row, Deserialize)]
struct MyRow<'a> {
no: u32,
name: &'a str,
}
let mut cursor = client
.query("SELECT ?fields FROM some WHERE no BETWEEN ? AND ?")
.bind(500)
.bind(504)
.fetch::<MyRow<'_>>()?;
while let Some(row) = cursor.next().await? { .. }
- Placeholder
?fields
is replaced withno, name
(fields ofRow
). - Placeholder
?
is replaced with values in followingbind()
calls. - Convenient
fetch_one::<Row>()
andfetch_all::<Row>()
can be used to get a first row or all rows correspondingly. sql::Identifier
can be used to bind table names.
Note that cursors can return an error even after producing some rows. To avoid this, use client.with_option("wait_end_of_query", "1")
in order to enable buffering on the server-side. More details. The buffer_size
option can be useful too.
use serde::Serialize;
use clickhouse::Row;
#[derive(Row, Serialize)]
struct MyRow {
no: u32,
name: String,
}
let mut insert = client.insert("some")?;
insert.write(&MyRow { no: 0, name: "foo".into() }).await?;
insert.write(&MyRow { no: 1, name: "bar".into() }).await?;
insert.end().await?;
- If
end()
isn't called, theINSERT
is aborted. - Rows are being sent progressively to spread network load.
- ClickHouse inserts batches atomically only if all rows fit in the same partition and their number is less
max_insert_block_size
.
Requires the inserter
feature.
let mut inserter = client.inserter("some")?
.with_timeouts(Some(Duration::from_secs(5)), Some(Duration::from_secs(20)))
.with_max_bytes(50_000_000)
.with_max_rows(750_000)
.with_period(Some(Duration::from_secs(15)));
inserter.write(&MyRow { no: 0, name: "foo".into() })?;
inserter.write(&MyRow { no: 1, name: "bar".into() })?;
let stats = inserter.commit().await?;
if stats.rows > 0 {
println!(
"{} bytes, {} rows, {} transactions have been inserted",
stats.bytes, stats.rows, stats.transactions,
);
}
Please, read examples to understand how to use it properly in different real-world cases.
Inserter
ends an active insert incommit()
if thresholds (max_bytes
,max_rows
,period
) are reached.- The interval between ending active
INSERT
s can be biased by usingwith_period_bias
to avoid load spikes by parallel inserters. Inserter::time_left()
can be used to detect when the current period ends. CallInserter::commit()
again to check limits if your stream emits items rarely.- Time thresholds implemented by using quanta crate to speed the inserter up. Not used if
test-util
is enabled (thus, time can be managed bytokio::time::advance()
in custom tests). - All rows between
commit()
calls are inserted in the sameINSERT
statement. - Do not forget to flush if you want to terminate inserting:
inserter.end().await?;
lz4
(enabled by default) — enablesCompression::Lz4
. If enabled,Compression::Lz4
is used by default for all queries.inserter
— enablesclient.inserter()
.test-util
— adds mocks. See the example. Use it only indev-dependencies
.uuid
— addsserde::uuid
to work with uuid crate.time
— addsserde::time
to work with time crate.chrono
— addsserde::chrono
to work with chrono crate.
By default, TLS is disabled and one or more following features must be enabled to use HTTPS urls:
native-tls
— uses native-tls, utilizing dynamic linking (e.g. against OpenSSL).rustls-tls
— enablesrustls-tls-aws-lc
andrustls-tls-webpki-roots
features.rustls-tls-aws-lc
— uses rustls with theaws-lc
cryptography implementation.rustls-tls-ring
— uses rustls with thering
cryptography implementation.rustls-tls-webpki-roots
— uses rustls with certificates provided by the webpki-roots crate.rustls-tls-native-roots
— uses rustls with certificates provided by the rustls-native-certs crate.
If multiple features are enabled, the following priority is applied:
native-tls
>rustls-tls-aws-lc
>rustls-tls-ring
rustls-tls-native-roots
>rustls-tls-webpki-roots
How to choose between all these features? Here are some considerations:
- A good starting point is
rustls-tls
, e.g. if you use ClickHouse Cloud. - To be more environment-agnostic, prefer
rustls-tls
overnative-tls
. - Enable
rustls-tls-native-roots
ornative-tls
if you want to use self-signed certificates.
-
(U)Int(8|16|32|64|128)
maps to/from corresponding(u|i)(8|16|32|64|128)
types or newtypes around them. -
(U)Int256
aren't supported directly, but there is a workaround for it. -
Float(32|64)
maps to/from correspondingf(32|64)
or newtypes around them. -
Decimal(32|64|128)
maps to/from correspondingi(32|64|128)
or newtypes around them. It's more convenient to use fixnum or another implementation of signed fixed-point numbers. -
Boolean
maps to/frombool
or newtypes around it. -
String
maps to/from any string or bytes types, e.g.&str
,&[u8]
,String
,Vec<u8>
orSmartString
. Newtypes are also supported. To store bytes, consider using serde_bytes, because it's more efficient.Example
#[derive(Row, Debug, Serialize, Deserialize)] struct MyRow<'a> { str: &'a str, string: String, #[serde(with = "serde_bytes")] bytes: Vec<u8>, #[serde(with = "serde_bytes")] byte_slice: &'a [u8], }
-
FixedString(N)
is supported as an array of bytes, e.g.[u8; N]
.Example
#[derive(Row, Debug, Serialize, Deserialize)] struct MyRow { fixed_str: [u8; 16], // FixedString(16) }
-
Enum(8|16)
are supported using serde_repr. You could use#[repr(i8)]
forEnum8
and#[repr(i16)]
forEnum16
.Example
use serde_repr::{Deserialize_repr, Serialize_repr}; #[derive(Row, Serialize, Deserialize)] struct MyRow { level: Level, } #[derive(Debug, Serialize_repr, Deserialize_repr)] #[repr(i8)] enum Level { Debug = 1, Info = 2, Warn = 3, Error = 4, }
-
UUID
maps to/fromuuid::Uuid
by usingserde::uuid
. Requires theuuid
feature.Example
#[derive(Row, Serialize, Deserialize)] struct MyRow { #[serde(with = "clickhouse::serde::uuid")] uuid: uuid::Uuid, }
-
IPv6
maps to/fromstd::net::Ipv6Addr
. -
IPv4
maps to/fromstd::net::Ipv4Addr
by usingserde::ipv4
.Example
#[derive(Row, Serialize, Deserialize)] struct MyRow { #[serde(with = "clickhouse::serde::ipv4")] ipv4: std::net::Ipv4Addr, }
-
Date
maps to/fromu16
or a newtype around it and represents a number of days elapsed since1970-01-01
. The following external types are supported:time::Date
is supported by usingserde::time::date
, requiring thetime
feature.chrono::NaiveDate
is supported by usingserde::chrono::date
, requiring thechrono
feature.
Example
#[derive(Row, Serialize, Deserialize)] struct MyRow { days: u16, #[serde(with = "clickhouse::serde::time::date")] date: Date, // if you prefer using chrono: #[serde(with = "clickhouse::serde::chrono::date")] date_chrono: NaiveDate, }
-
Date32
maps to/fromi32
or a newtype around it and represents a number of days elapsed since1970-01-01
. The following external types are supported:time::Date
is supported by usingserde::time::date32
, requiring thetime
feature.chrono::NaiveDate
is supported by usingserde::chrono::date32
, requiring thechrono
feature.
Example
#[derive(Row, Serialize, Deserialize)] struct MyRow { days: i32, #[serde(with = "clickhouse::serde::time::date32")] date: Date, // if you prefer using chrono: #[serde(with = "clickhouse::serde::chrono::date32")] date_chrono: NaiveDate, }
-
DateTime
maps to/fromu32
or a newtype around it and represents a number of seconds elapsed since UNIX epoch. The following external types are supported:time::OffsetDateTime
is supported by usingserde::time::datetime
, requiring thetime
feature.chrono::DateTime<Utc>
is supported by usingserde::chrono::datetime
, requiring thechrono
feature.
Example
#[derive(Row, Serialize, Deserialize)] struct MyRow { ts: u32, #[serde(with = "clickhouse::serde::time::datetime")] dt: OffsetDateTime, // if you prefer using chrono: #[serde(with = "clickhouse::serde::chrono::datetime")] dt_chrono: DateTime<Utc>, }
-
DateTime64(_)
maps to/fromi64
or a newtype around it and represents a time elapsed since UNIX epoch. The following external types are supported:time::OffsetDateTime
is supported by usingserde::time::datetime64::*
, requiring thetime
feature.chrono::DateTime<Utc>
is supported by usingserde::chrono::datetime64::*
, requiring thechrono
feature.
Example
#[derive(Row, Serialize, Deserialize)] struct MyRow { ts: i64, // elapsed s/us/ms/ns depending on `DateTime64(X)` #[serde(with = "clickhouse::serde::time::datetime64::secs")] dt64s: OffsetDateTime, // `DateTime64(0)` #[serde(with = "clickhouse::serde::time::datetime64::millis")] dt64ms: OffsetDateTime, // `DateTime64(3)` #[serde(with = "clickhouse::serde::time::datetime64::micros")] dt64us: OffsetDateTime, // `DateTime64(6)` #[serde(with = "clickhouse::serde::time::datetime64::nanos")] dt64ns: OffsetDateTime, // `DateTime64(9)` // if you prefer using chrono: #[serde(with = "clickhouse::serde::chrono::datetime64::secs")] dt64s_chrono: DateTime<Utc>, // `DateTime64(0)` #[serde(with = "clickhouse::serde::chrono::datetime64::millis")] dt64ms_chrono: DateTime<Utc>, // `DateTime64(3)` #[serde(with = "clickhouse::serde::chrono::datetime64::micros")] dt64us_chrono: DateTime<Utc>, // `DateTime64(6)` #[serde(with = "clickhouse::serde::chrono::datetime64::nanos")] dt64ns_chrono: DateTime<Utc>, // `DateTime64(9)` }
-
Tuple(A, B, ...)
maps to/from(A, B, ...)
or a newtype around it. -
Array(_)
maps to/from any slice, e.g.Vec<_>
,&[_]
. Newtypes are also supported. -
Map(K, V)
can be deserialized asHashMap<K, V>
orVec<(K, V)>
. -
LowCardinality(_)
is supported seamlessly. -
Nullable(_)
maps to/fromOption<_>
. Forclickhouse::serde::*
helpers add::option
.Example
#[derive(Row, Serialize, Deserialize)] struct MyRow { #[serde(with = "clickhouse::serde::ipv4::option")] ipv4_opt: Option<Ipv4Addr>, }
-
Nested
is supported by providing multiple arrays with renaming.Example
// CREATE TABLE test(items Nested(name String, count UInt32)) #[derive(Row, Serialize, Deserialize)] struct MyRow { #[serde(rename = "items.name")] items_name: Vec<String>, #[serde(rename = "items.count")] items_count: Vec<u32>, }
-
Geo
types are supported.Point
behaves like a tuple(f64, f64)
, and the rest of the types are just slices of points.Example
type Point = (f64, f64); type Ring = Vec<Point>; type Polygon = Vec<Ring>; type MultiPolygon = Vec<Polygon>; type LineString = Vec<Point>; type MultiLineString = Vec<LineString>; #[derive(Row, Serialize, Deserialize)] struct MyRow { point: Point, ring: Ring, polygon: Polygon, multi_polygon: MultiPolygon, line_string: LineString, multi_line_string: MultiLineString, }
-
Variant
data type is supported as a Rust enum. As the inner Variant types are always sorted alphabetically, Rust enum variants should be defined in the exactly same order as it is in the data type; their names are irrelevant, only the order of the types matters. This following example has a column defined asVariant(Array(UInt16), Bool, Date, String, UInt32)
:Example
#[derive(Serialize, Deserialize)] enum MyRowVariant { Array(Vec<i16>), Boolean(bool), #[serde(with = "clickhouse::serde::time::date")] Date(time::Date), String(String), UInt32(u32), } #[derive(Row, Serialize, Deserialize)] struct MyRow { id: u64, var: MyRowVariant, }
-
New
JSON
data type is currently supported as a string when using ClickHouse 24.10+. See this example for more details. -
Dynamic
data type is not supported for now.
See also the additional examples:
The crate provides utils for mocking CH server and testing DDL, SELECT
and INSERT
queries.
The functionality can be enabled with the test-util
feature. Use it only in dev-dependencies.
See the example.