Skip to content
This repository was archived by the owner on Apr 18, 2022. It is now read-only.

Commit 3f90452

Browse files
committed
Wrap long lines in README
1 parent 40648fc commit 3f90452

File tree

1 file changed

+68
-17
lines changed

1 file changed

+68
-17
lines changed

README.md

Lines changed: 68 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,47 @@
11
# crypto-async
2-
Native Cipher, Hash, and HMAC operations executed in Node's threadpool for multi-core throughput.
2+
Native Cipher, Hash, and HMAC operations executed in Node's threadpool for
3+
multi-core throughput.
34

45
## Motivation
56
#### Some issues with parts of the `crypto` module
6-
* `crypto` cipher, hash and hmac streams are not really asynchronous. They execute in C++, but only in the main thread and so they still block the event loop. Encrypting 64 MB of data might block the event loop for +/- 70ms. Hashing 64 MB of data might block the event loop for +/- 190ms.
7-
* These `crypto` operations do not take advantage of multiple CPU cores. Your server may have 4 cores available but `crypto` will use only 1 of these 4 cores for all encrypting and hashing operations.
8-
* These `crypto` operations were not designed to use statically allocated buffers. They allocate a new output buffer when encrypting or hashing data, even if you already have an output buffer available. If you want to hash only a portion of a buffer you must first create a slice. Thousands of JS object allocations put unnecessary strain on the GC. This in turn leads to longer GC pauses which also block the event loop.
9-
* These `crypto` operations require multiple roundtrips between JS and C++ even if you are only encrypting or hashing a single buffer.
10-
* These `crypto` operations are not suitable for high-throughput network protocols or filesystems which need to checksum and encrypt/decrypt large amounts of data. Such a user-space network protocol or filesystem using `crypto` might actually saturate a single CPU core with crypto operations before saturating a fast local network or SSD disk.
7+
* `crypto` cipher, hash and hmac streams are not really asynchronous. They
8+
execute in C++, but only in the main thread and so they still block the event
9+
loop. Encrypting 64 MB of data might block the event loop for +/- 70ms. Hashing
10+
64 MB of data might block the event loop for +/- 190ms.
11+
* These `crypto` operations do not take advantage of multiple CPU cores. Your
12+
server may have 4 cores available but `crypto` will use only 1 of these 4 cores
13+
for all encrypting and hashing operations.
14+
* These `crypto` operations were not designed to use statically allocated
15+
buffers. They allocate a new output buffer when encrypting or hashing data, even
16+
if you already have an output buffer available. If you want to hash only a
17+
portion of a buffer you must first create a slice. Thousands of JS object
18+
allocations put unnecessary strain on the GC. This in turn leads to longer GC
19+
pauses which also block the event loop.
20+
* These `crypto` operations require multiple roundtrips between JS and C++ even
21+
if you are only encrypting or hashing a single buffer.
22+
* These `crypto` operations are not suitable for high-throughput network
23+
protocols or filesystems which need to checksum and encrypt/decrypt large
24+
amounts of data. Such a user-space network protocol or filesystem using `crypto`
25+
might actually saturate a single CPU core with crypto operations before
26+
saturating a fast local network or SSD disk.
1127

1228
#### Some new ideas with the `crypto-async` module
13-
* Truly asynchronous. All calls execute asynchronously in the `node.js` threadpool. This keeps the main thread and event loop free without blocking.
14-
* Scalable across multiple CPU cores. While `crypto-async` is a fraction slower per call than `crypto` (possibly because of the overhead of interacting with the threadpool), for buffers larger than 1024 bytes it shines and provides N-cores more throughput. `crypto-async` achieves up to 3x more throughput compared to `crypto`.
15-
* Zero-copy. All keys, ivs, source and target arguments can be passed directly using offsets into existing buffers, without requiring any slices and without allocating any temporary output buffers. This enables predictable memory usage for programs with tight memory budgets.
16-
* Designed to support the common use-case of encrypting or hashing a single buffer, where memory is adequate and buffers are already in memory. This avoids multiple round-trips between JS and C++.
17-
* Separates the control plane and the data plane to enable high-throughput applications.
29+
* Truly asynchronous. All calls execute asynchronously in the `node.js`
30+
threadpool. This keeps the main thread and event loop free without blocking.
31+
* Scalable across multiple CPU cores. While `crypto-async` is a fraction slower
32+
per call than `crypto` (possibly because of the overhead of interacting with the
33+
threadpool), for buffers larger than 1024 bytes it shines and provides N-cores
34+
more throughput. `crypto-async` achieves up to 3x more throughput compared to
35+
`crypto`.
36+
* Zero-copy. All keys, ivs, source and target arguments can be passed directly
37+
using offsets into existing buffers, without requiring any slices and without
38+
allocating any temporary output buffers. This enables predictable memory usage
39+
for programs with tight memory budgets.
40+
* Designed to support the common use-case of encrypting or hashing a single
41+
buffer, where memory is adequate and buffers are already in memory. This avoids
42+
multiple round-trips between JS and C++.
43+
* Separates the control plane and the data plane to enable high-throughput
44+
applications.
1845

1946
## Performance
2047
```
@@ -100,15 +127,33 @@ npm install crypto-async
100127
## Usage
101128

102129
#### Adjust threadpool size and control concurrency
103-
Node runs filesystem and DNS operations in the threadpool. The threadpool consists of 4 threads by default. This means that at most 4 operations can be running at any point in time. If any operation is slow to complete, it will cause head-of-line blocking. The size of the threadpool should therefore be increased at startup time (at the top of your script, before requiring any modules) by setting the `UV_THREADPOOL_SIZE` environment variable (the absolute maximum is 128 threads, which requires only ~1 MB memory in total according to the [libuv docs](http://docs.libuv.org/en/v1.x/threadpool.html)).
104-
105-
Conventional wisdom would set the number of threads to the number of CPU cores, but most operations running in the threadpool are not run hot, they are not CPU-intensive and block mostly on IO. Issuing more IO operations than there are CPU cores will increase throughput and will decrease latency per operation by decreasing queueing time. On the other hand, `crypto-async` operations are CPU-intensive. Issuing more `crypto-async` operations than there are CPU cores will not increase throughput and will increase latency per operation by increasing queueing time.
130+
Node runs filesystem and DNS operations in the threadpool. The threadpool
131+
consists of 4 threads by default. This means that at most 4 operations can be
132+
running at any point in time. If any operation is slow to complete, it will
133+
cause head-of-line blocking. The size of the threadpool should therefore be
134+
increased at startup time (at the top of your script, before requiring any
135+
modules) by setting the `UV_THREADPOOL_SIZE` environment variable (the absolute
136+
maximum is 128 threads, which requires only ~1 MB memory in total according to
137+
the [libuv docs](http://docs.libuv.org/en/v1.x/threadpool.html)).
138+
139+
Conventional wisdom would set the number of threads to the number of CPU cores,
140+
but most operations running in the threadpool are not run hot, they are not
141+
CPU-intensive and block mostly on IO. Issuing more IO operations than there are
142+
CPU cores will increase throughput and will decrease latency per operation by
143+
decreasing queueing time. On the other hand, `crypto-async` operations are
144+
CPU-intensive. Issuing more `crypto-async` operations than there are CPU cores
145+
will not increase throughput and will increase latency per operation by
146+
increasing queueing time.
106147

107148
You should therefore:
108149

109-
1. Set the threadpool size to `IO` + `N`, where `IO` is the number of filesystem and DNS operations you expect to be running concurrently, and where `N` is the number of CPU cores available. This will reduce head-of-line blocking.
150+
1. Set the threadpool size to `IO` + `N`, where `IO` is the number of filesystem
151+
and DNS operations you expect to be running concurrently, and where `N` is the
152+
number of CPU cores available. This will reduce head-of-line blocking.
110153

111-
2. Allow or design for at most `N` `crypto-async` operations to be running concurrently, where `N` is the number of CPU cores available. This will keep latency within reasonable bounds.
154+
2. Allow or design for at most `N` `crypto-async` operations to be running
155+
concurrently, where `N` is the number of CPU cores available. This will keep
156+
latency within reasonable bounds.
112157

113158
```javascript
114159
process.env['UV_THREADPOOL_SIZE'] = 128;
@@ -164,6 +209,11 @@ cryptoAsync.hmac(algorithm, key, source,
164209
);
165210
```
166211

212+
### Zero-Copy Methods
213+
214+
The following method alternatives require more arguments but support zero-copy
215+
crypto operations, for reduced memory overhead and GC pressure.
216+
167217
#### Cipher (Zero-Copy)
168218
```javascript
169219
var cryptoAsync = require('crypto-async');
@@ -271,4 +321,5 @@ node benchmark.js
271321

272322
## AEAD Ciphers
273323

274-
AEAD ciphers such as GCM are currently not supported and may be added in future as an `aead` method.
324+
AEAD ciphers such as GCM are currently not supported and may be added in future
325+
as an `aead` method.

0 commit comments

Comments
 (0)