-
Notifications
You must be signed in to change notification settings - Fork 13
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Running benchmarks of the forward transform (with mw sampling scheme) the compile time appears to be growing linearly with the bandlimit L
forward
(method: jax, L: 16, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False, n_iter: None):
min(run times): 0.0011s, max(run times): 0.0011s, compile time: 1.1s, peak memory: 1.2e+04B, max(abs(error)): 2.0e-14, floating point ops: 4.3e+05, mem access: 7.9e+05B
(method: jax, L: 32, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False, n_iter: None):
min(run times): 0.0018s, max(run times): 0.0018s, compile time: 1.3s, peak memory: 1.2e+04B, max(abs(error)): 1.5e-13, floating point ops: 2.0e+06, mem access: 3.2e+06B
(method: jax, L: 64, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False, n_iter: None):
min(run times): 0.0037s, max(run times): 0.0037s, compile time: 1.8s, peak memory: 1.3e+04B, max(abs(error)): 3.7e-13, floating point ops: 9.2e+06, mem access: 1.3e+07B
(method: jax, L: 128, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False, n_iter: None):
min(run times): 0.0078s, max(run times): 0.0078s, compile time: 2.8s, peak memory: 1.2e+04B, max(abs(error)): 1.3e-12, floating point ops: 4.1e+07, mem access: 5.1e+07B
(method: jax, L: 256, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False, n_iter: None):
min(run times): 0.019s, max(run times): 0.019s, compile time: 5.0s, peak memory: 1.2e+04B, max(abs(error)): 6.9e-12, floating point ops: 1.8e+08, mem access: 2.0e+08B
(method: jax, L: 512, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False, n_iter: None):
min(run times): 0.065s, max(run times): 0.065s, compile time: 8.7s, peak memory: 1.4e+04B, max(abs(error)): 3.9e-11, floating point ops: 8.2e+08, mem access: 8.1e+08B
(method: jax, L: 1024, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False, n_iter: None):
min(run times): 0.20s, max(run times): 0.20s, compile time: 21.s, peak memory: 1.1e+04B, max(abs(error)): 1.6e-10, floating point ops: 3.6e+09, mem access: 3.2e+09B
(method: jax, L: 2048, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False, n_iter: None):
min(run times): 1.2s, max(run times): 1.2s, compile time: 52.s, peak memory: 1.2e+04B, max(abs(error)): 1.1e-09, floating point ops: 1.5e+10, mem access: 1.3e+10B
(method: jax, L: 4096, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False, n_iter: None):
min(run times): 10.s, max(run times): 10.s, compile time: 1.5e+02s, peak memory: 1.2e+04B, max(abs(error)): 5.8e-09, floating point ops: 6.7e+10, mem access: 5.2e+10B
(method: jax, L: 8192, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False, n_iter: None):
min(run times): 78.s, max(run times): 78.s, compile time: 4.6e+02s, peak memory: 1.2e+04B, max(abs(error)): 3.9e-08, floating point ops: 2.9e+11, mem access: 2.0e+11B
In comparison for inverse transform compile time remains roughly constant
inverse
(method: jax, L: 16, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False):
min(run times): 0.00072s, max(run times): 0.00078s, compile time: 0.43s, peak memory: 1.3e+04B, floating point ops: 3.0e+04, mem access: 1.3e+05B
(method: jax, L: 32, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False):
min(run times): 0.0015s, max(run times): 0.0015s, compile time: 0.48s, peak memory: 1.3e+04B, floating point ops: 1.4e+05, mem access: 5.1e+05B
(method: jax, L: 64, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False):
min(run times): 0.0032s, max(run times): 0.0032s, compile time: 0.45s, peak memory: 1.3e+04B, floating point ops: 6.1e+05, mem access: 2.2e+06B
(method: jax, L: 128, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False):
min(run times): 0.0068s, max(run times): 0.0068s, compile time: 0.44s, peak memory: 1.2e+04B, floating point ops: 2.7e+06, mem access: 8.7e+06B
(method: jax, L: 256, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False):
min(run times): 0.016s, max(run times): 0.017s, compile time: 0.46s, peak memory: 1.4e+04B, floating point ops: 1.2e+07, mem access: 3.5e+07B
(method: jax, L: 512, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False):
min(run times): 0.059s, max(run times): 0.060s, compile time: 0.48s, peak memory: 1.2e+04B, floating point ops: 5.1e+07, mem access: 1.3e+08B
(method: jax, L: 1024, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False):
min(run times): 0.13s, max(run times): 0.13s, compile time: 0.56s, peak memory: 1.2e+04B, floating point ops: 2.2e+08, mem access: 5.0e+08B
(method: jax, L: 2048, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False):
min(run times): 0.48s, max(run times): 0.48s, compile time: 0.57s, peak memory: 1.2e+04B, floating point ops: 9.5e+08, mem access: 2.0e+09B
(method: jax, L: 4096, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False):
min(run times): 3.5s, max(run times): 3.6s, compile time: 0.64s, peak memory: 1.7e+04B, floating point ops: 4.1e+09, mem access: 8.0e+09B
(method: jax, L: 16384, L_lower: 0, sampling: mw, spin: 0, L_to_nside_ratio: 2, reality: True, spmd: False):
min(run times): 2.2e+02s, max(run times): 2.2e+02s, compile time: 0.68s, peak memory: 1.3e+04B, floating point ops: 7.4e+10, mem access: 1.4e+11B
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested