Skip to content

Congestion-Aware Initial Accumulated Cost #3031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 52 commits into
base: master
Choose a base branch
from

Conversation

soheilshahrouz
Copy link
Contributor

@soheilshahrouz soheilshahrouz commented May 10, 2025

This PR adds a new option to bias routing away from congested channels during early iterations by adjusting the initial accumulated cost based on estimated CHANX/CHANY utilization.

@github-actions github-actions bot added VPR VPR FPGA Placement & Routing Tool lang-cpp C/C++ code libvtrutil labels May 10, 2025
@soheilshahrouz
Copy link
Contributor Author

soheilshahrouz commented May 11, 2025

To-do list

  1. compute channel widths by iterating over RR nodes ---> done
  2. support 3d ---> CHANX/CHANY utilization for all layers are computed. CHANX nodes that represent vertical connection are mistakenly initilized with acc_cost of the CHANX at that location.

@soheilshahrouz
Copy link
Contributor Author

soheilshahrouz commented May 12, 2025

titan_quick_qor

channel width place_time wl cpd route_time pack_time heap_pops total_swaps
300 1.0007 0.9991 0.9965 0.9993 0.9950 1.0067 1
270 1.0054 0.9984 1.0047 0.9615 0.9965 0.9914 1
250 0.9876 0.9981 1.0031 0.8403 0.9924 0.8510 1

In experiments with a channel width of 250, bitcoin_miner failed for both the master branch and this branch.

@soheilshahrouz soheilshahrouz changed the title [WIP] Initialize acc_cost with post-placement channel utilization estimate Congestion-Aware Initial Accumulated Cost May 16, 2025
@soheilshahrouz soheilshahrouz requested a review from vaughnbetz May 20, 2025 21:18
Copy link
Contributor

@vaughnbetz vaughnbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs lots of testing to be on by default:
Koios QoR
VTR QoR on binary search and final routing at 1.3Wmin
3D FPGAs work / have reasonable QoR.

Follow on possible work: check if drawing depends on arch file channels -> update if needed.

@vaughnbetz
Copy link
Contributor

Could use this as an opportunity to make

  1. Drawing use channel widths calculated from the rr-graph (not the arch file) so it draws properly with a read-in rr-graph. See
    j += device_ctx.chan_width.y_list[i] + 1; /* N wires need N+1 units of space */
  2. Placement cost (linear congestion / scaling of wiring by average channel width over bb) also respond to the rr-graph, not the arch file.

@vaughnbetz
Copy link
Contributor

Probably moving this code out of stats.cpp would also be good -- it'll become more fundamental.

@soheilshahrouz
Copy link
Contributor Author

VTR benchmarks minimum channel width search

Metric feature.txt
vtr_flow_elapsed_time 1.109599868
odin_synth_time
parmys_synth_time 0.9942209611
abc_depth 1
abc_synth_time 1.001840561
num_clb 1
num_memories 1
num_mult 1
max_vpr_mem 0.9914816969
num_pre_packed_blocks 1
num_post_packed_blocks 1
device_grid_tiles 1
pack_time 1.000728945
placed_wirelength_est 1
place_time 1.025983213
placed_CPD_est 1
min_chan_width 0.9906351292
routed_wirelength 1.002237975
min_chan_width_route_time 1.352100907
crit_path_routed_wirelength 1.001354999
critical_path_delay 1.000111178
geomean_nonvirtual_intradomain_critical_path_delay 1.000111178
crit_path_route_time 0.9794380194

Minimum channel width was reduced in two circuits.

@soheilshahrouz
Copy link
Contributor Author

VTR benchmarks
1.3minW

circuit route_time cpd heap_pops wl place_time pack_time
mcml.v 0.9625 0.9999531016 1.019235354 1.000281936 0.9766849315 1.001701838
LU32PEEng.v 0.9503321734 1.000014647 0.990690786 1.00014595 0.9954708105 0.9987463762
bgm.v 0.9737302977 0.9996530479 0.997730871 1.000230386 1.023058997 1.002453988
LU8PEEng.v 1.014973262 1.000074797 1.00165337 0.9988697785 1.0022553 0.9980750722
arm_core.v 1 1.000592746 0.9855161027 0.9969095411 1.004594181 1.045816253
stereovision2.v 1.005747126 0.9996107903 1.000784028 0.9993899826 1.076231263 1
stereovision1.v 1.088560886 1.00187781 1.121788749 1.00002572 1.027617952 0.9646504321

@vaughnbetz
Copy link
Contributor

Looks good, thanks. We should get koios as well, since this is a big change.

@soheilshahrouz
Copy link
Contributor Author

soheilshahrouz commented Jun 1, 2025

W=300

circuit CPD WL route_time heap_pops
tpu_like.small.os.v 1 1 1.015254237 1
tpu_like.small.ws.v 1 1 1.007668712 1
dla_like.small.v 1 0.999768608 1.018815717 1.004269293
bnn.v 1 0.9994662975 1.009292352 1.002568516
attention_layer.v 1 1 1.010739857 1
conv_layer_hls.v 1 1 0.9164619165 1
conv_layer.v 1 0.9993863978 1.053030303 1.00142193
eltwise_layer.v 1 0.9999351936 1.01025641 0.9879354833
robot_rl.v 1 1 0.9909090909 1
reduction_layer.v 1 1 0.9980430528 1
spmv.v 1 1 1.015789474 1
softmax.v 1 1 1.006349206 1
lenet.v 1.039334614 0.9979187016 1.005524862 0.9996246427
clstm_like.small.v 1 1 1.01551481 1
clstm_like.medium.v 0.9050457455 1.001017806 1.082175466 1.023845394
clstm_like.large.v 0.9523361003 1.000231865 0.974962828 0.9789848833
lstm.v 1 1 1.010317256 1
gemm_layer.v 0.8776401523 1.002186012 0.9898989899 1.013215695
tpu_like.large.os.v 1 1 1.0078075 1
tpu_like.large.ws.v 0.9912466445 0.9995306955 1.050833995 1.000956747
tdarknet_like.large.v 1 1 1.033892216 1
dla_like.medium.v 1 0.9998589942 1.004281637 0.994174936
bwave_like.float.small.v 1.020427253 1.000608795 0.9642172524 0.9816747451
bwave_like.fixed.small.v 1 1 1.013428827 1
dnnweaver.v 0.9881222569 1.001165979 0.9958448753 1.027681813
tdarknet_like.small.v 1 1 1.010583283 1
bwave_like.float.large.v 0.9734685948 1.001004839 0.9627392414 0.961273596
bwave_like.fixed.large.v 1 1.001696565 1.037675234 1.061526364
dla_like.large.v 0.9755507261 1.000620627 1.000162533 0.9959337593
proxy.1.v 1 0.9999891434 1.009653311 0.9993780357
proxy.2.v 1 1.000104018 0.9421366655 0.9980725353
proxy.3.v 1 1.000798333 0.9704296935 0.9980512289
proxy.4.v 1.009819358 1.000263529 1.012186732 1.009834591
proxy.5.v 1 1 0.9941634241 1
proxy.6.v 1 0.9998606597 1.00491763 1.00369854
proxy.7.v 1 1.000165903 1.00448682 1.013426537
proxy.8.v 1 0.9994262478 0.9944341373 0.9971751271
geomean 0.9927835526 1.000135276 1.003915663 1.001479038

@soheilshahrouz
Copy link
Contributor Author

vtr_reg_nightly_test7/3d_sb_titan_other_auto_bb

circuit CPD WL route_time heap_pops
carpat_stratixiv_arch_timing.blif 1.00 1.00 1.02 1.03
CH_DFSIN_stratixiv_arch_timing.blif 1.00 1.00 1.00 1.00
CHERI_stratixiv_arch_timing.blif 1.00 1.00 1.05 1.02
EKF-SLAM_Jacobians_stratixiv_arch_timing.blif 1.00 1.00 1.04 1.03
fir_cascade_stratixiv_arch_timing.blif 1.00 1.00 1.01 1.00
jacobi_stratixiv_arch_timing.blif 1.00 1.00 1.07 1.12
JPEG_stratixiv_arch_timing.blif 1.00 1.00 1.00 1.01
leon2_stratixiv_arch_timing.blif 1.00 1.00 0.96 0.93
leon3mp_stratixiv_arch_timing.blif 1.00 1.00 1.01 1.01
MCML_stratixiv_arch_timing.blif 1.00 1.00 1.02 1.00
MMM_stratixiv_arch_timing.blif 1.00 1.00 1.05 1.00
radar20_stratixiv_arch_timing.blif 1.00 1.00 1.00 1.00
random_stratixiv_arch_timing.blif 1.00 1.00 1.01 1.00
Reed_Solomon_stratixiv_arch_timing.blif 1.00 1.00 1.01 1.00
smithwaterman_stratixiv_arch_timing.blif 1.00 1.00 1.01 1.00
stap_steering_stratixiv_arch_timing.blif 1.00 1.00 1.00 1.00
sudoku_check_stratixiv_arch_timing.blif 1.00 1.00 1.00 1.01
SURF_desc_stratixiv_arch_timing.blif 1.00 1.00 0.98 1.01
ucsb_152_tap_fir_stratixiv_arch_timing.blif 1.00 1.00 1.04 1.00
uoft_raytracer_stratixiv_arch_timing.blif 1.00 1.00 1.03 1.04
wb_conmax_stratixiv_arch_timing.blif 1.00 1.00 0.99 1.00
picosoc_stratixiv_arch_timing.blif 1.00 1.00 1.02 1.00
murax_stratixiv_arch_timing.blif 1.00 1.00 1.04 1.00
geomean 1.00 1.00 1.01 1.01

@soheilshahrouz
Copy link
Contributor Author

vtr_reg_nightly_test7/3d_cb_titan_other_auto_bb

circuit CPD WL route_time heap_pops
carpat_stratixiv_arch_timing.blif 1.00 1.00 1.00 0.99
CH_DFSIN_stratixiv_arch_timing.blif 1.00 1.00 1.00 1.00
CHERI_stratixiv_arch_timing.blif 1.00 1.00 1.00 1.01
EKF-SLAM_Jacobians_stratixiv_arch_timing.blif 1.00 1.00 1.11 1.15
fir_cascade_stratixiv_arch_timing.blif 1.00 1.00 1.02 1.00
jacobi_stratixiv_arch_timing.blif 1.00 1.00 1.03 1.01
JPEG_stratixiv_arch_timing.blif 1.00 1.00 1.00 1.00
leon2_stratixiv_arch_timing.blif 1.00 1.00 1.02 1.01
leon3mp_stratixiv_arch_timing.blif 1.00 1.00 0.98 0.99
MCML_stratixiv_arch_timing.blif 1.00 1.00 1.01 1.00
MMM_stratixiv_arch_timing.blif 1.00 1.00 1.02 0.99
radar20_stratixiv_arch_timing.blif 1.00 1.00 1.00 1.00
random_stratixiv_arch_timing.blif 1.00 1.00 1.00 1.00
Reed_Solomon_stratixiv_arch_timing.blif 1.00 1.00 1.01 1.00
smithwaterman_stratixiv_arch_timing.blif 1.00 1.00 1.01 1.00
stap_steering_stratixiv_arch_timing.blif 1.00 1.00 0.99 1.00
sudoku_check_stratixiv_arch_timing.blif 1.00 1.00 1.00 0.99
SURF_desc_stratixiv_arch_timing.blif 1.00 1.00 1.04 1.03
ucsb_152_tap_fir_stratixiv_arch_timing.blif 1.00 1.00 1.00 1.00
uoft_raytracer_stratixiv_arch_timing.blif 1.00 1.00 0.99 1.01
wb_conmax_stratixiv_arch_timing.blif 1.00 1.00 0.98 1.00
picosoc_stratixiv_arch_timing.blif 1.00 1.00 0.99 1.00
murax_stratixiv_arch_timing.blif 1.00 1.00 1.02 1.00
geomean 1.00 1.00 1.01 1.01

@soheilshahrouz
Copy link
Contributor Author

vtr_reg_nightly_test7/vtr_reg_qor_large_run_flat

Metric feature.txt
vtr_flow_elapsed_time 1.043
odin_synth_time
parmys_synth_time 1.038
abc_depth 1
abc_synth_time 1.025
num_clb 1
num_memories 1
num_mult 1
max_vpr_mem 1.003
num_pre_packed_blocks 1
num_post_packed_blocks 1
device_grid_tiles 1
pack_time 1.005
placed_wirelength_est 1
place_time 1.024
placed_CPD_est 1
routed_wirelength 1.011
critical_path_delay 0.993
geomean_nonvirtual_intradomain_critical_path_delay 0.993
crit_path_route_time 1.049

@soheilshahrouz
Copy link
Contributor Author

@vaughnbetz
I added QoR for flat routing and 3D. This PR is ready to merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang-cpp C/C++ code libvtrutil VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants