Skip to content

Lower total allocations #20

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 24, 2024

Conversation

FinleyMcIlwaine
Copy link
Contributor

In #19, I mentioned that the changes lowered max residency but increased total allocations on Haddock. I've now investigated where the increased total allocations are coming from, and this patch now reduces total allocations, while maintaining the reduced max residency.

Before this patch (and some patches on Haddock), the stats on the Agda codebase were:

  17,310,445,440 bytes allocated in the heap
   2,697,017,992 bytes copied during GC
     358,309,432 bytes maximum residency (13 sample(s))
       7,586,632 bytes maximum slop
             998 MiB total memory in use (0 MiB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3412 colls,     0 par    1.092s   1.105s     0.0003s    0.0032s
  Gen  1        13 colls,     0 par    0.906s   1.007s     0.0775s    0.2389s

  TASKS: 5 (1 bound, 4 peak workers (4 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.006s  (  0.006s elapsed)
  MUT     time    2.646s  (  3.243s elapsed)
  GC      time    1.998s  (  2.112s elapsed)
  EXIT    time    0.011s  (  0.006s elapsed)
  Total   time    4.661s  (  5.367s elapsed)

  Alloc rate    6,542,622,877 bytes per MUT second

  Productivity  56.8% of total user, 60.4% of total elapsed

        5.42 real         4.84 user         0.45 sys
          1215266816  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
              211730  page reclaims
                   1  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
                 123  voluntary context switches
                3482  involuntary context switches
            58613513  instructions retired
            13780861  cycles elapsed
             1458752  peak memory footprint

The patched stats are now:

  12,297,534,536 bytes allocated in the heap
   2,970,492,280 bytes copied during GC
     356,726,824 bytes maximum residency (13 sample(s))
       7,430,184 bytes maximum slop
             991 MiB total memory in use (0 MiB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      2883 colls,     0 par    1.183s   1.197s     0.0004s    0.0031s
  Gen  1        13 colls,     0 par    0.902s   0.994s     0.0765s    0.2219s

  TASKS: 5 (1 bound, 4 peak workers (4 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.007s  (  0.007s elapsed)
  MUT     time    2.385s  (  2.978s elapsed)
  GC      time    2.084s  (  2.191s elapsed)
  EXIT    time    0.012s  (  0.010s elapsed)
  Total   time    4.487s  (  5.185s elapsed)

  Alloc rate    5,157,258,529 bytes per MUT second

  Productivity  53.1% of total user, 57.4% of total elapsed

        5.24 real         4.66 user         0.45 sys
          1207451648  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
              211180  page reclaims
                   1  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
                 114  voluntary context switches
                3549  involuntary context switches
            58692171  instructions retired
            13607299  cycles elapsed
             1557120  peak memory footprint

Much lower total allocations, and even slightly lower max residency and runtime.

`isValidHtmlITag` was allocating a lot, because it was converting `Builder` to
`ByteString` over and over. This refactors to take a `ByteString` directly and
changes the tag part of html elements to be `ByteString`.
@FinleyMcIlwaine
Copy link
Contributor Author

The haddock test suite still passes with this patch as well, and the total allocation stats for Haddock in the GHC testsuite look good:

                                          Baseline                                         
                    Test    Metric           value       New value Change                  
-------------------------------------------------------------------------------------------
   haddock.Cabal(normal) run/alloc  25,079,445,576  16,158,150,784 -35.6% GOOD
    haddock.base(normal) run/alloc  44,144,831,840  40,661,879,512  -7.9% GOOD
haddock.compiler(normal) run/alloc 202,580,755,144 177,985,477,808 -12.1% GOOD

@mpickering
Copy link

Very good work @FinleyMcIlwaine

@FinleyMcIlwaine
Copy link
Contributor Author

@cdornan Gentle ping. Can this be merged?

@Bodigrim
Copy link
Contributor

@cdornan just another reminder about this.

@Kleidukos
Copy link
Member

Alright let's do this.

@Kleidukos Kleidukos merged commit 2dc0c48 into haskell:master Jun 24, 2024
Kleidukos added a commit that referenced this pull request Nov 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants