Skip to content

Conversation

@gdalle
Copy link
Contributor

@gdalle gdalle commented Feb 16, 2025

Fixes #1853

Todo:

  • Add docs of what I learned

Related:

@codecov
Copy link

codecov bot commented Feb 16, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 28.01%. Comparing base (e8cec0c) to head (533fd4e).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2309   +/-   ##
=======================================
  Coverage   28.01%   28.01%           
=======================================
  Files           2        2           
  Lines         207      207           
=======================================
  Hits           58       58           
  Misses        149      149           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vchuravy
Copy link
Member

In some way, this feels equivalent to implementing autodiff but for non-scalar returns and fixing my old mistake of always passing in one(T) as the seed.

I think of autodiff(Reverse generally as vjp (with the convention that the output is updated in-place)

@gdalle
Copy link
Contributor Author

gdalle commented Feb 17, 2025

We could also call it autodiff but then we'd need to figure out how the output seed is passed. It can't be inside a Duplicated or BatchDuplicated because we have no primal

@vchuravy
Copy link
Member

It can't be inside a Duplicated or BatchDuplicated because we have no primal

Seed and BatchSeed? Just throwing some ideas into the air.

@gdalle
Copy link
Contributor Author

gdalle commented Feb 17, 2025

And how would you see the order of arguments?

autodiff(Reverse, f, seed, args...)
autodiff(Reverse, f, args...; seed=...)

@vchuravy
Copy link
Member

The first variant, since that is already the convention used for the activity of the return.

@gdalle gdalle marked this pull request as ready for review March 9, 2025 10:10
@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2025

Your PR requires formatting changes to meet the project's style guidelines.
Please consider running Runic (git runic main) to apply these changes.

Click here to view the suggested changes.
diff --git a/src/sugar.jl b/src/sugar.jl
index f15e6e2..3ed38de 100644
--- a/src/sugar.jl
+++ b/src/sugar.jl
@@ -1367,10 +1367,10 @@ julia> Enzyme.batchify_activity(Duplicated{Vector{Float64}}, Val(2))
 BatchDuplicated{Vector{Float64}, 2}

"""
-batchify_activity(::Type{Active{T}}, ::Val{B}) where {T,B} = Active{T}
-batchify_activity(::Type{Duplicated{T}}, ::Val{B}) where {T,B} = BatchDuplicated{T,B}
-batchify_activity(::Type{DuplicatedNoNeed{T}}, ::Val{B}) where {T,B} = BatchDuplicatedNoNeed{T,B}
-batchify_activity(::Type{MixedDuplicated{T}}, ::Val{B}) where {T,B} = BatchMixedDuplicated{T,B}
+batchify_activity(::Type{Active{T}}, ::Val{B}) where {T, B} = Active{T}
+batchify_activity(::Type{Duplicated{T}}, ::Val{B}) where {T, B} = BatchDuplicated{T, B}
+batchify_activity(::Type{DuplicatedNoNeed{T}}, ::Val{B}) where {T, B} = BatchDuplicatedNoNeed{T, B}
+batchify_activity(::Type{MixedDuplicated{T}}, ::Val{B}) where {T, B} = BatchMixedDuplicated{T, B}

"""
@@ -1378,12 +1378,12 @@ batchify_activity(::Type{MixedDuplicated{T}}, ::Val{B}) where {T,B} = BatchMixed

Wrapper for a single adjoint to the return value in reverse mode.
"""
-struct Seed{T,A}
+struct Seed{T, A}
dval::T

  • function Seed(dval::T) where T
  • function Seed(dval::T) where {T}
    A = guess_activity(T, Reverse)
  •    return new{T,A}(dval)
    
  •    return new{T, A}(dval)
    
    end
    end

@@ -1395,10 +1395,10 @@ Wrapper for a tuple of adjoints to the return value in reverse mode.
struct BatchSeed{N, T, AB}
dvals::NTuple{N, T}

  • function BatchSeed(dvals::NTuple{N,T}) where {N,T}
  • function BatchSeed(dvals::NTuple{N, T}) where {N, T}
    A = guess_activity(T, Reverse)
    AB = batchify_activity(A, Val(N))
  •    return new{N,T,AB}(dvals)
    
  •    return new{N, T, AB}(dvals)
    
    end
    end

@@ -1417,7 +1417,7 @@ Useful for computing pullbacks / VJPs for functions whose output is not a scalar
function autodiff(
rmode::Union{ReverseMode{ReturnPrimal}, ReverseModeSplit{ReturnPrimal}},
f::FA,

  •    dresult::Seed{RT,RA},
    
  •    dresult::Seed{RT, RA},
       args::Vararg{Annotation, N},
    
    ) where {ReturnPrimal, FA <: Annotation, RT, RA, N}
    rmode_split = Split(rmode)
    @@ -1454,7 +1454,7 @@ Useful for computing pullbacks / VJPs for functions whose output is not a scalar
    function autodiff(
    rmode::Union{ReverseMode{ReturnPrimal}, ReverseModeSplit{ReturnPrimal}},
    f::FA,
  •    dresults::BatchSeed{B,RT,RA},
    
  •    dresults::BatchSeed{B, RT, RA},
       args::Vararg{Annotation, N},
    
    ) where {ReturnPrimal, B, FA <: Annotation, RT, RA, N}
    rmode_split_rightwidth = ReverseSplitWidth(Split(rmode), Val(B))
    diff --git a/test/seeded.jl b/test/seeded.jl
    index 68c3962..664a6fa 100644
    --- a/test/seeded.jl
    +++ b/test/seeded.jl
    @@ -4,9 +4,9 @@ using Test

@testset "Batchify activity" begin
@test batchify_activity(Active{Float64}, Val(2)) == Active{Float64}

  • @test batchify_activity(Duplicated{Vector{Float64}}, Val(2)) == BatchDuplicated{Vector{Float64},2}
  • @test batchify_activity(DuplicatedNoNeed{Vector{Float64}}, Val(2)) == BatchDuplicatedNoNeed{Vector{Float64},2}
  • @test batchify_activity(MixedDuplicated{Tuple{Float64,Vector{Float64}}}, Val(2)) == BatchMixedDuplicated{Tuple{Float64,Vector{Float64}},2}
  • @test batchify_activity(Duplicated{Vector{Float64}}, Val(2)) == BatchDuplicated{Vector{Float64}, 2}
  • @test batchify_activity(DuplicatedNoNeed{Vector{Float64}}, Val(2)) == BatchDuplicatedNoNeed{Vector{Float64}, 2}
  • @test batchify_activity(MixedDuplicated{Tuple{Float64, Vector{Float64}}}, Val(2)) == BatchMixedDuplicated{Tuple{Float64, Vector{Float64}}, 2}
    end

the base case is a function returning (a(x, y), b(x, y))

@@ -56,11 +56,11 @@ dx_ref = da * 2x * y .+ db * abs2(y)
dy_ref = da * sum(abs2, x) + db * sum(x) * 2y
dxs_ref = (
das[1] * 2x * y .+ dbs[1] * abs2(y),

  • das[2] * 2x * y .+ dbs[2] * abs2(y)
  • das[2] * 2x * y .+ dbs[2] * abs2(y),
    )
    dys_ref = (
    das[1] * sum(abs2, x) + dbs[1] * sum(x) * 2y,
  • das[2] * sum(abs2, x) + dbs[2] * sum(x) * 2y
  • das[2] * sum(abs2, x) + dbs[2] * sum(x) * 2y,
    )

input derivatives, (a+b) case

@@ -69,11 +69,11 @@ dx1_ref = (da + db) * (2x * y .+ abs2(y))
dy1_ref = (da + db) * (sum(abs2, x) + sum(x) * 2y)
dxs1_ref = (
(das[1] + dbs[1]) * (2x * y .+ abs2(y)),

  • (das[2] + dbs[2]) * (2x * y .+ abs2(y))
  • (das[2] + dbs[2]) * (2x * y .+ abs2(y)),
    )
    dys1_ref = (
    (das[1] + dbs[1]) * (sum(abs2, x) + sum(x) * 2y),
  • (das[2] + dbs[2]) * (sum(abs2, x) + sum(x) * 2y)
  • (das[2] + dbs[2]) * (sum(abs2, x) + sum(x) * 2y),
    )

output seeds, weird cases

@@ -99,7 +99,7 @@ dzs6 = (MyMixedStruct(das[1], [dbs[1]]), MyMixedStruct(das[2], [dbs[2]]))

validation

function validate_seeded_autodiff(f, dz, dzs)

  • @testset for mode in (Reverse, ReverseWithPrimal, ReverseSplitNoPrimal, ReverseSplitWithPrimal)
  • return @testset for mode in (Reverse, ReverseWithPrimal, ReverseSplitNoPrimal, ReverseSplitWithPrimal)
    @testset "Simple" begin
    dx = make_zero(x)
    dinputs_and_maybe_result = autodiff(mode, Const(f), Seed(dz), Duplicated(x, dx), Active(y))

</details>

@gdalle
Copy link
Contributor Author

gdalle commented Mar 9, 2025

@vchuravy how do I generalize addition beyond arrays to accumulate the adjoint into the shadow?

@gdalle gdalle requested a review from vchuravy March 10, 2025 13:27
@gdalle gdalle requested a review from wsmoses March 14, 2025 08:49
@gdalle
Copy link
Contributor Author

gdalle commented Mar 14, 2025

@vchuravy any further comments?

@vchuravy
Copy link
Member

Only when looking at the examples I feel like we should fuse Active, Seed(dresult), maybe Seed{Active}(dresult) and then we could add Seedx(::Float64) = Seed{Active}(x)

@gdalle
Copy link
Contributor Author

gdalle commented Mar 14, 2025

I guess that requires some form of automatic activity detection in the general case. Should I use guess_activity?

@gdalle
Copy link
Contributor Author

gdalle commented Mar 20, 2025

Bump @vchuravy, in general do we want automatic activity detection for the seed?

@vchuravy
Copy link
Member

vchuravy commented Apr 9, 2025

Sorry was on vacation.

automatic activity detection for the seed?

Yes I think so. At least it is consistent with current activity deduction for return, but allows specification of the seed value,

@gdalle
Copy link
Contributor Author

gdalle commented Apr 11, 2025

I put the activity detection inside autodiff itself because guess_activity needs the mode object too. And I added a little batch_activity routine because guess_activity never returns a BatchDuplicated. Let me know what you think @vchuravy

@gdalle
Copy link
Contributor Author

gdalle commented May 24, 2025

Here's what I get with the Refs instead:

julia> autodiff(Reverse, Const(f6), Seed(dz6), Duplicated(x, zero(x)), Active(y))
ERROR: MethodError: no method matching (::Enzyme.Compiler.AdjointThunk{…})(::Const{…}, ::Duplicated{…}, ::Active{…}, ::@NamedTuple{})
This error has been manually thrown, explicitly, so the method may exist but be intentionally marked as unimplemented.

Closest candidates are:
  (::Enzyme.Compiler.AdjointThunk{PT, FA, RT, TT, Width, TapeT})(::FA, ::Any...) where {PT, FA, Width, RT, TT, TapeT}
   @ Enzyme ~/Documents/GitHub/Julia/Enzyme.jl/src/compiler.jl:4901

Stacktrace:
 [1] macro expansion
   @ ~/Documents/GitHub/Julia/Enzyme.jl/src/compiler.jl:5041 [inlined]
 [2] enzyme_call(::Val{…}, ::Ptr{…}, ::Type{…}, ::Val{…}, ::Val{…}, ::Type{…}, ::Type{…}, ::Const{…}, ::Type{…}, ::Duplicated{…}, ::Active{…}, ::@NamedTuple{})
   @ Enzyme.Compiler ~/Documents/GitHub/Julia/Enzyme.jl/src/compiler.jl:4997
 [3] AdjointThunk
   @ ~/Documents/GitHub/Julia/Enzyme.jl/src/compiler.jl:4901 [inlined]
 [4] autodiff(::ReverseMode{…}, ::Const{…}, ::Seed{…}, ::Duplicated{…}, ::Active{…})
   @ Enzyme ~/Documents/GitHub/Julia/Enzyme.jl/src/sugar.jl:1214
 [5] top-level scope
   @ ~/Documents/GitHub/Julia/Enzyme.jl/test/seeded.jl:165
Some type information was truncated. Use `show(err)` to see complete types.

@gdalle
Copy link
Contributor Author

gdalle commented May 25, 2025

I think I figures out the method error, it seems in the MixedDuplicated case the thunk expects the shadow values in addition to the tape.
Now I'm getting test failures but I'm also getting this error, only in the batched case:

julia> autodiff(Reverse, Const(f6), BatchSeed(dzs6), BatchDuplicated(x, (zero(x), zero(x))), Active(y))
Stored value type does not match pointer operand type!
  store [2 x { double, {} addrspace(10)* }] %102, { double, {} addrspace(10)* }* %103, align 8, !dbg !1119
 [2 x { double, {} addrspace(10)* }]; Function Attrs: mustprogress nofree willreturn
define "enzyme_type"="{[0]:Float@double, [8]:Pointer, [8,0]:Pointer, [8,0,-1]:Float@double, [8,8]:Pointer, [8,8,0]:Integer, [8,8,1]:Integer, [8,8,2]:Integer, [8,8,3]:Integer, [8,8,4]:Integer, [8,8,5]:Integer, [8,8,6]:Integer, [8,8,7]:Integer, [8,8,8]:Pointer, [8,8,8,-1]:Float@double, [8,16]:Integer, [8,17]:Integer, [8,18]:Integer, [8,19]:Integer, [8,20]:Integer, [8,21]:Integer, [8,22]:Integer, [8,23]:Integer}" "enzymejl_parmtype"="13289153232" "enzymejl_parmtype_ref"="1" { double, {} addrspace(10)* } @preprocess_julia_f6_43005_inner.3({} addrspace(10)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" "enzymejl_parmtype"="5136209872" "enzymejl_parmtype_ref"="2" %0, double "enzyme_type"="{[-1]:Float@double}" "enzymejl_parmtype"="5190083824" "enzymejl_parmtype_ref"="0" %1) local_unnamed_addr #10 !dbg !371 {
entry:
  %pgcstack.i = call {}*** @julia.get_pgcstack() #20, !noalias !372
  %ptls_field.i16 = getelementptr inbounds {}**, {}*** %pgcstack.i, i64 2
  %2 = bitcast {}*** %ptls_field.i16 to i64***
  %ptls_load.i1718 = load i64**, i64*** %2, align 8, !tbaa !13, !noalias !372
  %3 = getelementptr inbounds i64*, i64** %ptls_load.i1718, i64 2
  %safepoint.i = load i64*, i64** %3, align 8, !tbaa !17, !noalias !372
  fence syncscope("singlethread") seq_cst
  call void @julia.safepoint(i64* %safepoint.i) #20, !dbg !376, !noalias !372
  fence syncscope("singlethread") seq_cst
  %4 = call fastcc double @julia__mapreduce_43055({} addrspace(10)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(24) %0) #21, !dbg !378, !noalias !372
  %5 = bitcast {} addrspace(10)* %0 to i8 addrspace(10)*, !dbg !387
  %6 = addrspacecast i8 addrspace(10)* %5 to i8 addrspace(11)*, !dbg !387
  %7 = getelementptr inbounds i8, i8 addrspace(11)* %6, i64 16, !dbg !387
  %8 = bitcast i8 addrspace(11)* %7 to i64 addrspace(11)*, !dbg !387
  %9 = load i64, i64 addrspace(11)* %8, align 8, !dbg !387, !tbaa !30, !alias.scope !33, !noalias !401, !enzyme_type !41, !enzymejl_source_type_Int64 !0, !enzymejl_byref_BITS_VALUE !0, !enzyme_inactive !0
  switch i64 %9, label %L34.i [
    i64 0, label %julia_f6_43005_inner.exit
    i64 1, label %L15.i
  ], !dbg !402

L15.i:                                            ; preds = %entry
  %10 = bitcast {} addrspace(10)* %0 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !403
  %11 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %10 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !403
  %12 = bitcast {} addrspace(10)* %0 to {} addrspace(10)** addrspace(10)*, !dbg !403
  %13 = addrspacecast {} addrspace(10)** addrspace(10)* %12 to {} addrspace(10)** addrspace(11)*, !dbg !403
  %14 = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %13, align 8, !dbg !403, !tbaa !49, !alias.scope !33, !noalias !401, !enzyme_type !51, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0, !enzyme_nocache !0
  %15 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %11, i64 0, i32 1, !dbg !403
  %16 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %15, align 8, !dbg !403, !tbaa !49, !alias.scope !33, !noalias !401, !dereferenceable_or_null !54, !align !55, !enzyme_type !56, !enzymejl_source_type_Memory\7BFloat64\7D !0, !enzymejl_byref_MUT_REF !0
  %17 = call "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %16, {} addrspace(10)** noundef %14) #20, !dbg !403
  %18 = bitcast {} addrspace(10)* addrspace(13)* %17 to double addrspace(13)*, !dbg !403
  %19 = load double, double addrspace(13)* %18, align 8, !dbg !403, !tbaa !58, !alias.scope !61, !noalias !405, !enzyme_type !63, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Float64 !0
  br label %julia_f6_43005_inner.exit, !dbg !406

L34.i:                                            ; preds = %entry
  %20 = icmp sgt i64 %9, 15, !dbg !407
  br i1 %20, label %L99.i, label %L36.i, !dbg !409

L36.i:                                            ; preds = %L34.i
  %21 = bitcast {} addrspace(10)* %0 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !410
  %22 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %21 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !410
  %23 = bitcast {} addrspace(10)* %0 to {} addrspace(10)** addrspace(10)*, !dbg !410
  %24 = addrspacecast {} addrspace(10)** addrspace(10)* %23 to {} addrspace(10)** addrspace(11)*, !dbg !410
  %25 = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %24, align 8, !dbg !410, !tbaa !49, !alias.scope !33, !noalias !401, !enzyme_type !51, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0, !enzyme_nocache !0
  %26 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %22, i64 0, i32 1, !dbg !410
  %27 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %26, align 8, !dbg !410, !tbaa !49, !alias.scope !33, !noalias !401, !enzyme_type !56, !enzymejl_source_type_Memory\7BFloat64\7D !0, !enzymejl_byref_MUT_REF !0
  %28 = call "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %27, {} addrspace(10)** noundef %25) #20, !dbg !410
  %29 = bitcast {} addrspace(10)* addrspace(13)* %28 to double addrspace(13)*, !dbg !410
  %30 = load double, double addrspace(13)* %29, align 8, !dbg !410, !tbaa !58, !alias.scope !61, !noalias !405, !enzyme_type !63, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Float64 !0
  %31 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %28, i64 1, !dbg !412
  %32 = bitcast {} addrspace(10)* addrspace(13)* %31 to double addrspace(13)*, !dbg !412
  %33 = load double, double addrspace(13)* %32, align 8, !dbg !412, !tbaa !58, !alias.scope !61, !noalias !405, !enzyme_type !63, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Float64 !0
  %34 = fadd double %30, %33, !dbg !414
  %.not2223 = icmp sgt i64 %9, 2, !dbg !417
  br i1 %.not2223, label %L77.i.preheader, label %julia_f6_43005_inner.exit, !dbg !419

L77.i.preheader:                                  ; preds = %L36.i
  br label %L77.i, !dbg !419

L77.i:                                            ; preds = %L77.i.preheader, %L77.i
  %iv = phi i64 [ 0, %L77.i.preheader ], [ %iv.next, %L77.i ]
  %value_phi3.i24 = phi double [ %40, %L77.i ], [ %34, %L77.i.preheader ]
  %35 = add nuw nsw i64 %iv, 2, !dbg !420
  %iv.next = add nuw nsw i64 %iv, 1, !dbg !420
  %36 = add nuw nsw i64 %35, 1, !dbg !420
  %37 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %28, i64 %35, !dbg !422
  %38 = bitcast {} addrspace(10)* addrspace(13)* %37 to double addrspace(13)*, !dbg !422
  %39 = load double, double addrspace(13)* %38, align 8, !dbg !422, !tbaa !58, !alias.scope !61, !noalias !405
  %40 = fadd double %value_phi3.i24, %39, !dbg !423
  %exitcond.not = icmp eq i64 %36, %9, !dbg !417
  br i1 %exitcond.not, label %julia_f6_43005_inner.exit.loopexit, label %L77.i, !dbg !419

L99.i:                                            ; preds = %L34.i
  %41 = call fastcc double @julia_mapreduce_impl_43034({} addrspace(10)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(24) %0, i64 noundef signext 1, i64 noundef signext %9) #21, !dbg !426, !noalias !372
  br label %julia_f6_43005_inner.exit, !dbg !428

julia_f6_43005_inner.exit.loopexit:               ; preds = %L77.i
  br label %julia_f6_43005_inner.exit, !dbg !429

julia_f6_43005_inner.exit:                        ; preds = %julia_f6_43005_inner.exit.loopexit, %L99.i, %L36.i, %L15.i, %entry
  %value_phi.i = phi double [ %19, %L15.i ], [ %41, %L99.i ], [ 0.000000e+00, %entry ], [ %34, %L36.i ], [ %40, %julia_f6_43005_inner.exit.loopexit ]
  %42 = fmul double %4, %1, !dbg !429
  %current_task1.i15 = getelementptr inbounds {}**, {}*** %pgcstack.i, i64 -14
  %43 = fmul double %1, %1, !dbg !430
  %44 = fmul double %43, %value_phi.i, !dbg !432
  %45 = call noalias "enzyme_type"="{[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer, [-1,8]:Pointer, [-1,8,-1]:Float@double}" {} addrspace(10)* @jl_alloc_genericmemory({} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 5135345376 to {}*) to {} addrspace(10)*), i64 noundef 1) #22, !dbg !433, !noalias !372
  %46 = bitcast {} addrspace(10)* %45 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !436
  %47 = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %46 to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !436
  %48 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %47, i64 0, i32 1, !dbg !436
  %49 = bitcast {} addrspace(10)** addrspace(11)* %48 to i8* addrspace(11)*, !dbg !436
  %50 = load i8*, i8* addrspace(11)* %49, align 8, !dbg !436, !tbaa !17, !alias.scope !199, !noalias !438, !nonnull !0, !enzyme_type !51, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0, !enzyme_nocache !0
  %51 = bitcast {}*** %current_task1.i15 to {}*, !dbg !439
  %52 = call noalias nonnull align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %51, i64 noundef 24, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 5136209872 to {}*) to {} addrspace(10)*)) #23, !dbg !439, !noalias !372
  %53 = bitcast {} addrspace(10)* %52 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !439
  %54 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %53 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !439
  %.repack = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %54, i64 0, i32 0, !dbg !439
  store i8* %50, i8* addrspace(11)* %.repack, align 8, !dbg !439, !tbaa !49, !alias.scope !33, !noalias !440
  %.repack19 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %54, i64 0, i32 1, !dbg !439
  store {} addrspace(10)* %45, {} addrspace(10)* addrspace(11)* %.repack19, align 8, !dbg !439, !tbaa !49, !alias.scope !33, !noalias !440
  %55 = bitcast {} addrspace(10)* %52 to i8 addrspace(10)*, !dbg !439
  %56 = addrspacecast i8 addrspace(10)* %55 to i8 addrspace(11)*, !dbg !439
  %57 = getelementptr inbounds i8, i8 addrspace(11)* %56, i64 16, !dbg !439
  %58 = bitcast i8 addrspace(11)* %57 to i64 addrspace(11)*, !dbg !439
  store i64 1, i64 addrspace(11)* %58, align 8, !dbg !439, !tbaa !30, !alias.scope !33, !noalias !440
  %59 = bitcast i8* %50 to {} addrspace(10)**, !dbg !443
  %60 = call "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %45, {} addrspace(10)** noundef %59) #20, !dbg !446
  %61 = bitcast {} addrspace(10)* addrspace(13)* %60 to double addrspace(13)*, !dbg !446
  store double %44, double addrspace(13)* %61, align 8, !dbg !446, !tbaa !58, !alias.scope !61, !noalias !447
  %.fca.0.insert = insertvalue { double, {} addrspace(10)* } poison, double %42, 0, !dbg !448
  %.fca.1.insert = insertvalue { double, {} addrspace(10)* } %.fca.0.insert, {} addrspace(10)* %52, 1, !dbg !448
  ret { double, {} addrspace(10)* } %.fca.1.insert, !dbg !448
}

; Function Attrs: mustprogress nofree
define internal "enzymejl_parmtype"="13289153232" "enzymejl_parmtype_ref"="1" { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } } @augmented_julia_f6_43005_inner.3({} addrspace(10)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" "enzymejl_parmtype"="5136209872" "enzymejl_parmtype_ref"="2" %0, [2 x {} addrspace(10)*] %"'", double "enzyme_type"="{[-1]:Float@double}" "enzymejl_parmtype"="5190083824" "enzymejl_parmtype_ref"="0" %1) local_unnamed_addr #19 !dbg !986 {
entry:
  %2 = alloca { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }, align 8
  %3 = getelementptr inbounds { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }, { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }* %2, i32 0, i32 0
  %4 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 2
  store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %4, align 8
  %5 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 0, i32 2
  store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %5, align 8
  %6 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 3, i32 0
  store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %6, align 8
  %7 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 3, i32 1
  store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %7, align 8
  %8 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 4, i32 0
  store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %8, align 8
  %9 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 4, i32 1
  store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %9, align 8
  %10 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 0, i32 0, i32 0
  store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %10, align 8
  %11 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 0, i32 0, i32 1
  store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %11, align 8
  %12 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 0, i32 1, i32 0
  store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %12, align 8
  %13 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 0, i32 1, i32 1
  store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %13, align 8
  %"iv'ac" = alloca i64, align 8
  %pgcstack.i = call {}*** @julia.get_pgcstack() #20, !noalias !987
  %ptls_field.i16 = getelementptr inbounds {}**, {}*** %pgcstack.i, i64 2
  %14 = bitcast {}*** %ptls_field.i16 to i64***
  %ptls_load.i1718 = load i64**, i64*** %14, align 8, !tbaa !13, !alias.scope !991, !noalias !994
  %15 = getelementptr inbounds i64*, i64** %ptls_load.i1718, i64 2
  %safepoint.i = load i64*, i64** %15, align 8, !tbaa !17, !alias.scope !1000, !noalias !1003
  fence syncscope("singlethread") seq_cst
  call void @julia.safepoint(i64* %safepoint.i) #20, !dbg !1006, !noalias !987
  fence syncscope("singlethread") seq_cst
  %_augmented = call fastcc { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double } @augmented_julia__mapreduce_43055({} addrspace(10)* nocapture nofree readonly align 8 %0, [2 x {} addrspace(10)*] %"'"), !dbg !1008
  %subcache = extractvalue { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double } %_augmented, 0, !dbg !1008
  %16 = getelementptr inbounds { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i32 0, i32 0, !dbg !1008
  store { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* } %subcache, { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }* %16, align 8, !dbg !1008
  %17 = extractvalue { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double } %_augmented, 1, !dbg !1008
  %18 = getelementptr inbounds { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i32 0, i32 1, !dbg !1017
  store double %17, double* %18, align 8, !dbg !1017
  %19 = bitcast {} addrspace(10)* %0 to i8 addrspace(10)*, !dbg !1017
  %20 = addrspacecast i8 addrspace(10)* %19 to i8 addrspace(11)*, !dbg !1017
  %21 = getelementptr inbounds i8, i8 addrspace(11)* %20, i64 16, !dbg !1017
  %22 = bitcast i8 addrspace(11)* %21 to i64 addrspace(11)*, !dbg !1017
  %23 = load i64, i64 addrspace(11)* %22, align 8, !dbg !1017, !tbaa !30, !alias.scope !1031, !noalias !1034, !enzyme_type !41, !enzymejl_source_type_Int64 !0, !enzymejl_byref_BITS_VALUE !0, !enzyme_inactive !0
  switch i64 %23, label %L34.i [
    i64 0, label %julia_f6_43005_inner.exit
    i64 1, label %L15.i
  ], !dbg !1037

L15.i:                                            ; preds = %entry
  %24 = extractvalue [2 x {} addrspace(10)*] %"'", 0, !dbg !1038
  %"'ipc" = bitcast {} addrspace(10)* %24 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1038
  %25 = extractvalue [2 x {} addrspace(10)*] %"'", 1, !dbg !1038
  %"'ipc10" = bitcast {} addrspace(10)* %25 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1038
  %26 = bitcast {} addrspace(10)* %0 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1038
  %"'ipc11" = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %"'ipc" to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1038
  %"'ipc12" = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %"'ipc10" to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1038
  %27 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %26 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1038
  %28 = extractvalue [2 x {} addrspace(10)*] %"'", 0, !dbg !1038
  %"'ipc15" = bitcast {} addrspace(10)* %28 to {} addrspace(10)** addrspace(10)*, !dbg !1038
  %29 = extractvalue [2 x {} addrspace(10)*] %"'", 1, !dbg !1038
  %"'ipc16" = bitcast {} addrspace(10)* %29 to {} addrspace(10)** addrspace(10)*, !dbg !1038
  %30 = bitcast {} addrspace(10)* %0 to {} addrspace(10)** addrspace(10)*, !dbg !1038
  %"'ipc17" = addrspacecast {} addrspace(10)** addrspace(10)* %"'ipc15" to {} addrspace(10)** addrspace(11)*, !dbg !1038
  %"'ipc18" = addrspacecast {} addrspace(10)** addrspace(10)* %"'ipc16" to {} addrspace(10)** addrspace(11)*, !dbg !1038
  %31 = addrspacecast {} addrspace(10)** addrspace(10)* %30 to {} addrspace(10)** addrspace(11)*, !dbg !1038
  %"'ipl19" = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %"'ipc17", align 8, !dbg !1038, !tbaa !49, !alias.scope !1040, !noalias !1041
  %"'ipl20" = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %"'ipc18", align 8, !dbg !1038, !tbaa !49, !alias.scope !1042, !noalias !1043
  %32 = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %31, align 8, !dbg !1038, !tbaa !49, !alias.scope !1031, !noalias !1034, !enzyme_type !51, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0, !enzyme_nocache !0
  %"'ipg" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc11", i64 0, i32 1, !dbg !1038
  %"'ipg13" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc12", i64 0, i32 1, !dbg !1038
  %33 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %27, i64 0, i32 1, !dbg !1038
  %"'ipl" = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %"'ipg", align 8, !dbg !1038, !tbaa !49, !alias.scope !1040, !noalias !1041, !dereferenceable_or_null !54
  %"'ipl14" = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %"'ipg13", align 8, !dbg !1038, !tbaa !49, !alias.scope !1042, !noalias !1043, !dereferenceable_or_null !54
  %34 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %33, align 8, !dbg !1038, !tbaa !49, !alias.scope !1031, !noalias !1034, !dereferenceable_or_null !54, !align !55, !enzyme_type !56, !enzymejl_source_type_Memory\7BFloat64\7D !0, !enzymejl_byref_MUT_REF !0
  %35 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* %"'ipl", {} addrspace(10)** %"'ipl19"), !dbg !1038
  %36 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* %"'ipl14", {} addrspace(10)** %"'ipl20"), !dbg !1038
  %37 = call "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %34, {} addrspace(10)** noundef %32) #20, !dbg !1038
  %38 = bitcast {} addrspace(10)* addrspace(13)* %37 to double addrspace(13)*, !dbg !1038
  %39 = load double, double addrspace(13)* %38, align 8, !dbg !1038, !tbaa !58, !alias.scope !1044, !noalias !1047, !enzyme_type !63, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Float64 !0
  br label %julia_f6_43005_inner.exit, !dbg !1050

L34.i:                                            ; preds = %entry
  %40 = icmp sgt i64 %23, 15, !dbg !1051
  br i1 %40, label %L99.i, label %L36.i, !dbg !1053

L36.i:                                            ; preds = %L34.i
  %41 = extractvalue [2 x {} addrspace(10)*] %"'", 0, !dbg !1054
  %"'ipc21" = bitcast {} addrspace(10)* %41 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1054
  %42 = extractvalue [2 x {} addrspace(10)*] %"'", 1, !dbg !1054
  %"'ipc22" = bitcast {} addrspace(10)* %42 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1054
  %43 = bitcast {} addrspace(10)* %0 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1054
  %"'ipc23" = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %"'ipc21" to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1054
  %"'ipc24" = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %"'ipc22" to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1054
  %44 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %43 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1054
  %45 = extractvalue [2 x {} addrspace(10)*] %"'", 0, !dbg !1054
  %"'ipc29" = bitcast {} addrspace(10)* %45 to {} addrspace(10)** addrspace(10)*, !dbg !1054
  %46 = extractvalue [2 x {} addrspace(10)*] %"'", 1, !dbg !1054
  %"'ipc30" = bitcast {} addrspace(10)* %46 to {} addrspace(10)** addrspace(10)*, !dbg !1054
  %47 = bitcast {} addrspace(10)* %0 to {} addrspace(10)** addrspace(10)*, !dbg !1054
  %"'ipc31" = addrspacecast {} addrspace(10)** addrspace(10)* %"'ipc29" to {} addrspace(10)** addrspace(11)*, !dbg !1054
  %"'ipc32" = addrspacecast {} addrspace(10)** addrspace(10)* %"'ipc30" to {} addrspace(10)** addrspace(11)*, !dbg !1054
  %48 = addrspacecast {} addrspace(10)** addrspace(10)* %47 to {} addrspace(10)** addrspace(11)*, !dbg !1054
  %"'ipl33" = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %"'ipc31", align 8, !dbg !1054, !tbaa !49, !alias.scope !1040, !noalias !1041
  %"'ipl34" = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %"'ipc32", align 8, !dbg !1054, !tbaa !49, !alias.scope !1042, !noalias !1043
  %49 = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %48, align 8, !dbg !1054, !tbaa !49, !alias.scope !1031, !noalias !1034, !enzyme_type !51, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0, !enzyme_nocache !0
  %"'ipg25" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc23", i64 0, i32 1, !dbg !1054
  %"'ipg26" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc24", i64 0, i32 1, !dbg !1054
  %50 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %44, i64 0, i32 1, !dbg !1054
  %"'ipl27" = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %"'ipg25", align 8, !dbg !1054, !tbaa !49, !alias.scope !1040, !noalias !1041
  %"'ipl28" = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %"'ipg26", align 8, !dbg !1054, !tbaa !49, !alias.scope !1042, !noalias !1043
  %51 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %50, align 8, !dbg !1054, !tbaa !49, !alias.scope !1031, !noalias !1034, !enzyme_type !56, !enzymejl_source_type_Memory\7BFloat64\7D !0, !enzymejl_byref_MUT_REF !0
  %52 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* %"'ipl27", {} addrspace(10)** %"'ipl33"), !dbg !1054
  %53 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* %"'ipl28", {} addrspace(10)** %"'ipl34"), !dbg !1054
  %54 = call "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %51, {} addrspace(10)** noundef %49) #20, !dbg !1054
  %55 = bitcast {} addrspace(10)* addrspace(13)* %54 to double addrspace(13)*, !dbg !1054
  %56 = load double, double addrspace(13)* %55, align 8, !dbg !1054, !tbaa !58, !alias.scope !1056, !noalias !1059, !enzyme_type !63, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Float64 !0
  %57 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %54, i64 1, !dbg !1062
  %58 = bitcast {} addrspace(10)* addrspace(13)* %57 to double addrspace(13)*, !dbg !1062
  %59 = load double, double addrspace(13)* %58, align 8, !dbg !1062, !tbaa !58, !alias.scope !1056, !noalias !1059, !enzyme_type !63, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Float64 !0
  %60 = fadd double %56, %59, !dbg !1064
  %.not2223 = icmp sgt i64 %23, 2, !dbg !1067
  br i1 %.not2223, label %L77.i.preheader, label %julia_f6_43005_inner.exit, !dbg !1069

L77.i.preheader:                                  ; preds = %L36.i
  %61 = add i64 %23, -3, !dbg !1069
  br label %L77.i, !dbg !1069

L77.i:                                            ; preds = %L77.i, %L77.i.preheader
  %iv = phi i64 [ 0, %L77.i.preheader ], [ %iv.next, %L77.i ]
  %value_phi3.i24 = phi double [ %67, %L77.i ], [ %60, %L77.i.preheader ]
  %iv.next = add nuw nsw i64 %iv, 1, !dbg !1070
  %62 = add nuw nsw i64 %iv, 2, !dbg !1070
  %63 = add nuw nsw i64 %62, 1, !dbg !1070
  %64 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %54, i64 %62, !dbg !1072
  %65 = bitcast {} addrspace(10)* addrspace(13)* %64 to double addrspace(13)*, !dbg !1072
  %66 = load double, double addrspace(13)* %65, align 8, !dbg !1072, !tbaa !58, !alias.scope !1056, !noalias !1059
  %67 = fadd double %value_phi3.i24, %66, !dbg !1073
  %exitcond.not = icmp eq i64 %63, %23, !dbg !1067
  br i1 %exitcond.not, label %julia_f6_43005_inner.exit.loopexit, label %L77.i, !dbg !1069

L99.i:                                            ; preds = %L34.i
  %_augmented35 = call fastcc { {} addrspace(10)*, double } @augmented_julia_mapreduce_impl_43034({} addrspace(10)* nocapture nofree readonly align 8 %0, [2 x {} addrspace(10)*] %"'", i64 signext 1, i64 signext %23), !dbg !1076
  %subcache36 = extractvalue { {} addrspace(10)*, double } %_augmented35, 0, !dbg !1076
  %68 = getelementptr inbounds { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i32 0, i32 2, !dbg !1076
  store {} addrspace(10)* %subcache36, {} addrspace(10)** %68, align 8, !dbg !1076
  %69 = extractvalue { {} addrspace(10)*, double } %_augmented35, 1, !dbg !1076
  br label %julia_f6_43005_inner.exit, !dbg !1078

julia_f6_43005_inner.exit.loopexit:               ; preds = %L77.i
  br label %julia_f6_43005_inner.exit, !dbg !1079

julia_f6_43005_inner.exit:                        ; preds = %julia_f6_43005_inner.exit.loopexit, %L99.i, %L36.i, %L15.i, %entry
  %value_phi.i = phi double [ %39, %L15.i ], [ %69, %L99.i ], [ 0.000000e+00, %entry ], [ %60, %L36.i ], [ %67, %julia_f6_43005_inner.exit.loopexit ]
  %70 = getelementptr inbounds { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i32 0, i32 5
  store double %value_phi.i, double* %70, align 8
  %current_task1.i15 = getelementptr inbounds {}**, {}*** %pgcstack.i, i64 -14
  %71 = fmul double %1, %1, !dbg !1080
  %72 = fmul double %71, %value_phi.i, !dbg !1082
  %73 = call {} addrspace(10)* @jl_alloc_genericmemory({} addrspace(10)* addrspacecast ({}* inttoptr (i64 5135345376 to {}*) to {} addrspace(10)*), i64 1), !dbg !1083
  %74 = bitcast {} addrspace(10)* %73 to <{ i64, i8* }> addrspace(10)*, !dbg !1083
  %75 = getelementptr inbounds <{ i64, i8* }>, <{ i64, i8* }> addrspace(10)* %74, i32 0, i32 1, !dbg !1083
  %76 = load i8*, i8* addrspace(10)* %75, align 8, !dbg !1083
  call void @llvm.memset.p0i8.i64(i8* align 8 %76, i8 0, i64 8, i1 false), !dbg !1083
  %77 = insertvalue [2 x {} addrspace(10)*] undef, {} addrspace(10)* %73, 0, !dbg !1083
  %78 = call {} addrspace(10)* @jl_alloc_genericmemory({} addrspace(10)* addrspacecast ({}* inttoptr (i64 5135345376 to {}*) to {} addrspace(10)*), i64 1), !dbg !1083
  %79 = bitcast {} addrspace(10)* %78 to <{ i64, i8* }> addrspace(10)*, !dbg !1083
  %80 = getelementptr inbounds <{ i64, i8* }>, <{ i64, i8* }> addrspace(10)* %79, i32 0, i32 1, !dbg !1083
  %81 = load i8*, i8* addrspace(10)* %80, align 8, !dbg !1083
  call void @llvm.memset.p0i8.i64(i8* align 8 %81, i8 0, i64 8, i1 false), !dbg !1083
  %82 = insertvalue [2 x {} addrspace(10)*] %77, {} addrspace(10)* %78, 1, !dbg !1083
  %83 = getelementptr inbounds { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i32 0, i32 4, !dbg !1083
  store [2 x {} addrspace(10)*] %82, [2 x {} addrspace(10)*]* %83, align 8, !dbg !1083
  %84 = call noalias "enzyme_type"="{[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer, [-1,8]:Pointer, [-1,8,-1]:Float@double}" {} addrspace(10)* @jl_alloc_genericmemory({} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 5135345376 to {}*) to {} addrspace(10)*), i64 noundef 1) #21, !dbg !1083, !noalias !987
  %"'ipc61" = bitcast {} addrspace(10)* %73 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !1086
  %"'ipc62" = bitcast {} addrspace(10)* %78 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !1086
  %85 = bitcast {} addrspace(10)* %84 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !1086
  %"'ipc63" = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %"'ipc61" to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !1086
  %"'ipc64" = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %"'ipc62" to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !1086
  %86 = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %85 to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !1086
  %"'ipg65" = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %"'ipc63", i64 0, i32 1, !dbg !1086
  %"'ipg66" = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %"'ipc64", i64 0, i32 1, !dbg !1086
  %87 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %86, i64 0, i32 1, !dbg !1086
  %"'ipc67" = bitcast {} addrspace(10)** addrspace(11)* %"'ipg65" to i8* addrspace(11)*, !dbg !1086
  %"'ipc68" = bitcast {} addrspace(10)** addrspace(11)* %"'ipg66" to i8* addrspace(11)*, !dbg !1086
  %88 = bitcast {} addrspace(10)** addrspace(11)* %87 to i8* addrspace(11)*, !dbg !1086
  %"'ipl69" = load i8*, i8* addrspace(11)* %"'ipc67", align 8, !dbg !1086, !tbaa !17, !alias.scope !1088, !noalias !1091, !nonnull !0
  %"'ipl70" = load i8*, i8* addrspace(11)* %"'ipc68", align 8, !dbg !1086, !tbaa !17, !alias.scope !1094, !noalias !1095, !nonnull !0
  %89 = load i8*, i8* addrspace(11)* %88, align 8, !dbg !1086, !tbaa !17, !alias.scope !1096, !noalias !1097, !nonnull !0, !enzyme_type !51, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0, !enzyme_nocache !0
  %90 = bitcast {}*** %current_task1.i15 to {}*, !dbg !1098
  %"'mi58" = call noalias nonnull align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %90, i64 noundef 24, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 5136209872 to {}*) to {} addrspace(10)*)) #22, !dbg !1098
  %91 = bitcast {} addrspace(10)* %"'mi58" to i8 addrspace(10)*, !dbg !1098
  call void @llvm.memset.p10i8.i64(i8 addrspace(10)* nonnull dereferenceable(24) dereferenceable_or_null(24) %91, i8 0, i64 24, i1 false), !dbg !1098
  %92 = insertvalue [2 x {} addrspace(10)*] undef, {} addrspace(10)* %"'mi58", 0, !dbg !1098
  %"'mi59" = call noalias nonnull align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %90, i64 noundef 24, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 5136209872 to {}*) to {} addrspace(10)*)) #22, !dbg !1098
  %93 = bitcast {} addrspace(10)* %"'mi59" to i8 addrspace(10)*, !dbg !1098
  call void @llvm.memset.p10i8.i64(i8 addrspace(10)* nonnull dereferenceable(24) dereferenceable_or_null(24) %93, i8 0, i64 24, i1 false), !dbg !1098
  %94 = insertvalue [2 x {} addrspace(10)*] %92, {} addrspace(10)* %"'mi59", 1, !dbg !1098
  %95 = getelementptr inbounds { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i32 0, i32 3, !dbg !1098
  store [2 x {} addrspace(10)*] %94, [2 x {} addrspace(10)*]* %95, align 8, !dbg !1098
  %"'ipc50" = bitcast {} addrspace(10)* %"'mi58" to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1098
  %"'ipc51" = bitcast {} addrspace(10)* %"'mi59" to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1098
  %"'ipc52" = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %"'ipc50" to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1098
  %"'ipc53" = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %"'ipc51" to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1098
  %".repack'ipg" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc52", i64 0, i32 0, !dbg !1098
  %".repack'ipg55" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc53", i64 0, i32 0, !dbg !1098
  store i8* %"'ipl69", i8* addrspace(11)* %".repack'ipg", align 8, !dbg !1098, !tbaa !49, !alias.scope !1099, !noalias !1102
  store i8* %"'ipl70", i8* addrspace(11)* %".repack'ipg55", align 8, !dbg !1098, !tbaa !49, !alias.scope !1107, !noalias !1108
  %".repack19'ipg" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc52", i64 0, i32 1, !dbg !1098
  %".repack19'ipg54" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc53", i64 0, i32 1, !dbg !1098
  store {} addrspace(10)* %73, {} addrspace(10)* addrspace(11)* %".repack19'ipg", align 8, !dbg !1098, !tbaa !49, !alias.scope !1099, !noalias !1102
  store {} addrspace(10)* %78, {} addrspace(10)* addrspace(11)* %".repack19'ipg54", align 8, !dbg !1098, !tbaa !49, !alias.scope !1107, !noalias !1108
  %"'ipc39" = bitcast {} addrspace(10)* %"'mi58" to i8 addrspace(10)*, !dbg !1098
  %"'ipc40" = bitcast {} addrspace(10)* %"'mi59" to i8 addrspace(10)*, !dbg !1098
  %"'ipc41" = addrspacecast i8 addrspace(10)* %"'ipc39" to i8 addrspace(11)*, !dbg !1098
  %"'ipc42" = addrspacecast i8 addrspace(10)* %"'ipc40" to i8 addrspace(11)*, !dbg !1098
  %"'ipg43" = getelementptr inbounds i8, i8 addrspace(11)* %"'ipc41", i64 16, !dbg !1098
  %"'ipg44" = getelementptr inbounds i8, i8 addrspace(11)* %"'ipc42", i64 16, !dbg !1098
  %"'ipc45" = bitcast i8 addrspace(11)* %"'ipg43" to i64 addrspace(11)*, !dbg !1098
  %"'ipc46" = bitcast i8 addrspace(11)* %"'ipg44" to i64 addrspace(11)*, !dbg !1098
  store i64 1, i64 addrspace(11)* %"'ipc45", align 8, !dbg !1098, !tbaa !30, !alias.scope !1099, !noalias !1102
  store i64 1, i64 addrspace(11)* %"'ipc46", align 8, !dbg !1098, !tbaa !30, !alias.scope !1107, !noalias !1108
  %"'ipc37" = bitcast i8* %"'ipl69" to {} addrspace(10)**, !dbg !1109
  %"'ipc38" = bitcast i8* %"'ipl70" to {} addrspace(10)**, !dbg !1109
  %96 = bitcast i8* %89 to {} addrspace(10)**, !dbg !1109
  %97 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* %73, {} addrspace(10)** %"'ipc37"), !dbg !1112
  %98 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* %78, {} addrspace(10)** %"'ipc38"), !dbg !1112
  %99 = call "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %84, {} addrspace(10)** noundef %96) #20, !dbg !1112
  %100 = bitcast {} addrspace(10)* addrspace(13)* %99 to double addrspace(13)*, !dbg !1112
  store double %72, double addrspace(13)* %100, align 8, !dbg !1112, !tbaa !58, !alias.scope !1113, !noalias !1116
  %".fca.1.insert'ipiv" = insertvalue { double, {} addrspace(10)* } zeroinitializer, {} addrspace(10)* %"'mi58", 1, !dbg !1119
  %101 = insertvalue [2 x { double, {} addrspace(10)* }] undef, { double, {} addrspace(10)* } %".fca.1.insert'ipiv", 0, !dbg !1119
  %".fca.1.insert'ipiv72" = insertvalue { double, {} addrspace(10)* } zeroinitializer, {} addrspace(10)* %"'mi59", 1, !dbg !1119
  %102 = insertvalue [2 x { double, {} addrspace(10)* }] %101, { double, {} addrspace(10)* } %".fca.1.insert'ipiv72", 1, !dbg !1119
  %103 = getelementptr inbounds { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }, { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }* %2, i32 0, i32 1, !dbg !1119
  store [2 x { double, {} addrspace(10)* }] %102, { double, {} addrspace(10)* }* %103, align 8, !dbg !1119
  %104 = load { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }, { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }* %2, align 8, !dbg !1119
  ret { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } } %104, !dbg !1119
}

ERROR: LLVM error: augmented function failed verification (3)
Stacktrace:
 [1] handle_error(reason::Cstring)
   @ LLVM ~/.julia/packages/LLVM/2JPxT/src/core/context.jl:194

@gdalle
Copy link
Contributor Author

gdalle commented Jun 5, 2025

Gentle bump here, do you happen to know what this latest error means?

Stored value type does not match pointer operand type!

@gdalle
Copy link
Contributor Author

gdalle commented Jun 30, 2025

Mini-bump here, happy to take suggestions on test case number 6

@wsmoses
Copy link
Member

wsmoses commented Aug 13, 2025

@gdalle so I think the things needed to move the needle here are opening issues [that don't require this PR] for whatever failures exist (ideally as simple as possible)

@gdalle
Copy link
Contributor Author

gdalle commented Aug 16, 2025

Opened the first one in #2514

@wsmoses
Copy link
Member

wsmoses commented Aug 27, 2025

@gdalle compilation errors are gone, but theres a lot of correctness errors currently in this branch.

Can you try to look at/resolve them.

We're getting closer

@gdalle
Copy link
Contributor Author

gdalle commented Aug 27, 2025

Thanks for the quick fixes, I'll take a look at correctness. It might be me incorrectly defining reference values, I'll double check with another autodiff backend

@wsmoses
Copy link
Member

wsmoses commented Sep 3, 2025

gentle ping @gdalle

@gdalle
Copy link
Contributor Author

gdalle commented Sep 3, 2025

@wsmoses I think the correctness errors are due to recursive_accumulate behaving in a way that I don't understand. Here's an MWE that mimicks the seeding in the following lines:

https://github.com/gdalle/Enzyme.jl/blob/21873f91f01c4e2a05d489575ce567b015fa9169/src/sugar.jl#L1431-L1434

using Enzyme

struct MyMixedStruct
    bar::Float64
    foo::Vector{Float64}
end

shadow_result = Ref(MyMixedStruct(0.0, [0.0]))
dresult_dval = MyMixedStruct(1.0, [2.0])
Enzyme.Compiler.recursive_accumulate(shadow_result, Ref(dresult_dval))

The result is

julia> shadow_result
Base.RefValue{MyMixedStruct}(MyMixedStruct(1.0, [0.0]))

and I don't understand why the second field doesn't get incremented too.

@gdalle
Copy link
Contributor Author

gdalle commented Sep 5, 2025

Perhaps it has to do with the forcelhs option in recursive_add?

@gdalle
Copy link
Contributor Author

gdalle commented Sep 8, 2025

@wsmoses small bump, would love some guidance on the incrementation of the shadow with recursive_accumulate before the reverse pass

@wsmoses
Copy link
Member

wsmoses commented Sep 8, 2025

recursive_accumulate is an internal function whose semantics only add up values in the top-level pointer data structure

@wsmoses
Copy link
Member

wsmoses commented Sep 8, 2025

Here is the only user of the utility (and currently defines its necessary semantics):

Compiler.recursive_accumulate(k, v, refn_seed)

@gdalle
Copy link
Contributor Author

gdalle commented Sep 9, 2025

I find it hard to deduce from one use and without documentation what that function is supposed to do. Could you please take a look at https://github.com/gdalle/Enzyme.jl/blob/21873f91f01c4e2a05d489575ce567b015fa9169/src/sugar.jl#L1431-L1434 and help me figure out if I'm using it right?

@wsmoses
Copy link
Member

wsmoses commented Sep 9, 2025

I mean it is an internal functino, but yeah you're using it for a purpose it is not designed to do. You need to recursively accumulate beyond the first pointer depth so that utility function does not apply, I think you'll need to make a different one

@gdalle
Copy link
Contributor Author

gdalle commented Sep 9, 2025

I don't know how to do that.
I will need some help on this to bring it over the finish line. Alternately, we can throw an error when the return activity is MixedDuplicated.

@wsmoses
Copy link
Member

wsmoses commented Sep 10, 2025

the problem here is not limited to mixedduplicated, it equally applies to duplicated. For example something that returns Vector{Vector{Float64}}

@gdalle
Copy link
Contributor Author

gdalle commented Sep 11, 2025

@vchuravy any idea on how to write the right variant to recursive_accumulate?

@vchuravy
Copy link
Member

I now finally understand @wsmoses objection to #1852 which was essentially trying to implement deep_recursive_accumulate.

The challenge here is the treatment of immutable values, and it feels like we are re-implementing https://github.com/JuliaObjects/Accessors.jl

@gdalle
Copy link
Contributor Author

gdalle commented Sep 11, 2025

What does it mean for the current PR? I'd rather get something merged with a clean error message for the cases we don't handle than keep the hacky version inside DI. Since it is a new feature I think it's okay to start small and then improve?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Syntactic sugar for vjp

3 participants