Avoid some allocations in mutating methods #347

PerezHz · 2024-01-31T22:25:13Z

In current master, some mutating methods allocate, especially for TaylorN and mixtures:

julia> using TaylorSeries, BenchmarkTools

julia> dq = set_variables("dq", order=2, numvars=2)
2-element Vector{TaylorN{Float64}}:
  1.0 dq₁ + 𝒪(‖x‖³)
  1.0 dq₂ + 𝒪(‖x‖³)

julia> x1N = Taylor1([dq[1],dq[2],dq[1]*dq[2]])
  1.0 dq₁ + 𝒪(‖x‖³) + ( 1.0 dq₂ + 𝒪(‖x‖³)) t + ( 1.0 dq₁ dq₂ + 𝒪(‖x‖³)) t² + 𝒪(t³)

julia> y1N = atan(x1N)
  1.0 dq₁ + 𝒪(‖x‖³) + ( 1.0 dq₂ + 𝒪(‖x‖³)) t + ( 1.0 dq₁ dq₂ + 𝒪(‖x‖³)) t² + 𝒪(t³)

julia> z1N = zero(x1N)
  0.0 + 𝒪(‖x‖³) + 𝒪(t³)

julia> @btime TaylorSeries.mul!($z1N,$x1N,$y1N,0)
  590.044 ns (14 allocations: 1.06 KiB)

This PR is a proof of concept which avoids allocations in key places, such as zero and mul! (similar things can be tried in div!, etc.), just by defining suitable inplace zero! methods:

julia> using TaylorSeries, BenchmarkTools

julia> dq = set_variables("dq", order=2, numvars=2)
2-element Vector{TaylorN{Float64}}:
  1.0 dq₁ + 𝒪(‖x‖³)
  1.0 dq₂ + 𝒪(‖x‖³)

julia> x1N = Taylor1([dq[1],dq[2],dq[1]*dq[2]])
  1.0 dq₁ + 𝒪(‖x‖³) + ( 1.0 dq₂ + 𝒪(‖x‖³)) t + ( 1.0 dq₁ dq₂ + 𝒪(‖x‖³)) t² + 𝒪(t³)

julia> y1N = atan(x1N)
  1.0 dq₁ + 𝒪(‖x‖³) + ( 1.0 dq₂ + 𝒪(‖x‖³)) t + ( 1.0 dq₁ dq₂ + 𝒪(‖x‖³)) t² + 𝒪(t³)

julia> z1N = zero(x1N)
  0.0 + 𝒪(‖x‖³) + 𝒪(t³)

julia> @btime TaylorSeries.mul!($z1N,$x1N,$y1N,0)
  83.679 ns (0 allocations: 0 bytes)

Using these zero! methods then we can do things like:

julia> x1 = Taylor1(rand(3))
 0.6727385330826134 + 0.023575091119572567 t + 0.408205561614307 t² + 𝒪(t³)

julia> xN = sin(dq[1]+dq[2])
 1.0 dq₁ + 1.0 dq₂ + 𝒪(‖x‖³)

julia> @btime TaylorSeries.zero!($x1)
  8.049 ns (0 allocations: 0 bytes)

julia> @btime TaylorSeries.zero!($xN)
  21.815 ns (0 allocations: 0 bytes)

julia> @btime TaylorSeries.zero!($x1N)
  63.053 ns (0 allocations: 0 bytes)

without a single allocation.

This is still very much WIP, but do you think this is an idea worth pursuing @lbenet?

coveralls · 2024-01-31T22:28:41Z

coverage: 97.649% (-0.4%) from 98.097%
when pulling 40b5267 on PerezHz:jp/mul-mixtures
into f5587a7 on JuliaDiff:master.

lbenet · 2024-01-31T22:36:32Z

I think it is worth pursuing indeed, since the results are really good!

Once said that, let me point to this line, which defines zero!(res, a, ord), and supposedly avoids allocations; I'm not sure why we are incurring into allocating there...

lbenet · 2024-01-31T22:49:04Z

We should indeed keep the methods, e.g. for initialization of the aux quantities created by @taylorize in TaylorIntegration!

PerezHz · 2024-01-31T23:41:35Z

Once said that, let me point to this line, which defines zero!(res, a, ord), and supposedly avoids allocations; I'm not sure why we are incurring into allocating there...

AFAIU, that line eventually calls zero(::HomogeneousPolynomial), creating a new variable of that type, which allocates memory for its coefficients.

lbenet · 2024-02-01T01:22:50Z

( I guess a better place for those functions would be in src/functions.jl...) The idea is truly nice! Thanks a lot!

src/functions.jl

PerezHz · 2024-02-02T21:48:13Z

I'm trying to also avoid allocations in TaylorSeries.div!, and around this line I found there's an auxiliary TaylorN being declared there, which allocates once for each order of expansion:

TaylorSeries.jl/src/arithmetic.jl

Line 763 in f5587a7

aux = TaylorN(zero(a[ordT][0][1]), a[ordT].order)

I'm thinking that a possible workaround is to add it as an argument to div!, so that that auxiliary can be declared somewhere else, but that change would be perhaps a bit breaking? There are other mutating functions with similar auxiliaries defined in the function body, which could benefit from a similar treatment, what are your thoughts on this @lbenet?

PerezHz · 2024-02-02T21:50:09Z

In other words, whereas this PR makes mul! allocation-free for mixtures, div! is not:

julia> @btime TaylorSeries.mul!($z1N,$x1N,$y1N,0)
  55.880 ns (0 allocations: 0 bytes)

julia> @btime TaylorSeries.div!($z1N,$x1N,$y1N,0)
  360.201 ns (15 allocations: 1.09 KiB)

lbenet · 2024-02-02T22:09:57Z

I'm trying to also avoid allocations in TaylorSeries.div!, and around this line I found there's an auxiliary TaylorN being declared there, which allocates once for each order of expansion:

TaylorSeries.jl/src/arithmetic.jl

Line 763 in f5587a7

aux = TaylorN(zero(a[ordT][0][1]), a[ordT].order)

I don't quite remember the actual reason why that aux is needed there, but I think it is related to the sum in the recurrence relation... I recall the dirty tricks needed when defining sums and using @taylorize...

I'm thinking that a possible workaround is to add it as an argument to div!, so that that auxiliary can be declared somewhere else, but that change would be perhaps a bit breaking? There are other mutating functions with similar auxiliaries defined in the function body, which could benefit from a similar treatment, what are your thoughts on this @lbenet?

I don't think we should care about how breaking that change would be to the code; we simply release a new patch version 😄. Once said this, my guess is that such a change impacts in the infrastructure needed for @taylorize to work nicely. I suggest adding a couple of comments (on improving div!) and leave those changes to another PR. My guess is that such auxiliaries occur beyond div!, so solving div! will tell us the way to proceed.

lbenet · 2024-02-08T16:39:09Z

@PerezHz I've checked that aux (in div!, and actually other functions!) is indeed an auxiliary ::TaylorN. I've been also thinking about exploiting the kind of tricks we use in the mutating functions, but it is not too clear to me that we can actually exploit them in TaylorIntegration. My point is that @taylorize declares all those quantities as appropriate Taylor1s, but in this case aux is a TaylorN...

PerezHz · 2024-02-08T20:33:05Z

I don't quite remember the actual reason why that aux is needed there, but I think it is related to the sum in the recurrence relation... I recall the dirty tricks needed when defining sums and using @taylorize...

Even with the dirty tricks the specialized methods for mixtures allocate a bit less than the generic ones, so it's definitely a step forward I would say!

I don't think we should care about how breaking that change would be to the code; we simply release a new patch version 😄.

Agreed!

Once said this, my guess is that such a change impacts in the infrastructure needed for @taylorize to work nicely.

A bit yes, but not so much, as far as I've managed to test! (more details on this below...)

suggest adding a couple of comments (on improving div!) and leave those changes to another PR.

So far the mutating methods have allowed very good speedups while using @taylorize and yeah, some, or even most, of these mutating methods (pow! for example) allocate Taylor1s and/or TaylorNs explicitly, which kind of "defeats" the purpose of the mutating methods, which is avoiding allocations. My point is, there is still performance potential to gain. I will add these comments and we can try to tackle them in another PR.

@PerezHz I've checked that aux (in div!, and actually other functions!) is indeed an auxiliary ::TaylorN. I've been also thinking about exploiting the kind of tricks we use in the mutating functions, but it is not too clear to me that we can actually exploit them in TaylorIntegration. My point is that @taylorize declares all those quantities as appropriate Taylor1s, but in this case aux is a TaylorN...

That's a good point, and finding a good solution can take more than one attempt, yet I think it's worth pursuing. In a nutshell I guess what I'm trying to say is, we should aim for making TaylorIntegration.jetcoeffs! allocation-free.

Now, I found an issue with pow! for which I found a workaround but still haven't managed to find the reason behind it. Using this PR's TaylorSeries branch the following happens:

julia> using TaylorSeries

julia> dq = set_variables("δq", order=1, numvars=2)
2-element Vector{TaylorN{Float64}}:
  1.0 δq₁ + 𝒪(‖x‖²)
  1.0 δq₂ + 𝒪(‖x‖²)

julia> x = Taylor1([1.0 - 1.8dq[1],-0.007 + 0.0194dq[1], -1e-5 -0.00027dq[2], 3e-7 - 1.8e-5dq[1]-dq[2],dq[1]])
  1.0 - 1.8 δq₁ + 𝒪(‖x‖²) + ( - 0.007 + 0.0194 δq₁ + 𝒪(‖x‖²)) t + ( - 1.0e-5 - 0.00027 δq₂ + 𝒪(‖x‖²)) t² + ( 3.0e-7 - 1.8e-5 δq₁ - 1.0 δq₂ + 𝒪(‖x‖²)) t³ + ( 1.0 δq₁ + 𝒪(‖x‖²)) t⁴ + 𝒪(t⁵)

julia> y = Taylor1([-2.0 + dq[1],-1dq[2],2dq[2], dq[1]+dq[2],dq[2]])
  - 2.0 + 1.0 δq₁ + 𝒪(‖x‖²) + ( - 1.0 δq₂ + 𝒪(‖x‖²)) t + ( 2.0 δq₂ + 𝒪(‖x‖²)) t² + ( 1.0 δq₁ + 1.0 δq₂ + 𝒪(‖x‖²)) t³ + ( 1.0 δq₂ + 𝒪(‖x‖²)) t⁴ + 𝒪(t⁵)

julia> z = deepcopy(x)
  1.0 - 1.8 δq₁ + 𝒪(‖x‖²) + ( - 0.007 + 0.0194 δq₁ + 𝒪(‖x‖²)) t + ( - 1.0e-5 - 0.00027 δq₂ + 𝒪(‖x‖²)) t² + ( 3.0e-7 - 1.8e-5 δq₁ - 1.0 δq₂ + 𝒪(‖x‖²)) t³ + ( 1.0 δq₁ + 𝒪(‖x‖²)) t⁴ + 𝒪(t⁵)

julia> z.coeffs[1] = x.coeffs[1]^1.5 # this line and the one below is similar to how variables are initialized in parsed jetcoeffs! methods in TaylorIntegration
 1.0 - 2.7 δq₁ + 𝒪(‖x‖²)

julia> z.coeffs[2:end] .= zero(z.coeffs[1]) # similar to what happens in parsed jetcoeffs! methods
4-element view(::Vector{TaylorN{Float64}}, 2:5) with eltype TaylorN{Float64}:
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)

julia> TaylorSeries.pow!(z, x, 1.5, 1) # compute 1st-order coefficients

julia> z # the 1st-order coefficients are copied into higher-order coefficients!!
  1.0 - 2.7 δq₁ + 𝒪(‖x‖²) + ( - 0.0105 + 0.03855000000000001 δq₁ + 𝒪(‖x‖²)) t + ( - 0.0105 + 0.03855000000000001 δq₁ + 𝒪(‖x‖²)) t² + ( - 0.0105 + 0.03855000000000001 δq₁ + 𝒪(‖x‖²)) t³ + ( - 0.0105 + 0.03855000000000001 δq₁ + 𝒪(‖x‖²)) t⁴ + 𝒪(t⁵)

any hints about what might be happening here?

PerezHz · 2024-02-08T21:10:54Z

A workaround is to simply broadcast zero:

z.coeffs[2:end] .= zero.(z.coeffs[1]) # <-- broadcasted zero

lbenet · 2024-02-08T21:45:23Z

I can't reproduce the problem you report here... I'm using Julia v1.10, and TaylorSeries from master...

PerezHz · 2024-02-08T21:52:08Z

I can't reproduce the problem you report #347 (comment)... I'm using Julia v1.10, and TaylorSeries from master...

Sorry for the confusion, the issue happens with this PR's branch

PerezHz · 2024-02-09T19:17:19Z

I can't reproduce the problem you report #347 (comment)... I'm using Julia v1.10, and TaylorSeries from master...

... now, if you meant that the issue pops up only using this PR then yes, although I actually would say the issue is hidden in TaylorIntegration. I'll explain in the next comment.

PerezHz · 2024-02-09T19:19:11Z

Using current TS master:

julia> using TaylorSeries

julia> dq = set_variables("δq", order=1, numvars=2)
2-element Vector{TaylorN{Float64}}:
  1.0 δq₁ + 𝒪(‖x‖²)
  1.0 δq₂ + 𝒪(‖x‖²)

julia> x = Taylor1([1.0+dq[1],-dq[1], dq[2], dq[1]-dq[2],dq[1]])
  1.0 + 1.0 δq₁ + 𝒪(‖x‖²) + ( - 1.0 δq₁ + 𝒪(‖x‖²)) t + ( 1.0 δq₂ + 𝒪(‖x‖²)) t² + ( 1.0 δq₁ - 1.0 δq₂ + 𝒪(‖x‖²)) t³ + ( 1.0 δq₁ + 𝒪(‖x‖²)) t⁴ + 𝒪(t⁵)

julia> z = deepcopy(x)
  1.0 + 1.0 δq₁ + 𝒪(‖x‖²) + ( - 1.0 δq₁ + 𝒪(‖x‖²)) t + ( 1.0 δq₂ + 𝒪(‖x‖²)) t² + ( 1.0 δq₁ - 1.0 δq₂ + 𝒪(‖x‖²)) t³ + ( 1.0 δq₁ + 𝒪(‖x‖²)) t⁴ + 𝒪(t⁵)

julia> z.coeffs[1] = x.coeffs[1]^1.5
 1.0 + 1.5 δq₁ + 𝒪(‖x‖²)

julia> z.coeffs[2:end] .= zero(z.coeffs[1]) # no zero broadcasting here
4-element view(::Vector{TaylorN{Float64}}, 2:5) with eltype TaylorN{Float64}:
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)

julia> z[1][0] = HomogeneousPolynomial(-4.0,0) # only one z coefficient is updated
 - 4.0

julia> z # but the change is copied everywhere!
  1.0 + 1.5 δq₁ + 𝒪(‖x‖²) + ( - 4.0 + 𝒪(‖x‖²)) t + ( - 4.0 + 𝒪(‖x‖²)) t² + ( - 4.0 + 𝒪(‖x‖²)) t³ + ( - 4.0 + 𝒪(‖x‖²)) t⁴ + 𝒪(t⁵)

but this can be easily solved if we broadcast zero while initially filling the coefficients of z:

julia> z.coeffs[2:end] .= zero.(z.coeffs[1]) # zero is broadcasted here
4-element view(::Vector{TaylorN{Float64}}, 2:5) with eltype TaylorN{Float64}:
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)

julia> z[1][0] = HomogeneousPolynomial(-4.0,0)
 - 4.0

julia> z
  1.0 + 1.5 δq₁ + 𝒪(‖x‖²) + ( - 4.0 + 𝒪(‖x‖²)) t + 𝒪(t⁵)

AFAIU, the issue is that if zero is not broadcasted then each of the elements of z.coeffs[2:end] point to the same place in memory:

julia> z.coeffs[2:end] .= zero(z.coeffs[1]) # no zero broadcast
4-element view(::Vector{TaylorN{Float64}}, 2:5) with eltype TaylorN{Float64}:
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)

julia> z.coeffs[2] == z.coeffs[3]
true

julia> z.coeffs[2] === z.coeffs[3] # check if memory address is the same
true

julia> z.coeffs[2:end] .= zero.(z.coeffs[1]) # now, broadcast zero
4-element view(::Vector{TaylorN{Float64}}, 2:5) with eltype TaylorN{Float64}:
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)
  0.0 + 𝒪(‖x‖²)

julia> z.coeffs[2] == z.coeffs[3]
true

julia> z.coeffs[2] === z.coeffs[3] # now each coefficient points to a different place in memory
false

PerezHz · 2024-02-09T19:28:38Z

Now, this raises the question: why does this appear in this PR and not in current master? The line that makes the trick is this:

TaylorSeries.jl/src/power.jl

Line 279 in f5587a7

zero!(res, a, ordT)

the way it currently behaves it allocates a new place in memory of each of the elements of res.coeffs, but these lines are precisely the ones that this PR touches, updating coefficients inplace while avoiding allocations. Of course, as you mentioned earlier, there are other allocations that are more tricky to remove, but we can go step by step.

If downstream tests pass (TI, PE, NEOs) then I would propose to go ahead with this and make the update in TaylorIntegration, unless there are other considerations to be made?

lbenet · 2024-02-09T19:43:39Z

I think that z.coeffs[2:end] .= zero(z.coeffs[1]) is (in this PR) somehow behaving as a view, while before (e.g. master) it was behaving as a copy; hence the allocations. Did you check that the workaround, i.e., apply broadcasting to zero too, does allocate?

lbenet · 2024-02-09T19:58:05Z

Just as a check of what I say, consider (in your example):

julia> z = deepcopy(x)
  1.0 + 1.0 δq₁ + 𝒪(‖x‖²) + ( - 1.0 δq₁ + 𝒪(‖x‖²)) t + ( 1.0 δq₂ + 𝒪(‖x‖²)) t² + ( 1.0 δq₁ - 1.0 δq₂ + 𝒪(‖x‖²)) t³ + ( 1.0 δq₁ + 𝒪(‖x‖²)) t⁴ + 𝒪(t⁵)

julia> z.coeffs[3:end] .= zero(z.coeffs[1]); # changing from the 2nd order onwards

julia> z[3][0] = HomogeneousPolynomial(6.0,0) # changing order 3!
 6.0

julia> z # order 2 is also changed, but not order 1
  1.0 + 1.0 δq₁ + 𝒪(‖x‖²) + ( - 1.0 δq₁ + 𝒪(‖x‖²)) t + ( 6.0 + 𝒪(‖x‖²)) t² + ( 6.0 + 𝒪(‖x‖²)) t³ + ( 6.0 + 𝒪(‖x‖²)) t⁴ + 𝒪(t⁵)

So indeed that seems to change all coefficients from order 2. I think this is because z[z.coeffs[3:end] is being treated as a view.

PerezHz · 2024-02-09T20:03:14Z

I think that z.coeffs[2:end] .= zero(z.coeffs[1]) is (in this PR) somehow behaving as a view, while before (e.g. master) it was behaving as a copy; hence the allocations. Did you check that the workaround, i.e., apply broadcasting to zero too, does allocate?

That's a good point, it does allocate, and even more: I would expect this since in the zero-broadcasting case it is allocating different places in memory for each z coefficient:

julia> using BenchmarkTools

julia> function zeronobc(z)
           z.coeffs[2:end] .= zero(z.coeffs[1])
           return nothing
       end
zeronobc (generic function with 1 method)

julia> function zerobc(z)
           z.coeffs[2:end] .= zero.(z.coeffs[1])
           return nothing
       end
zerobc (generic function with 1 method)

julia> @btime zeronobc($z)
  219.727 ns (10 allocations: 752 bytes)

julia> @btime zerobc($z)
  880.123 ns (40 allocations: 2.94 KiB)

lbenet · 2024-02-13T00:27:06Z

src/functions.jl

@@ -655,6 +655,75 @@ end


 # Mutating functions for Taylor1{TaylorN{T}}
+@inline function zero!(a::Taylor1{T}, k::Int) where {T<:NumberNotSeries}


Sorry for this late suggestion: what about redefining zero!(c, a, k) simply to point to zero!(c, k)? That would avoid lots of changes (which can be kept here), but in particular those related to @taylorize (in dictmutfunct.jl.

Thanks for the suggestion! I changed the definition of zero!(c, a, k)

One more suggestion on this: can you move the code (of the zero! methods introduced in this PR), closer to other zero! methods?

lbenet · 2024-02-13T00:46:58Z

src/arithmetic.jl

+function /(a::NumberNotSeries, b::Taylor1{TaylorN{T}}) where {T<:NumberNotSeries}
+    iszero(a) && !iszero(b) && return zero(a)
+
+    # order and coefficient of first factorized term
+    # In this case, since a[k]=0 for k>0, we can simplify to:
+    # ordfact, cdivfact = 0, a/b[0]
+
+    res = Taylor1(a/b[0], b.order)
+    for ordT in eachindex(res)
+        div!(res, a, b, ordT)
+    end
+    return res
+end


I propose the following here to ensure type stability (this is untested)

function /(a::NumberNotSeries, b::Taylor1{TaylorN{T}}) where {T<:NumberNotSeries} R = promote_type(typeof(a), TaylorN{T}) iszero(a) && !iszero(b) && return convert(Taylor1{R}, zero(b)) # order and coefficient of first factorized term # In this case, since a[k]=0 for k>0, we can simplify to: # ordfact, cdivfact = 0, a/b[0] res = Taylor1(a/b[0], b.order) for ordT in eachindex(res) div!(res, a, b, ordT) end return res end

Note that a::NumberNotSeries is perhaps too strong, because it leaves outside a::TaylorN{T}, which is a method worth having too.

Thanks for the suggestion! I added the fix for type-stability and widened the type for a to allow for TaylorN{T}

lbenet · 2024-02-13T00:49:15Z

src/arithmetic.jl

+@inline function div!(c::Taylor1, a::NumberNotSeries,
+        b::Taylor1, k::Int)


Perhaps we should impose c::Taylor1{T} as well as b::Taylor1{T}...

I agree, although I think we can make an extra tweak to avoid some allocations for T<:TaylorN{<:NumberNotSeries}, so I added an extra method for the latter case (will fix it in the next few commits)

src/arithmetic.jl

lbenet · 2024-02-14T17:08:14Z

Tests in TI.jl are passing now. I'll try to complete my review in the next few days. Thanks a lot!

Project.toml

PerezHz · 2024-02-16T17:36:30Z

Tests in TI.jl are passing now. I'll try to complete my review in the next few days. Thanks a lot!

Sure! There's still a couple of things I'd like to try here to save some further allocs, but indeed this is essentially ready for review.

PerezHz · 2024-02-18T22:54:11Z

I think there's indeed many things that can still be done in order to avoid allocations in mutating functions, but the changes introduced here I think are in a sense self-contained, so I'd rather leave further changes to another PR. Therefore, as it is standing now, this PR is ready for review.

PerezHz · 2024-02-18T22:55:32Z

(maybe I'm repeating myself but just wanted to be clear that modulo reviews and some fixes, I don't intend a priori to add more things to this PR)

PerezHz · 2024-02-23T21:14:41Z

I've been running some benchmarks and using this PR for JT integrations in NEOs.jl we get 30% less memory used and about 20% better (shorter) runtime in long (>100 timesteps) integrations. Short integrations, where "time to first time-step" is dominant, will not benefit that much from this PR.

lbenet · 2024-02-23T21:28:23Z

I'll have a look later; the results you just quoted are very encouraging, so we should think about merging and tagging a new version!

…mas in IntervalArithmetic compat entry

PerezHz · 2024-02-23T22:28:10Z

I'll have a look later; the results you just quoted are very encouraging, so we should think about merging and tagging a new version!

Alright! Just bumped the minor version, let me know if there is anything else you would like to improve around here.

lbenet

Thanks a lot @PerezHz, this is a nice improvement! I have left one single comment (with code), which is a suggestion that may improve performance. In any case, I'm approving this PR and merging it, so we can exploit it in TaylorIntegration!

Thanks again a lot!

lbenet · 2024-02-24T18:58:29Z

src/arithmetic.jl

+
+    @inbounds mul!(c[k], c[0], b[k])
+    @inbounds for i = 1:k-1
+        c[k] += c[i] * b[k-i]


This comment is related to the TODO above. We do have methods that indeed accumulate as mentioned above for TaylorN. I wonder if replacing this line with those methods can provide further improvements. The suggested code is:

# c[k] += c[i] * b[k-i] # current code for ord in eachindex(c[k]) mul!(c[k], c[i], b[k-i], ord) end

Thank you for the suggestion, I'll give it a try and see if we can get rid of some more allocations 😄 !

PerezHz · 2024-02-24T19:30:40Z

I'm approving this PR and merging it, so we can exploit it in TaylorIntegration!

Thanks for merging! I'll go ahead now and update things accordingly in TaylorIntegration

WIP: proof of concept: avoid allocations in inplace methods

d8989a7

Move zero! methods from arithmetic.jl to functions.jl

e1f39b3

lbenet reviewed Feb 1, 2024

View reviewed changes

src/functions.jl Outdated Show resolved Hide resolved

PerezHz added 4 commits February 1, 2024 23:10

Substitute zero! methods where appropriate

fb339f6

Substitute 0:a.order -> eachindex(a)

bca5bb0

Add one-order zero! methods

cdbb75e

Try allocation-free zero! for non-mixtures

7b3bf25

WIP: debugging zero stuff

f60296e

WIP: workaround TaylorIntegration bug

75628c7

PerezHz added 2 commits February 9, 2024 20:48

Add tests

20771fc

Fix typo

0187b2d

Update comments

e6d9167

lbenet reviewed Feb 13, 2024

View reviewed changes

src/arithmetic.jl Outdated Show resolved Hide resolved

Address review by @lbenet

c78eb49

WIP: try less allocations in div!

b9f9b13

lbenet reviewed Feb 15, 2024

View reviewed changes

Project.toml Outdated Show resolved Hide resolved

Fixes in division methods

6f05d47

PerezHz changed the title ~~WIP: avoid allocations in inplace methods (proof of concept)~~ Avoid some allocations in mutating methods Feb 16, 2024

Cleanup and add test

f07594f

PerezHz added 3 commits February 20, 2024 17:35

Cleanup

20c20d0

More cleanup

a5fceb8

Remove comments

21b99b7

Update Project.toml: bump minor version and use hyphen instead of com…

40b5267

…mas in IntervalArithmetic compat entry

lbenet approved these changes Feb 24, 2024

View reviewed changes

lbenet merged commit bc1735c into JuliaDiff:master Feb 24, 2024
12 checks passed

PerezHz deleted the jp/mul-mixtures branch February 24, 2024 19:29

lbenet mentioned this pull request Feb 25, 2024

What was breaking in 0.16 and 0.17 ? #348

Closed

PerezHz mentioned this pull request Feb 25, 2024

Further reduction of allocs in mutating methods for mixtures #349

Merged

PerezHz mentioned this pull request Jul 9, 2024

Further allocation reduction in power computations #361

Merged

PerezHz mentioned this pull request Jul 20, 2024

Use @allocated/@allocations to test for allocations in mutating methods #365

Open

		@@ -655,6 +655,75 @@ end


		# Mutating functions for Taylor1{TaylorN{T}}
		@inline function zero!(a::Taylor1{T}, k::Int) where {T<:NumberNotSeries}

		@inline function div!(c::Taylor1, a::NumberNotSeries,
		b::Taylor1, k::Int)

Avoid some allocations in mutating methods #347

Avoid some allocations in mutating methods #347

Conversation

PerezHz commented Jan 31, 2024 • edited Loading

coveralls commented Jan 31, 2024 • edited Loading

lbenet commented Jan 31, 2024

lbenet commented Jan 31, 2024

PerezHz commented Jan 31, 2024 • edited Loading

lbenet commented Feb 1, 2024

PerezHz commented Feb 2, 2024

PerezHz commented Feb 2, 2024

lbenet commented Feb 2, 2024

lbenet commented Feb 8, 2024

PerezHz commented Feb 8, 2024 • edited Loading

PerezHz commented Feb 8, 2024

lbenet commented Feb 8, 2024

PerezHz commented Feb 8, 2024

PerezHz commented Feb 9, 2024

PerezHz commented Feb 9, 2024

PerezHz commented Feb 9, 2024

lbenet commented Feb 9, 2024

lbenet commented Feb 9, 2024

PerezHz commented Feb 9, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lbenet commented Feb 14, 2024

PerezHz commented Feb 16, 2024

PerezHz commented Feb 18, 2024

PerezHz commented Feb 18, 2024

PerezHz commented Feb 23, 2024

lbenet commented Feb 23, 2024

PerezHz commented Feb 23, 2024

lbenet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PerezHz commented Feb 24, 2024

PerezHz commented Jan 31, 2024 •

edited

Loading

coveralls commented Jan 31, 2024 •

edited

Loading

PerezHz commented Jan 31, 2024 •

edited

Loading

PerezHz commented Feb 8, 2024 •

edited

Loading