Vectorize fusiontree manipulations #261

lkdvos · 2025-07-31T02:21:18Z

Given the speed-up of the permutations, the bottleneck has now migrated to the actual computation of the treetransformers. (mostly for tensors with many legs again)
@ogauthe has reported cases where the first run takes ~2000seconds, while the second run is ~5seconds.

While I haven't really benchmarked or profiled myself yet (shame on me!), I immediately remembered that we are not really composing the fusion tree manipulations in a very optimal way, where we are effectively manually writing out the matrix multiplications on the one hand, but also the recursive implementation means that we are recomputing a lot of transformations that we would otherwise already have encountered. This never really showed up since these computations are somewhat fast and cached, but for large amount of fusiontrees these loops become prohibitively expensive.

This PR aims to address these issues in a very similar fashion as the previous optimization run: we consider the set of all fusion trees with equal uncoupled charges, and act on these as a block. If my back-of-the-envelope calculations are correct, if there are N such fusion trees and we are doing L substeps in the manipulation, we were previously doing $\sim N^L$ operations due to the recursive nature, while now this is $\sim L N^3$.

This PR now contains the following changes:

Some clean up of calling signatures (braid had some inconsistencies in its argument orders, Index2Tuple etc)
Use fusiontensor instead of convert(Array, ::FusionTreePair) to avoid type piracy
"vectorized" manipulations of non-UniqueFusion fusionstyles to avoid the exponential scaling in number of chained manipulations.
vectorized indexmanipulations beyond TensorMap specializations, approaching the entire "indexmanipulations" kernels in a more unified way
Updated the fusiontree tests to reflect these changes

src/fusiontrees/fusiontrees.jl

Jutho · 2025-07-31T10:34:55Z

I think a first general design remark is that now all the manipulations on the OuterTreeIterator (which is not an iterator 😄 ) have in them the lines:

    trees_src = fusiontrees(fs_src)
    trees_dst = fusiontrees(fs_dst)
    indexmap = Dict(f => ind for (ind, f) in enumerate(trees_dst))

If you know start composing elementary operations to make more complex ones (e.g. bendleft/right to implement repartition), there will be at least two fusiontrees calls on the same OuterTreeIterator object. In every step the previous destination is the new source for the following operation in the composition list. The fusiontrees method goes through blocksectors and the fusiontrees(uncoupled_sectors, blocksector), which should be pretty efficient, but nonetheless have some overhead.

It might thus be a better idea to store the blocksectors and associated fusion tree pairs (and there index mapping) within the OuterTreeIterator object. The initial data can be extracted from the data the tensor has, and then throughout the sequence of different operations, the old destination becoming new source would be able to reuse the same information.

Of course, this way, the OuterTreeIterator would become a lightweight version of FusionBlockStructure, so maybe there is some way to recycle some functionality or even redesing FusionBlockStructure.

Happy to discuss when you have time.

codecov · 2025-08-01T12:15:37Z

Codecov Report

❌ Patch coverage is 83.50731% with 79 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.73%. Comparing base (cc18e95) to head (54b7abc).

Files with missing lines	Patch %	Lines
src/fusiontrees/fusiontreeblocks.jl	84.18%	59 Missing ⚠️
src/tensors/treetransformers.jl	66.66%	12 Missing ⚠️
src/fusiontrees/manipulations.jl	91.52%	5 Missing ⚠️
src/fusiontrees/fusiontrees.jl	33.33%	2 Missing ⚠️
src/tensors/braidingtensor.jl	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #261      +/-   ##
==========================================
- Coverage   82.85%   82.73%   -0.13%     
==========================================
  Files          44       45       +1     
  Lines        5757     6139     +382     
==========================================
+ Hits         4770     5079     +309     
- Misses        987     1060      +73

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ogauthe · 2025-08-01T17:42:26Z

So I compared the performances with master. I make some permutations on a bilayer PEPS tensor with size D^8, with V_D =Rep[SU₂](0=>2, 1=>3, 2=>1). I first compile the code with a fast pass at D=3, so compiling time is negligible.

TensorKit v0.14.9
1992.983477 seconds (15.41 G allocations: 2.188 TiB, 10.51% gc time, 1.87% compilation time)
ld-outer
46.581346 seconds (207.48 M allocations: 51.391 GiB, 12.97% gc time, 0.93% compilation time)
which is a factor 43 speed-up. As expected the 2nd run takes the same time in both cases (~3.5 sec). Very impressive @lkdvos !

lkdvos · 2025-08-15T10:18:57Z

As a small update here, I think I further optimized the implementations slightly, however there definitely is still some room for improvement. In particular the insertat functions are still using a recursive approach, which I think still scales badly in the number of steps, therefore hurting the performance of foldright by quite a bit, depending on the number of legs.

I'll try and run some profilers next week to see what is now actually the bottleneck, to verify what needs further improvements. In particular, I can try and reduce some allocations by trying to re-use a bunch of the unitary matrices, or we can try and cache the foldright unitaries directly if these turn out to be the limiting factor. Finally, we should probably try and split the computations of the unitaries of productsectors into their parts, which should avoid us from doubly computing some of that. (although this might actually be non-trivial in combination with the caching and the multithreading...)

kshyatt · 2025-08-21T11:29:33Z

BTW does @ogauthe have a MWE of what's causing the slowdown that could be shared, perhaps to run as part of a perf regression testsuite 👀 ?

lkdvos · 2025-08-21T13:52:11Z

using TensorKit
using TensorKit: treepermuter
Vsrc = (Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1) ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1) ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1)' ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1)' ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (1, 1) => 1)) ← (Rep[ℤ₂ × SU₂]((0, 0) => 1, (1, 1) => 1) ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1) ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1) ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1)' ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1)')
Vdst = (Rep[ℤ₂ × SU₂]((0, 0) => 1, (1, 1) => 1) ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (1, 1) => 1)') ← (Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1)' ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1) ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1)' ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1) ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1) ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1)' ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1) ⊗ Rep[ℤ₂ × SU₂]((0, 0) => 1, (0, 1) => 1, (1, 1) => 1)')
p = ((5, 6), (1, 7, 2, 8, 3, 9, 4, 10))

TensorKit.empty_globalcaches!()
@time treepermuter(Vdst, Vsrc, p);
Profile.clear()
TensorKit.empty_globalcaches!()
Profile.init(n = 10^7, delay = 0.01)
@profile treepermuter(Vdst, Vsrc, p);

Here's a relevant case I took from the code at some point. It's really just a matter of taking one of the peps contractions and looking for the dominant treepermuter cost, which you can typically identify by adding ENV["JULIA_DEBUG"] = TensorKit

ogauthe · 2025-09-02T14:03:19Z

@kshyatt here is a slightly larger example. @lkdvos minimal case should appear at some point. The timing was done using this branch as of August 15th.

using TensorOperations: @tensor
using TensorKit

function proj_braket(rD)
    fs = map(s -> sign(frobeniusschur(s)), sectors(rD))
    !(all(fs .== 1) || all(fs .== -1)) && return error("Cannot handle quaternionic irreps")

    rD2 = fuse(rD, rD')
    isoD2 = isomorphism(rD2, rD ⊗ rD')
    permuted_iso = flip(permute(isoD2', ((3,), (2, 1))), (1,))
    proj2_even_fullspace = isoD2 + permuted_iso
    proj2_odd_fullspace = isoD2 - permuted_iso

    proj2_even = rightnull(proj2_odd_fullspace; alg=SVD(), atol=1e-12)
    proj2_odd = rightnull(proj2_even_fullspace; alg=SVD(), atol=1e-12)
    return proj2_even, proj2_odd
end

function project_Z2(rd, rD)
    tket = randn(rd ← rD ⊗ rD ⊗ rD' ⊗ rD')
    pd_even_oddd = proj_braket(rd)
    projN = map(t -> permute(t, ((2, 3), (1,))), proj_braket(rD))
    projE = map(t -> permute(t, ((2, 3), (1,))), proj_braket(rD))
    projS = map(t -> permute(t, ((2, 3), (1,))), proj_braket(rD'))
    projW = map(t -> permute(t, ((2, 3), (1,))), proj_braket(rD'))

    t_double = permute(tket' ⊗ tket, ((5, 6), (1, 7, 2, 8, 3, 9, 4, 10)))

    for i_d in 1:2
        projectedd = permute(pd_even_oddd[i_d] * t_double, ((1, 4, 5, 6, 7, 8, 9), (2, 3)))
        for iN in 1:2
            @tensor projectedN[d2, D2N, ketW, braW, ketS, braS; ketE, braE] :=
                projectedd[d2, ketE, braE, ketS, braS, ketW, braW; ketN, braN] *
                projN[iN][ketN, braN, D2N]
            for iE in 1:2
                @tensor projectedNE[d2, D2N, D2E, ketW, braW; ketS, braS] :=
                    projectedN[d2, D2N, ketS, braS, ketW, braW; ketE, braE] *
                    projE[iE][ketE, braE, D2E]
                for iS in 1:2
                    iW = mod1(i_d + iN + iE + iS + 1, 2)
                    @tensor projected[d2, D2N, D2E, D2S, D2W] :=
                        projectedNE[d2, D2N, D2E, ketW, braW; ketS, braS] *
                        projS[iS][ketS, braS, D2S] *
                        projW[iW][ketW, braW, D2W]
                end
            end
        end
    end
    return 1
end


rd = Rep[ℤ₂ × SU₂]((0, 0) => 1, (1, 1) => 1)
rD7 = Rep[ℤ₂ × SU₂]((0, 0)=>1, (0, 1)=>1, (1, 1)=>1)
rD11 = Rep[ℤ₂ × SU₂]((0, 0) => 2, (0, 1) => 1, (1, 1) => 2)
rD16 = Rep[ℤ₂ × SU₂]((0, 0) => 2, (0, 1) => 1, (1, 1) => 2, (0, 2) => 1)

@time project_Z2(rd, rD7)
@time project_Z2(rd, rD7)
@time project_Z2(rd, rD11)
@time project_Z2(rd, rD11)


# 1st run D = 7
@time project_Z2(rd, rD7)
# 400.344116 seconds (1.39 G allocations: 465.458 GiB, 7.96% gc time, 24.47% compilation time)

# 2nd run D = 7
@time project_Z2(rd, rD7)
#5.408577 seconds (20.24 M allocations: 2.738 GiB, 5.07% gc time, 5.34% compilation time)

# 1st run D = 11  (*after D = 7*)
@time project_Z2(rd, rD11)
# 319.394331 seconds (1.36 G allocations: 482.086 GiB, 8.63% gc time, 0.00% compilation time)

# 2nd run D = 11
@time project_Z2(rd, rD11)
#  7.989900 seconds (20.06 M allocations: 4.710 GiB, 4.31% gc time)

ogauthe · 2025-10-29T14:06:08Z

I have been using this branch for expensive computations with SU2Irrep and Z2Irrep ⊠ SU2Irrep and I am very happy with the performance improvement. However I found an issue with SU3Irrep.

using TensorKit
using SUNRepresentations: SU3Irrep

vd = Vect[SU3Irrep]((1, 0, 0) => 1)
h = TensorMap([0.0, 2.0], vd ⊗ vd ← vd ⊗ vd)
permute(h, (1, 2, 3), (4,))

ERROR: LoadError: KeyError: key (FusionTree{Irrep[SU₃]}(((1, 0, 0), (1, 0, 0), (1, 1, 0)), (1, 0, 0), (false, false, true), ((1, 1, 0),), (1, 1)), FusionTree{Irrep[SU₃]}(((1, 0, 0),), (1, 0, 0), (false,), (), ())) not found
Stacktrace:
  [1] getindex(h::Dict{Tuple{Tuple{SU3Irrep, SU3Irrep}, Tuple{Int64, Int64}}, Int64}, key::Tuple{FusionTree{SU3Irrep, 3, 1, 2}, FusionTree{SU3Irrep, 1, 0, 0}})
    @ Base ./dict.jl:498
  [2] bendright(src::TensorKit.FusionTreeBlock{SU3Irrep, 4, 0, Tuple{FusionTree{SU3Irrep, 4, 2, 3}, FusionTree{SU3Irrep, 0, 0, 0}}})
    @ TensorKit ~/.julia/packages/TensorKit/Y7XJ1/src/fusiontrees/fusiontreeblocks.jl:138
  [3] macro expansion
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/fusiontrees/fusiontreeblocks.jl:397 [inlined]
  [4] repartition
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/fusiontrees/fusiontreeblocks.jl:379 [inlined]
  [5] repartition
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/fusiontrees/fusiontreeblocks.jl:364 [inlined]
  [6] __fsbraid(key::Tuple{TensorKit.FusionTreeBlock{SU3Irrep, 2, 2, Tuple{FusionTree{SU3Irrep, 2, 0, 1}, FusionTree{SU3Irrep, 2, 0, 1}}}, Tuple{Tuple{Int64, Int64, Int64}, Tuple{Int64}}, Tuple{Tuple{Int64, Int64}, Tuple{Int64, Int64}}})
    @ TensorKit ~/.julia/packages/TensorKit/Y7XJ1/src/fusiontrees/fusiontreeblocks.jl:658
  [7] #54
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/auxiliary/caches.jl:108 [inlined]
  [8] get!(default::TensorKit.var"#54#59"{Tuple{TensorKit.FusionTreeBlock{SU3Irrep, 2, 2, Tuple{FusionTree{SU3Irrep, 2, 0, 1}, FusionTree{SU3Irrep, 2, 0, 1}}}, Tuple{Tuple{Int64, Int64, Int64}, Tuple{Int64}}, Tuple{Tuple{Int64, Int64}, Tuple{Int64, Int64}}}}, lru::LRUCache.LRU{Any, Any}, key::Tuple{TensorKit.FusionTreeBlock{SU3Irrep, 2, 2, Tuple{FusionTree{SU3Irrep, 2, 0, 1}, FusionTree{SU3Irrep, 2, 0, 1}}}, Tuple{Tuple{Int64, Int64, Int64}, Tuple{Int64}}, Tuple{Tuple{Int64, Int64}, Tuple{Int64, Int64}}})
    @ LRUCache ~/.julia/packages/LRUCache/ZH7qB/src/LRUCache.jl:169
  [9] _fsbraid(key::Tuple{TensorKit.FusionTreeBlock{SU3Irrep, 2, 2, Tuple{FusionTree{SU3Irrep, 2, 0, 1}, FusionTree{SU3Irrep, 2, 0, 1}}}, Tuple{Tuple{Int64, Int64, Int64}, Tuple{Int64}}, Tuple{Tuple{Int64, Int64}, Tuple{Int64, Int64}}}, ::TensorKit.GlobalLRUCache)
    @ TensorKit ./reduce.jl:0
 [10] _fsbraid
    @ ./none:0 [inlined]
 [11] braid
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/fusiontrees/fusiontreeblocks.jl:618 [inlined]
 [12] permute
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/fusiontrees/fusiontreeblocks.jl:676 [inlined]
 [13] fusiontreetransform
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/tensors/treetransformers.jl:228 [inlined]
 [14] TensorKit.GenericTreeTransformer(transform::TensorKit.var"#fusiontreetransform#259"{Tuple{Tuple{Int64, Int64, Int64}, Tuple{Int64}}}, p::Tuple{Tuple{Int64, Int64, Int64}, Tuple{Int64}}, Vdst::TensorMapSpace{GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 3, 1}, Vsrc::TensorMapSpace{GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 2, 2})
    @ TensorKit ~/.julia/packages/TensorKit/Y7XJ1/src/tensors/treetransformers.jl:130
 [15] TensorKit.TreeTransformer(transform::Function, p::Tuple{Tuple{Int64, Int64, Int64}, Tuple{Int64}}, Vdst::TensorMapSpace{GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 3, 1}, Vsrc::TensorMapSpace{GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 2, 2})
    @ TensorKit ~/.julia/packages/TensorKit/Y7XJ1/src/tensors/treetransformers.jl:199
 [16] _treepermuter
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/tensors/treetransformers.jl:229 [inlined]
 [17] #257
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/auxiliary/caches.jl:108 [inlined]
 [18] get!(default::TensorKit.var"#257#262"{TensorMapSpace{GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 3, 1}, TensorMapSpace{GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 2, 2}, Tuple{Tuple{Int64, Int64, Int64}, Tuple{Int64}}}, lru::LRUCache.LRU{Any, Any}, key::Tuple{TensorMapSpace{GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 3, 1}, TensorMapSpace{GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 2, 2}, Tuple{Tuple{Int64, Int64, Int64}, Tuple{Int64}}})
    @ LRUCache ~/.julia/packages/LRUCache/ZH7qB/src/LRUCache.jl:169
 [19] treepermuter(Vdst::TensorMapSpace{GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 3, 1}, Vsrc::TensorMapSpace{GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 2, 2}, p::Tuple{Tuple{Int64, Int64, Int64}, Tuple{Int64}}, ::TensorKit.GlobalLRUCache)
    @ TensorKit ./reduce.jl:0
 [20] treepermuter
    @ ./none:0 [inlined]
 [21] treepermuter
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/tensors/treetransformers.jl:224 [inlined]
 [22] add_permute!
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/tensors/indexmanipulations.jl:405 [inlined]
 [23] permute!
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/tensors/indexmanipulations.jl:38 [inlined]
 [24] permute(t::TensorMap{Float64, GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 2, 2, Vector{Float64}}, ::Tuple{Tuple{Int64, Int64, Int64}, Tuple{Int64}}; copy::Bool)
    @ TensorKit ~/.julia/packages/TensorKit/Y7XJ1/src/tensors/indexmanipulations.jl:77
 [25] permute
    @ ~/.julia/packages/TensorKit/Y7XJ1/src/tensors/indexmanipulations.jl:65 [inlined]
 [26] permute(t::TensorMap{Float64, GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 2, 2, Vector{Float64}}, p1::Tuple{Int64, Int64, Int64}, p2::Tuple{Int64}; copy::Bool)
    @ TensorKit ./deprecated.jl:105
 [27] permute(t::TensorMap{Float64, GradedSpace{SU3Irrep, TensorKit.SortedVectorDict{SU3Irrep, Int64}}, 2, 2, Vector{Float64}}, p1::Tuple{Int64, Int64, Int64}, p2::Tuple{Int64})
    @ TensorKit ./deprecated.jl:103
 [28] top-level scope
    @ ~/Documents/tensorkit/ThermalPEPS.jl/nogit.draft.jl:9
in expression starting at /home/ogauthe/Documents/tensorkit/ThermalPEPS.jl/nogit.draft.jl:9

There is no error with the same code with TensorKit release v0.14.11. I run the code on the same machine, just changing TensorKit version so this is not a SUNRepresentation / cache issue.

lkdvos · 2025-10-29T15:40:29Z

Ah, these great self-dual sectors... Thanks for the report and MWE, I'll see if I can find out something although it might not be for this week

lkdvos · 2025-11-12T16:17:19Z

I think at least now this should be working, and it is back in line with the latest main branch so all of the latest and greatest new features should be usable now :)
This PR still needs more work to actually get out of the WIP stage, since I really need to update the tests for this, but I'm somewhat confident that it is working at least.

codecov · 2025-11-12T16:39:07Z

Codecov Report

❌ Patch coverage is 80.27938% with 240 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/fusiontrees/braiding_manipulations.jl	68.44%	71 Missing ⚠️
src/tensors/indexmanipulations.jl	42.42%	57 Missing ⚠️
src/fusiontrees/duality_manipulations.jl	90.50%	43 Missing ⚠️
src/fusiontrees/basic_manipulations.jl	81.86%	33 Missing ⚠️
src/tensors/treetransformers.jl	58.69%	19 Missing ⚠️
src/TensorKit.jl	14.28%	6 Missing ⚠️
src/tensors/braidingtensor.jl	85.71%	5 Missing ⚠️
src/fusiontrees/fusiontrees.jl	95.00%	4 Missing ⚠️
src/tensors/diagonal.jl	92.85%	2 Missing ⚠️

Files with missing lines	Coverage Δ
src/auxiliary/auxiliary.jl	`95.45% <100.00%> (+0.81%)`	⬆️
src/planar/planaroperations.jl	`72.79% <100.00%> (ø)`
src/spaces/homspace.jl	`94.20% <100.00%> (+0.32%)`	⬆️
src/tensors/abstracttensor.jl	`59.07% <100.00%> (+0.17%)`	⬆️
src/tensors/linalg.jl	`74.82% <ø> (-7.52%)`	⬇️
src/tensors/tensoroperations.jl	`92.72% <100.00%> (-4.77%)`	⬇️
src/tensors/diagonal.jl	`87.19% <92.85%> (-5.04%)`	⬇️
src/fusiontrees/fusiontrees.jl	`90.65% <95.00%> (+0.33%)`	⬆️
src/tensors/braidingtensor.jl	`68.62% <85.71%> (-2.45%)`	⬇️
src/TensorKit.jl	`13.79% <14.28%> (+0.15%)`	⬆️
... and 5 more

... and 16 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Jutho · 2026-01-05T16:02:51Z

src/fusiontrees/fusiontrees.jl

+Base.convert(A::Type{<:AbstractArray}, f::FusionTree) = convert(A, fusiontensor(f))
+# TODO: is this piracy?
+Base.convert(A::Type{<:AbstractArray}, (f₁, f₂)::FusionTreePair) =
+    convert(A, fusiontensor((f₁, f₂)))


I would say: yes this is type piracy

Co-authored-by: Olivier Gauthé <[email protected]>

into ld-outer

Jutho · 2026-01-06T11:42:20Z

src/fusiontrees/fusiontrees.jl

+        for f₁ in fusiontrees(uncoupled[1], c, isdual[1]),
+                f₂ in fusiontrees(uncoupled[2], c, isdual[2])


This makes f₂ the fastest changing "index" if I am correct. Is that the order we want the trees to be in? Does it matter?

Jutho · 2026-01-06T11:52:50Z

src/spaces/homspace.jl

+        trees_src = fusiontrees(fs_src)
+        isempty(trees_src) || push!(fblocks, fs_src)


Since isempty on FusionBlock was defined:

Suggested change

trees_src = fusiontrees(fs_src)

isempty(trees_src) || push!(fblocks, fs_src)

isempty(fs_src) || push!(fblocks, fs_src)

Jutho · 2026-01-06T13:00:26Z

src/tensors/braidingtensor.jl

+    r = _braiding_factor(f₁, f₂, b.adjoint)
+    isnothing(r) || @inbounds for i in axes(data, 1), j in axes(data, 2)
+        data[i, j, j, i] = r
    end


I don't like the || short-circuit construction too much for such a complicated expression

Suggested change

end

if !isnothing(r)

for @inbounds for i in axes(data, 1), j in axes(data, 2)

data[i, j, j, i] = r

end

end

Something went wrong with that suggestion.

src/tensors/indexmanipulations.jl

Jutho · 2026-01-06T15:07:53Z

src/fusiontrees/basic_manipulations.jl

+        throw(ArgumentError("invalid fusion vertex label $μ"))
+    end
+    f₀ = FusionTree{I}((f₁.coupled, f₂.coupled), c, (false, false), (), (μ,))
+    f, coeff = first(insertat(f₀, 1, f₁)) # takes fast path, single output


Suggested change

f, coeff = first(insertat(f₀, 1, f₁)) # takes fast path, single output

f, coeff = only(insertat(f₀, 1, f₁)) # takes fast path, single output

Jutho · 2026-01-06T15:08:09Z

src/fusiontrees/basic_manipulations.jl

+    end
+    f₀ = FusionTree{I}((f₁.coupled, f₂.coupled), c, (false, false), (), (μ,))
+    f, coeff = first(insertat(f₀, 1, f₁)) # takes fast path, single output
+    @assert coeff == one(coeff)


Not sure if we need to keep this.

Jutho · 2026-01-06T15:20:11Z

src/fusiontrees/basic_manipulations.jl

+    return fusiontreedict(I)(f₁ => Fsymbol(c, c, c, c, c, c)[1, 1, 1, 1])
+end
+
+# flip a duality flag of a fusion tree


flip really needs some comments/doc string to remember the logic. I 'll try to come up with something after rereading the PR that introduced it.

Also, I don't think this is a very basic manipulation, as it involves both duality and braiding (twists), so not sure it is really in its place here.

Ok, I am going to leave this as a TODO and think about it later. I forgot what the original constraints and design considerations were for flip.

Jutho · 2026-01-06T15:21:15Z

src/fusiontrees/duality_manipulations.jl

+        ╭ ⋯ ┴╮ |         ╭ ⋯ ╯
+      ╭─┴─╮  | |       ╭─┴─╮
+    ╭─┴─╮ |  ╰─╯     ╭─┴─╮ |
+```


That's a nice drawing.

Jutho · 2026-01-06T16:15:24Z

src/fusiontrees/duality_manipulations.jl

+    uncoupled_dst = (
+        TupleTools.front(src.uncoupled[1]),
+        (src.uncoupled[2]..., dual(src.uncoupled[1][end])),
+    )
+    isdual_dst = (
+        TupleTools.front(src.isdual[1]),
+        (src.isdual[2]..., !(src.isdual[1][end])),
+    )
+    I = sectortype(src)
+    N₁ = numout(src)
+    N₂ = numin(src)
+    @assert N₁ > 0


Suggested change

uncoupled_dst = (

TupleTools.front(src.uncoupled[1]),

(src.uncoupled[2]..., dual(src.uncoupled[1][end])),

)

isdual_dst = (

TupleTools.front(src.isdual[1]),

(src.isdual[2]..., !(src.isdual[1][end])),

)

I = sectortype(src)

N₁ = numout(src)

N₂ = numin(src)

@assert N₁ > 0

I = sectortype(src)

N₁ = numout(src)

N₂ = numin(src)

@assert N₁ > 0

uncoupled_dst = (

TupleTools.front(src.uncoupled[1]),

(src.uncoupled[2]..., dual(src.uncoupled[1][N₁])),

)

isdual_dst = (

TupleTools.front(src.isdual[1]),

(src.isdual[2]..., !(src.isdual[1][N₁])),

)

borisdevos · 2026-01-08T08:45:25Z

src/fusiontrees/duality_manipulations.jl

+                    fc = FusionTree((c1, c2), c, (!isduala, false), (), (μ,))
+                    fr_coeffs = insertat(fc, 2, f₂)
+                    for (fl′, coeff1) in insertat(fc, 2, f₁)
+                        N₁ > 1 && !isone(fl′.innerlines[1]) && continue


Suggested change

N₁ > 1 && !isone(fl′.innerlines[1]) && continue

N₁ > 1 && !isunit(fl′.innerlines[1]) && continue

What's the reason why foldleft for the fusion tree block can't use the foldright results?

I don't think there is a fundamental reason why it couldn't, this more or less gradually came about.

I started these changes for performance reasons (when it wasn't vectorized yet), and found some cases where the bookkeeping that is required for mapping one to the other was not completely insignificant (but also not that bad), so I just changed it.

However, looking at it now, doing it on a full block anyways requires some non-trivial code since you either have to construct all the fusiontrees from scratch again, or swap the pairs and make sure they are correctly sorted, and then make sure the permutation is correctly passed on to the coefficient matrix.

TLDR this was easier in my head and I didn't think too hard to try and reduce "code duplication", because it is also slightly faster

Co-authored-by: Olivier Gauthé <[email protected]>

Jutho reviewed Jul 31, 2025

View reviewed changes

src/fusiontrees/fusiontrees.jl Outdated Show resolved Hide resolved

lkdvos force-pushed the ld-outer branch 8 times, most recently from 4e40fc3 to 50db028 Compare August 1, 2025 11:47

lkdvos force-pushed the ld-outer branch from 50db028 to f35a6fc Compare August 6, 2025 16:24

lkdvos force-pushed the ld-outer branch 2 times, most recently from 1f4c68f to ecbdaa6 Compare August 14, 2025 08:24

lkdvos force-pushed the ld-outer branch from 99c8ae5 to edde8f9 Compare August 17, 2025 08:21

lkdvos force-pushed the ld-outer branch 2 times, most recently from 8507405 to fc4deeb Compare November 12, 2025 14:41

lkdvos added 4 commits November 16, 2025 09:16

FusionTreePair

65ea338

implement "vectorized" fusiontree manipulations

ba97ca1

Refactor treetransformer to make use of vectorized implementation

74b323a

fix arg order braid

610e195

lkdvos added 2 commits January 5, 2026 15:37

refactor braidingtensor access

72d0df1

fix planarcontract for braidingtensor

da8e939

Jutho reviewed Jan 5, 2026

View reviewed changes

lkdvos and others added 10 commits January 5, 2026 20:15

code suggestions

58bedb6

Co-authored-by: Olivier Gauthé <[email protected]>

docs fixes

5f71cf1

Merge branch 'ld-outer' of https://github.com/QuantumKitHub/TensorKit.jl

b44a9c5

into ld-outer

more docs fixes

c518988

(0, 0) FusionTreeBlock

db772b0

correct for multifusion

dff1193

Merge branch 'main' into ld-outer

720580a

fix missing innerlines

cf504cf

fix foldright for UniqueFusion with non-trivial Fsymbols

f07586f

Merge branch 'main' into ld-outer

ea5d346

Jutho reviewed Jan 6, 2026

View reviewed changes

actually use BLAS

90751dd

ogauthe reviewed Jan 6, 2026

View reviewed changes

src/tensors/indexmanipulations.jl Outdated Show resolved Hide resolved

Jutho reviewed Jan 6, 2026

View reviewed changes

isunit

22e6fdc

borisdevos reviewed Jan 8, 2026

View reviewed changes

lkdvos and others added 5 commits January 8, 2026 14:23

Apply suggestions from code review

3126378

Co-authored-by: Olivier Gauthé <[email protected]>

special case braiding with unit

aa1f859

Merge branch 'main' into ld-outer

b48d2f8

correctly handle stridedviews with conj flag

b5f0dd3

careful with reshapes and adjoints

67010a7

		for f₁ in fusiontrees(uncoupled[1], c, isdual[1]),
		f₂ in fusiontrees(uncoupled[2], c, isdual[2])

		trees_src = fusiontrees(fs_src)
		isempty(trees_src) \|\| push!(fblocks, fs_src)

	trees_src = fusiontrees(fs_src)
	isempty(trees_src) \|\| push!(fblocks, fs_src)
	isempty(fs_src) \|\| push!(fblocks, fs_src)

-    end
+    if !isnothing(r)
+        for @inbounds for i in axes(data, 1), j in axes(data, 2)
+            data[i, j, j, i] = r
+        end
+    end

	f, coeff = first(insertat(f₀, 1, f₁)) # takes fast path, single output
	f, coeff = only(insertat(f₀, 1, f₁)) # takes fast path, single output

	N₁ > 1 && !isone(fl′.innerlines[1]) && continue
	N₁ > 1 && !isunit(fl′.innerlines[1]) && continue

Vectorize fusiontree manipulations #261

Are you sure you want to change the base?

Vectorize fusiontree manipulations #261

Conversation

lkdvos commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Jutho commented Jul 31, 2025

Uh oh!

codecov bot commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ogauthe commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lkdvos commented Aug 15, 2025

Uh oh!

kshyatt commented Aug 21, 2025

Uh oh!

lkdvos commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogauthe commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogauthe commented Oct 29, 2025

Uh oh!

lkdvos commented Oct 29, 2025

Uh oh!

lkdvos commented Nov 12, 2025

Uh oh!

codecov bot commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

lkdvos commented Jul 31, 2025 •

edited

Loading

codecov bot commented Aug 1, 2025 •

edited

Loading

ogauthe commented Aug 1, 2025 •

edited

Loading

lkdvos commented Aug 21, 2025 •

edited

Loading

ogauthe commented Sep 2, 2025 •

edited

Loading

codecov bot commented Nov 12, 2025 •

edited

Loading