-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support union of multiple intervals #156
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #156 +/- ##
==========================================
- Coverage 99.11% 98.62% -0.49%
==========================================
Files 6 7 +1
Lines 225 291 +66
==========================================
+ Hits 223 287 +64
- Misses 2 4 +2 ☔ View full report in Codecov by Sentry. |
Is the restriction to |
The function calls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you keep the original union(d1, d2)
method for performance?
Before this PR
julia> using BenchmarkTools
julia> using IntervalSets
julia> i1 = zero(T) .. one(T)
0.0 .. 1.0
julia> i3 = one(T)/2 .. 2*one(T)
0.5 .. 2.0
julia> @benchmark ∪(i1, i3)
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
Range (min … max): 16.133 ns … 1.314 μs ┊ GC (min … max): 0.00% … 95.32%
Time (median): 17.528 ns ┊ GC (median): 0.00%
Time (mean ± σ): 19.641 ns ± 31.241 ns ┊ GC (mean ± σ): 3.63% ± 2.33%
▄▄▂▇█▆▅▄▂▃▅▄▄▂▁ ▁▁▁▁▁ ▂
████████████████▆▆▆▆▄▅▄▄▆▇▇▆▅▆▇▇█████████▇▆▅▄▄▅▅▄▁▁▆▅▆▆▇▆▅▇ █
16.1 ns Histogram: log(frequency) by time 33.5 ns <
Memory estimate: 32 bytes, allocs estimate: 1.
After this PR
julia> using BenchmarkTools
julia> using IntervalSets
julia> i1 = zero(T) .. one(T)
0.0 .. 1.0
julia> i3 = one(T)/2 .. 2*one(T)
0.5 .. 2.0
julia> @benchmark ∪(i1, i3)
BenchmarkTools.Trial: 10000 samples with 979 evaluations.
Range (min … max): 65.323 ns … 1.561 μs ┊ GC (min … max): 0.00% … 93.07%
Time (median): 68.096 ns ┊ GC (median): 0.00%
Time (mean ± σ): 74.969 ns ± 73.735 ns ┊ GC (mean ± σ): 4.94% ± 4.81%
▄██▆▃ ▂▃▃ ▁ ▁
▇███████▇▇▇▇▇████▇▇▆▇▆▆▅▅▅▅▅▄▆▅▅▅▆▇▇▆▆▅▅▆▇█▇▇▇▇▇▇▇▆▆▆▄▅▅▅▄▅ █
65.3 ns Histogram: log(frequency) by time 114 ns <
Memory estimate: 128 bytes, allocs estimate: 2.
Co-authored-by: Yuto Horikawa <[email protected]>
The implementation is now to use an iterator-based method to merge sorted intervals. The bottleneck is sorting.
For the union of 2 intervals, the old method is still 25% faster, so it's reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updating this PR!
I have not looked into the details yet, but I commented on some points.
Important changes:
There is one uncovered line. union(d::TypedEndpointsInterval) = d # 1 interval It's to avoid calling union(I::TypedEndpointsInterval...) = iterunion(sort!(collect(I); lt = leftof)) # ≥21 intervals in that case, so I don't think it's worth testing. |
Splatting is reasonably used only for small numbers of arguments in Julia. Then, the most straightforward dependency-free implementation is even more performant than this PR: julia> ints = Tuple(i..(i+1) for i in 1:10)
(1 .. 2, 2 .. 3, 3 .. 4, 4 .. 5, 5 .. 6, 6 .. 7, 7 .. 8, 8 .. 9, 9 .. 10, 10 .. 11)
# this PR:
julia> @btime union($ints...)
255.479 ns (23 allocations: 1.52 KiB)
1 .. 11
# naive implementation:
julia> myunion(ints...) = reduce(union, ints)
julia> @btime myunion($ints...)
18.556 ns (0 allocations: 0 bytes)
1 .. 11 UPD: ah, I see, it does require sorting... I heard Julia is adding tuple sorting soon though? |
Yeah, don't know the current state of the Julia PR JuliaLang/julia#46104, but copying tuple sorting from there function tsort(x::NTuple{N}; lt::Function=isless, by::Function=identity,
rev::Union{Bool,Nothing}=nothing, order::Base.Ordering=Base.Forward) where N
o = Base.ord(lt,by,rev,order)
issorted(x, o) ? x : _sort(x, o)
end
_sort(x::Union{NTuple{0}, NTuple{1}}, o::Base.Ordering) = x
function _sort(x::NTuple, o::Base.Ordering)
a, b = Base.IteratorsMD.split(x, Val(length(x)>>1))
merge(_sort(a, o), _sort(b, o), o)
end
merge(x::NTuple, y::NTuple{0}, o::Base.Ordering) = x
merge(x::NTuple{0}, y::NTuple, o::Base.Ordering) = y
merge(x::NTuple{0}, y::NTuple{0}, o::Base.Ordering) = x # Method ambiguity
merge(x::NTuple, y::NTuple, o::Base.Ordering) =
(Base.lt(o, y[1], x[1]) ? (y[1], merge(x, Base.tail(y), o)...) : (x[1], merge(Base.tail(x), y, o)...)) julia> myunion(ints...) = reduce(union, tsort(ints; by=i -> (leftendpoint(i), isleftopen(i))))
julia> @btime myunion($ints...)
59.723 ns (0 allocations: 0 bytes) |
If I understand correctly, it only supports type-stable tuple sorting, that is, all items must be of the exact same type. Edit: JuliaLang/julia#52010 By the way, that implementation is exactly the same as TupleTools.jl. |
union
operation is not supported #103This also provides a way to simplify
DomainSets.UnionDomain
of intervals.