Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Repair profile after tail-merging #110656

Closed

Conversation

jakobbotsch
Copy link
Member

We can locally repair PGO data after tail-merging, which should keep global flow preservation. Also, we can do a best-effort repair in the throw-merging phase, but this one is not generally possible to repair locally.

Motivated by a case I saw in diffs of #110404.

Before:

*************** Finishing PHASE Head and tail merge
Trees after Head and tail merge

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight        IBC [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                           24533k 24533056 [000..002)-> BB05(1)                 (always)                     i IBC
BB02 [0001]  1       BB05                34933k 34932740 [002..048)-> BB04(0.1),BB03(0.9)     ( cond )                     i IBC bwd bwd-target
BB03 [0002]  1       BB02                31435k 31434760 [048..050)-> BB05(0.889),BB04(0.111) ( cond )                     i IBC bwd
BB04 [0003]  2       BB02,BB03            6994k  6993920 [050..???)-> BB07(1)                 (always)                     i IBC
BB07 [0007]  2       BB04,BB06            6994k  6993920 [058..058)                           (return)                     i IBC
BB05 [0004]  2       BB01,BB03           52472k 52471876 [058..068)-> BB02(0.666),BB06(0.334) ( cond )                     i IBC idxlen bwd bwd-src
BB06 [0005]  1       BB05                17539k 17539136 [068..072)-> BB07(1)                 (always)                     i IBC
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

(return weight does not match entry weight)
After:

*************** Finishing PHASE Head and tail merge
Trees after Head and tail merge

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight        IBC [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                           24533k 24533056 [000..002)-> BB05(1)                 (always)                     i IBC
BB02 [0001]  1       BB05                34933k 34932740 [002..048)-> BB04(0.1),BB03(0.9)     ( cond )                     i IBC bwd bwd-target
BB03 [0002]  1       BB02                31435k 31434760 [048..050)-> BB05(0.889),BB04(0.111) ( cond )                     i IBC bwd
BB04 [0003]  2       BB02,BB03            6994k  6993920 [050..???)-> BB07(1)                 (always)                     i IBC
BB07 [0007]  2       BB04,BB06           24533k 24533056 [058..058)                           (return)                     i IBC
BB05 [0004]  2       BB01,BB03           52472k 52471876 [058..068)-> BB02(0.666),BB06(0.334) ( cond )                     i IBC idxlen bwd bwd-src
BB06 [0005]  1       BB05                17539k 17539136 [068..072)-> BB07(1)                 (always)                     i IBC
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

We can locally repair PGO data after tail-merging, which should keep
global flow preservation. Also, we can do a best-effort repair in the
throw-merging phase, but this one is not generally possible to repair
locally.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 12, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@jakobbotsch jakobbotsch marked this pull request as ready for review December 12, 2024 17:36
@jakobbotsch
Copy link
Member Author

cc @amanasifkhalid @AndyAyersMS

Seems to have some pretty large diffs... but feels like a change to take on principle, since it is just correcting PGO data. Thoughts?
I haven't looked closer at the diffs, I can probably spot check a few tomorrow.

@jakobbotsch
Copy link
Member Author

Spot checked a few diffs in aspnet

  • 125889.dasm - System.Net.Http.Headers.HttpHeaders:TryGetHeaderDescriptor(System.String,byref):ubyte:this (Tier1): Corrected block weights mean that weighted ref counts of some locals become larger, making the aggressive CSE weight cutoff larger, making us no longer CSE a candidate
  • 128440.dasm - System.Reflection.CustomAttribute:IsDefined(System.Reflection.RuntimeMethodInfo,System.RuntimeType,ubyte):ubyte (Tier1): Looks like different layout
  • 126812.dasm - System.Reflection.CustomAttribute:IsDefined(System.RuntimeType,System.RuntimeType,ubyte):ubyte (Tier1): Looks like different layout

Copy link
Member

@amanasifkhalid amanasifkhalid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but feels like a change to take on principle, since it is just correcting PGO data. Thoughts?

I agree we ought to do it. I have this change locally somewhere as part of the profile consistency work -- thanks for getting to it first

src/coreclr/jit/fgehopt.cpp Outdated Show resolved Hide resolved
// call means that there is no contribution from nonCanonicalBlock to any of its successors.
// Note that this might not be the case if we have profile data from e.g. synthesis, so this
// repair is best-effort only.
canonicalBlock->bbWeight += removedWeight;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably add a helper for this, but the pattern we've been following for propagating profile data is like so:

if (canonicalBlock->hasProfileWeight())
{
    canonicalBlock->setBBProfileWeight(canonicalBlock->bbWeight + removedWeight);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to follow this pattern. Hopefully at some point hasProfileWeight will be always true and we can get rid of those checks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The block weight solver @AndyAyersMS implemented propagates the BBF_PROF_WEIGHT flag to all blocks reachable via normal flow here, so if we start using it relatively early on -- such as a replacement for the current fgComputeBlockWeights and optSetBlockWeights phases -- I imagine we can begin requiring blocks to have profile-derived weights at most flow modification sites.

// repair is best-effort only.
if (canonicalBlock->hasProfileWeight())
{
canonicalBlock->setBBProfileWeight(canonicalBlock->bbWeight + removedWeight);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you enable profile consistency checks for throw merging, I suspect this will cause asserts; since we don't convert canonicalBlock into a BBJ_THROW block until morph, the post-phase checks will see new weight into canonicalBlock that isn't being propagated to its successors. If you add some logic to make fgPgoConsistent false here, I think you'll be able to enable profile checks after this phase without issue (though I'm ok with you leaving them off for now, I can enable them in another PR).

Also, sorry for creating merge conflicts here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the comment above explains the situation. In practice this local update keeps global consistency, but in synthetic cases it does not. That's why I kept the profile checking disabled for this phase. Do you prefer to move the disablement further in favor of setting fgPgoConsistent = false here? If so I'll do that.

Copy link
Member

@amanasifkhalid amanasifkhalid Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you -- I have the checks enabled in a draft PR (#111047), so I'll have to rebase that on top of this one either way

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just close this one in favor of #111047

@jakobbotsch
Copy link
Member Author

Subsumed by #111047

@jakobbotsch jakobbotsch closed this Jan 6, 2025
@jakobbotsch jakobbotsch deleted the head-tail-merge-profile-consistency branch January 6, 2025 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants