Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli/command: ctx cancel should not print or produce a non zero exit code #5666

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Benehiko
Copy link
Member

@Benehiko Benehiko commented Dec 3, 2024

The user might kill the CLI through a SIGINT/SIGTERM
which cancels the main context we pass around.

Currently the context cancel error is printed
alongside any other wrapped error with a generic
exit code (125).

This patch improves on this behavior and prevents
any error from being printed when they match
context.Cancelled.

The cli.StatusError error would wrap errors but
not provide a way to unwrap them. This would lead
to situations where errors.Is would not match the underlying error.

Closes #5659

- What I did

- How I did it

- How to verify it

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

@codecov-commenter
Copy link

codecov-commenter commented Dec 3, 2024

Codecov Report

Attention: Patch coverage is 31.50685% with 50 lines in your changes missing coverage. Please review.

Project coverage is 59.51%. Comparing base (667ece3) to head (bed4155).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5666      +/-   ##
==========================================
+ Coverage   59.13%   59.51%   +0.37%     
==========================================
  Files         343      347       +4     
  Lines       29370    29408      +38     
==========================================
+ Hits        17369    17503     +134     
+ Misses      11029    10934      -95     
+ Partials      972      971       -1     

cli/error.go Outdated Show resolved Hide resolved
cmd/docker/docker.go Outdated Show resolved Hide resolved
cli/error.go Outdated Show resolved Hide resolved
@Benehiko
Copy link
Member Author

Benehiko commented Jan 3, 2025

@thaJeztah @vvoland do you have any concerns that would prevent this PR from getting merged?

@krissetto
Copy link
Contributor

krissetto commented Jan 7, 2025

I'm also not sure if we should be changing the StatusError.. Rather than leaving the whole PR sitting while we try to decide, i think we can split that out and at least merge the ctx cancellation bits which are good to have and improve the ux. WDYT?

@Benehiko
Copy link
Member Author

Benehiko commented Jan 7, 2025

I'm also not sure if we should be changing the StatusError.. Rather than leaving the whole PR sitting while we try to decide, i think we can split that out and at least merge the ctx cancellation bits which are good to have and improve the ux. WDYT?

We can separate it (it already is split up between 2 commits), but the UI will still print the error "user terminated the process" since the StatusError struct only accepts a string and not the actual error + children, so it cannot be used inside errors.Is since it won't be able to Unwrap the error exposing the context cancellation error.

For example:

⋊> ~/G/cli-3 on 5f221783c  ./build/docker run postgres:latest                                                                                                                              12:49:47
Unable to find image 'postgres:latest' locally
latest: Pulling from library/postgres
fd674058ff8f: Pulling fs layer
1eab12a50bdf: Pulling fs layer
5a81b4aedb94: Pulling fs layer
502eeeb4a17b: Waiting
e9e19177b318: Waiting
2068838cf5fa: Waiting
45a271dbb114: Waiting
8f9ac4ec849d: Waiting
9d8b60e88ddb: Waiting
3ec4ef471804: Waiting
16d755b48cd4: Waiting
3d5d11fb541c: Waiting
d8ab5fe30360: Waiting
d19370fe7a12: Waiting
^Cdocker: user terminated the process

Run 'docker run --help' for more information
⋊> ~/G/cli-3 on 5f221783c

The only option to not have StatusError changed and not have the output string is to just not use StatusError at all and return a different error...

You can try it for yourself, git checkout 5f221783c2a5791ec34e4070353a3125fd0847c9, make sure you don't have postgres:latest (or the image you are testing) on your machine, then run docker run postgres:latest. Cancel the pull mid way with ctrl+c and you will see the error message as shown above.

@laurazard
Copy link
Collaborator

laurazard commented Jan 7, 2025

There are other options that aren't perfect, but work in the interim, such as @thaJeztah's solution here. Would require a pass over the codebase to look at all of the other places, but that's probably fine and something that can be done iteratively.

There's always string matching too 😅
instead of

	...
	if err != nil && !errdefs.IsCancelled(err) && !errors.Is(err, errCtxUserTerminated) {
	...

do

	...
	if err != nil {
		// FIXME: replace this with errdefs.IsCancelled after changing StatusErr
		if err.Error() != "context cancelled" {
		...

@Benehiko
Copy link
Member Author

Benehiko commented Jan 7, 2025

There are other options that aren't perfect, but work in the interim, such as @thaJeztah's solution here. Would require a pass over the codebase to look at all of the other places, but that's probably fine and something that can be done iteratively.

There's always string matching too 😅 instead of

	...
	if err != nil && !errdefs.IsCancelled(err) && !errors.Is(err, errCtxUserTerminated) {
	...

do

	...
	if err != nil {
		// FIXME: replace this with errdefs.IsCancelled after changing StatusErr
		if err.Error() != "context cancelled" {
		...

I also wrestled with this a bit and I still came to the conclusion that having it match with errors.Is is more robust + we get this feature throughout the whole CLI instead of iteratively changing things in specific circumstances, which could lead some code paths to output the error while other do not.

The "best" solution I can come up with at this point in time is to replace StatusError in the places we use it with an internal version of it, then we don't break anything to external consumers of the CLI code and the behavior of the CLI improves without the risk of some commands being "left behind".

The user might kill the CLI through a SIGINT/SIGTERM
which cancels the main context we pass around.

Currently the context cancel error is printed
alongside any other wrapped error with a generic
exit code (125).

This patch improves on this behavior and prevents
any error from being printed when they match
`context.Cancelled`.

The `cli.StatusError` error would wrap errors but
not provide a way to unwrap them. This would lead
to situations where `errors.Is` would not match the
underlying error.

Signed-off-by: Alano Terblanche <[email protected]>
@Benehiko
Copy link
Member Author

Benehiko commented Jan 7, 2025

I've updated the code to use a new error called internal.StatusError instead of relying on cli.StatusError. This keeps the original cli.StatusError intact.

Now the question is:

  • If the user is calling code inside the CLI where previously cli.StatusError was returned and then comparing the error using errors.As(err, &cli.StatusError) then it will silently fail (not match on runtime, but compile fine).
  • If the user is using cli.StatusError inside their own code it will compile and behave the same.

@laurazard
Copy link
Collaborator

laurazard commented Jan 7, 2025

If the user is calling code inside the CLI where previously cli.StatusError was returned and then comparing the error using errors.As(err, &cli.StatusError) then it will silently fail (not match on runtime, but compile fine).

Yeah, I see a few instances (~250, but could be more if importing under other names) – including @ndeloof in swarmctl – of people doing this around Github, but that's a lot smaller so I'm less concerned. I'll leave it up to @thaJeztah and folks as to how acceptable that is.

Perfect is the enemy of good, and I think in projects such as these compromises need to be made sometimes to make the current situation better without breaking other things, which is why I don't think string matching as a stop-gap (that can be reverted on the next major when StatusError is changed) isn't that bad. As it is, internal.StatusError is similarly a stop-gap measure that will get removed when no longer needed, except it introduces some breakage that string matching doesn't.

As long as StatusError gets changed in the next major, no commands can be "left behind", since we'd be changing the StatusError struct itself, which forces any code referencing it to change accordingly in order to compile. In actuality, the current measure also leaves that risk – if a command is forgotten about/not updated to use the internal version of StatusError, it'll get "left behind".

@Benehiko
Copy link
Member Author

Benehiko commented Jan 7, 2025

Yeah I mean, there are only three options here (I'm okay choosing any of these):

  1. We print the error (no breakage to cli.StatusError), but the error now reads "user terminated the process" instead of "context cancelled"
  2. We stick with breaking the cli.StatusError by updating the properties (compile-time error for those setting their own errors inside cli.StatusError{})
  3. We introduce internal.StatusError with the updated changes and the code inside the CLI always references the new error type, but it silently breaks comparisons cli.StatusError != internal.StatusError.

Currently the last commit introduces the third option, but we can drop it and just go with option 1.

@laurazard
Copy link
Collaborator

FYI @Benehiko that area/context label is usually used for contexts as in docker context, not go context.Contexts.

@laurazard
Copy link
Collaborator

Yeah I mean, there are only three options here (I'm okay choosing any of these):

  1. We print the error (no breakage to cli.StatusError), but the error now reads "user terminated the process" instead of "context cancelled"
  2. We stick with breaking the cli.StatusError by updating the properties (compile-time error for those setting their own errors inside cli.StatusError{})
  3. We introduce internal.StatusError with the updated changes and the code inside the CLI always references the new error type, but it silently breaks comparisons cli.StatusError != internal.StatusError.

To be clear, the best solution is clearly changing cli.StatusError and we want to do that on the next major (v28), all we're discussing is how make things better until then.

We could simply add a field to the cli.StatusError struct without breaking compatibility, as long as we don't remove the current field. Something like

// StatusError reports an unsuccessful exit by a command.
type StatusError struct {
	Cause      error
	// Deprecated: use Cause instead.
	Status     string
	StatusCode int
}

And use that in the CLI codebase from now on, and then remove the Status field on the next major.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

don't print "context canceled" errors when canceling an action (CTRL-C)
5 participants