-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
behavior change, now I get timeouts on ping #175
Comments
I put loop() in a try except block. I don't know if this is an mqtt problem or a circuitpy v8 problem because both are different between my rp running v7 and an v7 minimqtt, and my rp running v8 and the latest minimqtt |
more information, I instrumented the code, the loop is supposed to exit after one second but when a ping occurs, the loop holds its context until the ping timeout. my application's main loop has time sensitive actions and shouldn't be delayed for one minute for a bad ping entering client.loop() |
Got it, Its not ping, instrumented the source available after 12/5/2023 in loop() the call to receive generates an exception from if nothing is received but its ok for nothing to be received. _wait_for_msg() ->_sock_exact_recv(1) |
in the same function, _sock_exact_recv(), the timeout isn't the tiimeout but the keep alive read_timeout = self.keep_alive |
problem solved with hack inside _wait_for_msg note: first I instrumented the code to discover the path, then I shortened the timeout for the first receive note: keep alive is the wrong value for this function, keep_alive is for ping
|
the above is not optimum because it causes ping to fail, now adding a parameter in _wait_for_msg to indicate whether a response is definitely expected or might or might not be expected and set the first wait accordingly |
Solved it. Contact me if you want to know what I did. |
Here's my understanding: The trouble is that once the receive loop (common to all MQTT message processing, not just PING) starts reading data into the buffer and hits the receive timeout (modulo #189) that pretty much results in the MQTT session to be useless, because if a partial read was already done, portion of the MQTT message was lost with the buffer (it is local variable) and if the client tried to recover, it would get the remaining portion of the message that would be interpreted as garbage w.r.t. its current state, hence the exception. Maybe it could special case for receive timeout with empty buffer however it should still give an indication (different exception perhaps) to the consumer that the receive timeout was exceeded; possibly not worth the trouble as it usually means that the connection at TCP level is dead anyway. |
My belief was that the ping timeout was colliding with the "normal" timeout. |
Could you expand on the timeout collision ? Anyhow, with the fix for #198 , the PINGREQ will be sent only if there has been no outgoing message traffic to the broker for the duration of the keep alive timeout or longer (assuming the |
The "collision" is an assumption on my part. I did not instrument/diagnose the path that generated the timeout but when I disabled ping the unwanted My assumption is that time passed so that the ping timeout was generated (even though ping may have or may have not been sent). And I'm not sure the code is "smart enough" to not invoke a ping in the middle of a mqtt transaction. |
I agree an error code should ALWAYS be returned so the issue can be handled by the program as to what happens next. Whether it is a restart or a loop to wait until connection can be established. This just makes for very unreliable programs. If you ever want CircuitPython to be taken serious them escalate issues like this one to get it correctly. This is only one of several error issues that are not handled properly. |
While important people might be listening on this thead, if you want CircuitPython to be taken seriously, you really need to:
|
Currently the need to send PINGREQ is based on whether more than keep alive time has passed since sending the last PINGREQ (which is wrong but let's continue) and is done as a first thing in the Also, thinking of this, it seems to me that the |
I have it working/hacked good enough to get by until the next circuitypython official release. I found for the rp204, that minimum timeouts less than 2 seconds led to timeouts on completing a nominally good message/hanshake, which is, for my application, 2 seconds is good enough. MQTT is just a step in a long chain of code of things that have to get done for my project. Too many things to do, too little time. |
Do you explicitly set
I understand. That said, it reminds me of https://xkcd.com/2347/ |
I made sure the timeout parameter at the loop() API made it down the functional call chain to where it needed to be applied. I may or may not have used the existing variable names, I likely did not, to prevent timeout dependency side-effects similar to what I saw with PING. |
FYI, in get_monotonic_time(), monotonic_ns()/1000000000 doesn't seem to maintain precision (at least for the RP2040 which I tested on) any better than straight up monotonic, so short timeout values (10-100ms) will start to run into errors within hours or few days of uptime. |
Is this because of the limited range that can be represented in the resulting float ? Could you create a new issue (this one is already overloaded), please ? |
Hi,
I've got a v7 running on a red featherwing, loaded with minimqtt from the v7 bundle around december 2022
and a v8 running on a black featherwing, loaded with minimqtt from the v8 bundle
The v7 with an earlier version of minimqtt runs w/o ping timeouts if I'm waiting and doing nothing
The v8 does this, after a short period of time:
File "code.py", line 1776, in
File "adafruit_minimqtt/adafruit_minimqtt.py", line 1007, in loop
File "adafruit_minimqtt/adafruit_minimqtt.py", line 1030, in _wait_for_msg
File "adafruit_minimqtt/adafruit_minimqtt.py", line 1126, in _sock_exact_recv
MMQTTException: Unable to receive 1 bytes within 60 seconds.
The problem I see is that the exception, in response to a ping should be handled in the ping,
which has its own timeout handling code, which is never reached but even if it were,
would also raise an unhandled exception if it did.
Or in the loop that contains the ping and return an error code for the app to receive and
figure out what to do.
questions:
thatis, return an error code that can be handled as to whether the app wants to restart or not,
and be accurate in that the problem is a ping timeout rather than an "expected 1 byte(s)" timeout
The text was updated successfully, but these errors were encountered: