Skip to content

Enhance KissModem frame processing and timeout handling#2490

Open
tuzzmaniandevil wants to merge 2 commits intomeshcore-dev:devfrom
tuzzmaniandevil:dev
Open

Enhance KissModem frame processing and timeout handling#2490
tuzzmaniandevil wants to merge 2 commits intomeshcore-dev:devfrom
tuzzmaniandevil:dev

Conversation

@tuzzmaniandevil
Copy link
Copy Markdown

Fixes several bugs in the KISS modem TX state machine that could cause the modem to permanently stop sending packets over serial.

Problem

The KISS modem TX state machine had multiple paths that could lock up permanently, requiring a device reboot:

  1. TX_SENDING stuck forever — If isSendComplete() never returns true (missed radio interrupt, SPI glitch), the state machine stays in TX_SENDING indefinitely. No more packets can be sent or queued.
  2. startSendRaw() return value ignored — If the radio fails to start transmitting, the modem enters TX_SENDING waiting for a completion that never started.
  3. TX_WAIT_CLEAR stuck forever — If the radio gets stuck detecting a phantom carrier, isReceiving() returns true indefinitely and the state machine never progresses.
  4. Silent packet drops — When a DATA frame arrives while a TX is already pending, it's silently discarded with no feedback to the host application, which may hang waiting for a TX completion that will never come.

Changes

  • Dynamic TX timeout — Added a timeout to TX_SENDING using getEstAirtimeFor() * 1.5, matching the approach used by the Dispatcher in companion/repeater firmware. Adapts automatically to radio configuration instead of using a fixed 10s constant.
  • startSendRaw() error handling — Check return value; on failure, drop the packet and return to TX_IDLE instead of entering a dead state.
  • TX_WAIT_CLEAR timeout — If the channel stays busy longer than the worst-case max packet airtime × 1.5, force-proceed to TX_DELAY.
  • TX_SLOT_WAIT timer reset — Reset _tx_timer when cycling back to TX_WAIT_CLEAR so the channel-busy timeout measures time in that state, not cumulative time since the TX was queued.
  • TX-busy rejection — When a DATA frame is received while a TX is already pending, respond with HW_RESP_TX_DONE (result=0x00) so the host knows the packet was rejected.
  • TX failure notification — On TX timeout, notify the host with HW_RESP_TX_DONE (result=0x00) instead of silently dropping.

Testing

These are all state machine edge cases triggered by radio hardware faults (missed interrupts, stuck carrier detect). Verified by code review against the Dispatcher's equivalent timeout handling in src/Dispatcher.cpp.

@tuzzmaniandevil tuzzmaniandevil force-pushed the dev branch 2 times, most recently from 76be54c to 314d777 Compare May 7, 2026 21:25
Copy link
Copy Markdown
Contributor

@446564 446564 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤌

@446564
Copy link
Copy Markdown
Contributor

446564 commented May 7, 2026

@recrof covered some edge cases here, much nicer flow now

@recrof
Copy link
Copy Markdown
Member

recrof commented May 7, 2026

@ViezeVingertjes can you review please?

Copy link
Copy Markdown
Contributor

@ViezeVingertjes ViezeVingertjes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good improvement, 2 comments though, one that probably should send a done, other which im not sure about if it should. It's quite late here, but if you can clear those up, i'll turn it into an approve first thing when i wake up! 👍🏼

Comment thread examples/kiss_modem/KissModem.cpp
Comment thread examples/kiss_modem/KissModem.cpp Outdated
@tuzzmaniandevil
Copy link
Copy Markdown
Author

Thanks @ViezeVingertjes I've addressed those two comments :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants