Skip to content

Conversation

hwilmers
Copy link
Contributor

@hwilmers hwilmers commented May 28, 2020

MOTIVATION:

If a watchdog is used, it may be needed to kick it while the update is
running. A good place to do so is after each successfully received data
block.

IMPLEMENTATION:

If CONFIG_WATCHDOG is set, a function updatehub_watchdog_kick() is called
each time a new block was successfully received and written to flash.

The function updatehub_watchdog_kick() is declared with __weak linkage,
and can be implemented elsewhere in the application to invoke desired
functionality.

Signed-off-by: Hans Wilmers [email protected]

@hwilmers hwilmers requested a review from nandojve as a code owner May 28, 2020 13:12
@nandojve
Copy link
Member

CC @otavio

@nandojve nandojve requested a review from jukkar May 28, 2020 14:29
@nandojve nandojve added bug The issue is a bug, or the PR is fixing a bug Trivial Changes that can be reviewed by anyone, i.e. doc changes, minor build system tweaks, etc. DNM This PR should not be merged (Do Not Merge) and removed Trivial Changes that can be reviewed by anyone, i.e. doc changes, minor build system tweaks, etc. labels May 28, 2020
@nandojve
Copy link
Member

@hwilmers Thank you for this initiative. You need try to follow WDT API, take a look at samples/drivers/watchdog/src/main.c and include/drivers/watchdog.h

@hwilmers
Copy link
Contributor Author

hwilmers commented May 28, 2020

@nandojve Do you think it is safe to use the WDT API directly here? What if the hardware watchdog does not support a sufficiently long timeout?

@jukkar
Copy link
Member

jukkar commented May 28, 2020

What is missing from commit message, is a description why you would need to do this kicking.

@hwilmers hwilmers force-pushed the updatehub_kick_watchdog branch from 8037375 to a26d190 Compare May 28, 2020 22:27
@hwilmers
Copy link
Contributor Author

What is missing from commit message, is a description why you would need to do this kicking.

@jukkar ok, I have rewritten the commit message to clarify.

@hwilmers hwilmers force-pushed the updatehub_kick_watchdog branch from a26d190 to f373460 Compare June 2, 2020 10:52
@zephyrbot
Copy link

zephyrbot commented Jun 2, 2020

Some checks failed. Please fix and resubmit.

checkpatch issues

-:17: ERROR:WEAK_DECLARATION: Using weak declarations can have unintended link defects
#17: FILE: lib/updatehub/include/updatehub.h:80:
+__weak void updatehub_watchdog_kick(void);

- total: 1 errors, 0 warnings, 33 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

Your patch has style problems, please review.

NOTE: Ignored message types: AVOID_EXTERNS BRACES CONFIG_EXPERIMENTAL CONST_STRUCT DATE_TIME FILE_PATH_CHANGES MINMAX NETWORKING_BLOCK_COMMENT_STYLE PRINTK_WITHOUT_KERN_LEVEL SPDX_LICENSE_TAG SPLIT_STRING VOLATILE

NOTE: If any of the errors are false positives, please report
      them to the maintainers.

Tip: The bot edits this comment instead of posting a new one, so you can check the comment's history to see earlier messages.

@hwilmers hwilmers force-pushed the updatehub_kick_watchdog branch from f373460 to 1728153 Compare June 2, 2020 11:34
@hwilmers
Copy link
Contributor Author

hwilmers commented Jun 2, 2020

So how can I avoid checkpatch flagging WEAK_DECLARATION, if weak linkage is exactly what is wanted?

MOTIVATION:

If a watchdog is used, it may be needed to kick it while the update is
running. A good place to do so is after each successfully received data
block.

IMPLEMENTATION:

If CONFIG_WATCHDOG is set, a function updatehub_watchdog_kick() is
called each time a new block was successfully received and written to
flash.

The function updatehub_watchdog_kick() is declared with __weak linkage,
and can thus be overwritten by an implementation in the application to
invoke desired functionality.

Signed-off-by: Hans Wilmers <[email protected]>
@hwilmers hwilmers force-pushed the updatehub_kick_watchdog branch from 1728153 to e492904 Compare June 2, 2020 12:42
@hwilmers
Copy link
Contributor Author

hwilmers commented Jun 2, 2020

Shouldn't this call the standard watchdog_kick?

I think we should allow for some kind of middle layer, since the hardware watchdog possibly does not allow a sufficiently long timeout.

@otavio
Copy link
Contributor

otavio commented Jun 2, 2020

CC @nandojve

@nandojve
Copy link
Member

nandojve commented Jun 2, 2020

Shouldn't this call the standard watchdog_kick?

I think we should allow for some kind of middle layer, since the hardware watchdog possibly does not allow a sufficiently long timeout.

I'm not convinced yet that path is the proper solution. Can we explore max tries attempts and NETWORK_TIMEOUT to have a better whatchdog window, for instance?

@hwilmers
Copy link
Contributor Author

hwilmers commented Jun 3, 2020

Shouldn't this call the standard watchdog_kick?

I think we should allow for some kind of middle layer, since the hardware watchdog possibly does not allow a sufficiently long timeout.

I'm not convinced yet that path is the proper solution. Can we explore max tries attempts and NETWORK_TIMEOUT to have a better whatchdog window, for instance?

I am not sure I understand. Do you suggest to not kick the watchdog while the update is running, but rather set it to a precalculated window before the update? In this case, we need to know the number of download packets also, which is not known to the application before the update. Also, this can potentially result in a watchdog window of gigantic size.

@nandojve
Copy link
Member

nandojve commented Jun 3, 2020

Shouldn't this call the standard watchdog_kick?

I think we should allow for some kind of middle layer, since the hardware watchdog possibly does not allow a sufficiently long timeout.

I'm not convinced yet that path is the proper solution. Can we explore max tries attempts and NETWORK_TIMEOUT to have a better whatchdog window, for instance?

I am not sure I understand. Do you suggest to not kick the watchdog while the update is running, but rather set it to a precalculated window before the update? In this case, we need to know the number of download packets also, which is not known to the application before the update. Also, this can potentially result in a watchdog window of gigantic size.

Not that. I mean, can we configure the FDs timeout, retries and MAX CoAP Block somehow to be inside watchdog window to do the kick? We can configure at build time MAX COAP BLOCK from 16 up to 1024, max retries and NETWORK_TIMEOUT. Please, take a look at https://github.com/UpdateHub/zephyr/commits/topic/uhu-fix-dup-pack, it uses commits from #25683
I hope soon that branch be a PR, need run more tests.
Do you think is possible create a solution changing these parameters without an external call?

@hwilmers
Copy link
Contributor Author

hwilmers commented Jun 3, 2020

Not that. I mean, can we configure the FDs timeout, retries and MAX CoAP Block somehow to be inside watchdog window to do the kick? We can configure at build time MAX COAP BLOCK from 16 up to 1024, max retries and NETWORK_TIMEOUT. Please, take a look at https://github.com/UpdateHub/zephyr/commits/topic/uhu-fix-dup-pack, it uses commits from #25683
I hope soon that branch be a PR, need run more tests.
Do you think is possible create a solution changing these parameters without an external call?

I don't understand how this will help, if we still shall not do external calls. Do you mean to call the WDT API directly, and adjust the parameters of updatehub to it? What if the hardware watchdog only allows windows that are smaller than the NETWORK_TIMEOUT that we need?

@otavio
Copy link
Contributor

otavio commented Jun 4, 2020

@hwilmers we are not seeing the use-case you are trying to cover here.

@nandojve and I were discussing it and, from our perspective, the UpdateHub library would be running in another thread or your watchdog kick would be running on another thread and then it'd not be triggered.

Are we missing something?

@hwilmers
Copy link
Contributor Author

hwilmers commented Jun 5, 2020

@hwilmers we are not seeing the use-case you are trying to cover here.

@nandojve and I were discussing it and, from our perspective, the UpdateHub library would be running in another thread or your watchdog kick would be running on another thread and then it'd not be triggered.

Are we missing something?

Yes, you are missing something. As you are writing, the hardware watchdog is kicked in another thread. But this thread wants to know that the update operation is in a stable state. The most error prone way to ensure this, is to either give the update a max. total time to complete (which can be huge), or to find a place in code where operation is known to be stable, and kick the other thread from there. Thus my proposed code changes.

@nandojve
Copy link
Member

Hi @hwilmers, can you check #26093. We would like address the watchdog after that merge.
We understand that makes things cleaner to address this issue.

@hwilmers hwilmers marked this pull request as draft June 25, 2020 07:22
@hwilmers hwilmers closed this Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug DNM This PR should not be merged (Do Not Merge)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants