Skip to content

Fix relative jumps #50

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 27, 2021
Merged

Conversation

wnienhaus
Copy link
Collaborator

This PR is meant to fix #41 .

The PR is more of a prototype for now (not finished). I raised it to be able to:

  1. discuss the implementation of "emit multiple instructions"
  2. get ideas on a behaviour of binutils-esp32ulp that I cannot yet explain.

1. opcode -> Multiple instructions
For now the implementation (see commit fa09eb3) works by the i_jumpr and i_jumpr methods returning a tuple when multiple instructions should be emitted, and the assembler testing for "is this a tuple" and if so, it looping over the elements. Otherwise it does what it did before, i.e. adds 1 instruction to the section. The current structure for this logic feels a bit rough, but for now I intend mostly to get feedback on the approach.

To help this, for pass 1, when we're not yet calling the i_* methods, because we only want to determine section sizes and label offsets, I added a function that can return "how many instructions will this opcode result in" based on the opcode and its arguments. The default return value is 1, but for the JUMPS and JUMPR opcodes and with the specific conditions that will result in 2 instructions, the function returns 2. Simple and effective, but technically a "duplication of logic" (because i_jumps/i_jumpr methods also make this decision). Perhaps there are other approaches for how to make this even simpler or more elegant?


2. Weird binutils-esp32 behaviour with negative immediate offsets
For a reason I cannot yet figure out, when using negative immediate offsets for JUMPS or JUMPR instructions, binutils-esp32 generates instructions differently than when using label references that are before the current PC (back-jumps).

The behaviour is as follows: for SYM references, if the offset is negative, set the sign bit and use the absolute value of the offset. for IMM references however, the sign bit is set the same way, but for the offset the negative (not absolute) value is used.

At the "field" level in the opcode the difference is such that:

  • For a negative symbol reference, e.g. -9, sign bit is 1 and the offset is 9
  • For a negative immediate reference, e.g. -2, the sign bit is 1 and the offset is 126 (which is the 7-bit two's complement of 2)

See commit 3ae1e20 for the implementation that mirrors what binutils-esp32ulp does. So "it works", in that py-esp32-ulp will now generate the same binary output, but I still don't know why binutils-esp32ulp treats these offsets differently (even after a fair bit of scanning the binutils-esp32ulp source code). If we need to jump 4 words back, because there is a label 4 instructions earlier, of whether I give an immediate offset of -4, I would expect the resulting jump instruction to be the same. I have not tested this on the actual device for how the ULP interprets these different "representations".


3. One more thing
One last thing, which is not yet in this PR, is another weird behaviour of binutils-esp32ulp. This relates to using constants defined with .set as arguments to the JUMPS and JUMPR opcodes. I would expect them to behave like immediate offsets, but they are always "interpreted" as 0.

So the following two JUMPS statements result in the exact same instruction (namely 2a00 0084):

  .set const, 42
  jumps 0, 1, lt
  jumps const, 1, lt

I started wondering whether this is a bug in binutils-esp32ulp? Or I have wondered if "resolving the constant" is deferred by binutils-esp32ulp to the linker perhaps, and maybe we're calling the linker wrong in the compat test scripts, but so far I cannot see that this is true. The binutils-esp32ulp test scripts call the linker the same way (then again, sadly they don't have this specific example in their tests).


Happy to get feedback/thought/ideas on the above, and the current implementation.

@wnienhaus
Copy link
Collaborator Author

wnienhaus commented Aug 14, 2021

Nevermind about point 3 above. I just got it! In the binutils-esp32ulp case, it is in fact the linker that fills in the const and therefore the symbol must be exported with .global const. When const is exported, both py-esp32-ulp and binutils produce the same output. I think I will leave this one as-is then, because technically, with binutils-esp32ulp, it's a case of "using it wrong", if one doesn't export the symbol.

@wnienhaus
Copy link
Collaborator Author

I'm now relatively sure, that point 2 is a bug in binutils-esp32ulp. I dug into that code, debugged around a bit and found the place where it is likely wrong.

I compared what binutils-esp32ulp does with the ULP "macro approach" in the ESP-IDF, and there the offset is always converted to absolute and the 8th bit is set based on sign (as we do in py-esp32-ulp). I also read the documentation (ESP32 TRM), which is mentioned in issue #41, and it too states that that bit 0-6 (7bits) are always a positive number and the 8th bit determines how the ULP will interpret the offset (positive or negative). So 2 sources against 1 encode the negative offsets the same way between values from symbols and immediate values.

I have a fix suggestion for binutils-esp32ulp and will send it as PR to that project and see what they say. Then I guess we'll know more.

Until then I think i'll revert the last patch of this PR, and thus simply "leave-it-as-before" (which I currently think is the right way anyway). Then we can hopefully merge this change already before any response from espressif and if we need to change, we can change that later.

@wnienhaus
Copy link
Collaborator Author

I have now made a PR to the binutils-esp32ulp project (espressif/binutils-esp32ulp#18). Let's see what they say.

@wnienhaus
Copy link
Collaborator Author

So I got a positive response to the PR to binutils-esp32ulp. I will wait until it's fully merged, and then I will update these tests to pass against the updated binutils-esp32ulp.

xxd is used by the compat test scripts to output binary dumps
of mismatching outputs for manual inspection later.

(So far the absence of this tool went by unnoticed, because the
tests had always passed)
…nstructions as per latest binutils-esp32ulp naming

Only renames. No change in behaviour/functionality.
JUMPS can only do LT, LE and GE comparisons. GT and EQ can be emulated
by the assembler generating 2 instructions using supported conditions
to achieve the desired effect.

For now only the "natively-supported" comparisons are supported.
Support for the dual-instruction jumps will be added in an upcoming
commit.

In other words this change temporarily removes support for the GT and
EQ conditions, given they resulted in incorrect instructions anyway.
Implementing them correctly requires the upcoming dual-instruction
approach.
@wnienhaus
Copy link
Collaborator Author

Now that my fix to binutils-esp32ulp has been merged, I finally managed to clean up this PR.

Points 2 and 3 from the original description of this PR are resolved. Now we can focus on the implementation of the "multi-instruction" approach.

In summary: opcodes.i_* functions can now optionally return a tuple, in which case each element is appended as an instruction to the TEXT section. A separate opcodes.no_of_instr function can be consulted during pass 1 of assembly to determine how many instructions a given opcode will result in.

Happy to get feedback on the implementation.

Copy link
Collaborator

@ThomasWaldmann ThomasWaldmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nitpicks, but looks pretty good already!

if condition == 'eq':
walk_cond = BRCOND_LT
result_cond = BRCOND_LE
elif condition == 'gt':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...  # gt == ge but not le

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add this comment also, please.

JUMPR can only do LT and GE comparisons, so the assembler implements
LE and GT conditions by using the LT and GE conditions respectively
and by adjusting the threshold accordingly.
@wnienhaus
Copy link
Collaborator Author

Fixed all the things you spotted. Thanks for that. Pushed latest code (amended commits).

Copy link
Collaborator

@ThomasWaldmann ThomasWaldmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the updates, 2 comment edits and we're done.

if condition == 'eq':
walk_cond = BRCOND_LT
result_cond = BRCOND_LE
elif condition == 'gt':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add this comment also, please.

The JUMPR and JUMPS instructions in assembly are defined to take
an offset in bytes, but the real CPU instruction expects this to
be expressed in 32-bit words. Thus, the assembler needs to
translate offsets into words (32-bits). i.e. 4 bytes is 1 word.

Only immediate values, which are multiples of 4 are allowed. Other
values will result in a ValueError. (binutils-esp32ulp actually
allows other values and rounds the result, but to get the exact same
behaviour as binutils with its peculiarities would require a bigger
change for a case, which might more likely represent a mistake than
a useful offset)

Note, since py-esp32-ulp already tracks label offsets in 32-bit words
internally, this conversion is only necessary for immediate offsets.
…te symbol

Values assigned to symbols using .set should be specified in words (not
bytes). Thus, this test ensures that the value of the const symbol is not
converted to words (divided by 4).

Since py-esp32-ulp internally tracks symbol values as words anyway, this
behaviour has been correct all along. The test aims to show that recent
changes related to immediate value handling did not break this (and to
guard against future changes that could break this).
Copy link
Collaborator

@ThomasWaldmann ThomasWaldmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, let's merge...

@ThomasWaldmann ThomasWaldmann merged commit f4b0720 into micropython:master Sep 27, 2021
@wnienhaus wnienhaus deleted the fix-relative-jumps branch October 7, 2021 06:04
@wnienhaus wnienhaus mentioned this pull request Jul 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JUMPS instruction conditions
2 participants