Skip to content

[OpenMP][test] Define print_possible_return_addresses on SPARC #138523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

rorth
Copy link
Collaborator

@rorth rorth commented May 5, 2025

Parts of the openmp testsuite currently don't build on SPARC due to the lack of a print_possible_return_addresses definition.

This patch provides one. With it, the vast majority of tests PASS on Solaris/sparcv9 and, with an additional patch, on Linux/sparc64.

The current definition was obtained empirically.

Tested on sparcv9-sun-solaris2.11, sparc64-unknown-linux-gnu, amd64-pc-solaris2.11, and x86_64-pc-linux-gnu.

Parts of the `openmp` testsuite currently don't build on SPARC due to the
lack of a `print_possible_return_addresses` definition.

This patch provides one.  With it, the vast majority of tests `PASS` on
Solaris/sparcv9 and, with an additional patch, on Linux/sparc64.

The current definition was obtained empirically and will need additional
explanation/justification.

Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`,
`amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.
@llvmbot llvmbot added the openmp:libomp OpenMP host runtime label May 5, 2025
// FIXME: Need to distinguish between 32 and 64-bit SPARC?
// On SPARC the NOP instruction is 4 bytes long.
// FIXME: Explain. Can use __builtin_frob_return_addr?
#define print_possible_return_addresses(addr) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I don't know much about how OpenMP works so, what is being passed in addr here?
Some sort of register dump struct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed this question: never got an email notification for the update.

You can see this in runtime/test/ompt/callback.h: print_current_address first emits a nop insn followed by a local label, then passes that label's address to print_possible_return_addresses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, so this is essencially tries to find where the start of the inserted nop is right?
If so, then I understand the addr-12 case:

  • nop itself is 4 byte, so we need to decrement the label's address by that much: addr-4
  • Depending on optimization level there might be a ba plus the corresponding delay slot inserted, so need to decrement again by zero or two instruction: addr-4-0 -> addr-4 or addr-4-8 -> addr-12

See e.g https://godbolt.org/z/8hYcsWTTd
Though, as far as I can tell there's no difference between 32 and 64 bit SPARC.

But what I still don't understand is, why is addr-20 a possible address?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value printed by this function will be compared to the value of __builtin_return_address(0) executed in the kmpc function generated for the directive before the nop. So, what we try to identify here is not the IP of nop, but the IP of the instruction following the kmpc call. I extended you example by some random kmpc API call (c&p from runtime/tasking/omp51_task_dep_inoutset.c): https://godbolt.org/z/h6KhaYhMW

So, the additional bytes account for the mov following the function call (and the !APP/!NO_APP ?).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, so this is essencially tries to find where the start of the inserted nop is right? If so, then I understand the addr-12 case:

* `nop` itself is 4 byte, so we need to decrement the label's address by that much: `addr-4`

* Depending on optimization level there might be a `ba` plus the corresponding delay slot inserted, so need to decrement again by zero or two instruction: `addr-4-0` -> `addr-4` or `addr-4-8` -> `addr-12`

We don't need to worry about optimization, at least not initially: print_possible_return_addresses is only used inside the openmp testsuite, and that is always compiled with just -fopenmp without any -O option. I found that in both Release and Debug builds, where libomp itself is compiled with either optimization or debug options.

See e.g https://godbolt.org/z/8hYcsWTTd Though, as far as I can tell there's no difference between 32 and 64 bit SPARC.

But what I still don't understand is, why is addr-20 a possible address?

I saw the need only in two tests: ompt/synchronization/masked.c and ompt/synchronization/master.c. In both cases, after a compiler-inserted call to __kmpc_end_master due to #pragma omp directives, we got a code sequence like

   0x10000005fcc:	sethi  %hi(0x400), %i1
   0x10000005fd0:	add  %i1, 0x220, %i1	! 0x620
   0x10000005fd4:	call  0x100002008a0 <[email protected]>
   0x10000005fd8:	ldx  [ %i0 + %i1 ], %o0
   0x10000005fdc:	b  0x10000005fe4				<= codeptr_ra
   0x10000005fe0:	nop 
   0x10000005fe4:	nop 						<= current_addr
   0x10000005fe8:	b  0x10000005ff0
   0x10000005fec:	nop 

i.e. 8 more bytes to allow for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I think all is okay then.

@rorth rorth changed the title [OpenMP][test] Define print_possible_return_address on SPARC [OpenMP][test] Define print_possible_return_addresses on SPARC May 8, 2025
@jprotze
Copy link
Collaborator

jprotze commented May 13, 2025

If the values work empirically, this is usually good enough. If the matching at some point fails for certain architectures/optimization levels, we can just add some offsets.

@rorth
Copy link
Collaborator Author

rorth commented May 15, 2025

Thanks. I'm going to merge this patch as is for now. I've tried doing 32-bit-default builds Solaris/sparc and Linux/sparc, but both failed horribly for different reasons. This will require more investigation and bug fixes first.

@rorth rorth merged commit df9a90c into llvm:main May 15, 2025
11 checks passed
TIFitis pushed a commit to TIFitis/llvm-project that referenced this pull request May 19, 2025
…138523)

Parts of the `openmp` testsuite currently don't build on SPARC due to
the lack of a `print_possible_return_addresses` definition.

This patch provides one. With it, the vast majority of tests `PASS` on
Solaris/sparcv9 and, with an additional patch, on Linux/sparc64.

The current definition was obtained empirically.

Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`,
`amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
openmp:libomp OpenMP host runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants