Skip to content

SPC Bin Counters and mmap #35

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

davideberius
Copy link
Collaborator

This is an update to the SPC code to add two major features: bin counters and the mmap interface. Bin counters are designed to allow for more context in the counters so that one counter can have several sub-counters attached to it as bins with breakpoints to denote when each bin gets incremented. These are used extensively for collective algorithm counters which count the number of times each collective algorithm is called and under which circumstances small/large communicator/message. The mmap interface creates a shared file using mmap so tools can attach to this and read SPC values directly using an XML file to point out where the data for each counter resides. This update also includes some bug fixes and quality of life improvements to the code.

@davideberius davideberius requested a review from bosilca December 18, 2019 23:14
Copy link

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! Overall the patch looks good. There are some mishaps that I marked, and few pending questions but we're getting to the end of it. I do however, deplore a certain lack of documentation, and it will not be easy for follow up developers to continue the work here without that.

ompi_spc_value_t *bins;

/* Denoted unlikely because counters will often be turned off. */
if( OPAL_UNLIKELY(IS_SPC_BIT_SET(ompi_spc_attached_event, event_id)) ) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get this check outside the function and into the upper-level macro ? If we can, we can save the function call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've added the check into the macros.

@@ -11,7 +11,7 @@
# Copyright (c) 2004-2005 The Regents of the University of California.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be here ? Did you missed a merge or maybe you rebase according to OMPI master and the ICLDisco master is slightly behind ?

@@ -3,6 +3,9 @@
* Copyright (c) 2011-2015 Los Alamos National Security, LLC. All rights
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be no changes here.

for(i = 0; i < node->num_procs; i++) {
if(NULL != (pptr = (orte_proc_t*)opal_pointer_array_get_item(node->procs, i))) {
if(fd[i] == -1) {
rc = sprintf(sm_file, "%s" OPAL_PATH_SEP "spc_data.%s.%d.%d", shm_dir,
Copy link

@bosilca bosilca Jan 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am no sure I understand what this striking similar code is doing in the 2 cases. More precisely, why are you not doing a copy of the file for the first call?

}

int rank = ompi_comm_rank(ompi_spc_comm), rc;
char filename[64];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64 might quickly become too short.

char filename[64];

if(ompi_mpi_spc_xml_string == NULL) {
rc = sprintf(filename, "%s" OPAL_PATH_SEP "spc_data.%s.%d.%d.xml", shm_dir,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you set the length for the filename string please use snprintf instead of sprintf.

index = pvar->pvar_index - mpi_t_offset;
if(index < 0) {
char *shm_dir;
if(0 == access("/dev/shm", W_OK)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use opal_shmem_mmap_backing_file_base_dir instead of the hardcoded /dev/shm ?

for(int i = 1; i < frag->num_segments; i++) {
total_data += frag->segments[i].seg_len;
}
SPC_RECORD(OMPI_SPC_UNEXPECTED_QUEUE_DATA, -total_data);
#endif
SPC_RECORD(OMPI_SPC_UNEXPECTED_IN_QUEUE, -1);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alignment ?

@@ -2,6 +2,9 @@
* Copyright (c) 2011-2017 Los Alamos National Security, LLC.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this need to be moved somewhere else as there is no ORTE directory anymore.

bin counters and the mmap interface.  Bin counters are designed
to allow for more context in the counters so that one counter
can have several sub-counters attached to it as bins with
breakpoints to denote when each bin gets incremented.
These are used extensively for collective algorithm counters
which count the number of times each collective algorithm is
called and under which circumstances small/large communicator/
message.  The mmap interface creates a shared file using mmap
so tools can attach to this and read SPC values directly using
an XML file to point out where the data for each counter
resides.  This update also includes some bug fixes and quality
of life improvements to the code.

Signed-off-by: David Eberius <[email protected]>
David Eberius added 2 commits May 11, 2020 20:24
are handled, adds new counters for internal queues such
as counters for data usage and timers for out-of-sequence
messages, and adds example code and documentation for
SPCs.  The watermark counter update reduces function call
overhead by better merging the updates to the sentinel
value and the watermark value.  There are now three SPC
example codes, one for accessing SPCs through MPI_T, one
for accessing SPCs through the mmap interface, and one
for parsing SPC snapshot datafiles and producing a
heatmap plot of the counter value differences over time.
This also adds documentation for SPCs in the form of a
markdown file in the same directory as the SPC driver
code.

Signed-off-by: David Eberius <[email protected]>
favor of prrte.  This requires an update to the
SPC driver code and documentation and necessitates
moving the SPC snapshot feature to another code
region which will be in a separate pull request.

Signed-off-by: David Eberius <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants