PRT.DevelopVisualTool History

Hide minor edits - Show changes to output - Cancel

November 22, 2012, at 07:02 AM by 24.130.186.152 -
Changed line 43 from:
00000000004021dd g F .text 0000000000000046 VMS__throw_exception
to:
00000000004021dd g F .text 0000000000000046 VMS__throw_exception
Changed line 49 from:
gdb probably has functions for resolving addresses, so if one of you can figure out how to use those instead that'd be awesome!
to:
gdb probably has functions for resolving addresses, so if one of you can figure out how to use those instead that'd be awesome!
November 22, 2012, at 07:01 AM by 24.130.186.152 -
Added line 3:
!!! Getting Started
Added line 10:
!!! Collecting Measurements and Generating Graphs
Changed lines 20-22 from:
That should be it for the first steps!

Good luck!
to:
!!! Linking Unit Numbers on Graph to Lines of Code
When looking at the constraint graph, each box has numbers on is, such as "(333, 2)" which is an identifier of a particular unit of work. The first number identifies the virtual processor, the second counts the number of times that VP has been assigned to a core. So, the instructions executed in VP "333" in between assignment "2" and assignment "3" make up the trace of one work-unit. The combination of 333 and 2 is used as a unique identifier of that work-unit.

Now, to figure out what lines of code were executed in that trace, the work-unit identifier has to be connected back to the code. The tool doesn't currently have an automated way to do this. Instead, it has to be done by hand, using a bit of intuition.

The starting point for establishing the linkage is the LoopGraph file (see above). It has a line starting with "unit" for each unit, that contains the identifier and also the address of the instruction where that unit started. So if you're looking for unit (333,2), you open the LoopGraph file, and Ctrl-S for "unit,333,2" (no spaces) and that should find the corresponding line, and it'll say something like

...
unit,333,2,0x40c5c3,0
...

and then you know that unit (333,2) starts executing at 0x40c5c3. If you also look for "unit,333,3" the start pointer for that is the suspend pointer for the previous unit, so if you find

unit,333,3,0x401562,1

then you know that unit (333,2) executed code from 0x40c5c3 to 0x401562. If that's not in the same function, you won't necessarily know how it went from one to the other, but at least it's that. Right now there's no easy way to get file and line of code that that address corresponds to, so you'll have to run objdump --syms on the binary and that'll give you something like this:

...
0000000000405585 g F .text 000000000000007d makeHist_helper
0000000000408d22 g F .text 0000000000000127 readCASQ
00000000006122f0 g O .bss 0000000000000008 dot_file
000000000040c587 g F .text 00000000000000b6 readPrivQ
0000000000406051 g F .text 0000000000000401 printHist
00000000004021dd g F .text 0000000000000046 VMS__throw_exception
00000000006121a8 g *ABS* 0000000000000000 _edata
...

except going on forever, and if you start looking for 40c5c3 you'll see that the closest address to that is readPrivQ so it's probably in there somewhere.

gdb probably has functions for resolving addresses, so if one of you can figure out how to use those instead that'd be awesome!
November 08, 2012, at 02:45 AM by 24.130.186.152 -
Changed line 3 from:
The visualization tool started life as a small script that just printed some stats, so it is not production ready. However, it is available for use for those willing to dive into the details of using it. Before being able to use it, performance counters must be enabled, as per this [[Attach:perfcounters.pdf |attached PDF]]. For more on performance counters, see [[http://opensourceresearchinstitute.org/pmwiki.php/VMS/DevelopPerformanceCounters| using performance counters]]
to:
The visualization tool started life as a small script that just printed some stats, so it is not production ready. However, it is available for use for those willing to dive into the details of using it. Before being able to use it, performance counters must be enabled, as detailed on the page: [[http://opensourceresearchinstitute.org/pmwiki.php/VMS/DevelopPerformanceCounters| using performance counters]].
November 08, 2012, at 01:03 AM by 24.130.186.152 -
Changed line 5 from:
After performance counters are enabled and working, you'll need a version of a language that is instrumented to collect the measurements. The measurements are enabled or disabled via a compiler switch (see the file "VMS_defs_turn_on_and_off.h" for all such compiler switches). One project with the instrumentation and switches already on is the SSR matrix multiply project: [[http://hg.opensourceresearchinstitute.org/cgi-bin/hgwebdir.cgi/VMS/VMS_Projects/VMS_Projects__MC_shared/SSR/SSR__Blocked_Matrix_Mult__MC_shared__Proj/rev/1414b33881aa| SSR matrix multiply project revision with measurements turned on]]
to:
After performance counters are enabled and working, you'll need a version of a language that is instrumented to collect the measurements. The measurements are enabled or disabled via a compiler switch (see the file "VMS_defs__turn_on_and_off.h" for all such compiler switches). One project with the instrumentation and switches already on is the SSR matrix multiply project: [[http://hg.opensourceresearchinstitute.org/cgi-bin/hgwebdir.cgi/VMS/VMS_Projects/VMS_Projects__MC_shared/SSR/SSR__Blocked_Matrix_Mult__MC_shared__Proj/rev/1414b33881aa| SSR matrix multiply project revision with measurements turned on]]
November 07, 2012, at 06:42 PM by 24.130.186.152 -
Changed line 3 from:
The visualization tool started life as a small script that just printed some stats, so it is not production ready. However, it is available for use for those willing to dive into the details of using it. Before being able to use it, performance counters must be enabled, as per this [[Attach:perfcounters.pdf |attached PDF]]. For more on performance counters, see
to:
The visualization tool started life as a small script that just printed some stats, so it is not production ready. However, it is available for use for those willing to dive into the details of using it. Before being able to use it, performance counters must be enabled, as per this [[Attach:perfcounters.pdf |attached PDF]]. For more on performance counters, see [[http://opensourceresearchinstitute.org/pmwiki.php/VMS/DevelopPerformanceCounters| using performance counters]]
November 07, 2012, at 06:41 PM by 24.130.186.152 -
Changed lines 3-5 from:
The visualization tool started life as a small script that just printed some stats, so it is not production ready. However, it is available for use for those willing to dive into the details of using it.

First, you'll need a version of a language that is instrumented to collect the measurements. Second, the measurement gathering has to be turned on with a compiler switch. One project that already meets this
is the SSR matrix multiply project: [[http://hg.opensourceresearchinstitute.org/cgi-bin/hgwebdir.cgi/VMS/VMS_Projects/VMS_Projects__MC_shared/SSR/SSR__Blocked_Matrix_Mult__MC_shared__Proj/rev/1414b33881aa| SSR matrix multiply project revision with measurements turned on]]
to:
The visualization tool started life as a small script that just printed some stats, so it is not production ready. However, it is available for use for those willing to dive into the details of using it. Before being able to use it, performance counters must be enabled, as per this [[Attach:perfcounters.pdf |attached PDF]]. For more on performance counters, see

After performance counters are enabled and working, you'll need a version of a language that is instrumented to collect the measurements. The measurements are enabled or disabled via a compiler switch (see the file "VMS_defs_turn_on_and_off.h" for all such compiler switches). One project with the instrumentation and switches already on
is the SSR matrix multiply project: [[http://hg.opensourceresearchinstitute.org/cgi-bin/hgwebdir.cgi/VMS/VMS_Projects/VMS_Projects__MC_shared/SSR/SSR__Blocked_Matrix_Mult__MC_shared__Proj/rev/1414b33881aa| SSR matrix multiply project revision with measurements turned on]]
September 04, 2012, at 11:50 AM by 24.130.186.152 -
Added lines 1-20:
!!Using the Performance Visualization Tool

The visualization tool started life as a small script that just printed some stats, so it is not production ready. However, it is available for use for those willing to dive into the details of using it.

First, you'll need a version of a language that is instrumented to collect the measurements. Second, the measurement gathering has to be turned on with a compiler switch. One project that already meets this is the SSR matrix multiply project: [[http://hg.opensourceresearchinstitute.org/cgi-bin/hgwebdir.cgi/VMS/VMS_Projects/VMS_Projects__MC_shared/SSR/SSR__Blocked_Matrix_Mult__MC_shared__Proj/rev/1414b33881aa| SSR matrix multiply project revision with measurements turned on]]

This project requires you to create a folder called "counters" in the run directory, where it saves three trace files per run (or fails if it can't find the folder, or it contains more than 255 files... it's not exactly production-value code).

After measurements are collected, a post-processing python script is run, which generates a graphical representation, in SVG format. This script is in the [[http://hg.opensourceresearchinstitute.org/cgi-bin/hgwebdir.cgi/VMS/2__runs_and_data/|2__runs_and_data repository]], under scripts/ucc_and_loop_graph_treatment/parse_loop_graph.py. To work with this SSR project, you want the revision (6d03033aca59).

The script calls for two command line arguments, which are the names of the trace files output during the run: the first is the one called "LoopGraph.x" and the second "Counter.x.csv" where x is whatever number was available when the file was created. (The numbers can occasionally get desynchronized if there happens to be a failure somewhere during writing of the trace files, so check the console output of the run, it says which ones were created.)
There's also a "lazy mode", if both files are in the current directory, have those canonical names, and x is the same for both files, you can avoid some typing and just use "parse_loop_graph.py x".

The script should create a file with the visual representation of the run, called LoopGraph.x.svg, or just x.svg in lazy mode (watch out what you leave lying around in the folder, it overwrites existing files) and prints out some stats. Keep an eye on the line that says something about the difference between expected and actual execution time; for a decent-sized matrix the percentage displayed shouldn't exceed 5%. If you're having issues there, it's usually because some thread(s) got interrupted by the OS; let me know and I'll switch you over to a version that uses timestamps instead.

As of this writing, the measurement of cache behavior isn't very detailed. It only collects L1 data cache read misses during work (i.e. excluding runtime overhead) and then displays the block in a shade between blue or red depending on the cache miss per instruction ratio (blue=low, red=high).

That should be it for the first steps!

Good luck!