PCA tells me that a large amount of time is being spent at a CALL instruction. Why? The CALL instruction should only consume a small part of the time spent executing the routine. First, check page faulting. Sometimes the faulting behavior of a program causes a moderately called routine to get paged out just before it is called. If that isn't the case, check for JSB linkages to an RTL routine. For performance reasons, some RTL routines use JSB linkages. This can cause confusion for the user when the /MAIN_IMAGE qualifier is used. This is especially true with PC sampling data, but can occur with any kind of data for which you can gather stack PC data. Because a JSB linkage does not place a call frame on the stack, the return address to the site of the call is lost to PCA. Consequently, the first return address found by /MAIN_IMAGE is the site of the call to the routine that called the RTL by means of a JSB linkage. As an example, suppose routine MAIN called routine FOO which in turn called the RTL via a JSB linkage. Then, suppose that a PC sampling hit occurred in the RTL. This will cause the PC of the call to FOO and the PC of the call to MAIN to be recorded. Thus, in the presence of the /MAIN_IMAGE qualifier, the first PC within the image is the PC of the call to FOO. Consequently, FOO's call site will be inflated by the number of data points in the RTL that are in routines which have JSB linkages. Note that the above can yield useful information. If you compare the time with /MAIN to the time without /MAIN, you can tell how much time was spent in JSB linkage routines. You cannot, however, separate the various JSB linkage routines. Note further that if the JSB routine is called from the main program, the data points will be lost because there is no caller of the main program.