This screen shows a summary of database users' stall activity. A user stalls whenever Oracle Rdb issues a system call on behalf of the user's process. For example, a stall occurs while a user waits for a lock or for completion of a physical disk read or write. By default, the Stall Messages screen shows all stalls, including those of millisecond duration. When a high-performance, high- volume online transaction processing (OLTP) application issues a large number of I/Os on a high-speed disk device, a DBA may find it impossible to differentiate between the many short millisecond stalls of the OLTP application and the longer, more important stalls that may be encountered by other applications using the database. By typing A to select the Alarm option from the Stall Messages horizontal menu, you can specify a duration, in seconds, that a process must stall before it appears on the Stall Messages screen. For example, if you specify an alarm interval of 5 seconds, then only stalls of 5 seconds or longer duration will appear on the Stall Messages screen. If you specify a value of 0 as the alarm interval, the default, all stalls will appear on the Stall Messages screen. By typing B once to select the Bell option from the horizontal menu, you can activate the alarm bell option and the option will be highlighted. Entering B again will deactivate the alarm bell and the option will not be highlighted. The alarm bell, even if activated, will be rung only if the alarm option has also been activated. When both the alarm and the alarm bell are activated, the alarm bell will be sounded once per screen refresh (specified by the Set_rate option) if there are any displayed stalls. By typing F to select the filter option, you can control the type of stall messages that are displayed. For example, you can display processes that are stalled while writing to a certain storage area. The filter option allows you to enter a search string which is used to filter the stall messages. Only those stall messages that contain the specified search string are displayed. The search string may contain one or both of the wildcard characters. The asterisk wildcard character is mapped to zero or more characters and the percent wildcard character is mapped to exactly one character. Note that the search string is not case sensitive. The filter menu option is highlighted when a search string is actively filtering stall messages. To disable filtering, press the Return key at the search string prompt. Stall messages are not saved in the binary statistics file created by the Output qualifier. Consequently, the Stall Messages screen is not available when you replay a binary file using the Input qualifier. This screen lists database user processes and describes the most recent stalls executed by users on the node from which the Show Statistics command was issued. Because the stall messages are sampled only at the screen update interval, most stalls are missed. If the same stall message for a process persists, it could indicate a problem. The screen also shows when a process is writing a bugcheck dump; a bugcheck dump file name longer than 53 characters is truncated. The Stall Messages screen shows only processes that are actively stalling. Once a process finishes stalling, it disappears from the screen. Processes that are still stalling ripple up to the top of the screen. This means that the longest stalling processes appear at the top of the screen. Newer stalls are added to the bottom of the screen. Therefore, all users on the same node share the same stall screen lines, and only the actively stalling processes show up on the stall screen. This allows you to monitor a relatively large number of stalling processes. If there are more stalled processes than can fit on one page, the notation "Page 1 of n" appears, where n is the total number of pages. You can display successive pages by pressing the right angle bracket (>) key or the Next Screen key. To display a previous stall message page, press the left angle bracket (<) key or the Prev Screen key. A database with no stalls has a blank stall display. You can force frequent screen updates by using a negative number for the Time qualifier in the RMU Show Statistics command. For example, Time=-10 refreshes the screen every 10/100 (1/10) of a second. Note that you use a lot of system resources, particularly on the smaller CPU machines, when you specify this time interval. If more stalls are in progress than can fit on your screen, some active stalls might not be displayed. Oracle Rdb attempts to place as many active stall messages on the screen as possible. You can use the Config menu option to configure the Stall Messages screen. Select this option, by typing the letter C, to display the configuration submenu. The configuration submenu provides the following options: o Display XXX Stall Time Display either the actual stall time or the elapsed stall time. The option description changes depending on which stall time is currently being displayed. o Set alarm interval Specify a duration, in seconds, that a process must stall before it appears on the Stall Messages screen. o XXX alarm bell Activate or deactivate the alarm bell. The option description changes depending on whether the bell is activated. o Filter Stall Messages Display stall messages that contain the specified search string. The information shown in the screen includes: o process The process ID and the Oracle Rdb stream ID of the database user. Normally the stream ID will be one (1). However, if the user is attached to multiple databases or has explicitly detached and attached to different database sessions during the same image activation, the stream ID will uniquely identify the database session. Stream ID values greater than 99 display as "**" to indicate an integer display overflow on the screen; the [Z]oom function can be used to display the full stream ID in this case. The Config menu allows you to select sorting of database users by process ID. Optionally, a single character following the stream ID field indicates additional information about the process: D - Database Recovery (DBR) R - Server for a remote user s - Database server (such as ABS, ALS, LCS, LRS, or RCS) u - Attached for utility access * - User process on another node in the cluster A - Available for per-process monitoring G - Actively being monitored o T The current transaction state. R indicates a read-only transaction and W indicates a read/write transaction. o since By default, the time at which the current stall started. The Config menu option allows you to display the elapsed stall time or the actual stall time. o stall reason The reason for the stall. For example, "waiting for..." messages indicate a stalled lock request (along with the requested lock mode). The following list describes all of the stall reason messages that can appear on the Stall Messages screen, and a brief explanation of what causes each of them. In most cases, the messages are informational and should cause little concern: o Extending .AIJ file This message displays whenever the .aij file is logically or physically extended, which should occur infrequently. o Extending .RUJ file This message displays whenever the .ruj file is physically extended, which should occur infrequently. o Extending storage area !UL This message displays whenever a storage area file (identified by its numeric identifier, which can be determined using RMU Dump) is physically extended. You can determine the numeric identifier for a database's storage areas by using the RMU Dump command. This message should occur infrequently. However, this message may occur more frequently with WORM areas because WORM area pages cannot be reused once they have been written. o Reading .AIJ file This message displays whenever the AIJ lock information needs to be refreshed; this typically only occurs the first time a user attaches to the database. The .aij file is read to determine the AIJ logical EOF (not to be confused with the OpenVMS logical EOF). It is also read by the database recovery (DBR) process. o Reading ROOT file This message displays whenever the in-memory database root information is determined to be out-of-date and must be read again from the disk. This message normally occurs only when a database parameter is modified by a user on line or some information in the database root is modified by the system (such as the AIJ sequence number). o Reading .RUJ file This message displays whenever an undo operation needs to read the next RUJ page to acquire the rollback information necessary to complete the operation. The .ruj file is read one block at a time. Sometimes a process that is not being rolled back receives this message because it was necessary to read the RUJ file in order to refresh cached recovery information. o Reading pages !UL:!UL to !UL:!UL This message displays whenever one or more pages is read into either a user's local buffer or the global buffer. One buffer full of pages is being read. The format string "!UL:!UL" identifies the physical area and the page number. o waiting for !AD (!AC) This message displays whenever a process requests a lock "with wait" and another process is holding the lock in an incompatible mode. This message may indicate a database hot spot and should be investigated using the RMU Show Locks utility. The format string "!AD" identifies the lock type (that is, storage area, page, MEMBIT, etc.) and the string "!AC" identifies the requested lock mode (PR, CR, EX, etc.). The following list contains information on "waiting for" messages: o waiting for record or page The "waiting for record" and "waiting for page" messages display a process ID, the time, and the DBKEY for a record or a page. The DBKEYs in "waiting for record" messages are logical DBKEYs. For example: waiting for record 1:0:-4 (CR) waiting for record 91:155:-1 (CW) In this example of the "waiting for record" message, the first two fields of the "waiting for record" message are not shown. The first field of a "waiting for record" message is the process ID of the stalled process, and the second field is the time the stall began. The third field in the "waiting for record" message (the field with the "XX:YY:ZZ" format) represents the DBKEY, and you can usually interpret it as "logical area number:page number:line number." However, only positive numbers represent the line number. When a negative number appears, it refers to the record ALG (adjustable lock granularity) locking level. By default, there are three page locking levels and the negative numbers are interpreted as follows: o -4 indicates the complete logical area o -3 normally indicates 1000 database pages range o -2 normally indicates 100 database pages range o -1 normally indicates 10 database pages range For example, in the second line of the example, the DBKEY occurs in logical area 91 in a range of 10 database pages, one of which is page 155. When you have a logical area number and want to get the physical area name for that logical area, follow these steps: o Issue the following command: $ RMU/DUMP/LAREAS=RDB$AIP db-name o Search the resulting dump for the logical area with that number. o Note the corresponding physical area number. o Issue the following command: $ RMU/DUMP/HEADER db-name Look up the physical area number from the output of the RMU Dump Header command to find the name of the physical area. You can also look up columns RDB$STORAGE_ID or RDB$INDEX_ID in system relations RDB$RELATIONS, RDB$INDICES and RDBVMS$STORAGE_MAP_AREAS to identify the Oracle Rdb entity (table or index) that the DBKEY represents. For a description of the system relations, see the System_Relations help topic by issuing the following command: $ HELP SQL SYSTEM_RELATIONS The page number field in the DBKEY is the number of the page in the corresponding physical area; the line number is the number of the record on that page. The dbkeys in "waiting for page" messages are physical dbkeys, for example: waiting for page 1:727 (PW) In this example of the "waiting for page" message, the first two fields of the "waiting for page" message are not shown. The first field of a "waiting for page" message is the process ID of the stalled process, and the second field is the time the stall began. The DBKEY format for a "waiting for page" message is interpreted as "physical area number:page number." When you have a physical area number and want to get the physical area name for the area, issue the RMU Dump Header command. Then look up the physical area number in the command output to find the name of the physical area. You can also get a conversion table by issuing the following command: $ RMU/ANALYZE/LAREAS/OPTION=DEBUG/END=1 - _$ /OUTPUT=LAREA.LIS db-name This command produces a printable file containing all logical areas, logical id numbers and physical id numbers for a database. CR, CW, and PW in the previous examples are requested lock modes of Concurrent Read, Concurrent Write, and Protected Write. The following table shows the lock compatibility between a current transaction and access modes other transactions can specify: Mode of Current Lock Mode of ____________________ Requested SR SW PR PW EX Lock _______________________________ SR Y Y Y Y N SW Y Y N N N PR Y N Y N N PW Y N N N N EX N N N N N _______________________________ Key to lock modes: SR - Shared Read SW - Shared Write PR - Protected Read PW - Protected Write EX - Exclusive Y - Locks are compatible N - Locks are not compatible ______________________________ Shared Read (SR) and Shared Write (SW) in the table are equivalent to Concurrent Read (CR) and Concurrent Write (CW). o waiting for DBKEY scope This message displays when a database user who attached using the DBKEY SCOPE IS TRANSACTION clause has a read/write transaction in progress (giving the user the database key scope lock in CW mode), and a second user who specifies the DBKEY SCOPE IS ATTACH clause (which would give the user the database key scope lock in PR mode) tries to attach. In this situation, the second user's process stalls until the first user's transaction completes. You can specify the database key scope at run time using the DBKEY SCOPE IS clause of the SQL ATTACH statement. If the DBKEY SCOPE IS clause is used with the SQL CREATE DATABASE or SQL IMPORT statements, the setting is in effect only for the duration of the session of the user who issued the statement; the setting does not become a database root file parameter. o waiting for snapshot cursor This message displays when a process tries to start a read-only transaction when snapshots are deferred, there is no current read-only transaction, and a read/write transaction is active. Waiting for snapshot cursor is a normal state if snapshots are deferred. The waiting will end when all read/write transactions started before the first read- only transaction have finished. o waiting for MEMBIT lock For each database, a membership data structure is maintained. The membership data structure keeps track of the nodes that are accessing the database at any given time. The membership data structure for a database is updated when the first user process from a node attaches to the database and when the last user process from a node detaches from the database. The "waiting for MEMBIT lock" message means that a process is stalled because the database's membership data structure is in the process of being updated. o waiting for client lock A client lock indicates that an Rdb metadata lock is in use. The term client indicates that Rdb is a client of the Rdb locking services. The metadata locks are used to guarantee memory copies of the metadata (table, index and column definitions) are consistent with the on-disk versions. The "waiting for client lock" message means the database user is requesting an incompatible locking mode. For example, when trying to drop a table which is in use, the drop operation requests a PROTECTED WRITE lock on the metadata object (such as a table) which is incompatible with the existing PROTECTED READ lock currently used by others of the table. These metadata locks consist of three longwords. The lock is displayed in text format first, followed by its hexadecimal representation. The text version masks out non-printable characters with a dot (.). The leftmost value seen in the hexadecimal output contains the id of the object. The id is described below for tables, routines, modules and storage map areas. o For tables and views, the id represents the unique value found in the RDB$RELATION_ID column of the RDB$RELATIONS system table for the given table. o For routines, the id represents the unique value found in the RDB$ROUTINE_ID column of the RDB$ROUTINES system table for the given routine. o For modules, the id represents the unique value found in the RDB$MODULE_ID column of the RDB$MODULES system table for the given module. o For storage map areas, the id presents the physical area id. The "waiting for client lock" message on storage map areas is very rare. This may be raised for databases which have been converted from versions prior to Rdb 5.1. The next value displayed signifies the object type. The following table describes objects and their hexadecimal type values. Object Type Values ------------------------------------- Object Hexadecimal Value ------------------------------------- Tables or views 00000004 Routines 00000006 Modules 00000015 Storage map areas 0000000E ------------------------------------- The last value in the hexadecimal output represents the lock type. The value 55 indicates this is a client lock. NOTE Because the full client lock output is long, it may require more space than is allotted for the Stall.reason column and therefore can be overwritten by the Lock.ID. column output. For more detailed lock information, perform the following steps: 1. Press the L option from the horizontal menu to display a menu of lock IDs. 2. Select the desired lock ID. o Writing .AIJ file This message displays whenever a group commit process writes the commit information to the .aij file. In a high throughput environment, the write buffer length will be as close to 64K as possible. o Writing ROOT file This message displays whenever the in-memory database root information is modified by a user on line or some information in the database root is modified by the system (such as the AIJ sequence number). o Writing .RUJ file This message displays whenever a user process writes data page modification information to the .ruj file. This message always precedes the next message. o Writing pages back to database This message displays whenever one or more data pages is written to the database. This is typically caused by a request to access those pages from another process or by detaching from the database. o lock ID The optional lock ID field is displayed only when the stall is the result of a lock request. When other types of stalls occur, such as stalls due to I/O activity, the lock ID field is cleared from the screen. When displayed, the lock ID field shows the lock identification of the resource that is stalled. You can use the lock identification number as input to the RMU Show Locks command to obtain information about processes that own, are blocking, or are waiting for locks.