The user defined filter facility provides a mechanism
whereby filters written as DEC Text Processing Utility
(DECTPU) programs are automatically executed when tests
are run. These filters are referred to as user filters.
The implementation enables users to solve easily many
filtering problems, often with a single-line program,
whilst allowing full access to the facilities of DECTPU
to solve more complex cases.
To implement a new filter, a file containing the
required DECTPU commands is created. There are a
number of predefined patterns and a global replace
procedure provided which can be used to build the
commands. For example, the following command will
replace device names that precede a directory with
the string "DEVICE":
global_replace( identifier + ':[' , 'DEVICE:[' )
Procedure global_replace is similar to the TPU pattern
style feature of Language-Sensitive Editor and filters
can be developed using that facility.
To associate a user filter with a test, a logical
variable starting with the characters "DTM$UF_"
is created. The value of the variable is the file
specification of the file containing the DECTPU
commands.
The variable is then associated with the test in the
usual way.
When the test is run, as part of a collection, the
filter will be applied.
1 – uf_variables
Variables with names beginning with the string "DTM$UF_ " are used to control user filters. These variables must be logical variables but can be global or local. When tests are run that have associated variables whose names begin with the string "DTM$UF_", DTM will apply the user filters contained in the files referenced by the value of those variables. Only a single file may be referenced by each variable. The specified files are executed by DEC Text Processing Utility (DECTPU). If more than one user filter variable is associated with a test, the files are executed in the lexicographic order of the variable names. The user filters are applied before any built-in filters that are also specified for the test. User filter files can be located either in OpenVMS directories or in Code Management System (CMS) libraries. Files may be specified using logical names including logical names that specify search lists. Wildcards cannot be used. For files in CMS libraries, the most recent generation on the main line of descent is used. Before the first file is executed the file to be filtered is read into the DECTPU buffer "filter_ buffer". Next, the file specified by the logical name DTM$UFDEFINES is executed. The system logical name DTM$UFDEFINES references the file SYS$LIBRARY:DTM$UFDEFINES.TPU, which contains definitions of a global replace procedure and patterns which can be used in building filters. This logical can be redefined to point to a custom file. Any errors in accessing the user filter files or in executing the DECTPU commands will be reported. However, they will not cause the filter operation to fail, and any remaining user and built-in filters will be applied. After all the user filters have been applied, the file being filtered will be written out. If any built-in filters are also specified, they are applied to the newly created file, resulting in a second new version. In order to disable a user filter that is defined with a global variable for a particular test, define the value of the variable for the test as a string containing only spaces.
2 – uf_record
When using the FILTER option to filter the benchmark produced by recording an interactive terminal test, user filters associated with the test will be applied, provided that the VARIABLES option is also used.
3 – uf_global_rep
The supplied file SYS$LIBRARY:DTM$UFDEFINES.TPU
contains a global replace procedure and some predefined
patterns that can be used to build filters. The
specification of procedure global_replace is as
follows:
PROCEDURE global_replace ( pattern_to_replace,
replacement_string;
search_mode,
evaluate_replacement,
convert_linefeeds)
DESCRIPTION:
Replace all occurrences of a given pattern with a
given string in the buffer "filter_buffer".
PARAMETERS:
pattern_to_replace The pattern to be replaced.
replacement_string The string to be substituted.
search_mode (optional) The mode of pattern matching
to be used when searching
for the pattern. Should be
one of:
NO_EXACT (default)
EXACT
TPU$K_SEARCH_CASE
TPU$K_SEARCH_DIACRITICAL
evaluate_replacement Specifies whether the replacement
(optional) string is to be evaluated.
Should be one of:
OFF, 0 (default)
ON, 1
If specified as ON or 1,
the replacement string is
evaluated before use. This is
needed if the replacement
string contains any partial
pattern variables. In this
case, any string literals
in the replacement string
must be specified as nested
strings and partial pattern
variables converted to strings
using the TPU procedure STR.
convert_linefeeds Specifies whether any linefeed
(optional) characters in the replacement
string are to be converted
into line breaks.
Should be one of:
OFF, 0 (default)
ON, 1
4 – uf_examples
The user filter examples are listed below.
4.1 – uf_example1
The following example assumes that the disks are named
UDISK{n} where {n} is a number, for example UDISK1,
UDISK13. This filter replaces such disk names with the
string "DISK_NAME":
global_replace ( 'UDISK' + number, 'DISK_NAME')
The pattern to replace is built from a string literal
('UDISK'), the concatenation operator (+) and the
pattern "number" included in the supplied definitions
file. The pattern "number" matches a sequence of
digits.
The replacement string is the string literal 'DISK_
NAME'.
4.2 – uf_example2
This example uses the supplied "null" pattern with
the DECTPU alternation operator to include an optional
element in a pattern.
Supposing that, in the previous example, some of the
disk names do not include the leading "U", for example
DISK7. The following filter replaces disk names with or
without the leading "U":
global_replace ( ("U"|null) + "DISK" + number,
"DISK_NAME")
4.3 – uf_example3
The following example filters dates in the form DD-MMM-
YYYY, for example 11-OCT-1999. Because it only filters
this one form of date, it is quicker than the built-in
date filter which filters many different date formats.
It is also not the exact equivalent of the built-in
date filter in other respects, for example it treats
37-NOV-0999 as a date, but should be sufficient for
most purposes.
day := any(" 123") + digit;
month := "JAN" | "FEB" | "MAR" | "APR" |
"MAY" | "JUN" | "JUL" | "AUG" |
"SEP" | "OCT" | "NOV" | "DEC";
year := any(digits,4);
date := day + "-" + month + "-" + year;
global_replace( date, "dd-mmm-yyyy");
This filter defines the pattern variables "day",
"month" and "year" which are then used to define the
pattern variable "date" used in the call to global_
replace.
The "day" pattern uses the DECTPU function "any" to
match either a space or one of the characters "1", "2"
or "3", followed by a digit.
The "month" pattern uses the DECTPU pattern alternation
operator "|" to specify a list of alternative string
literals.
The "year" pattern uses the DECTPU function "any"
with the supplied pattern "digits". The "4" parameter
indicates that exactly 4 digits are to be matched.
The "date" pattern concatenates these patterns and
linking punctuation.
4.4 – uf_example4
This filter removes blank lines using the DECTPU
keywords LINE_BEGIN and LINE_END.
global_replace( LINE_BEGIN + LINE_END, '');
The LINE_END keyword absorbs the new line.
The above filter only replaces lines containing no
characters. The following filter also replaces lines
containing only spaces and tab characters:
global_replace( LINE_BEGIN + (white_space|null) +
LINE_END, '');
4.5 – uf_example5
This example demonstrates how to use surrounding
text to identify a string to be replaced without also
replacing the surrounding text.
The following filter replaces the month part of
a date with the string "mmm". For example, the
string "14-OCT-1999" will be replaced by the string
"14-mmm-1999":
day := any(" 123") + digit;
month := "JAN" | "FEB" | "MAR" | "APR" |
"MAY" | "JUN" | "JUL" | "AUG" |
"SEP" | "OCT" | "NOV" | "DEC";
year := any(digits,4);
date := (day + "-"@day_part) + month + ("-" + year@year_part);
global_replace( date, 'str(day_part) + "mmm" + str(year_part)',,ON);
The day part of the date and the "-" character are
assigned to the partial pattern variable day_part and
the year part of the date and preceding "-" assigned
to year_part. These partial pattern variables are then
included in the replacement string.
When partial pattern variable are used in the
replacement string they must be evaluated for
each replacement. To do this, set the parameter
evaluate_replacement to ON, as shown above.
When the replacement string is to be evaluated, string
literals must be nested inside further quotes. This
is most easily done by using single quotes for the
outer string and double quotes for any nested string
literals, or vice-versa. Also, any partial pattern
variables must be converted to strings using the DECTPU
procedure STR.
Note that including LINE_END in the definition of a
partial pattern variable does not have the effect
of retaining the line break. See example 6 for a
resolution of this problem.
4.6 – uf_example6
If the search pattern contains LINE_END, the matched
line break will be removed, causing the next line to
be appended to the current line. To use LINE_END to
only provide context for the search, the line break
must be reinserted. This is done using the parameter
convert_linefeeds.
If the convert_linefeeds parameter is specified as ON,
any linefeed characters appearing in the replacement
string are removed and the built-in DECTPU procedure
SPLIT_LINE is called at the point of the linefeed
character.
The following filter replaces any numbers that are the
last characters on a line with the string "x":
global_replace (number+LINE_END, "x"+lf,,,ON)
The "lf" pattern is defined as a linefeed character in
the supplied definitions file.
If a LINE_END is included in a partial pattern
variable, the line break can be retained by specifying
the second optional parameter to the DECTPU STR
procedure as a linefeed character, for example:
global_replace (number+(LINE_END@sep),
'"x"+STR(sep,lf)',,ON,ON)
The second parameter to STR specifies the string that
line breaks occurring in the first parameter should be
converted to. Line breaks are retained by specifying
the linefeed character and setting the parameter
convert_linefeeds to ON.
4.7 – uf_example7
The DECTPU keyword UNANCHOR can be used to replace
sections of text delimited by specified strings. The
following replaces all text between the strings "/*"
and "*/" with the string "/* Text deleted */". The text
may run across line boundaries:
global_replace ( "/*" + UNANCHOR + "*/",
"/* Text deleted */")
Note that while a similar effect is possible using the
COMPARE/SENTINEL command, the filter can be applied
to individual tests, whereas the /SENTINEL qualifier
applies only to collections.
4.8 – uf_example8
The global_replace procedure can be used for many
filtering tasks. However any DECTPU commands can be
used to build filters. The file being filtered is read
into the buffer "filter_buffer" before the user filters
are applied and written out afterwards.
The following filter uses the DECTPU EDIT procedure to
convert all characters to upper case:
EDIT( filter_buffer, UPPER, OFF)
Note that while a similar effect is possible using the
COMPARE/IGNORE=CASE command, the filter can be applied
to individual tests, whereas the IGNORE qualifier
applies only to collections.
The following filter searches for numbers and replaces
them only if they are in a specified range:
POSITION (BEGINNING_OF (filter_buffer));
LOOP
found_range := SEARCH_QUIETLY (number, FORWARD);
EXITIF found_range = 0;
POSITION (END_OF(found_range));
MOVE_HORIZONTAL(1);
value := INT(STR(found_range));
IF (value>350) AND (value<570)
THEN
COPY_TEXT ("XXX");
ERASE (found_range);
ENDIF;
ENDLOOP;
The initial POSITION is required to ensure that the
whole of the filter_buffer is processed, because the
editing point is undefined at the start of each filter.
Then, as each number is processed, the editing point is
moved to the end of the number. The MOVE_HORIZONTAL
procedure call is necessary because the previous
POSITION leaves the editing point at the last character
of the number, which would result in an immediate match
on the next call to SEARCH_QUIETLY.