General purpose file extraction utility. Selects records from the
beginning, middle, or end of one or more files and optionally performs
various modifications before writing them out.
format:
EXTRACT 'options' file,...
EXTRACT /RECORD=([START=m,END=n,COUNT=k]) file,...
EXTRACT /HEAD=k file,...
EXTRACT /TAIL=k file,...
1 – Parameter
file[,...] input file(s); wildcards are supported. All output is written to a single file (default is SYS$OUTPUT) even if multiple input files are specified.
2 – Qualifiers
2.1 /BLOCKS
/BLOCKS[=(option,...)]
Similar to DUMP/BLOCKS; extracts blocks from the input file(s) without
interpreting their record structure.
Up to two of the following options may be specified; if more than one
is used, separate them with a comma and enclose the list in parentheses.
START=m starting block number; the first block in the file is 1.
A negative value specifies a number of blocks relative
to the end of the file: -1 is the last block, -2 is the
one before that, etc.
END=n ending block number; the last block of the file is
considered to be block number -1.
COUNT=k number of blocks to extract. If START is specified, END
is derived by adding COUNT-1 to it; if END is specified,
START is derived by subracting COUNT-1 from it; if neither
is specified, START is 1 and END is set to COUNT.
Incompatable with /RECORDS, /HEAD, and /TAIL. Also incompatable
with /COLUMNS, /EXPAND_TABS, /TRANSLATE, and /VFC_HEADER.
2.2 /COLUMNS
/COLUMNS=([-,]column_range,...) Select or reject certain columns within extracted records before writing them to the output file. 'column_range' is either a single column number or a low and high pair separated by "-" or ":". If more than one range is desired, separate them with commas and enclose the list in parentheses. If the first element of the list is "-", then the rest of the list represents columns to reject rather than ones to select. If the first range begins with ":" (ie, ":20") then column 1 is implied; if the last range ends with ":" (ie, "41:") or ":*" then the end of the record is implied. Note: the output record consists of the selected columns in their original relative positions, not in the order listed in /COLUMNS. That is, /COLUMNS=(25:30,5:10) produces the same output as the list (5:10,25:30).
2.3 /EDIT
/EDIT=(option,...)
Perform one or more of various modifications to extracted records
before writing them to the output file. The available functions
include all options of the DCL F$EDIT() lexical function plus
several extensions:
COLLAPSE, COMPRESS, LOWERCASE, TRIM, UNCOMMENT, UPCASE,
STRIP_TRAILING, IGNORE_QUOTES, FALLBACK, FORMAT.
If more than one option is specified, separate them with commas and
enclose the list within parentheses.
See "edit_options" for more information.
2.4 /EXPAND_TABS
/[NO]EXPAND_TABS Convert ASCII tab characters into spaces. Tab stops are considered to be positioned at every 8th column: 9, 17, 25, ... By default, tabs are expanded if extraction by columns (/COLUMNS=xxx) or translation into EBCDIC (/TRANSLATE=ASCII_TO_EBCDIC) is requested or lines are being numbered (/NUMBERS), and left as tabs otherwise. Tab expansion is never performed if translation from EBCDIC into ASCII (/TRANSLATE=EBCDIC_TO_ASCII) is requested.
2.5 /HEAD
/HEAD[=count] Extract records from the beginning of the file(s). 'count' is the number of records to extract; if not present, the default value is 22. A negative 'count' value designates the number records at the end of the file to omit. That is, /HEAD=-5 will extract every record in the file except for the last 5. Incompatable with /BLOCKS, /RECORDS, and /TAIL.
2.6 /IDENTIFY
/[NO]IDENTIFY Determines whether to display the input file name before performing extraction upon its contents. By default, the file identification is performed if the input file specification is a list of files, contains any wildcards, or includes a search list. /IDENTIFY forces identification; /NOIDENTIFY suppresses identication. Note: the file identification is written to the same destination as the extracted data, so it should normally be suppressed if the data is being translated into EBCDIC or if vfc-headers are kept.
2.7 /INDEX
/INDEX=n Specifies an alternate key of reference for indexed files. Has no effect for sequential or relative files. The default is index 0.
2.8 /NUMBERS
/[NO]NUMBERS Controls whether output lines are prefixed by the line number of the input record. Format is a formatted number, then one space, then the text of the extracted record. The number is formatted using the smallest width of 4, 6, 8, or 11 columns that can hold the value. Default is /NONUMBERS for unnumbered output. Use of /NUMBERS affects the default for /[NO]EXPAND_TABS.
2.9 /OUTPUT
/OUTPUT[=file] Specifies the output file. Output is written to SYS$OUTPUT by default. File format is variable length records with implied carriage return (ie, standard text file format), unless whole records are extracted from a file having fixed length records. /VFC_HEADER=KEEP will produce a VFC format output file (but only if the input file has VFC format).
2.10 /RECORDS
/RECORDS[=(option,...)]
Similar to DUMP/RECORDS but with more flexibility in indicating which
records. Extracts specified records from the input file(s).
Up to two of the following options may be specified; if more than one
is used, separate them with a comma and enclose the list in parentheses.
START=m starting record number; the first record in the file is 1.
A negative value specifies a number of records relative
to the end of the file: -1 is the last record, -2 is the
one before that, etc.
END=n ending record number; the last record of the file is
considered to be record number -1.
COUNT=k number of records to extract. If START is specified, END
is derived by adding COUNT-1 to it; if END is specified,
START is derived by subracting COUNT-1 from it; if neither
is specified, START is 1 and END is set to COUNT.
Incompatable with /BLOCKS, /HEAD, and /TAIL.
2.11 /TAIL
/TAIL[=count] Extract records from the end of the file(s). 'count' is the number of records to extract; if not present, the default value is 22. A negative 'count' value designates the number records at the start of the file to omit. That is, /TAIL=-5 will extract the entire file except for the first 5 records. Incompatable with /BLOCKS, /RECORDS, and /HEAD.
2.12 /TRANSLATE
/TRANSLATE={ ASCII_TO_EBCDIC | EBCDIC_TO_ASCII }
Translate data (which is assumed to represent simple character text)
from ASCII into EBCDIC or vice versa. One of the two keywords must
be specified. If ASCII_TO_EBCDIC is used then tab expansion is done
unless /NOEXPAND_TABS is specified. If the reverse is used, no tab
expansion will be attempted regardless of /expand_tabs.
Translation from EBCDIC into ASCII is done before any modifications
from /EDIT are performed; translation from ASCII into EBCDIC is done
after any requested edits are performed.
2.13 /VFC_HEADER
/VFC_HEADER={ IGNORE | DATA | KEEP }
Specifies how to handle files in variable-with-fixed-control-area
format (such as batch .log files). By default, the control area is
ignored. /VFC_HEADER=DATA causes the control area, which is normally
hidden, to be treated as part of the normal record contents. A value
of KEEP causes the output file to have the same format as the input
file (rather, as the *first* input file if there is more than one).
If the [first] input file is not in VFC format, /VFC_HEADER=KEEP has
no effect.
3 – edit_options
Options available for /EDIT:
COLLAPSE -- remove all spaces and tabs
COMPRESS -- convert multiple spaces and tabs into a single space
LOWERCASE -- convert unquoted letters into lower-case
TRIM -- remove leading and trailing blanks (spaces and tabs)
UNCOMMENT -- remove comments (from "!" to end of line)
UPCASE -- convert unquoted letters into upper-case
STRIP_TRAILING -- remove trailing blanks (spaces and tabs)
IGNORE_QUOTES -- don't check quotes ("xxx") when doing edits
FALLBACK -- strip 8-bit data into 7-bit equivalent
FORMAT -- convert non-visible (control) characters into "."
If both UPCASE and LOWERCASE are specified, UPCASE takes precedence.
COLLAPSE supercedes COMPRESS and TRIM; TRIM supercedes STRIP_TRAILING.
Quoted text is subject to FALLBACK and FORMAT modifications regardless
of whether or not IGNORE_QUOTES is specified.
4 – examples
!Display the last 10 lines of Login.Com on the terminal $ EXTRACT/TAIL=10 Login.Com /IDENTIFY !Look at the first few lines of all Fortran source files $ EXTRACT/HEAD=5 *.For !Copy Test.Txt to Test.Dat, converting text into upper case ! and removing excess blanks. $ EXTRACT/EDIT=(UPCASE,COMPRESS) Test.Txt/OUTPUT=Test.Dat !Extract all but the first 10 and last 10 records of Test.Dat $ EXTRACT Test.Dat/RECORDS=(START=11,END=-11) /OUTPUT=Test.Mid !Get specific columns out of several files $ EXTRACT/COLUMNS=(1:10,18:19,25,41:*) Test.*,[...]*.Tmp