Threads=number
Specifies the number of reader threads to be used by the copy
process.
RMU creates so called internal 'threads' of execution to read
data from one specific storage area. Threads run quasi-parallel
within the process executing the RMU image. Each thread generates
its own I/O load and consumes resources like virtual address
space and process quotas (e.g. FILLM, BYTLM). The more threads,
the more I/Os can be generated at one point in time and the more
resources are needed to accomplish the same task.
Performance increases with more threads due to parallel
activities which keeps disk drives busier. However, at a certain
number of threads, performance suffers because the disk I/O
subsystem is saturated and I/O queues build up for the disk
drives. Also the extra CPU time for additional thread scheduling
overhead reduces the overall performance. Typically 2-5 threads
per input disk drive are sufficient to drive the disk I/O
susbsystem at its optimum. However, some controllers may be
able to handle the I/O load of more threads, for example disk
controllers with RAID sets and extra cache memory.
In a copy operation, one thread moves the data of one storage
area at a time. If there are more storage areas to be copied than
there are threads, then the next idle thread takes on the next
storage area. Storage areas are copied in order of the area size
- largest areas first. This optimizes the overall elapsed time
by allowing other threads to copy smaller areas while an earlier
thread is still working on a large area. If no threads qualifier
is specified, then 10 threads are created by default. The minimum
is 1 thread and the maximum is the number of storage areas to be
copied. If the user specifies a value larger than the number of
storage areas, then RMU silently limits the number of threads to
the number of storage areas.
For a copy operation, you can specify a threads number as low as
1. Using a threads number of 1 generates the smallest system
load in terms of working set usage and disk I/O load. Disk
I/O subsystems most likely can handle higher I/O loads. Using
a slightly larger value than 1 typically results in faster
execution time.