NameSearch Product Information Batch vs. On-Line
The significant difference between batch and on-line applications
is the amount of automation needed to process a transaction. Batch
processes must rely on mechanical techniques to make decisions
concerning the fate
of a transaction record whereas on-line systems depend on human
interaction.
Performance criterion for on-line systems revolves around
response time. The limiting factor relating to response time is
the degree of I/O
performed
per name inquiry. NameSearch makes it possible to cluster records
on name keys yielding optimal utilization of I/0 resources. In
many cases it is
affordable to use large search ranges and eliminate unlikely candidates
using the comparison routine. This method allows the greatest degree
of name variations to be processed and displays only the most likely
candidates. The extra CPU expense incurred by comparing the records
in a larger
set
is nominal, since only a single entry per transaction is being
processed.
Similar to on-line systems, batch utilities, which create
reports or merge files, need to be miserly with I/O resources.
Typically, batch
applications
evaluate large input files against the database. The sequence of
records being processed can greatly effect the amount of page swapping
being managed
by the database system. By including NameSearch ranges and sorting
the transaction file based on the start value, the number of page
swaps will
be minimized and the optimum I/O utilization yield will be maximized.
Unlike on-line systems, the cost of processing large ranges
of candidates is very CPU intensive. Each record in the transaction
file must be compared
with all the records returned for that set. As the sets get larger,
the number of comparisons per transaction record increases. The
result of increasing
the number of comparisons per record is prohibitive when calculating
the total number of comparisons performed on the entire transaction
file. Batch
processes must limit the size of name search sets in order to avoid
becoming CPU bound.
In both batch and on-line systems the sizes of ranges,
use of comparison routines, and database organizations are determined
by the size of the database
being searched, performance objectives, and expectations relating
to accuracy and the expense of implementation. As the size of
your database grows these
factors become harder to balance. It is extremely important to
use a realistic test environment to tune your name search application.
An approach that
worked well for a small data sample will perform differently on
your production system.
NameSearch® General Information
|