Wednesday, November 28, 2012

MVS Utilities Reviewer Part 2


AN OVERVIEW ON SYNCSORT 

1. OVERVIEW


Syncsort belongs to Syncsort IncC that can sort data, merge data, selectively process data, reformat data, create summary records from data and create extensive reports from input data. It can also perform any combination of the above mentioned functions and more. This document tries to explain in brief the various items related to processing data using Syncsort. This document describes a subset of the Syncsort functions and does not claim to be a replacement nor does the author guarantee the exactness of details or syntax.

2. SYNCSORT COMMANDS

2.1 SORT The SORT statement can be used sort a dataset or concatenated datasets. The SORT statement requires the sort sequence for the data. The list of fields and their formats must be specified for this statement. The output records are sorted in the specified sequence. The multiple records contain the same sort sequence key, then the options specified will determine if the input order is maintained for such records. The format of the SORT statement is as follows.
SORT FIELDS=({begcol},{length},{fieldtype},{D|A}[,{begcol}, {length},{fieldtype},{D|A}]...)

or
SORT FIELDS=({begcol},{length},{D|A}[,{begcol},{length}, {D|A}]...),{fieldtype}
The beginning column is specified in bytes, starting with 1 for the first byte. The length of the field must be specified in bytes, irrespective of the field type. There are many field types. The frequently used ones are CH for character, BI for binary (COMP fields of COBOL), PD for packed decimal (COMP-3 fields of COBOL), ZD for zone decimal and AQ for alternate collating sequence (refer to later section on alternate collating sequence). The second form of the SORT statement can be used when all the fields specified for the sort sequence are of the same type.
2.2 MERGE
The MERGE statement can be used merge two or more pre-sorted datasets. The MERGE statement requires the sort sequence for the data. The list of fields and their formats must be specified for this statement. The format of the MERGE statement is as follows.
MERGE FIELDS=({begcol},{length},{fieldtype},{D|A}[,{begcol}, {length},{fieldtype},{D|A}]...)

or
MERGE FIELDS=({begcol},{length},{D|A}[,{begcol},{length}, {D|A}]...),{fieldtype}
Please refer to SORT statement above for description of the statement.
2.3 INCLUDE AND OMIT
The INCLUDE statement can be used to specify the conditions for inclusion of records from the input during processing. The OMIT statement can be used to specify the conditions for exclusion of records from the input during processing. Both statements cannot be used together. The different formats of the INCLUDE and OMIT statements are shown below.
INCLUDE COND=({begcol1},{length1},{fldtype1},{comp.oper}, {begcol2},{length2},{fldtype2})
The above statement can be used to compare two fields within the same record. The valid comparison operators are EQ, NE, GT, LT, LE and GE.
INCLUDE COND=({begcol},{length},{fldtype},{comp.oper}, {constant})
The above statement can be used to compare a field in the record with a character, decimal or hexadecimal constant.
Using the convention that {cond.stmt} is the part of statement between the parenthesis in either of the statements mentioned above, compound statements can be constructed as follows.
INCLUDE COND=({cond.stmt},[{OR|AND},{cond.stmt}]...)
Parenthesis can be used to group conditional statements to form complex conditions. For OMIT statements replace the INCLUDE verb by OMIT verb in the above examples.
2.4 INREC AND OUTREC
The INREC statement can be specified to reformat the input records before SYNCSORT processes them for SORT or MERGE. The OUTREC statement can be specified to reformat the processed records into the required layout for the output records or the report that is generated. The format for the INREC and OUTREC statements are similar.
INREC FIELDS=([{outpos}:]{begcol},{length} [,[{outpos}:]{begcol},{length}]...)
The Outpos description is optional. This specifies the position in the output record where the field must be placed. The default is to place it at the current position in the output record, placing the first field specified at column 1.
The individual numeric fields can also be reformatted from any form to zone decimal. An example syntax is given below.
INREC FIELDS=({begcol},{length},{fldtype},EDIT=M#)
 or
INREC FIELDS=({begcol},{length},{fldtype},EDIT=SIII,IIT.TT)
Default picture clauses are provided and named M0 to M9. Customized picture clauses can be specified by using appropriate syntax. In the example above, the S character is for sign field, I is equivalent of Z PIC clause of COBOL and T is equivalent of 9 PIC clause. Conventions for the sign displayed for numeric fields can be specified after the edit parameter, using SIGNS parameter.
Constant fields can also be introduced in the record. For example, spaces or zeroes can be placed in the record at specific positions.
NOTE: When using INREC fields, the column positions and lengths in the SORT, MERGE and SUM statements must reflect the output from the INREC processing.
2.5 SUM
The SUM statement can be used to summarize data based on the SORT statement. One record will be produced for each unique key present in the input. If numeric fields are specified for summation, those fields will be summed up. The SYNCSORT software does not guarantee unique records if numeric fields are required to be summed up. Whenever there is an overflow of a numeric field, more than one record may be created. The syntax is as follows.
SUM FIELDS=NONE

or
SUM FIELDS=({begcol},{length},{fldtype}[,{begcol},{length}, {fldtype}]...)
The first format is used when duplicate records need to be removed and no numeric summation is required. The second format is used when numeric summation is required when duplicate records exist.

2.6 ALTERNATE COLLATING SEQUENCE

The alternate collating sequence can be specified using the ALTSEQ statement. This will help the user in sorting records in a different sequence than the EBCDIC character set. This may be required in situations where the character codes for fields, that the user intends to sort on, are not in EBCDIC collating sequence.
ALTSEQ CODE=({hexcode}{newhexcode}[,{hexcode}{newhexcode}]...)
The new hex code will be used for the sorting or merging process only for the appropriate hex code specified for it. The new hex code will not replace the hex codes in the output or the reports.

2.7 OPTION STATEMENT

The option statement can be used to control parameters during SORT, MERGE and SUM processing.

2.7.1 EQUALS AND NOEQUALS

The default of NOEQUALS specifies that SYNCSORT need not retain the order of input data when duplicate record keys are found. OPTION EQUALS should be used if the order of input data must be maintained during the SORT processing.
This parameter will affect the non-summation data in SUM processing. When EQUALS is used the data for the non key fields are taken from the first input record for that key value. When NOEQUALS is specified the data for the non keyfields are unpredictable.

2.7.2 RECORD

The record option of the OPTION statement can be used to specify if the input data to be processed are Variable length records or Fixed length records. This is required when both the input and output from the SORT or MERGE processing are VSAM files. The valid values are RECORD=V and RECORD=F.

2.7.3 SKIPREC

The SKIPREC parameter of the OPTION statement can be used to specify the number of records of input to skip before any processing should begin. SKIPREC=20 specifies that the first 20 records of the input must be skipped.

2.7.4 STOPAFT

The STOPAFT parameter of the OPTION statement can be used to specify the number of records to be included for processing. STOPAFT=100 specifies that SYNCSORT stop taken any more input after 100 records that match the criteria are selected.

2.7.5 COPY The COPY parameter can be used if a simple COPY operation is required. If neither SORT processing nor MERGE processing is required this is ideal to use. The parameter when combined with SKIPREC and STOPAFT helps copy selected records to output based on number of records. If COPY is combined with INCLUDE or OMIT condition statements, selected records can be copied to output based on specific conditions in field values. When COPY is combined with INREC or OUTREC processing (INREC is more efficient in this case), a reformatted output can be produced.

2.8 OUTFIL

The OUTFIL statement can be used to produce multiple output datasets. This statement must be used if elaborate formatting is required, like producing reports. One OUTFIL statement is required for each output dataset. The FILE parameter specifies the DD name suffix to be used for the dataset output.
Each output dataset can have its own INCLUDE or OMIT condition and its own INREC and OUTREC parameters. Further, report formatting is available, including 3 levels of HEADER#, 3 levels of TRAILER#, summation, Section processing and Section breaks. For example, this statement can help create separate reports for each department into a different dataset or sysout and route them to the appropriate destination.
The SYNCSORT manual should be referred if the OUTFIL statement is required.

3. JCL REQUIREMENTS

The different DD statements required for the SORT step are as follows.

                3.1.1.1 SYSIN               It should point to the SYNCSORT control statements mentioned above.


SYSOUT It should point to a dataset or SYSOUT. This is where the SYNCSORT messages are placed.
SORTIN This dataset should point to the input dataset(s) for the sort process.
SORTIN## These statements should refer to the individual datasets to be merged. These individual datasets are required to be in pre-sorted order.
SORTOUT This dataset should point to the dataset where the output must be placed.
SORTOUT# These datasets should point to the individual output datasets referred in the OUTFIL statements.
SORTOT## The same as SORTOUT#. There must be a one to one correspondence between the FILES parameter in the OUTFIL statement and the list of DD statements specified.
SORTWK## These statements should refer to temporary volumes with appropriate space parameters depending on the volume of data to be processed.

4. PROCESSING ORDER

The processing order of the control statements by SYNCSORT is as follows.
                1. INCLUDE or OMIT condition statement processing.
                2. INREC statement processing.
                3. SORT, MERGE or COPY processing (including alternate collating sequence processing for SORT and MERGE).
                4. SUM statement processing.
                5. OUTREC processing.
The processing order will drastically change if OUTFIL statement is present in the SYSIN of SYNCSORT. The processing is very complex if the OUTFIL statements use different INREC statements and different INCLUDE or OMIT statements.

No comments:

Post a Comment