Matching and merging data is the one of the common tasks for data analysis that most legacy developers have a template Cobol program ready for this type of task. This is a common approach that I bet most developers out there have one tucked in, ready to be used when needed.
I have no problem with a Cobol solution as it does the job. It is just that there are a lot of steps needed (tweaks on the keys, layout and the data to return, program compile) before I could use the program. For me it is too much especially if it is for a one off adhoc task.
Is there a quicker way?
Well, there is. In fact, there are many options you can choose from - EZtrieve, SAS or REXX to name a few. These tools provide a simpler approach but there is that provides the simplest solution and it involves a very unlikely tool.
Trust SORT to sort it out.
Yes, you heard it. SORT utilities, Syncsort in particular. The jobstep below is an example of how a typical match and merge SORT step will look like.
//JSTEPA1 EXEC PGM=SORT,REGION=0M
//SYSOUT DD SYSOUT=*
//SYSOUA DD SYSOUT=*
//SYSDBOUT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*
//SORTWK01 DD UNIT=SYSDA,SPACE=(CYL,(10,05),RLSE)
//SORTWK02 DD UNIT=SYSDA,SPACE=(CYL,(10,05),RLSE)
//SORTWK03 DD UNIT=SYSDA,SPACE=(CYL,(10,05),RLSE)
//SORTJNF1 DD DISP=SHR,DSN=match.file.one
//SORTJNF2 DD DISP=SHR,DSN=match.file.two
//SORTOUT DD DSN=merge.file.result,
// DISP=(NEW,CATLG,DELETE),DCB=(RECFM=FB,BLKSIZE=0),
// UNIT=SYSDA,SPACE=(CYL,(10,2),RLSE)
//SYSIN DD *
JOINKEYS FILE=F1,FIELDS=(23,9,A,33,4,A,37,2,A,39,2,A)
JOINKEYS FILE=F2, FIELDS=(1,9,A,11,4,A,16,2,A,19,2,A)
REFORMAT FIELDS=(F1:1,68,F2:36,4,41,2,44,2,F1:77,271)
SORT FIELDS=COPY
END
This job has almost the same setup as any SORT step in a JCL. The differences are highlighted below.
- Two input files. Unlike your regular SORT step, this has two input files. SORTJNF1 and SORTJNF2 refers to the files to be matched. The output files remains the same.
- JOINKEYS FILE=F1,FIELDS/JOINKEYS FILE=F2,FIELDS refers to the keys to be used for matching. As shown, we can specify multiple keys and these keys don't even have to be on the same position. We just have to make sure that the corresponding key for F1 is of the same length as that in F2. The FIELDS parameter follow the following syntax
- REFORMAT FIELDS will contain the fields you need to return from both files. This will comprise the new merge record. The parameter after F1:/F2: refers the fields to be returned from the associated file.
Note: The example shown is very simple. There are other parameters available to address many needs (filtering records to match, keeping unmatched record from files, etc) so it is worthwhile to explore this functionality.
