In this first issue, I’ll cover the following:
- Updates to our Web site.
- IBM-MAIN Listserver address.
- CMOS and COBOL microcode changes.
- IBM’s latest CMOS announcement, the 9672-RY4.
- A reminder about the newest IBM Capacity Planning Redbook.
1. UPDATES TO OUR WEB SITE
During this coming week, we’ll be uploading new course descriptions. The updates to the Application Tuning class came as a result of our first public class in November. We found we had enough material for TWO weeks of classes, so we’ve added and pared to provide the most valuable class we can devise. The date for the second 1997 Application Tuning class was changed last month to September 22-26.
I’m getting excited about the new OS/390 Advanced Performance and Capacity Planning class. (First class: May 5.) There are so-o-o-o-o many new indicators for use by performance analysts and capacity planners that it takes a FULL week just to cover them. There’ll be no time for the beach when you come down to this class!
We’ll also be posting an updated version of the Workload Manager Quickstart Policy that I first described in the Sep/Oct 95 TL. The Quickstart Policy has been so popular that we’ve decided to provide it through our Web site to anyone with Internet access. IBM is planning to include it in their WLM sitehttp://www.s390.ibm.com/products/mvs/wlm and Steve Samson has included it in the latest revision of his McGraw Hill (due February 1997) “MVS Performance – OS/390 Edition,” ISBN 0-07-057700-5. If you haven’t started to think about migrating to goal mode, you should certainly start thinking now. Installations are often finding that WLM can manage their work even better than their finely tuned IPS/ICS. Deb Soricelli from CIGNA mentioned in a recent TalkLink forum that their nightly batch production cycle had a 9% to 17% ITR improvement after moving to goal mode. What are you waiting for?
2. IBM-MAIN LISTSERVER ADDRESS
In our 1994, Issue #4 on page 9, there is a typo for the address of the listserver for IBM-MAIN. The correct address to send a subscription to is <email@example.com>, not <ua1vm.au.edu>. Joe Connally from The Baptist Health System in Birmingham, Alabama, pointed out the typo and indicated that it was bad timing on our part, since “ua” is the server at The University of Alabama and “au” is Auburn University. Apparently they are huge rivals and a big game was only days away. Oops!
For those of you who didn’t see the article, we described how to sign on to the listserver forum IBM-MAIN: send an email message to <firstname.lastname@example.org> with a single line in the body of the message that provides your name in the form: sub IBM-MAIN your name. Once subscribed, you’ll get 20 to 50 emails a day relating to MVS questions. If you prefer instead to have them summarized and sent in a single, daily piece of email, you can subscribe as above and then subscribe to the digest form by sending email to the same address with the one-line message:
set ibm-main digest
3. CMOS and COBOL
About a year after the first IBM 9672 Rx1 CMOS processors came out, some customers found that a few COBOL programs were taking much, much, longer than anticipated. The problem was found to be due to microcode differences in the packed decimal instruction set (such as CVB and CVD), and any highly repetitive calculations with packed decimal fields. One example of highly repetitive calculations occurs when using COBOL non-binary subscripts. When I asked some IBM developers (toward the end of October) whether the CMOS problem with CVB/CVD and COBOL subscripts was fixed in the Rx4 machines, I was told ‘no’. That was the basis for my statement on page 14 in Issue #5 about the problem not being resolved.
Then I heard that IBM reps were saying the problem had been resolved. Since that was a conflict, I checked it out. It’s a matter of what you mean by “resolved.” No, the entire problem has not been resolved, but they have made some enhancements in the Rx4 and microcode changes for Rx2 and Rx3 to reduce the amount of impact. Gary Hall from IBM’s WSC points out that WSC Flash #9608, the original flash describing the overhead, was updated on 28 October, 1996. The flash now indicates that microcode enhancements made to the 9672-Rx4 processors reduces the extent of the original problem. The microcode enhancements are also available for the 9672-Rx2 and Rx3 machines at microcode level 137 or higher.
While the microcode changes have certainly reduced the degradation, you can still expect to see more degradation than you would expect from LSPR numbers. Let me show you what I mean.
The original timings (without microcode):
Table Manipulated with Subscripts
COMPILER ES/9000-972 ES/9672-E04 ITR ITR
OPTION (my calc.) (my estimate
TRUNC=BIN 6 seconds 45 seconds 7.5 5.3
TRUNC=STD 4 seconds 18 seconds 4.5 3.8
TRUNC=OPT 3 seconds 15 seconds 5.0 4.1
Table Manipulated with Indexes
TRUNC=BIN 2 seconds 11 seconds 5.5 4.7
TRUNC=STD 2 seconds 11 seconds 5.5 4.7
TRUNC=OPT 2 seconds 11 seconds 5.5 4.7
** ‘ITR (my estimate w/microcode)’ is described later.
I created the column ‘ITR (my calc.)’ by simply calculating the increase in CPU time. (An ITR of 5.0 indicates that the CPU time is 5 times or 500% longer on the CMOS.) If you use IBM’s LSPR (Large Systems Performance Reference) ITR ratings for the CB84 and CBW2 workloads, you might expect a COBOL program to take about 4.25 times as much CPU, instead of 5.0 to 7.5. That means a 29% increase (4.25 to 5.5) for COBOL indexes and from 6% to 76% increase for subscripts. These increases are based on the WSC flash timings. Some customers have told me that they saw timings that indicated over 200% increase from their expectations. This problem is most noticeable in installations that do chargeback to their users, since the increased CPU times affect the users’ bills. Timings with microcode enhancements (from the updated WSC Flash):
Table Manipulated with Subscripts (on 9672-Rx3)
COMPILER MCL Level MCL Level Reduction
OPTION 124 141 (my calc.)
TRUNC=BIN 40 seconds 28 seconds 30%
TRUNC=STD 13 seconds 11 seconds 15%
TRUNC=OPT 11 seconds 9 seconds 18%
Table Manipulated with Indexes (on 9672-Rx3)
TRUNC=BIN 7 seconds 6 seconds 14%
TRUNC=STD 7 seconds 6 seconds 14%
TRUNC=OPT 7 seconds 6 seconds 14%
Unfortunately, the WSC timings weren’t run on the same machine with the same test code, so we can’t tell the corresponding change due to the microcode. But we can see that, while there is certainly an improvement with the microcode, that not all of the overhead has been removed. If I were to apply the savings from the microcode change on the Rx3 to the Rx1 (which I’m not sure is valid), the corresponding factors that would occur are shown in the first chart under ‘ITR (my estimate w/microcode)’. That chart shows that if the same savings holds true, some of the workloads could still be taking 25% (from 4.25 from LSPR to 5.3 from my estimate) more CPU than anticipated.
My recommendations from this are:
a) If you have a 9672-Rx2 or -Rx3, order the microcode enhancements as soon as possible to obtain CPU reductions for many of your programs.
b) Ensure that COBOL programmers are aware of the benefits of using indexes rather than subscripts.
c) Ensure that COBOL programmers are aware of the performance differences between the compiler options for TRUNC (TRUNC=BIN is the default).
d) Run evaluations on COBOL programs (or other programs that do a lot of data conversion) after any move to CMOS processors.
4. IBM’s latest CMOS announcement, the 9672-RY4
On November 26, IBM announced one more CMOS Generation 3 model. This is another 10-way, similar to the previously announced RX4, but with a uni-processor speed that’s about 9.8% faster than the RX4. The specs are listed below in the same format as we distribute with our CPU Chart. We’ll send MIPS (average, minimum, maximum, and MIPS per CPU) to subscribers. Simply send an email to Doni Richardson with either your company and location name or the name of the subscriber.
Model # CPUs SU/SEC Uni SUs Proc Grp MSUs Version
9672-RY4 10 1790.11 2486.79 80 64 5B
5. Reminder about newest IBM Capacity Planning Redbook.
I mentioned a new redbook called Capacity Planning for Parallel Sysplex, SG24-4680, in our TL #4 on page 3, but I’d like to mention it here again for emphasis. It contains answers to many of the most common questions I get today, such as how to calculate capture ratio and how to determine the effective speed of a new processor. I think IBM must have it wrong, but currently they’re only charging $7.15 for this 180-page manual that will be extremely valuable to any capacity planner.
That’s all for today! Don’t forget to send comments!