![]() ![]() ![]() |
> ... and assembly language programming is largely about a culture and > a mode of expression shared by a group of specialized people. Well said. Human readability is more important than mere assemblability. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > That said, what do people think about things like: > BCTR R6,R0 > vs. > BCTR R6,0 > I prefer, nay insist on, the latter because you're not really using > register 0 and it should not be counted by the assembler (or the > editor's FIND command) as a "reference". There we agree. > Problems: you can't use them intelligently in instructions like BXLE > or MR which use register pairs; You can if the author has done things systematically instead of haphazardly. Like any tool, EQU is a good servant but a poor master. Used properly, it can save you a lot of time any make the code easier to maintain and more readable; used improperly, it can sink you in a morass. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > Maybe, but those numbers look awfully bare without the preceeding "R" > to me. What difference does one extra character make especially when > assembly source is typically 80 column "card" image. If you don't like the constraint of 80-character "card" images, you can use the input exit ASMAXINV shipped (as sample code) with High Level Assembler Release 2. It allows you to create V-format input to the assembler; the exit then takes care of converting this user-friendly input to the traditional fixed format the assembler digests. (See p.324 of the HLASM Programmer's Guide for details.) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - How about clearing registers? Again, the performance difference won't be dramatic in any case, but as obvious choices we have LA R15,0(0,0) XR R15,R15 SR R15,R15 SLR R15,R15 which can be used interchangeably if the value of the CC does not matter. I dislike the first because I have a hard time doing in 4 bytes what can be done in 2. Of the remaining choices, I prefer XR because it is like XC, which can be similarly used to initialize a field to binary zeros. On the other hand IBM code I've read usually chooses SR or occasionally SLR. I recall reading somewhere that XR should be fastest, but I have never tested this and doubt there is much of a difference. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > matter. I recall reading somewhere that XR should be fastest Once upon a time, a long long time ago IBM used to publish a "Functional Characteristics" book on each of its processors and in this book the published instruction timings. I recall that SLR was the fastest way. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Speaking of style let me add my own element. I prefer NOT to put labels on instructions. Labels always go on a statement themselves along with a DS 0H (e.g. RETURN DS 0H). This insures that a label doesn't get deleted when deleting the instruction it is attached to and also insures halfword alignment unlike "RETURN EQU *". It also makes it easier to comment out sections of code because column 1 is always blank so you can just put an asterisk there. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > All-in-all style, like other things is in the eye of the beholder, >IMHO the idea is to make the program work. I can't agree with you on that point. The only time that is true is when there is only a single programmer working on the system and the program will be run once and then thrown away. > BAL is my first love also, but I take the other view. It's lots > of relative addressing and registers. I guess it's my scientific > background, lots of table searching does that to you. > The code is the only true document. > Back in the 360 days I would LPSW rather than a B. > Many of things mentioned in "style" also have efficiency implications, if you write an operating system exit that is executed hundreds or thousands of times each day you had better make sure that you have given some thought to efficiency. Code readability is also important, many of the "one time programs" that I have written have survived longer than those written to be part of a major application system. When there is a problem in the program and you need to look at the source, documentation and consistency make the job much much easier. While using LPSW instead of a branch may seem obvious to us dinosaurs some less experienced person may have to pick up the code when there is a problem, they don't know much about the PSW other than that's where to look in a dump to see where the program blew up. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - I use not only DS 0H, but also (in large programs, when I'm up to the additional effort) DS 0Y. The DS 0Y go on labels which are only referenced "locally", i.e., within a few lines of them. For example, the oh-so-common construct: TM SOMEFLAG,SOMEBIT BO SKIPONESKIPONE DS 0Y The label on the subroutine would be a DS 0H because it's referenced elsewhere. This isn't perfect, of course, but if followed, helps avoid extra searching for "Who else might get us to this line?" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - >if the instruction stream is terminated by an unconditional branch, as >it is almost certain to be. Sure, there may be a level at which bytes >are blindly fetched ahead of the instruction counter, but it won't last >long - the I- unit will soon realize that it's a bad place to be fetching >from. While the I-unit should be able to detect an unconditional branch, that is only part of the problem. Cache lines are a good deal longer than instructions are, so you end up with cache lines containing both instructions and data. If the data changes, the entire line is modified. This isn't really much of a problem for a set of instructions that are only executed a few times, but within loops this could be noticeable, especially if the data is modified. An annoying aspect of this problem is that it can get better or worse depending on how the code is aligned relative to cache line boundaries, so making an unrelated change in the program or running on a different processor may unexpectedly degrade performance. >From: "Shmuel (Seymour J.) Metz" > > Once upon a time, a long long time ago IBM used to publish a > > "Functional Characteristics" book on each of its processors and > > in this book the published instruction timings. I recall that SLR > > was the fastest way. > > > Ken (kgunther@delphi.com) > >Only on specific models; on some XR was faster and on some LA was >faster. BTW, they still publish the "funky specs", but they no longer >contain timings. In the last one (370/168) that had timings, you had >timing formulae rather than simple numbers, and I'm sure that if they >published timing information on current models you wouldn't want to >drop one of the manuals on your foot . > Style can't be answered definitively since it is subjective, but times can be found. To see how LA, XR, SR, and SLR compare for clearing a register, I wrote a program to generate then time execution of N successive copies of each instruction. Varying N from 50000 to 1000000, I got the following results: N 9021-580, CMS version 9672, MVS version LA others LA others 50 000 27 15 16 14 100 000 33 15 21 13 200 000 32 16 22 16 500 000 32 16 22 17 1 000 000 32 17 22 17 All times are in ns, and hopefully no errors... That is, XR, SR, and SLR are practically the same (the run-to-run variation was more than the time differences), and about 1/2 to 3/4 of the time of LA. The increased time with longer series may be due to cache delays. (Note that LA is 2X longer and is affected first.) I was planning to append the program here, but it grew to almost 500 lines, so I thought it might be better not to. However, if anyone would like a copy, send me e-mail, and I'll send one back. The program is conditionally assembled based on &SYSTEM_ID, and should work in CMS or MVS without modification.