-
Notifications
You must be signed in to change notification settings - Fork 1
/
tutor6.txt
1120 lines (801 loc) · 35.3 KB
/
tutor6.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
LET'S BUILD A COMPILER!
By
Jack W. Crenshaw, Ph.D.
31 August 1988
Part VI: BOOLEAN EXPRESSIONS
*****************************************************************
* *
* COPYRIGHT NOTICE *
* *
* Copyright (C) 1988 Jack W. Crenshaw. All rights reserved. *
* *
*****************************************************************
INTRODUCTION
In Part V of this series, we took a look at control constructs,
and developed parsing routines to translate them into object
code. We ended up with a nice, relatively rich set of
constructs.
As we left the parser, though, there was one big hole in our
capabilities: we did not address the issue of the branch
condition. To fill the void, I introduced to you a dummy parse
routine called Condition, which only served as a place-keeper for
the real thing.
One of the things we'll do in this session is to plug that hole
by expanding Condition into a true parser/translator.
THE PLAN
We're going to approach this installment a bit differently than
any of the others. In those other installments, we started out
immediately with experiments using the Pascal compiler, building
up the parsers from very rudimentary beginnings to their final
forms, without spending much time in planning beforehand. That's
called coding without specs, and it's usually frowned upon. We
could get away with it before because the rules of arithmetic are
pretty well established ... we know what a '+' sign is supposed
to mean without having to discuss it at length. The same is true
for branches and loops. But the ways in which programming
languages implement logic vary quite a bit from language to
language. So before we begin serious coding, we'd better first
make up our minds what it is we want. And the way to do that is
at the level of the BNF syntax rules (the GRAMMAR).
THE GRAMMAR
For some time now, we've been implementing BNF syntax equations
for arithmetic expressions, without ever actually writing them
down all in one place. It's time that we did so. They are:
<expression> ::= <unary op> <term> [<addop> <term>]*
<term> ::= <factor> [<mulop> factor]*
<factor> ::= <integer> | <variable> | ( <expression> )
(Remember, the nice thing about this grammar is that it enforces
the operator precedence hierarchy that we normally expect for
algebra.)
Actually, while we're on the subject, I'd like to amend this
grammar a bit right now. The way we've handled the unary minus
is a bit awkward. I've found that it's better to write the
grammar this way:
<expression> ::= <term> [<addop> <term>]*
<term> ::= <signed factor> [<mulop> factor]*
<signed factor> ::= [<addop>] <factor>
<factor> ::= <integer> | <variable> | (<expression>)
This puts the job of handling the unary minus onto Factor, which
is where it really belongs.
This doesn't mean that you have to go back and recode the
programs you've already written, although you're free to do so if
you like. But I will be using the new syntax from now on.
Now, it probably won't come as a shock to you to learn that we
can define an analogous grammar for Boolean algebra. A typical
set or rules is:
<b-expression>::= <b-term> [<orop> <b-term>]*
<b-term> ::= <not-factor> [AND <not-factor>]*
<not-factor> ::= [NOT] <b-factor>
<b-factor> ::= <b-literal> | <b-variable> | (<b-expression>)
Notice that in this grammar, the operator AND is analogous to
'*', and OR (and exclusive OR) to '+'. The NOT operator is
analogous to a unary minus. This hierarchy is not absolutely
standard ... some languages, notably Ada, treat all logical
operators as having the same precedence level ... but it seems
natural.
Notice also the slight difference between the way the NOT and the
unary minus are handled. In algebra, the unary minus is
considered to go with the whole term, and so never appears but
once in a given term. So an expression like
a * -b
or worse yet,
a - -b
is not allowed. In Boolean algebra, though, the expression
a AND NOT b
makes perfect sense, and the syntax shown allows for that.
RELOPS
OK, assuming that you're willing to accept the grammar I've shown
here, we now have syntax rules for both arithmetic and Boolean
algebra. The sticky part comes in when we have to combine the
two. Why do we have to do that? Well, the whole subject came up
because of the need to process the "predicates" (conditions)
associated with control statements such as the IF. The predicate
is required to have a Boolean value; that is, it must evaluate to
either TRUE or FALSE. The branch is then taken or not taken,
depending on that value. What we expect to see going on in
procedure Condition, then, is the evaluation of a Boolean
expression.
But there's more to it than that. A pure Boolean expression can
indeed be the predicate of a control statement ... things like
IF a AND NOT b THEN ....
But more often, we see Boolean algebra show up in such things as
IF (x >= 0) and (x <= 100) THEN ...
Here, the two terms in parens are Boolean expressions, but the
individual terms being compared: x, 0, and 100, are NUMERIC in
nature. The RELATIONAL OPERATORS >= and <= are the catalysts by
which the Boolean and the arithmetic ingredients get merged
together.
Now, in the example above, the terms being compared are just
that: terms. However, in general each side can be a math
expression. So we can define a RELATION to be:
<relation> ::= <expression> <relop> <expression> ,
where the expressions we're talking about here are the old
numeric type, and the relops are any of the usual symbols
=, <> (or !=), <, >, <=, and >=
If you think about it a bit, you'll agree that, since this kind
of predicate has a single Boolean value, TRUE or FALSE, as its
result, it is really just another kind of factor. So we can
expand the definition of a Boolean factor above to read:
<b-factor> ::= <b-literal>
| <b-variable>
| (<b-expression>)
| <relation>
THAT's the connection! The relops and the relation they define
serve to wed the two kinds of algebra. It is worth noting that
this implies a hierarchy where the arithmetic expression has a
HIGHER precedence that a Boolean factor, and therefore than all
the Boolean operators. If you write out the precedence levels
for all the operators, you arrive at the following list:
Level Syntax Element Operator
0 factor literal, variable
1 signed factor unary minus
2 term *, /
3 expression +, -
4 b-factor literal, variable, relop
5 not-factor NOT
6 b-term AND
7 b-expression OR, XOR
If we're willing to accept that many precedence levels, this
grammar seems reasonable. Unfortunately, it won't work! The
grammar may be great in theory, but it's no good at all in the
practice of a top-down parser. To see the problem, consider the
code fragment:
IF ((((((A + B + C) < 0 ) AND ....
When the parser is parsing this code, it knows after it sees the
IF token that a Boolean expression is supposed to be next. So it
can set up to begin evaluating such an expression. But the first
expression in the example is an ARITHMETIC expression, A + B + C.
What's worse, at the point that the parser has read this much of
the input line:
IF ((((((A ,
it still has no way of knowing which kind of expression it's
dealing with. That won't do, because we must have different
recognizers for the two cases. The situation can be handled
without changing any of our definitions, but only if we're
willing to accept an arbitrary amount of backtracking to work our
way out of bad guesses. No compiler writer in his right mind
would agree to that.
What's going on here is that the beauty and elegance of BNF
grammar has met face to face with the realities of compiler
technology.
To deal with this situation, compiler writers have had to make
compromises so that a single parser can handle the grammar
without backtracking.
FIXING THE GRAMMAR
The problem that we've encountered comes up because our
definitions of both arithmetic and Boolean factors permit the use
of parenthesized expressions. Since the definitions are
recursive, we can end up with any number of levels of
parentheses, and the parser can't know which kind of expression
it's dealing with.
The solution is simple, although it ends up causing profound
changes to our grammar. We can only allow parentheses in one
kind of factor. The way to do that varies considerably from
language to language. This is one place where there is NO
agreement or convention to help us.
When Niklaus Wirth designed Pascal, the desire was to limit the
number of levels of precedence (fewer parse routines, after all).
So the OR and exclusive OR operators are treated just like an
Addop and processed at the level of a math expression.
Similarly, the AND is treated like a Mulop and processed with
Term. The precedence levels are
Level Syntax Element Operator
0 factor literal, variable
1 signed factor unary minus, NOT
2 term *, /, AND
3 expression +, -, OR
Notice that there is only ONE set of syntax rules, applying to
both kinds of operators. According to this grammar, then,
expressions like
x + (y AND NOT z) DIV 3
are perfectly legal. And, in fact, they ARE ... as far as the
parser is concerned. Pascal doesn't allow the mixing of
arithmetic and Boolean variables, and things like this are caught
at the SEMANTIC level, when it comes time to generate code for
them, rather than at the syntax level.
The authors of C took a diametrically opposite approach: they
treat the operators as different, and have something much more
akin to our seven levels of precedence. In fact, in C there are
no fewer than 17 levels! That's because C also has the operators
'=', '+=' and its kin, '<<', '>>', '++', '--', etc. Ironically,
although in C the arithmetic and Boolean operators are treated
separately, the variables are NOT ... there are no Boolean or
logical variables in C, so a Boolean test can be made on any
integer value.
We'll do something that's sort of in-between. I'm tempted to
stick mostly with the Pascal approach, since that seems the
simplest from an implementation point of view, but it results in
some funnies that I never liked very much, such as the fact that,
in the expression
IF (c >= 'A') and (c <= 'Z') then ...
the parens above are REQUIRED. I never understood why before,
and neither my compiler nor any human ever explained it very
well, either. But now, we can all see that the 'and' operator,
having the precedence of a multiply, has a higher one than the
relational operators, so without the parens the expression is
equivalent to
IF c >= ('A' and c) <= 'Z' then
which doesn't make sense.
In any case, I've elected to separate the operators into
different levels, although not as many as in C.
<b-expression> ::= <b-term> [<orop> <b-term>]*
<b-term> ::= <not-factor> [AND <not-factor>]*
<not-factor> ::= [NOT] <b-factor>
<b-factor> ::= <b-literal> | <b-variable> | <relation>
<relation> ::= | <expression> [<relop> <expression]
<expression> ::= <term> [<addop> <term>]*
<term> ::= <signed factor> [<mulop> factor]*
<signed factor>::= [<addop>] <factor>
<factor> ::= <integer> | <variable> | (<b-expression>)
This grammar results in the same set of seven levels that I
showed earlier. Really, it's almost the same grammar ... I just
removed the option of parenthesized b-expressions as a possible
b-factor, and added the relation as a legal form of b-factor.
There is one subtle but crucial difference, which is what makes
the whole thing work. Notice the square brackets in the
definition of a relation. This means that the relop and the
second expression are OPTIONAL.
A strange consequence of this grammar (and one shared by C) is
that EVERY expression is potentially a Boolean expression. The
parser will always be looking for a Boolean expression, but will
"settle" for an arithmetic one. To be honest, that's going to
slow down the parser, because it has to wade through more layers
of procedure calls. That's one reason why Pascal compilers tend
to compile faster than C compilers. If it's raw speed you want,
stick with the Pascal syntax.
THE PARSER
Now that we've gotten through the decision-making process, we can
press on with development of a parser. You've done this with me
several times now, so you know the drill: we begin with a fresh
copy of the cradle, and begin adding procedures one by one. So
let's do it.
We begin, as we did in the arithmetic case, by dealing only with
Boolean literals rather than variables. This gives us a new kind
of input token, so we're also going to need a new recognizer, and
a new procedure to read instances of that token type. Let's
start by defining the two new procedures:
{--------------------------------------------------------------}
{ Recognize a Boolean Literal }
function IsBoolean(c: char): Boolean;
begin
IsBoolean := UpCase(c) in ['T', 'F'];
end;
{--------------------------------------------------------------}
{ Get a Boolean Literal }
function GetBoolean: Boolean;
var c: char;
begin
if not IsBoolean(Look) then Expected('Boolean Literal');
GetBoolean := UpCase(Look) = 'T';
GetChar;
end;
{--------------------------------------------------------------}
Type these routines into your program. You can test them by
adding into the main program the print statement
WriteLn(GetBoolean);
OK, compile the program and test it. As usual, it's not very
impressive so far, but it soon will be.
Now, when we were dealing with numeric data we had to arrange to
generate code to load the values into D0. We need to do the same
for Boolean data. The usual way to encode Boolean variables is
to let 0 stand for FALSE, and some other value for TRUE. Many
languages, such as C, use an integer 1 to represent it. But I
prefer FFFF hex (or -1), because a bitwise NOT also becomes a
Boolean NOT. So now we need to emit the right assembler code to
load those values. The first cut at the Boolean expression
parser (BoolExpression, of course) is:
{---------------------------------------------------------------}
{ Parse and Translate a Boolean Expression }
procedure BoolExpression;
begin
if not IsBoolean(Look) then Expected('Boolean Literal');
if GetBoolean then
EmitLn('MOVE #-1,D0')
else
EmitLn('CLR D0');
end;
{---------------------------------------------------------------}
Add this procedure to your parser, and call it from the main
program (replacing the print statement you had just put there).
As you can see, we still don't have much of a parser, but the
output code is starting to look more realistic.
Next, of course, we have to expand the definition of a Boolean
expression. We already have the BNF rule:
<b-expression> ::= <b-term> [<orop> <b-term>]*
I prefer the Pascal versions of the "orops", OR and XOR. But
since we are keeping to single-character tokens here, I'll encode
those with '|' and '~'. The next version of BoolExpression is
almost a direct copy of the arithmetic procedure Expression:
{--------------------------------------------------------------}
{ Recognize and Translate a Boolean OR }
procedure BoolOr;
begin
Match('|');
BoolTerm;
EmitLn('OR (SP)+,D0');
end;
{--------------------------------------------------------------}
{ Recognize and Translate an Exclusive Or }
procedure BoolXor;
begin
Match('~');
BoolTerm;
EmitLn('EOR (SP)+,D0');
end;
{---------------------------------------------------------------}
{ Parse and Translate a Boolean Expression }
procedure BoolExpression;
begin
BoolTerm;
while IsOrOp(Look) do begin
EmitLn('MOVE D0,-(SP)');
case Look of
'|': BoolOr;
'~': BoolXor;
end;
end;
end;
{---------------------------------------------------------------}
Note the new recognizer IsOrOp, which is also a copy, this time
of IsAddOp:
{--------------------------------------------------------------}
{ Recognize a Boolean Orop }
function IsOrop(c: char): Boolean;
begin
IsOrop := c in ['|', '~'];
end;
{--------------------------------------------------------------}
OK, rename the old version of BoolExpression to BoolTerm, then
enter the code above. Compile and test this version. At this
point, the output code is starting to look pretty good. Of
course, it doesn't make much sense to do a lot of Boolean algebra
on constant values, but we'll soon be expanding the types of
Booleans we deal with.
You've probably already guessed what the next step is: The
Boolean version of Term.
Rename the current procedure BoolTerm to NotFactor, and enter the
following new version of BoolTerm. Note that is is much simpler
than the numeric version, since there is no equivalent of
division.
{---------------------------------------------------------------}
{ Parse and Translate a Boolean Term }
procedure BoolTerm;
begin
NotFactor;
while Look = '&' do begin
EmitLn('MOVE D0,-(SP)');
Match('&');
NotFactor;
EmitLn('AND (SP)+,D0');
end;
end;
{--------------------------------------------------------------}
Now, we're almost home. We are translating complex Boolean
expressions, although only for constant values. The next step is
to allow for the NOT. Write the following procedure:
{--------------------------------------------------------------}
{ Parse and Translate a Boolean Factor with NOT }
procedure NotFactor;
begin
if Look = '!' then begin
Match('!');
BoolFactor;
EmitLn('EOR #-1,D0');
end
else
BoolFactor;
end;
{--------------------------------------------------------------}
And rename the earlier procedure to BoolFactor. Now try that.
At this point the parser should be able to handle any Boolean
expression you care to throw at it. Does it? Does it trap badly
formed expressions?
If you've been following what we did in the parser for math
expressions, you know that what we did next was to expand the
definition of a factor to include variables and parens. We don't
have to do that for the Boolean factor, because those little
items get taken care of by the next step. It takes just a one
line addition to BoolFactor to take care of relations:
{--------------------------------------------------------------}
{ Parse and Translate a Boolean Factor }
procedure BoolFactor;
begin
if IsBoolean(Look) then
if GetBoolean then
EmitLn('MOVE #-1,D0')
else
EmitLn('CLR D0')
else Relation;
end;
{--------------------------------------------------------------}
You might be wondering when I'm going to provide for Boolean
variables and parenthesized Boolean expressions. The answer is,
I'm NOT! Remember, we took those out of the grammar earlier.
Right now all I'm doing is encoding the grammar we've already
agreed upon. The compiler itself can't tell the difference
between a Boolean variable or expression and an arithmetic one
... all of those will be handled by Relation, either way.
Of course, it would help to have some code for Relation. I don't
feel comfortable, though, adding any more code without first
checking out what we already have. So for now let's just write a
dummy version of Relation that does nothing except eat the
current character, and write a little message:
{---------------------------------------------------------------}
{ Parse and Translate a Relation }
procedure Relation;
begin
WriteLn('<Relation>');
GetChar;
end;
{--------------------------------------------------------------}
OK, key in this code and give it a try. All the old things
should still work ... you should be able to generate the code for
ANDs, ORs, and NOTs. In addition, if you type any alphabetic
character you should get a little <Relation> place-holder, where
a Boolean factor should be. Did you get that? Fine, then let's
move on to the full-blown version of Relation.
To get that, though, there is a bit of groundwork that we must
lay first. Recall that a relation has the form
<relation> ::= | <expression> [<relop> <expression]
Since we have a new kind of operator, we're also going to need a
new Boolean function to recognize it. That function is shown
below. Because of the single-character limitation, I'm sticking
to the four operators that can be encoded with such a character
(the "not equals" is encoded by '#').
{--------------------------------------------------------------}
{ Recognize a Relop }
function IsRelop(c: char): Boolean;
begin
IsRelop := c in ['=', '#', '<', '>'];
end;
{--------------------------------------------------------------}
Now, recall that we're using a zero or a -1 in register D0 to
represent a Boolean value, and also that the loop constructs
expect the flags to be set to correspond. In implementing all
this on the 68000, things get a a little bit tricky.
Since the loop constructs operate only on the flags, it would be
nice (and also quite efficient) just to set up those flags, and
not load anything into D0 at all. This would be fine for the
loops and branches, but remember that the relation can be used
ANYWHERE a Boolean factor could be used. We may be storing its
result to a Boolean variable. Since we can't know at this point
how the result is going to be used, we must allow for BOTH cases.
Comparing numeric data is easy enough ... the 68000 has an
operation for that ... but it sets the flags, not a value.
What's more, the flags will always be set the same (zero if
equal, etc.), while we need the zero flag set differently for the
each of the different relops.
The solution is found in the 68000 instruction Scc, which sets a
byte value to 0000 or FFFF (funny how that works!) depending upon
the result of the specified condition. If we make the
destination byte to be D0, we get the Boolean value needed.
Unfortunately, there's one final complication: unlike almost
every other instruction in the 68000 set, Scc does NOT reset the
condition flags to match the data being stored. So we have to do
one last step, which is to test D0 and set the flags to match it.
It must seem to be a trip around the moon to get what we want: we
first perform the test, then test the flags to set data into D0,
then test D0 to set the flags again. It is sort of roundabout,
but it's the most straightforward way to get the flags right, and
after all it's only a couple of instructions.
I might mention here that this area is, in my opinion, the one
that represents the biggest difference between the efficiency of
hand-coded assembler language and compiler-generated code. We
have seen already that we lose efficiency in arithmetic
operations, although later I plan to show you how to improve that
a bit. We've also seen that the control constructs themselves
can be done quite efficiently ... it's usually very difficult to
improve on the code generated for an IF or a WHILE. But
virtually every compiler I've ever seen generates terrible code,
compared to assembler, for the computation of a Boolean function,
and particularly for relations. The reason is just what I've
hinted at above. When I'm writing code in assembler, I go ahead
and perform the test the most convenient way I can, and then set
up the branch so that it goes the way it should. In effect, I
"tailor" every branch to the situation. The compiler can't do
that (practically), and it also can't know that we don't want to
store the result of the test as a Boolean variable. So it must
generate the code in a very strict order, and it often ends up
loading the result as a Boolean that never gets used for
anything.
In any case, we're now ready to look at the code for Relation.
It's shown below with its companion procedures:
{---------------------------------------------------------------}
{ Recognize and Translate a Relational "Equals" }
procedure Equals;
begin
Match('=');
Expression;
EmitLn('CMP (SP)+,D0');
EmitLn('SEQ D0');
end;
{---------------------------------------------------------------}
{ Recognize and Translate a Relational "Not Equals" }
procedure NotEquals;
begin
Match('#');
Expression;
EmitLn('CMP (SP)+,D0');
EmitLn('SNE D0');
end;
{---------------------------------------------------------------}
{ Recognize and Translate a Relational "Less Than" }
procedure Less;
begin
Match('<');
Expression;
EmitLn('CMP (SP)+,D0');
EmitLn('SGE D0');
end;
{---------------------------------------------------------------}
{ Recognize and Translate a Relational "Greater Than" }
procedure Greater;
begin
Match('>');
Expression;
EmitLn('CMP (SP)+,D0');
EmitLn('SLE D0');
end;
{---------------------------------------------------------------}
{ Parse and Translate a Relation }
procedure Relation;
begin
Expression;
if IsRelop(Look) then begin
EmitLn('MOVE D0,-(SP)');
case Look of
'=': Equals;
'#': NotEquals;
'<': Less;
'>': Greater;
end;
EmitLn('TST D0');
end;
end;
{---------------------------------------------------------------}
Now, that call to Expression looks familiar! Here is where the
editor of your system comes in handy. We have already generated
code for Expression and its buddies in previous sessions. You
can copy them into your file now. Remember to use the single-
character versions. Just to be certain, I've duplicated the
arithmetic procedures below. If you're observant, you'll also
see that I've changed them a little to make them correspond to
the latest version of the syntax. This change is NOT necessary,
so you may prefer to hold off on that until you're sure
everything is working.
{---------------------------------------------------------------}
{ Parse and Translate an Identifier }
procedure Ident;
var Name: char;
begin
Name:= GetName;
if Look = '(' then begin
Match('(');
Match(')');
EmitLn('BSR ' + Name);
end
else
EmitLn('MOVE ' + Name + '(PC),D0');
end;
{---------------------------------------------------------------}
{ Parse and Translate a Math Factor }
procedure Expression; Forward;
procedure Factor;
begin
if Look = '(' then begin
Match('(');
Expression;
Match(')');
end
else if IsAlpha(Look) then
Ident
else
EmitLn('MOVE #' + GetNum + ',D0');
end;
{---------------------------------------------------------------}
{ Parse and Translate the First Math Factor }
procedure SignedFactor;
begin
if Look = '+' then
GetChar;
if Look = '-' then begin
GetChar;
if IsDigit(Look) then
EmitLn('MOVE #-' + GetNum + ',D0')
else begin
Factor;
EmitLn('NEG D0');
end;
end
else Factor;
end;
{--------------------------------------------------------------}
{ Recognize and Translate a Multiply }
procedure Multiply;
begin
Match('*');
Factor;
EmitLn('MULS (SP)+,D0');
end;
{-------------------------------------------------------------}
{ Recognize and Translate a Divide }
procedure Divide;
begin
Match('/');
Factor;
EmitLn('MOVE (SP)+,D1');
EmitLn('EXS.L D0');
EmitLn('DIVS D1,D0');
end;
{---------------------------------------------------------------}
{ Parse and Translate a Math Term }
procedure Term;
begin
SignedFactor;
while Look in ['*', '/'] do begin
EmitLn('MOVE D0,-(SP)');
case Look of
'*': Multiply;
'/': Divide;
end;
end;
end;
{---------------------------------------------------------------}
{ Recognize and Translate an Add }
procedure Add;
begin
Match('+');
Term;
EmitLn('ADD (SP)+,D0');
end;
{---------------------------------------------------------------}
{ Recognize and Translate a Subtract }
procedure Subtract;
begin
Match('-');
Term;
EmitLn('SUB (SP)+,D0');
EmitLn('NEG D0');
end;
{---------------------------------------------------------------}
{ Parse and Translate an Expression }
procedure Expression;
begin
Term;
while IsAddop(Look) do begin
EmitLn('MOVE D0,-(SP)');
case Look of
'+': Add;
'-': Subtract;
end;
end;
end;
{---------------------------------------------------------------}
There you have it ... a parser that can handle both arithmetic
AND Boolean algebra, and things that combine the two through the
use of relops. I suggest you file away a copy of this parser in
a safe place for future reference, because in our next step we're
going to be chopping it up.
MERGING WITH CONTROL CONSTRUCTS
At this point, let's go back to the file we had previously built
that parses control constructs. Remember those little dummy
procedures called Condition and Expression? Now you know what
goes in their places!
I warn you, you're going to have to do some creative editing
here, so take your time and get it right. What you need to do is
to copy all of the procedures from the logic parser, from Ident
through BoolExpression, into the parser for control constructs.
Insert them at the current location of Condition. Then delete
that procedure, as well as the dummy Expression. Next, change
every call to Condition to refer to BoolExpression instead.
Finally, copy the procedures IsMulop, IsOrOp, IsRelop, IsBoolean,
and GetBoolean into place. That should do it.
Compile the resulting program and give it a try. Since we
haven't used this program in awhile, don't forget that we used
single-character tokens for IF, WHILE, etc. Also don't forget
that any letter not a keyword just gets echoed as a block.
Try
ia=bxlye
which stands for "IF a=b X ELSE Y ENDIF".
What do you think? Did it work? Try some others.
ADDING ASSIGNMENTS
As long as we're this far, and we already have the routines for
expressions in place, we might as well replace the "blocks" with
real assignment statements. We've already done that before, so
it won't be too hard. Before taking that step, though, we need
to fix something else.
We're soon going to find that the one-line "programs" that we're
having to write here will really cramp our style. At the moment