-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathusage.tex
1246 lines (1016 loc) · 47.5 KB
/
usage.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% !TeX root = forth.tex
% !TeX spellcheck = en_US
% !TeX program = pdflatex
\chapter{Usage requirements}
\label{usage}
A system shall provide all of the words defined in
\xref[Core Words]{wordlist:core} and \xref{wordlist:exception}.
It may also provide any
words defined in the optional word sets and extensions word
sets. No standard word provided by a system shall alter the
system state in a way that changes the effect of execution of
any other standard word except as provided in this standard.
A system may contain non-standard extensions, provided that
they are consistent with the requirements of this standard.
The implementation of a system may use words and techniques
outside the scope of this standard.
A system need not provide all words in executable form. The
implementation may provide definitions, including definitions
of words in the Core word set, in source form only. If so,
the mechanism for adding the definitions to the dictionary
is implementation defined.
A program that requires a system to provide words or techniques
not defined in this standard has an environmental dependency.
\section{Data types} % 3.1
\label{usage:data}
A data type identifies the set of permissible values for a
data object. It is not a property of a particular storage
location or position on a stack. Moving a data object shall
not affect its type.
No data-type checking is required of a system. An ambiguous
condition exists if an incorrectly typed data object is
encountered.
Table \ref{table:datatypes} summarizes the data types used
throughout this standard. Multiple instances of the same
type in the description of a definition are suffixed with
a sequence digit subscript to distinguish them.
\begin{table}[!ht]
\begin{center}
\caption{Data types}
\label{table:datatypes}
\begin{tabular}{llr}
\hline\hline
\emph{Symbol} & \emph{Data type} & \emph{Size on stack} \\
\hline
\param{flag} & flag & 1 cell \\
\param{true} & true flag & 1 cell \\
\param{false} & false flag & 1 cell \\
\param{char} & character & 1 cell \\
\param{n} & signed number & 1 cell \\
\param{+n} & non-negative number & 1 cell \\
\param{u} & unsigned number & 1 cell \\
\param{u|n}\footnotemark[1]
& number & 1 cell \\
\param{x} & unspecified cell & 1 cell \\
\param{xt} & execution token & 1 cell \\
\param{addr} & address & 1 cell \\
\param{a-addr} & aligned address & 1 cell \\
\param{c-addr} & character-aligned address & 1 cell \\
\param{ior} & error result & 1 cell \\
\param{d} & double-cell signed number & 2 cells \\
\param{+d} & double-cell non-negative number & 2 cells \\
\param{ud} & double-cell unsigned number & 2 cells \\
\param{d|ud}\footnotemark[2]
& double-cell number & 2 cells \\
\param{xd} & unspecified cell pair & 2 cells \\
\param{colon-sys} & definition compilation & implementation dependent \\
\param{do-sys} & do-loop structures & implementation dependent \\
\param{case-sys} & \word{CASE} structures & implementation dependent \\
\param{of-sys} & \word{OF} structures & implementation dependent \\
\param{orig} & control-flow origins & implementation dependent \\
\param{dest} & control-flow destinations & implementation dependent \\
\param{loop-sys} & loop-control parameters & implementation dependent \\
\param{nest-sys} & definition cells & implementation dependent \\
\param{i*x, j*x, k*x}\footnotemark[3]
& any data type & 0 or more cells \\
\hline\hline
\end{tabular}
\par
\begin{tabular}{lp{0.8\textwidth}}
\footnotemark[1] &
May be either a signed number or an unsigned number
depending on context. \\
\footnotemark[2] &
May be either a double-cell signed number or a double-cell
unsigned number depending on context.\\
\footnotemark[3] &
May be an undetermined number of stack entries of
unspecified type. For examples of use, see
\wref{core:EXECUTE}{EXECUTE}, \wref{core:QUIT}{QUIT}.
\end{tabular}
\end{center}
\end{table}
\subsection{Data-type relationships} % 3.1.1
\label{usage:type}
Some of the data types are subtypes of other data types. A data
type \param{i} is a subtype of type \param{j} if and only if the
members of \param{i} are a subset of the members of \param{j}. The
following list represents the subtype relationships using the
phrase ``\param{i} $\Rightarrow$ \param{j}'' to denote ``\param{i}
is a subtype of \param{j}''. The subtype relationship is transitive;
if \param{i} $\Rightarrow$ \param{j} and \param{j} $\Rightarrow$
\param{k} then \param{i} $\Rightarrow$ \param{k}:
\begin{quote}
\param{+n} $\Rightarrow$ \param{u} $\Rightarrow$ \param{x}; \\
\param{+n} $\Rightarrow$ \param{n} $\Rightarrow$ \param{x}; \\
\param{char} $\Rightarrow$ \param{+n}; \\
\param{a-addr} $\Rightarrow$ \param{c-addr}
$\Rightarrow$ \param{addr}
$\Rightarrow$ \param{u}; \\
\param{flag} $\Rightarrow$ \param{x}; \\
\param{xt} $\Rightarrow$ \param{x}; \\
\param{ior} $\Rightarrow$ \param{n} $\Rightarrow$ \param{x}; \\
\param{+d} $\Rightarrow$ \param{d} $\Rightarrow$ \param{xd}; \\
\param{+d} $\Rightarrow$ \param{ud} $\Rightarrow$ \param{xd}.
\end{quote}
Any Forth definition that accepts an argument of type \param{i}
shall also accept an argument that is a subtype of \param{i}.
\pagebreak
\subsection{Character types} % 3.1.2
\label{usage:char}
Characters shall have the following properties:
\begin{itemize}
\item be exactly one address unit wide; \\[-4ex]
\item contain at least eight bits; \\[-4ex]
\item be of fixed width; \\[-4ex]
\item have a size less than or equal to cell size; \\[-4ex]
\item be unsigned.
\end{itemize}
The characters provided by a system shall include the graphic
characters \{32 {\ldots} 126\}, which represent graphic forms
as shown in table \ref{table:ASCII}.
\subsubsection{Graphic characters} % 3.1.2.1
\label{usage:ASCII}
A graphic character is one that is normally displayed (e.g.,
A, \#, \&, 6). These values and graphics, shown in table
\ref{table:ASCII}, are taken directly from ANS X3.4-1974 (ASCII)
and ISO 646-1983, International Reference Version (IRV). The
graphic forms of characters outside the hex range \{20 {\ldots}
7E\} are implementation defined. Programs that use the graphic hex
24 (the currency sign) have an environmental dependency.
The graphic representation of characters is not restricted to
particular type fonts or styles. The graphics here are examples.
\begin{table}[ht]
\begin{center}
\caption{Standard graphic characters}
\label{table:ASCII}
\small
\begin{tabular}{c@{~}c@{~}c|c@{~}c@{~}c|c@{~}c@{~}c|c@{~}c@{~}c|c@{~}c@{~}c|c@{~}c@{~}c}
\hline\hline
Hex & IRV & ASCII &
Hex & IRV & ASCII &
Hex & IRV & ASCII &
Hex & IRV & ASCII &
Hex & IRV & ASCII &
Hex & IRV & ASCII \\
\hline
20& & &30& 0 & 0 &40& @ & @ &50& P & P &60& \verb|`| & \verb|`|
&70& p & p \\
21& ! & ! &31& 1 & 1 &41& A & A &51& Q & Q &61& a & a &71& q & q \\
22& \verb|"| & \verb|"|
&32& 2 & 2 &42& B & B &52& R & R &62& b & b &72& r & r \\
23& \# & \# &33& 3 & 3 &43& C & C &53& S & S &63& c & c &73& s & s \\
24& \textcurrency & \$ &34& 4 & 4 &44& D & D &54& T & T &63& d & d &74& t & t \\
25& \% & \% &35& 5 & 5 &45& E & E &55& U & U &64& e & e &75& u & u \\
26& \& & \& &36& 6 & 6 &46& F & F &56& V & V &65& f & f &76& v & v \\
27& ' & ' &37& 7 & 7 &47& G & G &57& W & W &66& g & g &77& w & w \\
28& ( & ( &38& 8 & 8 &48& H & H &58& X & X &67& h & h &78& x & x \\
29& ) & ) &39& 9 & 9 &49& I & I &59& Y & Y &68& i & i &79& y & y \\
2A& * & * &3A& : & : &4A& J & J &5A& Z & Z &69& j & j &7A& z & z \\
2B& + & + &3B& ; & ; &4B& K & K &5B& [ & [ &6A& k & k &7B& \{&\{ \\
2C& , & , &3C&$<$&$<$ &4C& L & L &5C& \verb|\| & \verb|\|
&6C& l & l &7C& \verb"|" & \verb"|" \\
2D& - & - &3D& = & = &4D& M & M &5D& ] & ] &6D& m & m &7D& \}&\} \\
2E& . & . &3E&$>$&$>$ &4E& N & N &5E& \verb|^| & \verb|^|
&6E& n & n &7E& \verb|~| & \verb|~| \\
2F& / & / &3F& ? & ? &4F& O & O &5F& \_ & \_&6F& o & o \\
\hline\hline
\end{tabular}
\end{center}
\end{table}
\subsubsection{Control characters} % 3.1.2.2
\label{usage:control}
All non-graphic characters included in the implementation-defined
character set are defined in this standard as control characters.
In particular, the characters \{0 {\ldots} 31\}, which could be
included in the im\-ple\-ment\-ation-de\-fined character set, are control
characters.
Programs that require the ability to send or receive control
characters have an environmental dependency.
\subsubsection{Primitive Character} % 3.1.2.3
\label{usage:pchar}
A primitive character (pchar) is a character with no restrictions on
its contents. Unless otherwise stated, a ``character'' refers to a
primitive character.
\subsection{Single-cell types} % 3.1.3
\label{usage:cell}
The implementation-defined fixed size of a cell is specified in
address units and the corresponding number of bits.
See \xref[E.2 Hardware peculiarities]{port:hardware}.
Cells shall be at least one address unit wide and contain at least
sixteen bits. The size of a cell shall be an integral multiple of
the size of a character. Data-stack elements, return-stack elements,
addresses, execution tokens, flags, and integers are one cell wide.
\subsubsection{Flags} % 3.1.3.1
\label{usage:flags}
Flags may have one of two logical states, \emph{true} or \emph{false}.
A true flag returned by a standard word shall be a
single-cell value with all bits set. A false flag returned by a
standard word shall be a single-cell value with all bits clear.
\subsubsection{Integers} % 3.1.3.2
\label{usage:int}
The implementation-defined range of signed integers shall include
\{-32768 {\ldots} +32767\}. The im\-ple\-ment\-ation-de\-fined range of
non-negative integers shall include \{0 {\ldots} 32767\}. The
implementation-defined range of unsigned integers shall include
\{0 {\ldots} 65535\}.
\subsubsection{Addresses} % 3.1.3.3
\label{usage:addr}
An address identifies a location in data space with a size of one
address unit, which a program may fetch from or store into except
for the restrictions established in this standard. The size of an
address unit is specified in bits. Each distinct address value
identifies exactly one such storage element.
See \xref[3.3.3 Data space]{usage:dataspace}.
The set of character-aligned addresses, addresses at which a
character can be accessed, is an im\-ple\-ment\-ation-de\-fined subset of
all addresses. Adding the size of a character to a character-aligned
address shall produce another character-aligned address.
The set of aligned addresses is an implementation-defined subset
of character-aligned addresses. Adding the size of a cell to an
aligned address shall produce another aligned address.
\subsubsection{Counted strings} % 3.1.3.4
\label{usage:cstring}
A counted string in memory is identified by the address
(\emph{c-addr}) of its length character.
The length character of a counted string shall contain a binary
representation of the number of data characters, between zero and
the implementation-defined maximum length for a counted string.
The maximum length of a counted string shall be at least 255.
\subsubsection{Execution tokens} % 3.1.3.5
Different definitions may have the same execution token if the
definitions are equivalent.
\subsubsection{Error results} % 3.1.3.6
\label{usage:ior}
A value of zero indicates that the operation completed successfully;
other values are in the range \{-4095 {\ldots} -1\} and represent a
valid \word[exception]{THROW} code.
The meanings of values in the range \{-255 {\ldots} -1\} are defined
by table \xref[9.1 THROW code assignments]{table:throw}. Values in
the range \{-4095 {\ldots} -256\} and their meanings are implementation
defined.
A word that returns an \param{ior} will not \word[exception]{THROW}
that \param{ior} as an exception, but indicates the exception through
the \param{ior}.
This allows a program to take appropriate actions, which may include
throwing the exception.
\subsection{Cell-pair types} % 3.1.4
\label{usage:2cell}
A cell pair in memory consists of a sequence of two contiguous
cells. The cell at the lower address is the first cell, and its
address is used to identify the cell pair. Unless otherwise
specified, a cell pair on a stack consists of the first cell
immediately above the second cell.
\subsubsection{Double-cell integers} % 3.1.4.1
On the stack, the cell containing the most significant part of a
double-cell integer shall be above the cell containing the least
significant part.
The implementation-defined range of double-cell signed integers
shall include \{-2147483647 {\ldots} \linebreak +2147483647\}.
The implementation-defined range of double-cell non-negative
integers shall include \{0 {\ldots} 2147483647\}.
The implementation-defined range of double-cell unsigned integers
shall include \{0 {\ldots} 4294967295\}. Placing the single-cell
integer zero on the stack above a single-cell unsigned integer
produces a double-cell unsigned integer with the same value.
See \xref[Internal number representation]{usage:number}.
\subsubsection{Character strings} % 3.1.4.2
A string is specified by a cell pair (\emph{c-addr u}) representing
its starting address and length in characters.
\subsection{System types} % 3.1.5
The system data types specify permitted word combinations during
compilation and execution.
\subsubsection{System-compilation types} % 3.1.5.1
These data types denote zero or more items on the control-flow stack
(see \ref{usage:controlstack}). The possible presence of such items
on the data stack means that any items already there shall be
unavailable to a program until the control-flow-stack items are
consumed.
The implementation-dependent data generated upon beginning to compile
a definition and consumed at its close is represented by the symbol
\emph{colon-sys} throughout this standard.
The implementation-dependent data generated upon beginning to
compile a do-loop structure such as \word{DO} {\ldots} \word{LOOP}
and consumed at its close is represented by the symbol \emph{do-sys}
throughout this standard.
The implementation-dependent data generated upon beginning to
compile a \word{CASE} {\ldots} \word{ENDCASE} structure and consumed
at its close is represented by the symbol \emph{case-sys} throughout
this standard.
The implementation-dependent data generated upon beginning to
compile an \word{OF} {\ldots} \word{ENDOF} structure and consumed
at its close is represented by the symbol \emph{of-sys} throughout
this standard.
The implementation-dependent data generated and consumed by executing
the other standard control-flow words is represented by the symbols
\emph{orig} and \emph{dest} throughout this standard.
\subsubsection{System-execution types} % 3.1.5.2
These data types denote zero or more items on the return stack.
Their possible presence means that any items already on the return
stack shall be unavailable to a program until the system-execution
items are consumed.
The implementation-dependent data generated upon beginning to
execute a definition and consumed upon exiting it is represented
by the symbol \emph{nest-sys} throughout this standard.
The implementation-dependent loop-control parameters used to
control the execution of do-loops are represented by the symbol
\emph{loop-sys} throughout this standard. Loop-control parameters
shall be available inside the do-loop for words that use or change
these parameters, words such as \word{I}, \word{J}, \word{LEAVE}
and \word{UNLOOP}.
\section{The implementation environment} % 3.2 =======================
\subsection{Numbers} % 3.2.1
\subsubsection{Internal number representation} % 3.2.1.1
\label{usage:number}
This standard requires two's-complement number representation and arithmetic.
Arithmetic zero is represented as the value of a single cell with all bits
clear.
The representation of a number as a compiled literal or in memory
is implementation dependent.
\subsubsection{Digit conversion} % 3.2.1.2
\label{usage:digits}
Numbers shall be represented externally by using characters from
the standard character set. Conversion between the internal and
external forms of a digit shall behave as follows:
The value in \word{BASE} is the radix for number conversion. A
digit has a value ranging from zero to one less than the contents
of \word{BASE}. The digit with the value zero corresponds to the
character ``0''. This representation of digits proceeds through
the character set to the decimal value nine corresponding to the
character ``9''. For digits beginning with the decimal value ten
the graphic characters beginning with the character ``A'' are used.
This correspondence continues up to and including the digit with
the decimal value thirty-five which is represented by the character
``Z''. The characters ``a'' though to ``z'' should be treated the
same as ``A'' though ``Z'', with ``a'' having the value ten and
``z'' the value thirty-five. The conversion of digits outside this
range is implementation defined.
\subsubsection{Free-field number display} % 3.2.1.3
\label{usage:dot}
Free-field number display uses the characters described in digit
conversion, without leading zeros, in a field the exact size of
the converted string plus a trailing space. If a number is zero,
the least significant digit is not considered a leading zero. If
the number is negative, a leading minus sign is displayed.
Number display may use the pictured numeric output string buffer
to hold partially converted strings (see \xref[Other transient
regions]{usage:transient}).
\subsection{Arithmetic} % 3.2.2
\subsubsection{Integer division} % 3.2.2.1
\label{usage:div}
Division produces a quotient \emph{q} and a remainder \emph{r}
by dividing operand \emph{a} by operand \emph{b}. Division
operations return \emph{q, r}, or both. The identity
$b \times q + r = a$ shall hold for all \emph{a} and \emph{b}.
When unsigned integers are divided and the remainder is not zero,
\emph{q} is the largest integer less than the true quotient.
When signed integers are divided, the remainder is not zero, and
\emph{a} and \emph{b} have the same sign, \emph{q} is the largest
integer less than the true quotient. If only one operand is
negative, whether \emph{q} is rounded toward negative infinity
(floored division) or rounded towards zero (symmetric division) is
implementation defined.
Floored division is integer division in which the remainder carries
the sign of the divisor or is zero, and the quotient is rounded to
its arithmetic floor. Symmetric division is integer division in
which the remainder carries the sign of the dividend or is zero and
the quotient is the mathematical quotient ``rounded towards zero''
or ``truncated''. Examples of each are shown in tables
\ref{table:floor} and \ref{table:round}.
In cases where the operands differ in sign and the rounding
direction matters, a program shall either include code generating
the desired form of division, not relying on the
implementation-defined default result, or have an environmental
dependency on the desired rounding direction.
\begin{table}[ht]
\begin{center}
\begin{minipage}{0.48\textwidth}
\begin{center}
\caption{Floored Division Example}
\label{table:floor}
\begin{tabular}{lrllrllrllrl}
\hline\hline
\multicolumn{3}{c}{Dividend} &
\multicolumn{3}{c}{Divisor} &
\multicolumn{3}{c}{Remainder} &
\multicolumn{3}{c}{Quotient} \\
\hline
& 10 &&& 7 &&& 3 &&& 1 \\
& -10 &&& 7 &&& 4 &&& -2 \\
& 10 &&& -7 &&& -4 &&& -2 \\
& -10 &&& -7 &&& -3 &&& 1 \\
\hline\hline
\end{tabular}
\end{center}
\end{minipage}
\begin{minipage}{0.48\textwidth}
\begin{center}
\caption{Symmetric Division Example}
\label{table:round}
\begin{tabular}{lrllrllrllrl}
\hline\hline
\multicolumn{3}{c}{Dividend} &
\multicolumn{3}{c}{Divisor} &
\multicolumn{3}{c}{Remainder} &
\multicolumn{3}{c}{Quotient} \\
\hline
& 10 &&& 7 &&& 3 &&& 1 \\
& -10 &&& 7 &&& -3 &&& -1 \\
& 10 &&& -7 &&& 3 &&& -1 \\
& -10 &&& -7 &&& -3 &&& 1 \\
\hline\hline
\end{tabular}
\end{center}
\end{minipage}
\html{<br class="clear" />}
\end{center}
\end{table}
\subsubsection{Other integer operations} % 3.2.2.2
\label{usage:intops}
In all integer arithmetic operations except division, both overflow
and underflow shall be ignored. The value returned when either
overflow or underflow occurs is:
\begin{itemize}
\item for unsigned results, the exact result modulo $2^n$
\item for signed results, with the exact result being $r$,
for operations other than division the number $x$ in the range $-2^{n-1}\leq x<2^{n-1}$ that
satisfies $x$ congruent $r$ (mod $2^n$).
\end{itemize}
where $n$ is the number of bits in the result.
\subsection{Stacks} % 3.2.3
\subsubsection{Data stack} % 3.2.3.1
\label{usage:datastack}
Objects on the data stack shall be one cell wide.
\subsubsection{Control-flow stack} % 3.2.3.2
\label{usage:controlstack}
The control-flow stack is a last-in, first out list whose elements
define the permissible matchings of control-flow words and the
restrictions imposed on data-stack usage during the compilation of
control structures.
The elements of the control-flow stack are system-compilation data
types.
The control-flow stack may, but need not, physically exist in an
implementation. If it does exist, it may be, but need not be,
implemented using the data stack. The format of the control-flow
stack is implementation defined.
\subsubsection{Return stack} % 3.2.3.3
\label{usage:returnstack}
Items on the return stack shall consist of one or more cells. A
system may use the return stack in an implementation-dependent
manner during the compilation of definitions, during the execution
of do-loops, and for storing run-time nesting information.
A program may use the return stack for temporary storage during the
execution of a definition subject to the following restrictions:
\begin{itemize}
\item A program shall not access values on the return stack
(using \word{R@}, \word{Rfrom}, \word{2R@}, \word{2Rfrom}
or \word[tools]{NRfrom}) that it did not place there using
\word{toR}, \word{2toR} or \word[tools]{NtoR};
\item A program shall not access from within a do-loop values
placed on the return stack before the loop was entered;
\item All values placed on the return stack within a do-loop
shall be removed before \word{I}, \word{J}, \word{LOOP},
\word{+LOOP}, \word{UNLOOP}, or \word{LEAVE} is executed;
\item All values placed on the return stack within a definition
shall be removed before the definition is terminated or
before \word{EXIT} is executed.
\end{itemize}
\subsection{Operator terminal} % 3.2.4
See \xref[Exclusions]{intro:exclusions}.
\subsubsection{User input device} % 3.2.4.1
\label{usage:input}
The method of selecting the user input device is implementation
defined.
The method of indicating the end of an input line of text is
implementation defined.
\subsubsection{User output device} % 3.2.4.2
\label{usage:output}
The method of selecting the user output device is implementation
defined.
\subsection{Mass storage} % 3.2.5
\label{usage:mass}
A system need not provide any standard words for accessing mass
storage.
\subsection{Environmental queries} % 3.2.6
\label{usage:env}
The name spaces for \word{ENVIRONMENTq} and definitions are
disjoint. Names of definitions that are the same as
\word{ENVIRONMENTq} strings shall not impair the operation of
\word{ENVIRONMENTq}. Table \ref{table:env} contains
the valid input strings and corresponding returned value for
inquiring about the programming environment with
\word{ENVIRONMENTq}.
\begin{table}[ht]
\begin{center}
\caption{Environmental Query Strings}
\label{table:env}
\begin{tabular}{p{11em}rcp{0.42\textwidth}}
\hline\hline
\multicolumn{2}{l}{String \hfill Value data type} & Constant? & Meaning \\
\hline
\texttt{/COUNTED-STRING} & \emph{n} & yes
& maximum size of a counted string, in characters \\
\texttt{/HOLD} & \emph{n} & yes
& size of the pictured numeric output string buffer,
in characters \\
\texttt{/PAD} & \emph{n} & yes
& size of the scratch area pointed to by \word{PAD},
in characters \\
\texttt{ADDRESS-UNIT-BITS} & \emph{n} & yes
& size of one address unit, in bits \\
\texttt{FLOORED} & \emph{flag} & yes
& true if floored division is the default \\
\texttt{MAX-CHAR} & \emph{u} & yes
& maximum value of any character in the
implementation-defined character set \\
\texttt{MAX-D} & \emph{d} & yes
& largest usable signed double number \\
\texttt{MAX-N} & \emph{n} & yes
& largest usable signed integer \\
\texttt{MAX-U} & \emph{u} & yes
& largest usable unsigned integer \\
\texttt{MAX-UD} & \emph{ud} & yes
& largest usable unsigned double number \\
\texttt{RETURN-STACK-CELLS} & \emph{n} & yes
& maximum size of the return stack, in cells \\
\texttt{STACK-CELLS} & \emph{n} & yes
& maximum size of the data stack, in cells \\
\hline\hline
\end{tabular}
\end{center}
\end{table}
If an environmental query (using \word{ENVIRONMENTq}) returns
\emph{false} (i.e., unknown) in response to a string, subsequent
queries using the same string may return \emph{true}. If a query
returns \emph{true} (i.e., known) in response to a string,
subsequent queries with the same string shall also return
\emph{true}. If a query designated as constant in the above table
returns \emph{true} and a value in response to a string,
subsequent queries with the same string shall return \emph{true}
and the same value.
\subsection{Obsolescent Environmental Queries} % 3.2.7
\label{usage:obsolete}
\proposal{X:wordset-query}
This standard designates the practice of using \word{ENVIRONMENTq}
to inquire whether a given word set is present as obsolescent. If
such a query, as listed in table \ref{table:obsolete}, returns
\param{true}, the word set is present in the form defined by Forth 94.
As these queries will be withdrawn from future revisions of the
standard their use in new programs is discouraged.
See \xref{rat:obsolete}.
\newcommand{\query}[2]{% <wordset><wordset name>
\texttt{#1} & \emph{flag} & no & Forth 94 #2 word set present. \\
\texttt{#1-EXT} & \emph{flag} & no & Forth 94 #2 extensions word set present. \\
}
\begin{table}[ht]
\begin{center}
\caption{Obsolescent Environmental Query Strings}
\label{table:obsolete}
\begin{tabular}{p{10em}r@{~~}c@{~~}p{0.55\textwidth}}
\hline\hline
\multicolumn{2}{l}{String \hfill Value data type} & Constant? & Meaning \\
\hline
\texttt{CORE} & \emph{flag} & no
& true if complete core word set of Forth 94 is present \\
& & & (i.e., not a subset as defined in \ref{label:system}) \\
\texttt{CORE-EXT} & \emph{flag} & no
& true if the core extensions word set of Forth 94 is present \\
\query{BLOCK}{block}
\query{DOUBLE}{double number}
\query{EXCEPTION}{exception}
\query{FACILITY}{facility}
\query{FILE}{file}
\query{FLOATING}{floating-point}
\query{LOCALS}{locals}
\query{MEMORY-ALLOC}{memory-allocation}
\query{TOOLS}{programming-tools}
\query{SEARCH-ORDER}{search-order}
\query{STRING}{string}
\hline\hline
\end{tabular}
\end{center}
\end{table}
\ifrelease\else
%\stepcounter{table}
\newcommand{\extension}[3][\empty]{% [+/-]{name}{description}
\let\mod=\relax%
\ifx+#1\let\mod=\uline\fi%
\ifx-#1\let\mod=\sout\fi%
\ifx\mod\relax\else\cbstart\fi%
\parbox[t]{0.3\textwidth}{\mod{\texttt{#2}}}
\quad
\parbox[t]{0.6\textwidth}{\mod{#3}}
\ifx\mod\relax\else\cbend\fi%
\\
}
\newenvironment{extensions}{%
\smallskip
\begin{center}
\rule[6pt]{0.95\textwidth}{0.4pt}\\[-2ex]
\rule[8pt]{0.95\textwidth}{0.4pt}\\[-2ex]
\parbox[t]{0.3\textwidth}{String} \quad
\parbox[t]{0.6\textwidth}{Meaning}
\rule[8pt]{0.95\textwidth}{0.3pt}\\[-2ex]
}{
\rule[6pt]{0.95\textwidth}{0.4pt}\\[-2ex]
\rule[8pt]{0.95\textwidth}{0.4pt}\\[-2ex]
\end{center}
}
%\begin{subtable}
% \caption{Forth 200\emph{x} Extensions}
% \label{ext:2x}
% \begin{extensions}
%\extension[+]{x:2-complement}{two's complement mandated}
%\extension[+]{x:1 chars = 1}{1 address unit is 1 character wide}
%\extension[+]{x:to-f-round}{\word[floating]{StoF} and \word[floating]{DtoF} round to nearest}
% \end{extensions}
%\end{subtable}
\fi
\section{The Forth dictionary} % 3.3 ================================
\label{usage:dict}
Forth words are organized into a structure called the dictionary.
While the form of this structure is not specified by the standard,
it can be described as consisting of three logical parts:
a name space, a code space, and a data space. The logical separation
of these parts does not require their physical separation.
A program shall not fetch from or store into locations outside data
space. An ambiguous condition exists if a program addresses name
space or code space.
\subsection{Name space} % 3.3.1
The relationship between name space and data space is implementation
dependent.
\subsubsection{Word lists} % 3.3.1.1
The structure of a word list is implementation dependent. When
duplicate names exist in a word list, the latest-defined duplicate
shall be the one found during a search for the name.
\subsubsection{Definition names} % 3.3.1.2
\label{usage:names}
Definition names shall contain \{1 {\ldots} 31\} characters.
A system may allow or prohibit the creation of definition names
containing non-standard characters. A system may allow the creation
of definition names longer than 31 characters. Programs with
definition names longer than 31 characters have an environmental
dependency.
\place{ed19}{Defining a name longer than the implementation defined
limit will throw a -19 (definition name too long) exception.}
Programs that use lower case for standard definition names or depend
on the case-sensitivity properties of a system have an environmental
dependency.
A program shall not create definition names containing non-graphic
characters.
\subsection{Code space} % 3.3.2
The relationship between code space and data space is implementation
dependent.
\subsection{Data space} % 3.3.3
\label{usage:dataspace}
Data space is the only logical area of the dictionary for which
standard words are provided to allocate and access regions of
memory. These regions are: contiguous regions, variables,
text-literal regions, input buffers, and other transient regions,
each of which is described in the following sections. A program may
read from or write into these regions unless otherwise specified.
\subsubsection{Address alignment} % 3.3.3.1
\label{usage:aaddr}
Most addresses are cell aligned (indicated by \emph{a-addr}) or character
aligned (\emph{c-addr}).
\word{ALIGNED}, \word{CHAR+}, and arithmetic operations can alter
the alignment state of an address on the stack. \word{CHAR+} applied
to an aligned address returns a character-aligned address that can
only be used to access characters. Applying \word{CHAR+} to a
character-aligned address produces the succeeding character-aligned
address. Adding or subtracting an arbitrary number to an address can
produce an unaligned address that shall not be used to fetch or
store anything. The only way to find the next aligned address is
with \word{ALIGNED}.
An ambiguous condition exists when memory is accessed using an
address that is not aligned according to the requirements for
the accessed type.
The definitions of \wref{core:CREATE}{CREATE} and
\wref{core:VARIABLE}{VARIABLE} require that the definitions created
by them return aligned addresses.
After definitions are compiled or the word \word{ALIGN} is executed
the data-space pointer is guaranteed to be aligned.
\subsubsection{Contiguous regions} % 3.3.3.2
\label{usage:contiguous}
A system guarantees that a region of data space allocated using
\word{ALLOT}, \word{,} (comma), \word{C,} (c-comma), and
\word{ALIGN} shall be contiguous with the last region allocated
with one of the above words, unless the restrictions in the
following paragraphs apply. The data-space pointer \word{HERE}
always identifies the beginning of the next data-space region to be
allocated. As successive allocations are made, the data-space
pointer increases. A program may perform address arithmetic within
contiguously allocated regions. The last region of data space
allocated using the above operators may be released by allocating a
corresponding negatively-sized region using \word{ALLOT}, subject
to the restrictions of the following paragraphs.
\word{CREATE} establishes the beginning of a contiguous region of
data space, whose starting address is returned by the \word{CREATE}d
definition. This region is terminated by compiling the next
definition.
Since an implementation is free to allocate data space for use by
code, the above operators need not produce contiguous regions of
data space if definitions are added to or removed from the
dictionary between allocations. An ambiguous condition exists if
deallocated memory contains definitions.
\subsubsection{Variables} % 3.3.3.3
\label{usage:var}
The region allocated for a variable may be non-contiguous with
regions subsequently allocated with \linebreak \word{,} (comma) or
\word{ALLOT}. For example, in:
\begin{quote}
\word{VARIABLE} X ~ 1 \word{CELLS} \word{ALLOT}
\end{quote}
the region \texttt{X} and the region \word{ALLOT}ted could be
non-contiguous.
Some system-provided variables, such as \word{STATE}, are
restricted to read-only access.
\subsubsection{Text-literal regions} % 3.3.3.4
\label{usage:"literal}
The text-literal regions, specified by strings compiled with
\word{Sq}, \word{Seq} and \word{Cq} may be read-only.
A program shall not store into the text-literal regions created
by \word{Sq}, \word{Seq} and \word{Cq} nor into any read-only
system variable or read-only transient regions.
A system must provide at least two transient buffers for use with
\word{Cq}, \word{Sq} and \word{Seq} strings. These buffers shall
be no less than 80 characters in length.
The system should be able to store two strings defined by sequential
use of these words.
RAM-limited systems may have environmental restrictions on the number
of buffers and their lifetimes.
\subsubsection{Input buffers} % 3.3.3.5
\label{usage:inbuf}
The address, length, and content of the input buffer may be
transient. A program shall not write into the input buffer. In the
absence of any optional word sets providing alternative input
sources, the input buffer is either the terminal-input buffer, used
by \word{QUIT} to hold one line from the user input device, or a
buffer specified by \word{EVALUATE}. In all cases, \word{SOURCE}
returns the beginning address and length in characters of the
current input buffer.
The minimum size of the terminal-input buffer shall be 80
characters.
The address and length returned by \word{SOURCE}, the string
returned by \word{PARSE}, and directly computed input-buffer
addresses are valid only until the text interpreter does I/O to
refill the input buffer or the input source is changed.
A program may modify the size of the parse area by changing the
contents of \word{toIN} within the limits imposed by this standard.
For example, if the contents of \word{toIN} are saved before a
parsing operation and restored afterwards, the text that was parsed
will be available again for subsequent parsing operations. The
extent of permissible repositioning using this method depends on the
input source (see \xref[(7.3.3) Block buffer regions]{block:buffers}
and \xref[(11.3.4) Input source]{file:source}).
A program may directly examine the input buffer using its address
and length as returned by \word{SOURCE}; the beginning of the parse
area within the input buffer is indexed by the number in \word{toIN}.
The values are valid for a limited time. An ambiguous condition
exists if a program modifies the contents of the input buffer.
\subsubsection{Other transient regions} % 3.3.3.6
\label{usage:transient}
The data space regions identified by \word{PAD}, \word{WORD}, and
\word{num-end} (the pictured numeric output string buffer) may be
transient. Their addresses and contents may become invalid after:
\begin{itemize}
\item a definition is created via a defining word;
\item definitions are compiled with \word{:} or \word{:NONAME};
\item data space is allocated using \word{ALLOT}, \word{,} (comma),
\word{C,} (c-comma), or \word{ALIGN}.
\end{itemize}
The previous contents of the regions identified by \word{WORD} and
\word{num-end} may be invalid after each use of these words. Further,
the regions returned by \word{WORD} and \word{num-end} may overlap in
memory. Consequently, use of one of these words can corrupt a region
returned earlier by a different word. The other words that construct
pictured numeric output strings (\word{num-start}, \word{num}, \word{numS},
\word{HOLD}, \word{HOLDS}, \word[xchar]{XHOLD}) may also modify
the contents of these regions. Words that display numbers may be
implemented using pictured numeric output words. Consequently, \word{d}
(dot), \word{.R}, \word[tools]{.S}, \word[tools]{q}, \word[double]{Dd},
\word[double]{D.R}, \word{Ud}, \word{U.R} could also corrupt the
regions.
The size of the scratch area whose address is returned by \word{PAD}
shall be at least 84 characters. The contents of the region
addressed by \word{PAD} are intended to be under the complete
control of the user: no words defined in this standard place
anything in the region, although changing data-space allocations as
described in \xref[Contiguous regions]{usage:contiguous} may change
the address returned by \word{PAD}. Non-standard words provided by
an implementation may use \word{PAD}, but such use shall be
documented.
The size of the region identified by \word{WORD} shall be at least
33 characters.
The size of the pictured numeric output string buffer shall be at
least $(2 \times n) + 2 $ characters, where $n$ is the number of
bits in a cell. Programs that consider it a fixed area with
unchanging access parameters have an environmental dependency.
\section{The Forth text interpreter} % 3.4 ==========================
\label{usage:command}
Upon start-up, a system shall be able to interpret, as described
by \wref{core:QUIT}{QUIT}, Forth source code received interactively
from a user input device.
Such interactive systems usually furnish a ``prompt'' indicating
that they have accepted a user request and acted on it. The
implementation-defined Forth prompt should contain the word ``OK''
in some combination of upper or lower case.
Text interpretation (see \wref{core:EVALUATE}{EVALUATE} and
\wref{core:QUIT}{QUIT}) shall repeat the following steps until
either the parse area is empty or an ambiguous condition exists:
\begin{enumerate}