BBC BASIC programs are tokenised, that is, BASIC keywords are stored as one or two byte values. This results in programs which execute faster and are more compact.
A tokenised line can easily be detokenised, or expanded, as there is a one-to-one mapping between token values and the expanded string. For example, code similar to the following would expand a tokenised line:
quote%=FALSE REPEAT IF ?addr%<128 OR quote% THEN VDU ?addr% ELSE P.token$(?addr%); IF ?addr%=34 quote%=NOT quote% addr%=addr%+1 UNTIL ?addr%=13
Tokenising, however, is more fiddly. Tokens can be abbreviated on entry and characters are only tokenised at certain parts of the line. For instance, in the following line:
ON NOON GOTO 1,2
the first 'ON' is the token ON, but the second 'ON' is part of the variable 'NOON'. The second 'ON' must be left untokenised.
EVAL function tokenises the supplied string and evaluates it as an
expression. Usefully, the tokenised string can be retrieved from where BASIC
has stored it.
In 6502 BASIC:
A%=EVAL("0:"+A$) token$=$((!4 AND &FFFF)-LENA$-1)
In Z80 BASIC:
In 32000 BASIC:
In PDP-11 BASIC:
In ARM BASIC:
SYS "XOS_GenerateError",0,STRING$(255,"*") TO ,A% B%=EVAL("0:"+A$) token$=$(A%-14)
(There is an official but unwieldy tokenising routine named MATCH
available from the call table provided by the
In DOS BASIC:
In Windows BASIC:
By preceding the code you want to tokenise with
you can safely pass it to
EVAL without provoking a
Syntax error. You can then extract the tokenised code from memory, so
long as you do it immediately after calling
In later versions of ARM BASIC the stack has an extra word on it and the string is stored lower in memory, as do later versions of 6502 BASIC. The following functions are written to take this into account
In Z80 BASIC the string buffer is in a different location in different
versions. When machine code is entered with
USR IX is set pointing to the string buffer, and this
can be used to find it.
These can be written as functions as follows:
DEFFNTokenise_65(A$):LOCAL A%,B% A%=(!4AND&FFFF)-LENA$-1 B%=EVAL("0:"+A$):=$A% : DEFFNTokenise_Z80(A$):LOCAL A%,P%:Tokenise_Z80%=Tokenise_Z80% IF Tokenise_Z80%=0:DIM A% 4:!A%=&D9E1E5DD:A%?4=&C9:Tokenise_Z80%=USRA% A%=EVAL("0:"+A$):=$(Tokenise_Z80%-254) : DEFFNTokenise_32(A$):LOCAL A% A%=EVAL("0:"+A$):=$(!&1B2+2) : DEFFNTokenise_PDP(A$):LOCAL A% A%=EVAL("0:"+A$):=$(^@%-254) : DEFFNTokenise_ARM(A$):LOCAL A%,B% SYS "XOS_GenerateError",0,STRING$(255,"*") TO ,A% A%!-36=0:B%=EVAL("0:"+A$):=$(A%-14+4*(A%!-36<>0)) : DEFFNTokenise_DOS(A$):LOCAL A% A%=EVAL("0:"+A$):=$&102 : DEFFNTokenise_Win(A$):LOCAL A%,B% WHILELEFT$(A$,1)=" ":A$=MID$(A$,2):ENDWHILE B%=EVAL("0:"+A$):=$(!332+2) :
These functions are used in full in the 'Tokenise' BASIC library at mdfs.net.
A text file can then be tokenised using the following code, which uses the
'FileIO' library functions
in%=OPENIN(text$) out%=OPENOUT(basic$) line%=10 :REM Start from an arbitrary line number REPEAT line$=FNTokenise_65(FNrd(in%)) :REM Read line and tokenise it BPUT#out%,13 :REM Output <cr> BPUT#out%,line%DIV256:BPUT#out%,line% :REM Output line number BPUT#out%,LENline$+4 :REM Output line length PROCwr(out%,line$) :REM Output line line%=line%+10 :REM Increment line number UNTIL EOF#in% BPUT#out%,13:BPUT#out%,&FF :REM Output program terminator CLOSE#out%:out%=0 CLOSE#in%:in%=0