Difference between revisions of "Crunching BASIC programs"
m (mended long lines) |
m (1 revision) |
(No difference)
|
Revision as of 23:58, 28 August 2013
To fit more code into the BBC Micro's limited memory, BASIC programs can be crunched. Crunching reduces the size of a program without changing its meaning, but in the process makes it almost unreadable -- it is a form of obfuscation.
Crunching can be done by a utility ROM that compacts a program in memory, or more recently by sending a text listing to be compacted on another computer and reading it back in. Automated crunchers tend to concentrate on the following areas that yield big and easy savings:
- Removing redundant characters
- Joining lines together (a multi-byte line header is replaced by a colon)
- Shortening variable names.
Utility ROMs are able to remove spaces that were needed during keying, but are redundant now that the code has been tokenised. So they may produce a program that can't be typed back in from a listing.
As BASIC is an interpreted language, crunching delivers an overall time saving as well. The suggestions below are targeted at saving space more than time. See Chapter 32 of the B+ User Guide for tips on increasing the speed of a program.
Suggestions
The following does not apply to the characters within a string constant or a *command. Correctness is not guaranteed!
Trivial
- Delete empty lines.
- Delete leading and trailing spaces. A space between the line number and the code is stored in memory, and optional.
E.g.
1030 ENDPROC
could be replaced by1030ENDPROC
LISTO 1
makesLIST
reinsert the space on the screen. - Replace multiple spaces with a single space.
- Delete leading and trailing colons.
- Replace multiple colons with a single colon.
- Delete comments.
Easy
- Eliminate the keyword
LET
. - Delete the keyword
THEN
except before a system variable assignment, unary operator, function return statement, *command or implied-GOTO
line number. -
NEXT
statements don't have to name the control variables. OneNEXT
statement can terminate severalFOR
loops, using commas.NEXT X%:NEXT Y%
can be replaced withNEXT,
If the program breaks when the control variable is removed, theFOR...NEXT
loops are mis-nested! - Functions with a single argument, except
RND
, don't need brackets around the argument. E.g.PRINT CHR$letter%, INKEY100, STR$~code%
- The result of a numeric function can be discarded with
IF
rather than assigning to a dummy variable:IFGET
This only saves space at the end of a line.
Moderate
- Replace Teletext
CHR$
functions with inline characters in strings, using SHIFT/CTRL and the function keys. The listing cannot be printed and re-typed after this. - Express very large or very small real constants in scientific format:
G=6.673E-11
- A zero before a decimal point can be eliminated.
- Express integer constants ≥ +1,000,000 as hexadecimal and the rest as decimal.
- VDU sequences may be shorter with some byte constants combined into word constants using semicolons. In particular
0;
replaces0,0,
Tedious
- Delete spaces:-
- After characters
"#$%'()*+,-./:;<=>[\]^{|}~
- Preserve spaces between string constants so that they do not merge.
- Before characters
!"#&'()*+,-/:;<=>?@[\]^{|}~
- Again preserve spaces between string constants.
- Between numbers and other code, but not between two numbers.
- After keywords, but not after real variable names.
- Preserve the space in
END ELSE
,ERR OR
,GET $
,INKEY $
,MOD E
,OPT
<keyword>, <assembler mnemonic>TO P
if the listing is to be typed in.
- Preserve the space in
- After characters
- Replace long variable, procedure and function names with shorter ones.
- Preserve the variable type; don't replace integers with reals or vice versa as rounding errors may result.
- Use the resident integers
A%
toZ%
for speed, but otherwise it is best to use names with a lowercase character to avoid collisions with keywords. - Automated crunchers should use all one-character names first, then two character names with the first character 'varying fastest', and so on.
-
@%
is reserved if the programPRINT
s variables. -
A%
,C%
,X%
andY%
are reserved ifCALL
orUSR
appear in the program (6502 BASIC). -
O%
andP%
are reserved if the program contains assembly language. -
A
,X
,Y
,a
,x
andy
are reserved inside assembly language segments as they are register indicators, not variable names (6502 BASIC).
- Replace multiple lines with fewer multi-statement lines.
- The longest line that can be typed in is 240 characters including the line number.
- Remember
DATA
andDEF
must be at the beginning of a line. -
DATA 1
(newline)DATA 2
becomesDATA 1,2
. - Don't add to the end of a line containing
ELSE
,IF
,ON
,REM
or a *command. - Keywords
[
,ELSE
,REPEAT
andTHEN
, andDEF
... statements don't need a colon between them and the next statement.
-
DATA
strings don't need double quotes unless they contain double quotes, commas or leading spaces. - When crunching a text listing, keywords can be abbreviated to their minimum forms. See Chapter 48 of the B+ User Guide.
Difficult
- Use operator precedence rules to find redundant brackets in expressions and remove them.
- Use intermediate variables to reduce the number of repeated sub-expressions.
- Refactor repeated segments of code into a function, procedure or subroutine.
- Find other ways of storing data in the program besides
DATA
; see Data without DATA.
References
Based on crunch.pl, packaged with EDOSPAT 4.40.
-- beardo 19:12, 11 October 2007 (BST)