Crunching BASIC programs
To fit more code into the BBC Micro's limited memory, BASIC programs can be crunched (in modern terms, minified.) Crunching reduces the size of a program without changing its meaning, but in the process makes it almost unreadable -- it is a form of obfuscation.
Crunching can be done by a utility program that compacts a program in memory, or more recently by sending a text listing to be compacted on another computer and reading it back in. Automated crunchers tend to concentrate on the following areas that yield big and easy savings:
- Removing redundant characters
- Joining lines together (a multi-byte line header is replaced by a colon)
- Shortening variable names.
Utilities are able to remove spaces that were needed during keying, but are redundant now that the code has been tokenised. So they may produce a program that can't be typed back in from a listing.
As BASIC is an interpreted language, crunching delivers an overall time saving as well. The suggestions below are targeted at saving space more than time. See Chapter 32 of the B+ User Guide for tips on increasing the speed of a program.
The following does not apply to the characters within a string constant or a *command. Correctness is not guaranteed!
- Delete empty lines.
- Delete leading and trailing spaces. A space between the line number and the code is stored in memory, and optional.
1030 ENDPROCcould be replaced by
LISTreinsert the space when listing.
- Replace multiple spaces with a single space.
- Delete leading and trailing colons.
- Replace multiple colons with a single colon.
- Delete REM program comments and \ assembler comments.
- Eliminate the keyword
- Delete the keyword
THENexcept before a system variable assignment, unary operator, function return statement, *command or implied-
NEXTstatements don't have to name the control variables. One
NEXTstatement can terminate several
FORloops, using commas.
NEXT X%:NEXT Y%can be replaced with
NEXT,If the program breaks when the control variable is removed, the
FOR...NEXTloops are mis-nested!
- Functions with a single argument, except
RND, don't need brackets around the argument. E.g.
PRINT CHR$letter%, INKEY100, STR$~code%
- The result of a numeric function can be discarded with
IFrather than assigning to a dummy variable:
IFGET. This only saves space at the end of a line.
- Replace Teletext
CHR$functions with inline characters in strings, using SHIFT/CTRL and the function keys. The listing cannot be printed and re-typed after this.
- Express very large or very small real constants in scientific format:
- A zero before a decimal point can be eliminated.
- Express integer constants ≥ +1,000,000 as hexadecimal and the rest as decimal.
- VDU sequences may be shorter with some byte constants combined into word constants using semicolons. In particular
- Delete spaces:
- After characters
- Preserve spaces between string constants so that they do not merge.
- Before characters
- Again preserve spaces between string constants.
- Between numbers and other code, but not between two numbers.
- After keywords, but not after real variable names.
- Preserve the space in
OPT<keyword>, <assembler mnemonic>
TO Pif the listing is to be typed in.
- Preserve the space in
- Automated crunchers can remove all spaces immediately before and after tokenised keywords
- After characters
- Replace long variable, procedure and function names with shorter ones.
- Preserve the variable type; don't replace integers with reals or vice versa as rounding errors may result.
- Use the resident integers
Z%for speed, but otherwise it is best to use names with a lowercase character to avoid collisions with keywords.
- Automated crunchers should use all one-character names first, then two character names with the first character 'varying fastest', and so on.
@%is reserved if the program
Y%are reserved if
USRappear in the program (6502 BASIC).
P%are reserved if the program contains assembly language.
yare reserved inside assembly language segments as they are register indicators, not variable names (6502 BASIC).
Eshould be reserved, or used carefully together with whitespace removal as it may form the exponent of a preceding <num-const>.
- Replace multiple lines with fewer multi-statement lines.
- The longest line that can be typed in is 240 characters including the line number.
DEFmust be at the beginning of a line.
- Don't add to the end of a line containing
REMor a *command.
DEF... statements don't need a colon between them and the next statement.
DATAstrings don't need double quotes unless they contain double quotes, commas or leading spaces.
- When crunching a text listing, keywords can be abbreviated to their minimum forms. See Chapter 48 of the B+ User Guide.
- Use operator precedence rules to find redundant brackets in expressions and remove them.
- Use intermediate variables to reduce the number of repeated sub-expressions.
- Refactor repeated segments of code into a function, procedure or subroutine.
- Find other ways of storing data in the program besides
DATA; see Data without DATA.
Based on crunch.pl, packaged with EDOSPAT.
-- beardo 19:12, 11 October 2007 (BST)