Difference between revisions of "Program format"

From BeebWiki
Jump to: navigation, search
(.)
 
m (ElectrEm link new address.)
 
(2 intermediate revisions by one user not shown)
Line 1: Line 1:
 
[[Category:BASIC]]
 
[[Category:BASIC]]
BBC BASIC programs are usually stored in a tokenised format. Most of the tokens used by different versions of BBC BASIC are the same, but extended tokens are different, and different versions have different line headers. The exceptions are [http://jaguar.orpheusweb.co.uk/branpage.html Brandy BASIC] and the emulator [http://electrem.acornelectron.co.uk ElectrEm] which both store BASIC as plain text.
+
BBC BASIC programs are usually stored in a tokenised format. Most of the
 +
tokens used by different versions of BBC BASIC are the same, but extended
 +
tokens are different, and different versions have different line headers.  
 +
The exceptions are
 +
[http://jaguar.orpheusweb.co.uk/branpage.html Brandy BASIC]
 +
and the emulator [https://github.com/TomHarte/ElectrEm ElectrEm] which both
 +
store BASIC as plain text.
  
 
Acorn/Wilson format BASIC is stored as:
 
Acorn/Wilson format BASIC is stored as:
Line 11: Line 17:
 
:    <tt>{[<text>] [<cr>|<lf>|<cr><lf>|<lf><cr>]}</tt>
 
:    <tt>{[<text>] [<cr>|<lf>|<cr><lf>|<lf><cr>]}</tt>
  
It can be useful to be able to determine what format a BASIC file is stored in. Different file formats have defined RISC OS filetypes and DOS extensions:
+
It can be useful to be able to determine what format a BASIC file is stored
 +
in. Different file formats have defined RISC OS filetypes and DOS
 +
extensions:
 
{| border="1"
 
{| border="1"
 
!    Format  !!        RISC OS filetype  !!    File extension
 
!    Format  !!        RISC OS filetype  !!    File extension
Line 19: Line 27:
 
|    Russell      || &1C7 "Basic8" ||  ".bbc"
 
|    Russell      || &1C7 "Basic8" ||  ".bbc"
 
|-
 
|-
|    Text/Brandy/ElectrEm  || &FFF "Text"   ||  ".bas"
+
|    Text/Brandy/ElectrEm  || &FFF "Text" or &FD1 "BASICtxt" ||  ".bas"
 
|}
 
|}
  
However, you should never assume a file's format from its filetype or extension. Assuming a BASIC program has been saved normally, and there is no extra data appended to the end of the file, the format can be easily determined by looking at the ''final'' few bytes of the file:
+
However, you should never assume a file's format from its filetype or
 +
extension. Assuming a BASIC program has been saved normally, and there is no
 +
extra data appended to the end of the file, the format can be easily
 +
determined by looking at the ''final'' few bytes of the file:
 
:        <tt><cr><00><ff><ff> - Russell format (Z80, 80x86)
 
:        <tt><cr><00><ff><ff> - Russell format (Z80, 80x86)
 
:        <xx><xx><cr><ff> - Wilson/Acorn format (6502, 32000, ARM, PDP11)
 
:        <xx><xx><cr><ff> - Wilson/Acorn format (6502, 32000, ARM, PDP11)
Line 31: Line 42:
 
:        <xx><xx><xx><xx> - unrecognised</tt>
 
:        <xx><xx><xx><xx> - unrecognised</tt>
  
The following code will examine the last four bytes of an open file and determine what format it is:
+
The following code will examine the last four bytes of an open file and
 +
determine what format it is:
  
 
     PTR#in%=EXT#in%-4:FOR A%=0 TO 3:buffer%?A%=BGET#in%:NEXT
 
     PTR#in%=EXT#in%-4:FOR A%=0 TO 3:buffer%?A%=BGET#in%:NEXT
Line 42: Line 54:
 
     IF!buffer%=&FFFF000D                :type%=1 :REM 80/86 format
 
     IF!buffer%=&FFFF000D                :type%=1 :REM 80/86 format
  
Here '''in%''' is the handle of the file that has been opened, '''buffer%''' is a pointer to a four byte '''DIM'''ensioned block of memory, '''A%''' is a temporary variable and on exit '''type%''' contains the file type.
+
Here '''in%''' is the handle of the file that has been opened, '''buffer%'''
 +
is a pointer to a four byte '''DIM'''ensioned block of memory, '''A%''' is a
 +
temporary variable and on exit '''type%''' contains the file type.
  
Note that in some circumstances it is legitimate to ''append'' data to the end of a BBC BASIC program file.  In that case this method of determining the format will not work, since the last four bytes of the file will not be the last four bytes of the program.
+
Note that in some circumstances it is legitimate to ''append'' data to the
 +
end of a BBC BASIC program file.  In that case this method of determining
 +
the format will not work, since the last four bytes of the file will not be
 +
the last four bytes of the program.
  
 
''by JGH, May 2006, originally for BBFW Wiki''
 
''by JGH, May 2006, originally for BBFW Wiki''

Latest revision as of 15:24, 4 December 2017

BBC BASIC programs are usually stored in a tokenised format. Most of the tokens used by different versions of BBC BASIC are the same, but extended tokens are different, and different versions have different line headers. The exceptions are Brandy BASIC and the emulator ElectrEm which both store BASIC as plain text.

Acorn/Wilson format BASIC is stored as:

{<cr> <linehi> <linelo> <len> <text>} <cr> <ff>

Russell format BASIC is stored as:

{<len> <linelo> <linehi> <text> <cr>} <00> <ff> <ff>

BASIC can also be stored as text, as:

{[<text>] [<cr>|<lf>|<cr><lf>|<lf><cr>]}

It can be useful to be able to determine what format a BASIC file is stored in. Different file formats have defined RISC OS filetypes and DOS extensions:

Format RISC OS filetype File extension
Acorn/Wilson &FFB "BASIC" "." or ",ffb"
Russell &1C7 "Basic8" ".bbc"
Text/Brandy/ElectrEm &FFF "Text" or &FD1 "BASICtxt" ".bas"

However, you should never assume a file's format from its filetype or extension. Assuming a BASIC program has been saved normally, and there is no extra data appended to the end of the file, the format can be easily determined by looking at the final few bytes of the file:

<cr><00><ff><ff> - Russell format (Z80, 80x86)
<xx><xx><cr><ff> - Wilson/Acorn format (6502, 32000, ARM, PDP11)
<xx><xx><cr><lf> - text CR/LF terminated
<xx><xx><lf><cr> - text LF/CR terminated
<xx><xx><xx><lf> - text LF terminated (Brandy)
<xx><xx><xx><cr> - text CR terminated
<xx><xx><xx><xx> - unrecognised

The following code will examine the last four bytes of an open file and determine what format it is:

   PTR#in%=EXT#in%-4:FOR A%=0 TO 3:buffer%?A%=BGET#in%:NEXT
   type%=0                                       :REM unknown
   IFbuffer%?3=&0D                      :type%=7 :REM text cr
   IFbuffer%?3=&0A                      :type%=6 :REM text lf
   IF(!buffer% AND &FFFF0000)=&0D0A0000 :type%=5 :REM text lfcr
   IF(!buffer% AND &FFFF0000)=&0A0D0000 :type%=4 :REM text crlf
   IF(!buffer% AND &FFFF0000)=&FF0D0000 :type%=2 :REM 6502 format
   IF!buffer%=&FFFF000D                 :type%=1 :REM 80/86 format

Here in% is the handle of the file that has been opened, buffer% is a pointer to a four byte DIMensioned block of memory, A% is a temporary variable and on exit type% contains the file type.

Note that in some circumstances it is legitimate to append data to the end of a BBC BASIC program file. In that case this method of determining the format will not work, since the last four bytes of the file will not be the last four bytes of the program.

by JGH, May 2006, originally for BBFW Wiki