Filename character mapping

From BeebWiki
Revision as of 05:30, 3 April 2018 by Jgharston (talk | contribs) (See also)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

When dealing with "foreign" media formats, such as zip files, DOS or Unix disks and images, etc., filenames can be encountered that contain characters that are not allowed on BBC filing systems, or have different meanings. The most common occurance is Unix, URLs and Zip files use / as a directory seperator whereas the BBC uses ., and the converse mapping of . and / as an extension seperator.

Other than DOS/Windows using \ directory seperators and Unix/URL/ZIP using / seperators, all non-BBC filename characters have the same meanings. The following tables show how different special characters are mapped between BBC and non-BBC platforms.

Non-BBC to BBC

The following characters are legal in DOS/Windows or Unix/URL/ZIP filenames, but are illegal or have special meanings in BBC filenames:

 Non-BBC    Non-BBC          BBC       Replace   Non-BBC      BBC
Character   Meaning        Meaning      with     Example    Example     Notes
---------------------------------------------------------------------------------
    #          -           wildcard       ?      part#3     part?3
    $          -           root dir       <      TEMP.$$$   TEMP/<<<
    %          -           lib dir        ;      TEMP.%%%   TEMP/;;;    see notes
    &          -           user dir       +      one&two    one+two     see notes
    *      wildcard        wildcard       *      info*      info*
    .      extension       directory      /      help.txt   help/txt
    .      current dir     directory      @      ./file     @.file      .\file on DOS
    ..     parent dir      directory      ^      ../menu    ^.menu      ..\file on DOS
    :      drive           drive          :      A:file     :A.file
    ?      wildcard            -          #      page??     page##
    @          -           current dir    =      TEMP.@@@   TEMP/===    see notes
    ^          -           parent dir     >      TEMP.^^^   TEMP/>>>
    
    /      Unix directory  extension   . or $    /dir/file  $.dir.file
    
    \      DOS directory       -       . or $    \dir\file  $.dir.file
    /      DOS -           extension      \      one/two    one\two

The simplest bidirectional mapping is:

 Non-BBC   BBC     DOS    BBC     Unix    BBC
-----------------------------------------------
    #  <->  ?       .  ->  /       .  <->  /
    $  <->  <       /  ->  \
    ^  <->  >       \  ->  .
    &  <->  +
    @  <->  =
    %  <->  ;

BBC to Non-BBC

The following characters are legal in BBC filenames, but are illegal or have special meanings in DOS/Windows or Unix/URL/Zip filenames:

   BBC        BBC       Non-BBC       Replace    BBC      Non-BBC
Character   Meaning     Meaning        with    Example    Example          Notes
--------------------------------------------------------------------------------------
    #       wildcard        -            ?      part#3     part?3
    $       root dir        -            \      $.user     \user or /unix
    %       lib dir         -                   %.blib                     see notes
    &       user dir        -                   &.mymail                   see notes
    *       wildcard    wildcard         *      info*      info*
    :       drive       drive            :      :A.file    A:file
    <           -       redirection      $      TEMP/<<<   TEMP.$$$
    >           -       redirection      ^      TEMP/>>>   TEMP.^^^
    ?           -       wildcard         #      why?       why#
    @       current dir     -            .      @.file     .\file  or ./file
    ^       parent dir      -            ..     ^.file     ..\file or ../file
    
    /       extension   Unix directory   .      help/txt   help.txt
    .       directory   Unix extension   /      dir.file   dir/file
    
    .       directory   DOS extension    \      dir.file   dir\file
    /       extension   DOS -            .      help/txt   help.txt
    \           -       DOS directory    /      one\two    one/two

The simplest bidirectional mapping is:

   BBC   Non-BBC   BBC    DOS     BBC    Unix
-----------------------------------------------
    ?  <->  #       .  ->  \       .  <->  /
    <  <->  $       \  ->  /
    >  <->  ^       /  ->  .
    +  <->  &
    =  <->  @
    ;  <->  %

DOS to/from UNIX

Included for completeness is mapping between DOS/Windows and Unix/URL/ZIP filenames. Both use the same characters other than DOS/Windows using \ for directories and UNIX/URL/ZIP using /.

   DOS       Unix         DOS       Replace    DOS           Unix
Character   Meaning     Meaning      with    Example        Example
--------------------------------------------------------------------------------------
   /       directory       -          \      \dir\file.txt  /dir/file.txt
   \           -       directory      /      one/two.txt    one\two.txt

The bidirection mapping is:

   DOS    Unix
---------------------
    /  <->  \

Notes

  • BBC & and % would be mapped to something like $HOMEDIR$ and $PATH$.

Implementations

  • RISC OS DOSFS translates bidirectionally between DOS %, &, @ and BBC ;=, ;~, ;=. However, BBC %, &, ; are also mapped to DOS %, &, ;, creating a non-reversable two-to-one mapping.
  • SparkFS Zip only maps bidirectionally between . and /. All other mapping is Zip to BBC only, eg a Zip # is extracted as a ?, but a ? is stored back in a Zip unchanged as a ?.
  • BBCZip performs seven bidirectional character mappings: # <-> ?, $ <-> <, ^ <-> >, . <-> /, & <-> +, @ <-> =, % <-> ;.
  • BBC Sprow and Petrov DOSFS have not yet been tested.
  • Other Zip implementations have not yet been tested.

See also