Difference between revisions of "CSV file reading and writing"

From BeebWiki
Jump to: navigation, search
m (Tweeked semantics of imax%, rmax%.)
m (Tweeked semantics of rmax%, fmax%, corrected error.)
Line 55: Line 55:
 
records containing ''"'' (quote) characters.
 
records containing ''"'' (quote) characters.
  
     REM Read CSV file from in% to array$()
+
     REM Read CSV file from in% to array$(rmax%,fmax%)
     imax%=0:REPEAT
+
     rmax%=0:REPEAT
 
       s$=FNrd(in%)
 
       s$=FNrd(in%)
       FOR record%=0 TO rmax%-1
+
       FOR field%=0 TO fmax%
 
         i%=INSTR(s$+",",","):r$=LEFT$(s$,i%-1):s$=MID$(s$,i%+1)
 
         i%=INSTR(s$+",",","):r$=LEFT$(s$,i%-1):s$=MID$(s$,i%+1)
         IFLEFT$(r$,1)="""":IF RIGHT$(r$,1)="""":r$=MID$(r$,2,LENr$-2)
+
         IF LEFT$(r$,1)="""":IF RIGHT$(r$,1)="""":r$=MID$(r$,2,LENr$-2)
 
         i%=0
 
         i%=0
 
         REPEAT
 
         REPEAT
           i%=INSTR(r$,"""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,i%+1)
+
           i%=INSTR(r$,"""""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,i%+1)
 
         UNTIL i%=0
 
         UNTIL i%=0
         array$(imax%,record%)=r$
+
         array$(rmax%,field%)=r$
       NEXT record%
+
       NEXT field%
       imax%=imax%+1
+
       rmax%=rmax%+1
 
     UNTIL EOF#in%
 
     UNTIL EOF#in%
 +
    rmax%=rmax%-1
 
      
 
      
     REM Write CSV file from array$() to out%
+
     REM Write CSV file from array$(rmax%,fmax%) to out%
     FOR item%=0 TO imax%-1
+
     FOR record%=0 TO rmax%
       FOR record%=0 TO rmax%-1
+
       FOR field%=0 TO fmax%
         r$=array$(item%,record%)
+
         r$=array$(record%,field%)
 
         i%=0
 
         i%=0
 
         REPEAT
 
         REPEAT
           i%=INSTR(r$,"""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,2,i%):i%=i%+1
+
           i%=INSTR(r$,"""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,i%):i%=i%+1
 
         UNTIL i%=0
 
         UNTIL i%=0
 
         IF INSTR(r$,",")<>0 THEN r$=""""+r$+""""
 
         IF INSTR(r$,",")<>0 THEN r$=""""+r$+""""
         PROCwr(out%,r$+",")
+
         PROCwr(out%,r$):IF field%<>fmax%:BPUT#out%,ASC","
      NEXT record%
+
      NEXT field%
      PTR#out%=PTR#out%-1:BPUT#out%,13
+
     NEXT record%
     NEXT item%
 
  
 
Note that this code is vunerable to ''String too long'' errors if escaping
 
Note that this code is vunerable to ''String too long'' errors if escaping

Revision as of 02:03, 5 April 2017

Comma Separated Value or CSV files are a common format typically used to represent the contents of a spreadsheet or database (or any other 2D matrix of data cells); they can by read and written by programs such as Microsoft Excel and Microsoft Access. The main properties of CSV files are as follows:

  • Data rows are separated by end-of-line sequences.
  • Data columns are separated by commas.
  • Quotation marks must be escaped using the convention "".
  • Data cells containing commas or quotes must be enclosed in quotes.
  • Any data cell may optionally be enclosed in quotes.

Here is an example of the contents of a CSV file:

   1997,Ford,E350,"AC, ABS, moon roof",3000.00
   1999,Chevy,"Venture ""Extended Edition""",,4900.00

For more details see this Wikipedia article.

Records with no quote characters

If you know that no records contain " (quote) characters, or are willing to ignore them, the following code will read and write a CSV file.

   REM Read CSV file from in% to array$(imax%,rmax%)
   imax%=0:REPEAT
     s$=FNrd(in%)
     FOR record%=0 TO rmax%
       i%=INSTR(s$+",",","):r$=LEFT$(s$,i%-1):s$=MID$(s$,i%+1)
       IF LEFT$(r$,1)="""":IF RIGHT$(r$,1)="""":r$=MID$(r$,2,LENr$-2)
       array$(imax%,record%)=r$
     NEXT record%
     imax%=imax%+1
   UNTIL EOF#in%
   imax%=imax%-1
   
   REM Write CSV file from array$(imax%,rmax%) to out%
   FOR item%=0 TO imax%
     FOR record%=0 TO rmax%
       s$=array$(item%,record%)
       IF INSTR(s$,",")<>0 THEN s$=""""+s$+""""
       PROCwr(out%,s$):IF record%<>rmax%:BPUT#out%,ASC","
     NEXT record%
   NEXT item%

This code uses imax% to hold the maximum item number and rmax% to hold the maximum record number, and PROCwr() to write a string to a file and FNrd() to read a terminated line from a file. PROCwr() and FNrd() are in the StringIO library.

Records containing quote characters

The following code will read and write CSV files if you need to deal with records containing " (quote) characters.

   REM Read CSV file from in% to array$(rmax%,fmax%)
   rmax%=0:REPEAT
     s$=FNrd(in%)
     FOR field%=0 TO fmax%
       i%=INSTR(s$+",",","):r$=LEFT$(s$,i%-1):s$=MID$(s$,i%+1)
       IF LEFT$(r$,1)="""":IF RIGHT$(r$,1)="""":r$=MID$(r$,2,LENr$-2)
       i%=0
       REPEAT
         i%=INSTR(r$,"""""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,i%+1)
       UNTIL i%=0
       array$(rmax%,field%)=r$
     NEXT field%
     rmax%=rmax%+1
   UNTIL EOF#in%
   rmax%=rmax%-1
   
   REM Write CSV file from array$(rmax%,fmax%) to out%
   FOR record%=0 TO rmax%
     FOR field%=0 TO fmax%
       r$=array$(record%,field%)
       i%=0
       REPEAT
         i%=INSTR(r$,"""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,i%):i%=i%+1
       UNTIL i%=0
       IF INSTR(r$,",")<>0 THEN r$=""""+r$+""""
       PROCwr(out%,r$):IF field%<>fmax%:BPUT#out%,ASC","
     NEXT field%
   NEXT record%

Note that this code is vunerable to String too long errors if escaping embedded characters result in the output string becoming too long.

Jgharston 20:20, 29 August 2008 (BST)