Difference between revisions of "CSV file reading and writing"
m (Tweeked semantics of imax%, rmax%.) |
m (Tweeked semantics of rmax%, fmax%, corrected error.) |
||
Line 55: | Line 55: | ||
records containing ''"'' (quote) characters. | records containing ''"'' (quote) characters. | ||
− | REM Read CSV file from in% to array$() | + | REM Read CSV file from in% to array$(rmax%,fmax%) |
− | + | rmax%=0:REPEAT | |
s$=FNrd(in%) | s$=FNrd(in%) | ||
− | FOR | + | FOR field%=0 TO fmax% |
i%=INSTR(s$+",",","):r$=LEFT$(s$,i%-1):s$=MID$(s$,i%+1) | i%=INSTR(s$+",",","):r$=LEFT$(s$,i%-1):s$=MID$(s$,i%+1) | ||
− | + | IF LEFT$(r$,1)="""":IF RIGHT$(r$,1)="""":r$=MID$(r$,2,LENr$-2) | |
i%=0 | i%=0 | ||
REPEAT | REPEAT | ||
− | i%=INSTR(r$,"""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,i%+1) | + | i%=INSTR(r$,"""""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,i%+1) |
UNTIL i%=0 | UNTIL i%=0 | ||
− | array$( | + | array$(rmax%,field%)=r$ |
− | NEXT | + | NEXT field% |
− | + | rmax%=rmax%+1 | |
UNTIL EOF#in% | UNTIL EOF#in% | ||
+ | rmax%=rmax%-1 | ||
− | REM Write CSV file from array$() to out% | + | REM Write CSV file from array$(rmax%,fmax%) to out% |
− | FOR | + | FOR record%=0 TO rmax% |
− | FOR | + | FOR field%=0 TO fmax% |
− | r$=array$( | + | r$=array$(record%,field%) |
i%=0 | i%=0 | ||
REPEAT | REPEAT | ||
− | i%=INSTR(r$,"""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$ | + | i%=INSTR(r$,"""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,i%):i%=i%+1 |
UNTIL i%=0 | UNTIL i%=0 | ||
IF INSTR(r$,",")<>0 THEN r$=""""+r$+"""" | IF INSTR(r$,",")<>0 THEN r$=""""+r$+"""" | ||
− | PROCwr(out%,r$ | + | PROCwr(out%,r$):IF field%<>fmax%:BPUT#out%,ASC"," |
− | + | NEXT field% | |
− | + | NEXT record% | |
− | NEXT | ||
Note that this code is vunerable to ''String too long'' errors if escaping | Note that this code is vunerable to ''String too long'' errors if escaping |
Revision as of 01:03, 5 April 2017
Comma Separated Value or CSV files are a common format typically used to represent the contents of a spreadsheet or database (or any other 2D matrix of data cells); they can by read and written by programs such as Microsoft Excel and Microsoft Access. The main properties of CSV files are as follows:
- Data rows are separated by end-of-line sequences.
- Data columns are separated by commas.
- Quotation marks must be escaped using the convention "".
- Data cells containing commas or quotes must be enclosed in quotes.
- Any data cell may optionally be enclosed in quotes.
Here is an example of the contents of a CSV file:
1997,Ford,E350,"AC, ABS, moon roof",3000.00 1999,Chevy,"Venture ""Extended Edition""",,4900.00
For more details see this Wikipedia article.
Records with no quote characters
If you know that no records contain " (quote) characters, or are willing to ignore them, the following code will read and write a CSV file.
REM Read CSV file from in% to array$(imax%,rmax%) imax%=0:REPEAT s$=FNrd(in%) FOR record%=0 TO rmax% i%=INSTR(s$+",",","):r$=LEFT$(s$,i%-1):s$=MID$(s$,i%+1) IF LEFT$(r$,1)="""":IF RIGHT$(r$,1)="""":r$=MID$(r$,2,LENr$-2) array$(imax%,record%)=r$ NEXT record% imax%=imax%+1 UNTIL EOF#in% imax%=imax%-1 REM Write CSV file from array$(imax%,rmax%) to out% FOR item%=0 TO imax% FOR record%=0 TO rmax% s$=array$(item%,record%) IF INSTR(s$,",")<>0 THEN s$=""""+s$+"""" PROCwr(out%,s$):IF record%<>rmax%:BPUT#out%,ASC"," NEXT record% NEXT item%
This code uses imax%
to hold the maximum item number
and rmax%
to hold the maximum record number, and
PROCwr()
to write a string to a file and
FNrd()
to read a terminated line from a file.
PROCwr()
and FNrd()
are in the
StringIO library.
Records containing quote characters
The following code will read and write CSV files if you need to deal with records containing " (quote) characters.
REM Read CSV file from in% to array$(rmax%,fmax%) rmax%=0:REPEAT s$=FNrd(in%) FOR field%=0 TO fmax% i%=INSTR(s$+",",","):r$=LEFT$(s$,i%-1):s$=MID$(s$,i%+1) IF LEFT$(r$,1)="""":IF RIGHT$(r$,1)="""":r$=MID$(r$,2,LENr$-2) i%=0 REPEAT i%=INSTR(r$,"""""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,i%+1) UNTIL i%=0 array$(rmax%,field%)=r$ NEXT field% rmax%=rmax%+1 UNTIL EOF#in% rmax%=rmax%-1 REM Write CSV file from array$(rmax%,fmax%) to out% FOR record%=0 TO rmax% FOR field%=0 TO fmax% r$=array$(record%,field%) i%=0 REPEAT i%=INSTR(r$,"""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,i%):i%=i%+1 UNTIL i%=0 IF INSTR(r$,",")<>0 THEN r$=""""+r$+"""" PROCwr(out%,r$):IF field%<>fmax%:BPUT#out%,ASC"," NEXT field% NEXT record%
Note that this code is vunerable to String too long errors if escaping embedded characters result in the output string becoming too long.
Jgharston 20:20, 29 August 2008 (BST)