Discussion:
[gentoo-user] OT: how does excel find commas within fields of a csv file?
(too old to reply)
Adam Carter
2024-02-28 02:00:01 UTC
Permalink
To clean up csv files I use excel's find/replace to swap the commas
occurring within fields for something benign. How does this magic work?
Different character sets within the same file?

Is it possible to do this with shell scripting?
Jack
2024-02-28 02:20:01 UTC
Permalink
Post by Adam Carter
To clean up csv files I use excel's find/replace to swap the commas
occurring within fields for something benign. How does this magic
work? Different character sets within the same file?
Is it possible to do this with shell scripting?
Once Excel (or LibreOffice) reads in a csv file, the commas are no
longer present, and it just searches within the cells.  It might be
possible for a shell script to do it, but you need to parse the file to
distinguish any commas separating the fields from commas within the
fields.  I'm sure there are plenty of utilities to do this, but it's
certainly not trivial.
Wols Lists
2024-02-28 08:40:02 UTC
Permalink
Post by Jack
Post by Adam Carter
To clean up csv files I use excel's find/replace to swap the commas
occurring within fields for something benign. How does this magic
work? Different character sets within the same file?
Is it possible to do this with shell scripting?
Once Excel (or LibreOffice) reads in a csv file, the commas are no
longer present, and it just searches within the cells.  It might be
possible for a shell script to do it, but you need to parse the file to
distinguish any commas separating the fields from commas within the
fields.  I'm sure there are plenty of utilities to do this, but it's
certainly not trivial.
The other thing is, look up the definition (such as there is) of CSVs.
Special characters (such as commas) can be quoted. Standard practice as
far as I can tell, is that any cell containing a comma will be
double-quoted, and the quotes are stripped on import.

The other trick I learnt is that to prevent Excel mangling text, you
precede it with a single quote - for example I want eg "+7" in a cell,
so I have to enter '+7.

Cheers,
Wol
Adam Carter
2024-02-29 01:30:01 UTC
Permalink
Post by Wols Lists
The other thing is, look up the definition (such as there is) of CSVs.
Special characters (such as commas) can be quoted. Standard practice as
far as I can tell, is that any cell containing a comma will be
double-quoted, and the quotes are stripped on import.
Thanks - looks like quoting is the answer.
Mark Knecht
2024-02-29 03:00:02 UTC
Permalink
Post by Adam Carter
Post by Wols Lists
The other thing is, look up the definition (such as there is) of CSVs.
Special characters (such as commas) can be quoted. Standard practice as
far as I can tell, is that any cell containing a comma will be
double-quoted, and the quotes are stripped on import.
Thanks - looks like quoting is the answer.
It might not be something you want to deal with but pretty much
every Python data analysis and machine learning package has
functions for reading and writing CSV files.

- Mark

Loading...