Konvertieren Sie die Zeichenkodierung in einer CSV-Datei in UTF-8

Question

$ iconv -f windows-1252 -t utf-8 linkedin_contacts.csv
.
.
.
"","Ahmet XXXXX","","??
iconv: linkedin_contacts.csv:665:23: cannot convert
$ cat linkedin_contacts.csv|grep Ahmet|hexdump -C| sed -n '1,2p'
00000000  22 22 2c 22 41 68 6d 65  74 20 53 61 6c 69 68 22  |"","Ahmet XXXXX"|
00000010  2c 22 22 2c 22 3f 3f 8d  65 6e 22 2c 22 22 2c 22  |,"","??.en","","|

Ich habe den Wert 8din einemASCII-Tabelleund es scheint, als ob es sich um die Variante ISO 8859-1 handelt. Eine Überprüfung iconv --list | grep 8859-1bestätigt, dass iconves damit umgehen kann.

$ iconv -f ISO-8859-1 -t UTF-8 linkedin_contacts.csv > foo.rb
$ file foo.rb
foo.rb: UTF-8 Unicode text, with very long lines, with CRLF, LF line terminators

Dass beide Terminatoren vorhanden sind, stellt für Ruby immer noch ein Problem dar, aber wenn wir das Ende abschneiden, ist alles gut :)

$ sed '$ d' foo.rb > bar.csv
$ file bar.csv
bar.csv: UTF-8 Unicode text, with very long lines, with CRLF line terminators

Answer 1

$ iconv -f windows-1252 -t utf-8 linkedin_contacts.csv
.
.
.
"","Ahmet XXXXX","","??
iconv: linkedin_contacts.csv:665:23: cannot convert
$ cat linkedin_contacts.csv|grep Ahmet|hexdump -C| sed -n '1,2p'
00000000  22 22 2c 22 41 68 6d 65  74 20 53 61 6c 69 68 22  |"","Ahmet XXXXX"|
00000010  2c 22 22 2c 22 3f 3f 8d  65 6e 22 2c 22 22 2c 22  |,"","??.en","","|

Ich habe den Wert 8din einemASCII-Tabelleund es scheint, als ob es sich um die Variante ISO 8859-1 handelt. Eine Überprüfung iconv --list | grep 8859-1bestätigt, dass iconves damit umgehen kann.

$ iconv -f ISO-8859-1 -t UTF-8 linkedin_contacts.csv > foo.rb
$ file foo.rb
foo.rb: UTF-8 Unicode text, with very long lines, with CRLF, LF line terminators

Dass beide Terminatoren vorhanden sind, stellt für Ruby immer noch ein Problem dar, aber wenn wir das Ende abschneiden, ist alles gut :)

$ sed '$ d' foo.rb > bar.csv
$ file bar.csv
bar.csv: UTF-8 Unicode text, with very long lines, with CRLF line terminators

Konvertieren Sie die Zeichenkodierung in einer CSV-Datei in UTF-8

Antwort1

verwandte Informationen