假設我有一個 CSV 檔案:
"col1","col2","col3"
"col4","col5,subtext","col6
我遇到的問題如下:
cut -d, -f1,2 test.txt
"coll1","col2"
"col4","col5
所需的輸出為:
"col1","col2"
"col4","col5,subtext"
答案1
Perl 附帶的 ParseWords 模組非常優雅地涵蓋了這一點。下面的例子。
$ perl -MText::ParseWords -nE '@a=quotewords ",",1,$_;say $a[0],",",$a[1]' <test.txt
"col1","col2"
"col4","col5,subtext"
$
答案2
如果您有gawk
v4 可用,則有一個很好的解決方案使用 awk 解析 csv 並忽略字段內的逗號
例子:
gawk -vFPAT='[^,]*|"[^"]*"' '{print $1 "," $2}' test.txt
答案3
另一種perl
解決方案,假設所有欄位都被引用
$ perl -F'/"\K,(?=")/' -lane 'print "$F[0],$F[1]"' test.txt
"col1","col2"
"col4","col5,subtext"
-F'/"\K,(?=")/'
僅當欄位分隔符號前後為逗號時,欄位分隔符號才會為"
逗號"
print "$F[0],$F[1]"
列印前兩個字段,分隔符,
grep
也 可以用
$ grep -oE '^"[^"]*","[^"]*"' test.txt
"col1","col2"
"col4","col5,subtext"
如果需要 N 個字段,請使用裡面的grep -oE '^("[^"]*",){1}"[^"]*"'
數字{}
N-1
答案4
你也可以用 awk 嘗試一下,如下所示;
awk -F'","' '{printf "%s\",\"%s\"\n", $1, $2 }' test.txt
例如;
user@host$ awk -F'","' '{printf "%s\",\"%s\"\n", $1, $2 }' test.txt
"col1","col2"
"col4","col5,subtext"