是否有命令列咒語可以刪除 CSV 檔案中的一列？

Question 1

我相信這是 GNU coreutils 特有的：

$ cut --complement -f 3 -d, inputfile
1111,2222,4444
aaaa,bbbb,dddd

通常，您可以透過 -f 指定所需的字段，但透過新增 --complement ，您自然可以顛倒含義。來自「人切」：

--complement
    complement the set of selected bytes, characters or fields

要注意的是：如果任何欄位包含逗號，則會拋出 cut off，因為 cut 不是與電子表格相同的 CSV 解析器。許多解析器對於如何處理 CSV 中的轉義逗號有不同的想法。對於簡單的 CSV 情況，在命令列上，cut 仍然是最佳選擇。

Answer

我相信這是 GNU coreutils 特有的：

$ cut --complement -f 3 -d, inputfile
1111,2222,4444
aaaa,bbbb,dddd

通常，您可以透過 -f 指定所需的字段，但透過新增 --complement ，您自然可以顛倒含義。來自「人切」：

--complement
    complement the set of selected bytes, characters or fields

要注意的是：如果任何欄位包含逗號，則會拋出 cut off，因為 cut 不是與電子表格相同的 CSV 解析器。許多解析器對於如何處理 CSV 中的轉義逗號有不同的想法。對於簡單的 CSV 情況，在命令列上，cut 仍然是最佳選擇。

Question 2

如果資料只是由逗號分隔的欄位組成：

cut -d , -f 1-2,4-

您也可以使用 awk，但這有點尷尬，因為雖然清除欄位很容易，但刪除分隔符號需要一些工作。如果你沒有空字段，那也不算太糟：

awk -F , 'BEGIN {OFS=FS}  {$3=""; sub(",,", ","); print}'

如果您有實際的 CSV，如果正確引用，逗號可以出現在欄位內，那麼您需要真正的 CSV 庫。

Answer

如果資料只是由逗號分隔的欄位組成：

cut -d , -f 1-2,4-

您也可以使用 awk，但這有點尷尬，因為雖然清除欄位很容易，但刪除分隔符號需要一些工作。如果你沒有空字段，那也不算太糟：

awk -F , 'BEGIN {OFS=FS}  {$3=""; sub(",,", ","); print}'

如果您有實際的 CSV，如果正確引用，逗號可以出現在欄位內，那麼您需要真正的 CSV 庫。

Question 3

使用 CSV 感知工具從無標題 CSV 輸入檔中刪除前兩列：

$ cat file
1111,2222,3333,4444
aaaa,bbbb,cccc,dddd

$ mlr --csv -N cut -x -f 1,2 file
3333,4444
cccc,dddd

操作-x的選項cut磨坊主( mlr) 導致操作排除命名欄位（在本例中為字段號 1 和 2）。如果 CSV 資料有標題，我們就可以使用命名欄位-f（-N在這種情況下也需要刪除該選項）。

由於 Miller 支援 CSV，因此它可以處理包含嵌入逗號、引號和換行符的正確引用欄位。

Answer

使用 CSV 感知工具從無標題 CSV 輸入檔中刪除前兩列：

$ cat file
1111,2222,3333,4444
aaaa,bbbb,cccc,dddd

$ mlr --csv -N cut -x -f 1,2 file
3333,4444
cccc,dddd

操作-x的選項cut磨坊主( mlr) 導致操作排除命名欄位（在本例中為字段號 1 和 2）。如果 CSV 資料有標題，我們就可以使用命名欄位-f（-N在這種情況下也需要刪除該選項）。

由於 Miller 支援 CSV，因此它可以處理包含嵌入逗號、引號和換行符的正確引用欄位。

Question 4

嘗試使用以下命令刪除使用索引的列。

dropColumnCSV --index=0 --file=file.csv

如果列用逗號分隔，這將起作用，如下所示sed函數內部使用指令來刪除字串。

dropColumnCSV() {
  # argument check
  while [ $# -gt 0 ]; do
    case "$1" in
      --index=*)
        index="${1#*=}"
        ;;
      --file=*)
        file="${1#*=}"
        ;;
      *)
        printf "* Error: Invalid argument. *\n"
        return
    esac
    shift
  done

  # file check
  if [ ! -f $file ]; then
        printf "* Error: $file not found.*\n"
        return
  fi

  # sed remove command index zero
  if [[ $index == 0 ]]; then
    sed -i 's/\([^,]*\),\(.*\)/\2/' $file

  # sed remove command index greater than zero
  elif [[ $index > 0 ]]; then
    pos_str=$(for i in {1..$(seq "$index")}; do echo -n '[^,]*',; done| sed 's/,$//') ;
    sed -i 's/^\('$pos_str'\),[^,]*/\1/' $file
  fi
}

Answer

嘗試使用以下命令刪除使用索引的列。

dropColumnCSV --index=0 --file=file.csv

如果列用逗號分隔，這將起作用，如下所示sed函數內部使用指令來刪除字串。

dropColumnCSV() {
  # argument check
  while [ $# -gt 0 ]; do
    case "$1" in
      --index=*)
        index="${1#*=}"
        ;;
      --file=*)
        file="${1#*=}"
        ;;
      *)
        printf "* Error: Invalid argument. *\n"
        return
    esac
    shift
  done

  # file check
  if [ ! -f $file ]; then
        printf "* Error: $file not found.*\n"
        return
  fi

  # sed remove command index zero
  if [[ $index == 0 ]]; then
    sed -i 's/\([^,]*\),\(.*\)/\2/' $file

  # sed remove command index greater than zero
  elif [[ $index > 0 ]]; then
    pos_str=$(for i in {1..$(seq "$index")}; do echo -n '[^,]*',; done| sed 's/,$//') ;
    sed -i 's/^\('$pos_str'\),[^,]*/\1/' $file
  fi
}

是否有命令列咒語可以刪除 CSV 檔案中的一列？

答案1

答案2

答案3

答案4

相關內容