將行的以下部分合併到 3 列文件中的目前行

Question 1

word假設您的輸入在和欄位上進行排序type，就像您發布的範例輸入中顯示的那樣：

$ cat tst.awk
BEGIN { FS=" @@@ "; ORS="" }
{ curr = $1 FS $2 }
curr != prev {
    printf "%s%s", ORS, $0
    prev = curr
    ORS = RS
    next
}
{ printf " ;;; %s", $NF }
END { print "" }

$ awk -f tst.awk file
word0 @@@ type2 @@@ sentence0
word1 @@@ type1 @@@ sentence1 ;;; sentence2 ;;; sentence3
word1 @@@ type2 @@@ sentence4
word2 @@@ type1 @@@ sentence5

上面的程式碼可以在每個 UNIX 機器上的任何 shell 中使用任何 awk 來工作，一次只在記憶體中儲存 1 行，並且將按照與輸入相同的順序產生輸出。

Answer

word假設您的輸入在和欄位上進行排序type，就像您發布的範例輸入中顯示的那樣：

$ cat tst.awk
BEGIN { FS=" @@@ "; ORS="" }
{ curr = $1 FS $2 }
curr != prev {
    printf "%s%s", ORS, $0
    prev = curr
    ORS = RS
    next
}
{ printf " ;;; %s", $NF }
END { print "" }

$ awk -f tst.awk file
word0 @@@ type2 @@@ sentence0
word1 @@@ type1 @@@ sentence1 ;;; sentence2 ;;; sentence3
word1 @@@ type2 @@@ sentence4
word2 @@@ type1 @@@ sentence5

上面的程式碼可以在每個 UNIX 機器上的任何 shell 中使用任何 awk 來工作，一次只在記憶體中儲存 1 行，並且將按照與輸入相同的順序產生輸出。

Question 2

這是 awk 中的一種方法：

$ awk -F'@@@' '{ $1 in a ? a[$1][$2]=a[$1][$2]" ;;; "$3 : a[$1][$2]=$3}END{for(word in a){for (type in a[word]){print word,FS,type,FS,a[word][type]} }}' file 
word0  @@@  type2  @@@  sentence0
word1  @@@  type1  @@@  sentence1 ;;;  sentence2 ;;;  sentence3
word1  @@@  type2  @@@  ;;;  sentence4
word2  @@@  type1  @@@  sentence5

或者，更清晰一點：

awk -F'@@@' '{ 
                if($1 in a){ 
                    a[$1][$2]=a[$1][$2]" ;;; "$3
                }
                else{
                    a[$1][$2]=$3
                }
             }
             END{
                 for(word in a){
                     for (type in a[word]){
                         print word,FS,type,FS,a[word][type]
                     }
                 }
             }' file

請注意，這需要一個awk能夠理解多維數組的實現，例如 GNU awk ( )，這是Linux 系統上的gawk預設實作。awk

Answer

這是 awk 中的一種方法：

$ awk -F'@@@' '{ $1 in a ? a[$1][$2]=a[$1][$2]" ;;; "$3 : a[$1][$2]=$3}END{for(word in a){for (type in a[word]){print word,FS,type,FS,a[word][type]} }}' file 
word0  @@@  type2  @@@  sentence0
word1  @@@  type1  @@@  sentence1 ;;;  sentence2 ;;;  sentence3
word1  @@@  type2  @@@  ;;;  sentence4
word2  @@@  type1  @@@  sentence5

或者，更清晰一點：

awk -F'@@@' '{ 
                if($1 in a){ 
                    a[$1][$2]=a[$1][$2]" ;;; "$3
                }
                else{
                    a[$1][$2]=$3
                }
             }
             END{
                 for(word in a){
                     for (type in a[word]){
                         print word,FS,type,FS,a[word][type]
                     }
                 }
             }' file

請注意，這需要一個awk能夠理解多維數組的實現，例如 GNU awk ( )，這是Linux 系統上的gawk預設實作。awk

將行的以下部分合併到 3 列文件中的目前行

答案1

答案2

相關內容