awk/shell で同じレコードを持つ 2 つのファイルを行ごとにマージするにはどうすればよいですか?

Question 1

awk を使用する場合:

$ awk 'NR==FNR {a[$1] = $2; next} $1 in a {print $1, $2, a[$1]}' file2.txt file1.txt 
Mary 68 74
Tom 50 26
Jason 45 37

ソートは不要で、出力は指定された 2 番目のファイルの順序になります。

説明：

NR==FNR最初の名前のファイルからレコードを選択する標準的な方法です
{a[$1] = $2; next}最初のフィールドのキーと2番目のフィールドの値を配列に格納する
$1 in a最初のフィールドが最初のファイルに既に存在していた場合、
{print $1, $2, a[$1]}2番目のファイルからキーと値を出力し、1番目のファイルから値を出力します。

Answer

awk を使用する場合:

$ awk 'NR==FNR {a[$1] = $2; next} $1 in a {print $1, $2, a[$1]}' file2.txt file1.txt 
Mary 68 74
Tom 50 26
Jason 45 37

ソートは不要で、出力は指定された 2 番目のファイルの順序になります。

説明：

NR==FNR最初の名前のファイルからレコードを選択する標準的な方法です
{a[$1] = $2; next}最初のフィールドのキーと2番目のフィールドの値を配列に格納する
$1 in a最初のフィールドが最初のファイルに既に存在していた場合、
{print $1, $2, a[$1]}2番目のファイルからキーと値を出力し、1番目のファイルから値を出力します。

Question 2

これは、join、リレーショナルデータベース演算子

join <(sort file1.txt) <(sort file2.txt)

テスト

$ cat file1.txt
Mary 68
Tom 50
Jason 45
Lu 66

$ cat file2.txt
Jason 37
Tom 26
Mary 74
Tina 80

$ join <(sort file1.txt) <(sort file2.txt)
Jason 45 37
Mary 68 74
Tom 50 26

joinPOSIX で規定された標準ツールです。

manjoinページには次のように記載されています:

The files file1 and file2 shall be ordered in the collating sequence of sort -b on the 
fields on which they shall be joined, by default the first in each line. All selected 
output shall be written in the same collating sequence.

Answer

これは、join、リレーショナルデータベース演算子

join <(sort file1.txt) <(sort file2.txt)

テスト

$ cat file1.txt
Mary 68
Tom 50
Jason 45
Lu 66

$ cat file2.txt
Jason 37
Tom 26
Mary 74
Tina 80

$ join <(sort file1.txt) <(sort file2.txt)
Jason 45 37
Mary 68 74
Tom 50 26

joinPOSIX で規定された標準ツールです。

manjoinページには次のように記載されています:

The files file1 and file2 shall be ordered in the collating sequence of sort -b on the 
fields on which they shall be joined, by default the first in each line. All selected 
output shall be written in the same collating sequence.

awk/shell で同じレコードを持つ 2 つのファイルを行ごとにマージするにはどうすればよいですか?

答え1

答え2

関連情報