我有兩個以逗號分隔的 csv 檔案(section_ne_lookup.csv 和section_fru_out.csv):
-sh-4.1$ cat section_ne_lookup.csv
CIBI_NMS_NE,,,,,,,,,,,,,,,,,,,,,,,
D,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,r,s,t,u,v,w
X,NE1,1.1.1.1,,833-006355,00,Windows,,,9.4.04,301035,,,,,,,,,,,,,
X,NE2,2.2.2.2,,833-006352,00,XS-A,,,5.1,51.573,,,,,,,,,,,,,
X,80,3.3.3.3,,833-006366,00,XS-F,,,5.2,52.604,,,,,,,,,,,,,
X,83,4.4.4.4,,833-006366,00,XS-F,,,5.3,53.575,,,,,,,,,,,,,
X,85,5.5.5.5,,833-006352,00,XS-A,,,5.3,53.605,,,,,,,,,,,,,
X,87,6.6.6.6,,Dummy_EPLC60,Dummy_EPLR60,XS-A,,,5.3,53.543,,,,,,,,,,,,,
-sh-4.1$ cat section_fru_out.csv
CIBI_NMS_FRU,,,,,,,,,,,,,,,,,,,,,,,
D,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,r,s,t,u,v,w,x,y
X,UNKNOWN,0,NE2,,,,NE2:klm11,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm12,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm13,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm14,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm15,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm16,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm2,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm3,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm4,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm5,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm6,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm7,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm8,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm9,,,,UNKNOWN,,,,,,,0,,,,,,
X,abc,0,83,,,,83:klm1,,,,rty,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm2,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm3,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm4,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm5,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm6,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm7,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm8,,,,UNKNOWN,,,,,,,0,,,,,,
X,TBD,1,NE2,,,,NE2:klm4.asd1,,,,bnm,,,,,,,0,,,,,,0000
X,TBD,1,80,,,,80:.klm1.asd1,,,,bnm,,,,,,,0,,,,,,3132
X,abc,1,87,,,,87:ab1,,,,,,,,,,,0,,,,,,xyz-C
X,abc,0,87,,,,87:abA,,,,bnm,,,,,,,0,,,,,,xyz-A
我想要做的是:如果section_ne_lookup.csv的第五或第六列以“Dummy_”短語開頭,則獲取匹配行第二列的值。並檢查此取得的值(section_ne_lookup.csv 中符合行的第二列)是否存在於section_fru_out.csv 的第四列中。如果是,請從section_fru_out.csv 中刪除符合行。如果不是,請將符合行保留在section_fru_out.csv 中。根據此規則,應刪除以下行:
X,abc,1,87,,,,87:ab1,,,,,,,,,,,0,,,,,,xyz-C
X,abc,0,87,,,,87:abA,,,,bnm,,,,,,,0,,,,,,xyz-A
因為,section_ne_lookup.csv 中最後 2 行的第 5 和第 6 列以 Dummy_ 片語開頭。因此,我們應該取得符合行的第二列(在本例中為 87),並在檔案的第四列中搜尋該值 (87):section_fru_out.csv。由於該檔案的最後 2 行與此規則匹配,因此應刪除這些行。
我怎樣才能做到這一點?
答案1
這應該可以滿足您的需求:
awk -F, '{
if(NR==FNR){a[$2]=$5; b[$2]=$6;}
else if(a[$4] !~ /^Dummy/ && b[$4] !~ /^Dummy/){print}
}' section_ne_lookup.csv section_fru_out.csv
if(NR==FNR){a[$2]=$5; b[$2]=$6;}
:FNR
是目前輸入檔的行號,NR
是所有輸入的行號。當它們相等時,我們正在讀取第一個檔案。然後,我們建立a
關聯b
數組,其鍵為第二個字段,其值分別為第 5 個和第 6 個字段。else if (a[$4] !~ /^Dummy/ && b[$4] !~ /^Dummy/){print}
section_ne_lookup.csv
:如果這不是第一個文件,並且與start with中的第二個欄位關聯的行的第四個或第五個欄位都不是Dummy
,則列印該行。