如何將 grep 與日期格式和唯一值一起使用?

如何將 grep 與日期格式和唯一值一起使用?

我有大量的數據列表

我的數據看起來像這樣

"[01/Dec/2011:20:53:04 +0900] ","COMZ","90.663.65.61","21.123.31.100","250","CONNECT","t.ierz.er:443","13127","836"
"[01/Dec/2011:22:20:01 +0900] ","COMZ","90.663.65.61","21.123.31.100","250","CONNECT","t.ierz.er:443","13127","836"
"[02/Dec/2011:24:33:04 +0900] ","COMZ","20.663.65.61","2.123.91.100","220","CONNECT","t.ierz.er:443","13127","836"

如何取得唯一值資料或 IP 位址等資料格式

01/DEC/2011 90.663.65.61 21.123.31.100

因為我已經得到了相同的值而無法得到獨特的值

[01 / Dec / 2011: 20: 53: 04 0900] 90.663.65.61 21.123.31.100
[01 / Dec / 2011: 20: 53: 04 0900] 90.663.65.61 21.123.31.100

代碼:

file.csv | awk -F\" '{print $2,$6,$8}' | sort | uniq -c | sort -n

答案1

您應該使用sed來完成您的請求。

這是一個適合您的情況的命令:

 cat file.csv | awk -F\" '{print $2,$6,$8}' | sed 's#\(:[[:digit:]]\{2\}\)\{3\} +0900##' | sort | uniq -c | sort -n

它將刪除日期以僅保留以下格式:[01/DEC/2011] 90.663.65.61 21.123.31.100

答案2

試試這個,

 awk -F '[:"[]' '{print $3" "$10" "$12}' file.csv | sort | uniq 

答案3

由於您的資料似乎採用 CSV 格式,您可能可以使用csvsqlfrom csvkit,請參閱https://csvkit.readthedocs.io/en/1.0.3/scripts/csvsql.html#

假設你的檔案名為data.csv

csvsql -H --query 'SELECT a,c,d FROM data GROUP BY c,d' data.csv

印刷

a,c,d
[02/Dec/2011:24:33:04 +0900] ,20.663.65.61,2.123.91.100
[01/Dec/2011:22:20:01 +0900] ,90.663.65.61,21.123.31.100

也可以看看https://unix.stackexchange.com/a/495010/330217

答案4

我始終建議對 CSV 資料使用 CSV 解析器。這是紅寶石:

ruby -rcsv -ne 'CSV.parse($_) do |row|
  puts [row[0][1..11].upcase, row[2], row[3]].join " "
end' | sort -u
01/DEC/2011 90.663.65.61 21.123.31.100
02/DEC/2011 20.663.65.61 2.123.91.100

相關內容