
我的日誌檔案類似於以下範例:
10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 11/Aug/2020:23:34:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 12/Aug/2020:23:45:43 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 13/Aug/2020:23:43:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 14/Aug/2020:23:33:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
我想透過指定日期範圍來搜尋上述條目,如下所示:
./Logsearch.sh 10/Aug/2020 13/Aug/2020
預期結果:
10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 11/Aug/2020:23:34:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 12/Aug/2020:23:45:43 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 13/Aug/2020:23:43:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
我怎樣才能做到這一點?
知道如何為我的查詢編寫腳本。
答案1
這看起來像是一個標準的 HTTP 存取日誌,那麼為什麼不使用它grep
來匹配您想要的日期模式呢?
$ grep '1[0-3]/Aug/2020' access_log
10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 11/Aug/2020:23:34:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 12/Aug/2020:23:45:43 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 13/Aug/2020:23:43:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
grep 模式 '1[0-3]/Aug/2020' 使用範圍表達式 [0-3]。此表達式比對可以取值 0、1、2、3 的單一字元。將其與表達式的其餘部分結合,您將得到 10/Aug/2020、11/Aug/2020、12/Aug/2020 和 13/Aug/2020 作為可能的模式。grep
將從日誌中列印出與這些模式相符的行。
答案2
您可以使用專門的結構化文字工具,如 Miller (https://github.com/johnkerl/miller)並運行
mlr --nidx then filter 'strftime(strptime($4,"%d/%b/%Y:%H:%M:%S"),"%Y-%m-%d") >="2020-08-11" && strftime(strptime($4,"%d/%b/%Y:%H:%M:%S"),"%Y-%m-%d") <="2020-08-13"' input.txt
具有
10.434.22.334 - unauthenticated 11/Aug/2020:23:34:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 12/Aug/2020:23:45:43 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 13/Aug/2020:23:43:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
我已經應用了一個過濾器來獲取2020-08-11
和之間的所有內容2020-08-13
一些注意事項:
--nidx
設定輸入和輸出格式(https://bit.ly/3h3UvN3)filter
應用過濾器;strftime(strptime($4,"%d/%b/%Y:%H:%M:%S"),"%Y-%m-%d") >="2020-08-11"
是過濾器之一。使用I 設定第四個欄位 ( ) 的strptime
輸入日期格式 ( ) 。使用我更改日期格式%d/%b/%Y:%H:%M:%S
$4
strftime
%Y-%m-%d
答案3
使用樂(以前稱為 Perl_6)
~$ raku -e 'my $start_date = DateTime.new("2020-08-11").in-timezone(28800); \
my $stop_date = DateTime.new("2020-08-13").in-timezone(28800); \
my @a = lines.map: *.words; my @b = do for @a { \
.[0..2], \
.[3..4].join.subst(/^ (\d**2) \/ (Aug) \/ (\d**4) \: /, {"$2-08-$0T"} ) \
.subst(/ (\+\d**2) (\d**2) $/, {"$0:$1"} ).DateTime, \
.[5..*] }; \
.put if .[1] ~~ $start_date .. $stop_date for @b;' file
#或者:
~$ raku -e 'my $start_date = DateTime.new("2020-08-11").in-timezone(28800); \
my $stop_date = DateTime.new("2020-08-13").in-timezone(28800); \
my @a = lines.map(*.words).map({ \
.[0..2], ( \
.[3].subst(/^ (\d**2) \/ (Aug) \/ (\d**4) \: /, {"$2-08-$0T"} ), \
.[4].subst(/ (\+\d**2) (\d**2) $ /, {"$0:$1"} )).join.DateTime, \
.[5..*] }); \
.put if .[1] ~~ $start_date .. $stop_date for @a;' file
以上是用 Raku(Perl 程式語言家族的成員)寫的答案。 RakuISO 8601
內建了 DateTime 物件。
簡而言之,每行上的日期/時間都會轉換為ISO 8601
DateTime 物件。當每個lines
被拆分為以空格分隔的時words
,日期/時間元素可在零索引列.[3]
和中找到.[4]
。這些使用subst
itute 命令進行轉換並join
建立ISO 8601
DateTime 物件。然後測試每一行以查看該 DateTime 物件是否落在所需的$start .. $stop
範圍內。
輸入範例:
10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 11/Aug/2020:23:34:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 12/Aug/2020:23:45:43 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 13/Aug/2020:23:43:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 14/Aug/2020:23:33:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
範例輸出(兩個程式碼範例):
10.434.22.334 - unauthenticated 2020-08-11T23:34:45+08:00 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 2020-08-12T23:45:43+08:00 "GET /eai/random.jsp HTTP/1.1" 200 74
https://www.iso.org/iso-8601-date-and-time-format.html
https://docs.raku.org
https://raku.org