我正在處理聊天日誌並想要對其進行格式化。
它們看起來就像這樣,包括 # 符號:
Tuesday, February 24, 2015
##Person1 (21:22:01): hello
##Person2 (21:22:37): hi
Wednesday, February 25, 2015
##Person1 (13:12:43): hey
##Person2 (13:13:04): hey
日期僅在每個新的一天發布,我希望它的格式如下,這樣它就可以在電子表格中使用:
Tuesday, February 24, 2015
Tuesday, February 24, 2015##Person1 (21:22:01): hey
Tuesday, February 24, 2015##Person2 (21:22:37): hi
Wednesday, February 25, 2015
Wednesday, February 25, 2015##Person1 (13:12:43): hey
Wednesday, February 25, 2015##Person2 (13:13:04): hey
之後,我可以輕鬆刪除不包含 ## 字串的行,以擺脫僅日期的行。
有沒有什麼方法可以讓 Notepad++ 將包含日期字串(例如\d{1,2}, 201\d{1}$
)的整個最新行新增到其下方每行的開頭(直到下一個實例)?
答案1
恐怕這不能在 Notepad++ 中完成。
這是完成這項工作的 Perl 單行程式碼。
perl -ane '$date = $1 if /^(\w+,\h+\w+\h+\d\d?,\h+20\d\d)/;s/^(?=##)/$date/ && print;' file.txt
如果您想就地替換文件,請使用:
perl -i -ane '$date = $1 if /^(\w+,\h+\w+\h+\d\d?,\h+20\d\d)/;s/^(?=##)/$date/ && print;' file.txt
輸出:
Tuesday, February 24, 2015##Person1 (21:22:01): hello
Tuesday, February 24, 2015##Person2 (21:22:37): hi
Wednesday, February 25, 2015##Person1 (13:12:43): hey
Wednesday, February 25, 2015##Person2 (13:13:04): hey
正規表示式解釋:
/ # delimiter
^ # beginning of line
( # start group 1
\w+ # 1 or more word character
, # a comma
\h+ # 1 or more horizontal spaces
\w+ # 1 or more word character
\h+ # 1 or more horizontal spaces
\d\d? # 1 or 2 digits
, # a comma
\h+ # 1 or more horizontal spaces
20\d\d # 20 and 2 digits
) # end group 1
/ # delimiter
s/ # substitute, delimiter
^ # beginning of line
(?=##) # positive lookahead, zero-length assertion that make sure we have ## at the beginning
/ # delimiter
$date # the date found with the preceding regex
/ # delimiter