刪除 Notepad++ 中除 URL 之外的所有內容

Question 1

Ctrl+H
找什麼：^.*?(\bhttps://twitter\.com/\w+)?.*$
用。(?1$1:)
檢查環繞
檢查正規表示式
不要檢查. matches newline
Replace all

解釋：

^                           # beginning of line
  .*?                       # 0 or more any character but newline, not greedy
  (                         # start grpup 1
    \b                      # word boundary
    https://twitter\.com/   # literally
    \w+                     # 1 or more word character
  )?                        # end group, optional
  .*                        # 0 or more any character but newline
$                           # end of line

替代品：

(?1$1:)         # if group 1 exists, then use it as replacement, else replace with nothing

給定範例的結果：

https://twitter.com/thtjournal


https://twitter.com/jcarrollhistory

Answer

Ctrl+H
找什麼：^.*?(\bhttps://twitter\.com/\w+)?.*$
用。(?1$1:)
檢查環繞
檢查正規表示式
不要檢查. matches newline
Replace all

解釋：

^                           # beginning of line
  .*?                       # 0 or more any character but newline, not greedy
  (                         # start grpup 1
    \b                      # word boundary
    https://twitter\.com/   # literally
    \w+                     # 1 or more word character
  )?                        # end group, optional
  .*                        # 0 or more any character but newline
$                           # end of line

替代品：

(?1$1:)         # if group 1 exists, then use it as replacement, else replace with nothing

給定範例的結果：

https://twitter.com/thtjournal


https://twitter.com/jcarrollhistory

Question 2

假設您有一個定義 URL 的正規表示式，我們稱之為正規表示式。

使用 Notepad++ 中的「尋找」對話方塊、「取代」標籤執行以下操作全部替換的正規表示式經過\n$1\n。這會將所有 URL 分成僅包含 URL 的行，並散佈垃圾行。

再次在「尋找」對話方塊的「標記」標籤中，標記包含以下內容的所有行：正規表示式使用書籤線選項，使用全部標記手術。

最後，在搜尋 => 書籤選單，選擇選項刪除未加書籤的行。

有關 URL 的良好正規表示式，請參閱這篇文章：
檢查字串是否為有效 URL 的最佳正規表示式是什麼？。

有關更多資訊和螢幕截圖，請參閱本文中的類似案例：
Notepad++如何從文件中提取電子郵件地址。

Answer