Notepad++でURL以外のすべてを削除します

Question 1

Ctrl+H
検索対象:^.*?(\bhttps://twitter\.com/\w+)?.*$
と置換する：(?1$1:)
チェックラップアラウンド
正規表現をチェック
チェックしないでください. matches newline
Replace all

説明：

^                           # beginning of line
  .*?                       # 0 or more any character but newline, not greedy
  (                         # start grpup 1
    \b                      # word boundary
    https://twitter\.com/   # literally
    \w+                     # 1 or more word character
  )?                        # end group, optional
  .*                        # 0 or more any character but newline
$                           # end of line

交換：

(?1$1:)         # if group 1 exists, then use it as replacement, else replace with nothing

与えられた例の結果:

https://twitter.com/thtjournal


https://twitter.com/jcarrollhistory

Answer

Ctrl+H
検索対象:^.*?(\bhttps://twitter\.com/\w+)?.*$
と置換する：(?1$1:)
チェックラップアラウンド
正規表現をチェック
チェックしないでください. matches newline
Replace all

説明：

^                           # beginning of line
  .*?                       # 0 or more any character but newline, not greedy
  (                         # start grpup 1
    \b                      # word boundary
    https://twitter\.com/   # literally
    \w+                     # 1 or more word character
  )?                        # end group, optional
  .*                        # 0 or more any character but newline
$                           # end of line

交換：

(?1$1:)         # if group 1 exists, then use it as replacement, else replace with nothing

与えられた例の結果:

https://twitter.com/thtjournal


https://twitter.com/jcarrollhistory

Question 2

URLを定義する正規表現があると仮定し、それを次のように呼びます。正規表現。

Notepad++の検索ダイアログの置換タブを使用して、すべて置換の正規表現によって\n$1\n、すべての URL が、URL のみを含む行と、その間にゴミ行が散在する行に分割されます。

再び検索ダイアログのマークタブで、以下の文字を含むすべての行をマークします。正規表現使用してブックマークラインオプションを使用してすべてマーク手術。

最後に、検索 => ブックマークメニューで、ブックマークされていない行を削除する。

URL に適した正規表現については、次の投稿を参照してください。
文字列が有効な URL かどうかを確認するのに最適な正規表現は何ですか?。

詳細情報とスクリーンショットについては、同様のケースに関するこの記事を参照してください。
Notepad++ ファイルからメールアドレスを抽出する方法。

Answer