Perl を使用してファイル内のキーワード間のテキストを抽出する

Question 1

正規表現を変更する必要があると思います。「\ability」と「\skill」はおそらく必要なものではないでしょう。「\a」は「bell」の文字で、「\s」は空白文字に一致します。

キャプチャしたいテキスト部分は、括弧で囲まれた正規表現の適切な部分と一致させることができます。正規表現全体が一致したら、$1、$2 などを使用して部分的に一致した部分にアクセスできます。たとえば... '(\w+)\s+(ability|skill)\s+(\w+)'

Answer

正規表現を変更する必要があると思います。「\ability」と「\skill」はおそらく必要なものではないでしょう。「\a」は「bell」の文字で、「\s」は空白文字に一致します。

キャプチャしたいテキスト部分は、括弧で囲まれた正規表現の適切な部分と一致させることができます。正規表現全体が一致したら、$1、$2 などを使用して部分的に一致した部分にアクセスできます。たとえば... '(\w+)\s+(ability|skill)\s+(\w+)'

Question 2

あなたのスクリプトには多くの誤りがあります。私はそれを書き直して簡素化しました。

#!/usr/bin/perl 
use strict;
use warnings;
use Data::Dumper;

# file to search
my $file = 'C:\Users\Acer Nitro\Desktop\perl\sim.txt';
open my $fh, '<', $file or die "unable to open '$file' for reading: $!";
# read whole file in a single string
undef $/;
my $full = <$fh>;
# search text between keywords
my @found = $full =~ /\b(?:ability|skills|experience)\b\R?\K(.+?)(?=\b(?:ability|skills|experience)\b)/gsi;
# dump the result
print Dumper\@found;

与えられた例の出力:

$VAR1 = [
          ' to manage issues, communications and influencing ',
          ',Passion for great technology and user ',
          'Exceptional organizational '
        ];

正規表現の説明:

/                       # regex delimiter
    \b                  # word boundary
    (?:                 # non capture group
        ability         # literally
      |                 # OR
        skills          # literally
      |                 # OR
        experience      # literally
    )                   # end group
    \b                  # word boundary
    \R?                 # optional linebreak
    \K                  # forget all we have seen until this position
    (.+?)               # group 1, the text we want
    (?=                 # positive lookahead
        \b              # word boundary
        (?:             # non capture group
            ability     # literally
          |             # OR
            skills      # literally
          |             # OR
            experience  # literally
        )               # end group
        \b              # word boundary
    )                   # end lookahead
/gsi                    # delimiter, global; dot matches newline; case insensitive

Answer

あなたのスクリプトには多くの誤りがあります。私はそれを書き直して簡素化しました。

#!/usr/bin/perl 
use strict;
use warnings;
use Data::Dumper;

# file to search
my $file = 'C:\Users\Acer Nitro\Desktop\perl\sim.txt';
open my $fh, '<', $file or die "unable to open '$file' for reading: $!";
# read whole file in a single string
undef $/;
my $full = <$fh>;
# search text between keywords
my @found = $full =~ /\b(?:ability|skills|experience)\b\R?\K(.+?)(?=\b(?:ability|skills|experience)\b)/gsi;
# dump the result
print Dumper\@found;

与えられた例の出力:

$VAR1 = [
          ' to manage issues, communications and influencing ',
          ',Passion for great technology and user ',
          'Exceptional organizational '
        ];

正規表現の説明:

/                       # regex delimiter
    \b                  # word boundary
    (?:                 # non capture group
        ability         # literally
      |                 # OR
        skills          # literally
      |                 # OR
        experience      # literally
    )                   # end group
    \b                  # word boundary
    \R?                 # optional linebreak
    \K                  # forget all we have seen until this position
    (.+?)               # group 1, the text we want
    (?=                 # positive lookahead
        \b              # word boundary
        (?:             # non capture group
            ability     # literally
          |             # OR
            skills      # literally
          |             # OR
            experience  # literally
        )               # end group
        \b              # word boundary
    )                   # end lookahead
/gsi                    # delimiter, global; dot matches newline; case insensitive

Perl を使用してファイル内のキーワード間のテキストを抽出する

答え1

答え2

関連情報