如何將增量計數附加到文字檔案的每個預定義單字？

Question 1

我比較喜歡perl這個：

$ cat ip.txt 
He drove his car to the cinema. He then went inside the cinema to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema.

$ # forward counting is easy
$ perl -pe 's/\bcinema\b/$&.++$i/ge' ip.txt 
He drove his car to the cinema1. He then went inside the cinema2 to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema3.

\bcinema\b要搜尋的單字，使用單字邊界，這樣它就不會作為另一個單字的部分部分進行匹配。例如，\bpar\b不會匹配apart或park或spar
ge此g標誌用於全域替換。e允許在替換部分使用 Perl 程式碼
$&.++$i是匹配單字和預遞增值的串聯，其$i預設值為0

對於反向，我們需要先得到計數......

$ c=$(grep -ow 'cinema' ip.txt | wc -l) perl -pe 's/\bcinema\b/$&.$ENV{c}--/ge' ip.txt 
He drove his car to the cinema3. He then went inside the cinema2 to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema1.

c成為可透過哈希存取的環境變量%ENV

或者，perl單獨使用整個文件

perl -0777 -pe '$c=()=/\bcinema\b/g; s//$&.$c--/ge' ip.txt

Answer

我比較喜歡perl這個：

$ cat ip.txt 
He drove his car to the cinema. He then went inside the cinema to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema.

$ # forward counting is easy
$ perl -pe 's/\bcinema\b/$&.++$i/ge' ip.txt 
He drove his car to the cinema1. He then went inside the cinema2 to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema3.

\bcinema\b要搜尋的單字，使用單字邊界，這樣它就不會作為另一個單字的部分部分進行匹配。例如，\bpar\b不會匹配apart或park或spar
ge此g標誌用於全域替換。e允許在替換部分使用 Perl 程式碼
$&.++$i是匹配單字和預遞增值的串聯，其$i預設值為0

對於反向，我們需要先得到計數......

$ c=$(grep -ow 'cinema' ip.txt | wc -l) perl -pe 's/\bcinema\b/$&.$ENV{c}--/ge' ip.txt 
He drove his car to the cinema3. He then went inside the cinema2 to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema1.

c成為可透過哈希存取的環境變量%ENV

或者，perl單獨使用整個文件

perl -0777 -pe '$c=()=/\bcinema\b/g; s//$&.$c--/ge' ip.txt

Question 2

使用 GNU awk 進行多字元 RS、不區分大小寫的匹配和字邊界：

$ awk -v RS='^$' -v ORS= -v word='cinema' '
    BEGIN { IGNORECASE=1 }
    { cnt=gsub("\\<"word"\\>","&"); while (sub("\\<"word"\\>","&"cnt--)); print }
' file
He drove his car to the cinema3. He then went inside the cinema2 to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema1.

Answer

使用 GNU awk 進行多字元 RS、不區分大小寫的匹配和字邊界：

$ awk -v RS='^$' -v ORS= -v word='cinema' '
    BEGIN { IGNORECASE=1 }
    { cnt=gsub("\\<"word"\\>","&"); while (sub("\\<"word"\\>","&"cnt--)); print }
' file
He drove his car to the cinema3. He then went inside the cinema2 to purchase tickets, and afterwards discovered that it was more then two years since he last visited the cinema1.

Question 3

考慮單字後面的標點符號。
正向編號：

word="cinema"
awk -v word="$word" '
    { 
      for (i = 1; i <= NF; i++) 
        if ($i ~ word "([,.;:)]|$)") { 
          gsub(word, word "" ++count,$i) 
        }
      print 
    }' input-file

向後編號：

word="cinema"
count="$(awk -v word="$word" '
    { count += gsub(word, "") }
    END { print count }' input-file)"
awk -v word="$word" -v count="$count" '
    { 
      for (i = 1; i <= NF; i++) 
        if ($i ~ word "([,.;:)]|$)") { 
          gsub(word, word "" count--, $i) 
        }
      print 
    }' input-file

Answer

考慮單字後面的標點符號。
正向編號：

word="cinema"
awk -v word="$word" '
    { 
      for (i = 1; i <= NF; i++) 
        if ($i ~ word "([,.;:)]|$)") { 
          gsub(word, word "" ++count,$i) 
        }
      print 
    }' input-file

向後編號：

word="cinema"
count="$(awk -v word="$word" '
    { count += gsub(word, "") }
    END { print count }' input-file)"
awk -v word="$word" -v count="$count" '
    { 
      for (i = 1; i <= NF; i++) 
        if ($i ~ word "([,.;:)]|$)") { 
          gsub(word, word "" count--, $i) 
        }
      print 
    }' input-file

Question 4

為了以降序標記單詞，我們反轉正則表達式並反轉數據，最後再次反轉日期以實現轉換：

perl -l -0777pe '$_ = reverse reverse =~ s/(?=\bamenic\b)/++$a/gre' input.data

結果

He drove his car to the cinema3. He then went inside the cinema2 to purchase tickets, and
afterwards discovered that it was more then two years since he last visited the cinema1.

為了按升序標記單字，我們對單字進行後向搜尋：

perl -lpe 's/\bcinema\b\K/++$a/eg' input.data

結果

He drove his car to the cinema1. He then went inside the cinema2 to purchase tickets, and
afterwards discovered that it was more then two years since he last visited the cinema3.

Answer

為了以降序標記單詞，我們反轉正則表達式並反轉數據，最後再次反轉日期以實現轉換：

perl -l -0777pe '$_ = reverse reverse =~ s/(?=\bamenic\b)/++$a/gre' input.data

結果

He drove his car to the cinema3. He then went inside the cinema2 to purchase tickets, and
afterwards discovered that it was more then two years since he last visited the cinema1.

為了按升序標記單字，我們對單字進行後向搜尋：

perl -lpe 's/\bcinema\b\K/++$a/eg' input.data

結果

He drove his car to the cinema1. He then went inside the cinema2 to purchase tickets, and
afterwards discovered that it was more then two years since he last visited the cinema3.

如何將增量計數附加到文字檔案的每個預定義單字？

答案1

答案2

答案3

答案4

結果

結果

相關內容