如何找到遵循特定順序的單字

Question 1

#!/bin/sh
pttrn="^$(printf '%s' "$1" | sed -e 's/\(.\)/\1*/g' -e 's/\*/\\+/' -e 's/\*$/\\+/')"'$'
grep "$pttrn" /usr/share/dict/words

透過在每個字元之後注入，從第一個參數獲得模式*。然後第一個*改為\+；最後也是如此*。另外^還$添加了和。您的範例輸入產生以下模式：

^q\+w*e*r*t*y*u*y*t*r*e*s*d*f*t*y*u*i*o*k*n\+$

此模式是正確的模式grep。q開頭必須至少出現一次，n結尾必須至少出現一次。中間的每個字母可能出現零次或多次，順序保持不變。

請注意，該腳本很愚蠢。如果您提供帶有.、[、]等輸入，那麼您將得到超出規範的正規表示式。提供合理的輸入或擴展腳本以驗證它。

例子：

$ ./script1.sh qwertyuytresdftyuiokn
queen
question
$ ./script1.sh te
tee
$ ./script1.sh superuser
seer
serer
spur
super
supper
surer
$

Answer

#!/bin/sh
pttrn="^$(printf '%s' "$1" | sed -e 's/\(.\)/\1*/g' -e 's/\*/\\+/' -e 's/\*$/\\+/')"'$'
grep "$pttrn" /usr/share/dict/words

透過在每個字元之後注入，從第一個參數獲得模式*。然後第一個*改為\+；最後也是如此*。另外^還$添加了和。您的範例輸入產生以下模式：

^q\+w*e*r*t*y*u*y*t*r*e*s*d*f*t*y*u*i*o*k*n\+$

此模式是正確的模式grep。q開頭必須至少出現一次，n結尾必須至少出現一次。中間的每個字母可能出現零次或多次，順序保持不變。

請注意，該腳本很愚蠢。如果您提供帶有.、[、]等輸入，那麼您將得到超出規範的正規表示式。提供合理的輸入或擴展腳本以驗證它。

例子：

$ ./script1.sh qwertyuytresdftyuiokn
queen
question
$ ./script1.sh te
tee
$ ./script1.sh superuser
seer
serer
spur
super
supper
surer
$

Question 2

這是一種解決方法

首先，過濾單字列表，僅保留那些以與混雜字母相同的字母開頭和結尾的單字。例如，如果混亂作為位置參數傳遞$1（並假設最近的bashshell）

grep -x "${1:0:1}.*${1:(-1):1}" /usr/share/dict/words

然後將這些單字中的每一個都分解成一個正規表示式 - 我想不出一個「好的」方法來做到這一點，但是使用 GNU sed 你可以這樣做

$ sed -E 's/(.)\1*/+.*\1/2g' <<< "queen"
q+.*u+.*e+.*n

現在針對每個產生的模式測試混亂情況。

把它們放在一起：

$ cat script1 
#!/bin/bash

wordlist=/usr/share/dict/words

while IFS= read -r word; do 
  grep -qEx "$(sed -E 's/(.)\1*/+.*\1/2g' <<< "$word")" <<< "$1" && printf '%s\n' "$word"
done < <(grep -x "${1:0:1}.*${1:(-1):1}" "$wordlist")

然後

$ ./script1 qwertyuytresdftyuiokn
queen
question

Answer

這是一種解決方法

首先，過濾單字列表，僅保留那些以與混雜字母相同的字母開頭和結尾的單字。例如，如果混亂作為位置參數傳遞$1（並假設最近的bashshell）

grep -x "${1:0:1}.*${1:(-1):1}" /usr/share/dict/words

然後將這些單字中的每一個都分解成一個正規表示式 - 我想不出一個「好的」方法來做到這一點，但是使用 GNU sed 你可以這樣做

$ sed -E 's/(.)\1*/+.*\1/2g' <<< "queen"
q+.*u+.*e+.*n

現在針對每個產生的模式測試混亂情況。

把它們放在一起：

$ cat script1 
#!/bin/bash

wordlist=/usr/share/dict/words

while IFS= read -r word; do 
  grep -qEx "$(sed -E 's/(.)\1*/+.*\1/2g' <<< "$word")" <<< "$1" && printf '%s\n' "$word"
done < <(grep -x "${1:0:1}.*${1:(-1):1}" "$wordlist")

然後

$ ./script1 qwertyuytresdftyuiokn
queen
question

Question 3

這是另一個（在中運行bash）python程式碼產生正規表示式並將其提供給grep。grep然後處理古老look實用程式的輸出，該實用程式執行二分搜尋以拉回範例中/usr/share/dict/words以開頭的所有單字。因此要搜尋的單字集大大減少qgrep

python3 -c 'import sys
arr = list(sys.argv[1])
print(*arr, sep="*")
' $1 | grep -x -f - <(look ${1:0:1})

或者，避免使用正規表示式的look+解決方案python3

look q | ./finder.py "qwertyuytresdftyuiokn"

其中finder.py如下：

#!/usr/bin/env python3
import sys
from itertools import groupby

seek_word = sys.argv[1]
for word in sys.stdin:
    orig_word = word.strip()
    word = ''.join(k for k, g in groupby(orig_word)) 
    s_w = iter(seek_word)
    i_word = iter(word)
    if all(c in s_w for c in i_word) and not next(s_w, None):
        print(orig_word)

Answer

這是另一個（在中運行bash）python程式碼產生正規表示式並將其提供給grep。grep然後處理古老look實用程式的輸出，該實用程式執行二分搜尋以拉回範例中/usr/share/dict/words以開頭的所有單字。因此要搜尋的單字集大大減少qgrep

python3 -c 'import sys
arr = list(sys.argv[1])
print(*arr, sep="*")
' $1 | grep -x -f - <(look ${1:0:1})

或者，避免使用正規表示式的look+解決方案python3

look q | ./finder.py "qwertyuytresdftyuiokn"

其中finder.py如下：

#!/usr/bin/env python3
import sys
from itertools import groupby

seek_word = sys.argv[1]
for word in sys.stdin:
    orig_word = word.strip()
    word = ''.join(k for k, g in groupby(orig_word)) 
    s_w = iter(seek_word)
    i_word = iter(word)
    if all(c in s_w for c in i_word) and not next(s_w, None):
        print(orig_word)

如何找到遵循特定順序的單字

答案1

答案2

答案3

相關內容