패턴을 검색하고 다른 파일에 줄을 추가합니다.

Question 1

이미 가지고 있는 코드로 작업할 수 있습니다. 행을 배열에 저장하고 다섯 번째 요소와 일치시킵니다.

while read -r line; do
    [ -z "$line" ] && continue
    patlist=($line)
    pat=${patlist[4]}
    grep "$pat" --label="$line" -H < KEGG.annotations
done < allKO.txt

보고:

Metabolism Carbohydrate metabolism Glycolisis K07448:>aai:AARI_33320  mrr; restriction system protein Mrr; K07448 restriction system protein
Metabolism Protein metabolism protesome K02217:>aai:AARI_26600  ferritin-like protein; K02217 ferritin [EC:1.16.3.1]

Answer

이미 가지고 있는 코드로 작업할 수 있습니다. 행을 배열에 저장하고 다섯 번째 요소와 일치시킵니다.

while read -r line; do
    [ -z "$line" ] && continue
    patlist=($line)
    pat=${patlist[4]}
    grep "$pat" --label="$line" -H < KEGG.annotations
done < allKO.txt

보고:

Metabolism Carbohydrate metabolism Glycolisis K07448:>aai:AARI_33320  mrr; restriction system protein Mrr; K07448 restriction system protein
Metabolism Protein metabolism protesome K02217:>aai:AARI_26600  ferritin-like protein; K02217 ferritin [EC:1.16.3.1]

Question 2

이것은 당신이 요구하는 것 같습니다 :

while read w1 w2 w3 w4 ID
do
    printf "%s " "$w1 $w2 $w3 $w4 $ID"
    if ! grep "$ID" KEGG.annotations
    then
        echo
    fi
done < allKO.txt

그러면 화면에 출력이 기록됩니다. 출력( >) 리디렉션(예: > test1)을 마지막 줄에 추가하여 출력을 파일로 캡처합니다.

귀하의 예에 따르면 키/ID 필드("패턴")는다섯~의다섯필드가 파일에 있으므로 allKO.txt우리는 read w1 w2 w3 w4 ID. 당신은 이것이 탭으로 구분된 파일이라고 말했습니다. 필드에 공백이 포함되어 있지 않다고 가정합니다.
printf에서 줄(즉, 필드)을 작성합니다 allKO.txt. 끝에는 공백이 있지만 줄바꿈은 끝나지 않습니다.
파일에서 ID( 의 줄에서 다섯 번째 필드 )를 검색( grep) 합니다 . 이는 완전한 라인(개행 포함)입니다.KEGG.annotationsallKO.txt
실패 하면 grep실패했으므로 개행 문자를 작성하십시오 printf.

그러면 ID가 없는 행이 KEGG.annotations 출력에 간단히 기록됩니다.

Metabolism Protein metabolism proteasome K02217  >aai:AARI_26600 ferritin-like protein; K02217 ferritin [EC:1.16.3.1]
This ID doesn’t exist: K99999

두 번 이상 존재하는 ID는 추가 줄로 기록됩니다(의 데이터를 반복하지 않음 allKO.txt).

Metabolism Protein metabolism proteasome K02217  >aai:AARI_26600 ferritin-like protein; K02217 ferritin [EC:1.16.3.1]
This is a hypothetical additional line from KEGG.annotations that mentions “K02217”.

Answer

이것은 당신이 요구하는 것 같습니다 :

while read w1 w2 w3 w4 ID
do
    printf "%s " "$w1 $w2 $w3 $w4 $ID"
    if ! grep "$ID" KEGG.annotations
    then
        echo
    fi
done < allKO.txt

그러면 화면에 출력이 기록됩니다. 출력( >) 리디렉션(예: > test1)을 마지막 줄에 추가하여 출력을 파일로 캡처합니다.

귀하의 예에 따르면 키/ID 필드("패턴")는다섯~의다섯필드가 파일에 있으므로 allKO.txt우리는 read w1 w2 w3 w4 ID. 당신은 이것이 탭으로 구분된 파일이라고 말했습니다. 필드에 공백이 포함되어 있지 않다고 가정합니다.
printf에서 줄(즉, 필드)을 작성합니다 allKO.txt. 끝에는 공백이 있지만 줄바꿈은 끝나지 않습니다.
파일에서 ID( 의 줄에서 다섯 번째 필드 )를 검색( grep) 합니다 . 이는 완전한 라인(개행 포함)입니다.KEGG.annotationsallKO.txt
실패 하면 grep실패했으므로 개행 문자를 작성하십시오 printf.

그러면 ID가 없는 행이 KEGG.annotations 출력에 간단히 기록됩니다.

Metabolism Protein metabolism proteasome K02217  >aai:AARI_26600 ferritin-like protein; K02217 ferritin [EC:1.16.3.1]
This ID doesn’t exist: K99999

두 번 이상 존재하는 ID는 추가 줄로 기록됩니다(의 데이터를 반복하지 않음 allKO.txt).

Metabolism Protein metabolism proteasome K02217  >aai:AARI_26600 ferritin-like protein; K02217 ferritin [EC:1.16.3.1]
This is a hypothetical additional line from KEGG.annotations that mentions “K02217”.

패턴을 검색하고 다른 파일에 줄을 추가합니다.

답변1

답변2

관련 정보