Ich möchte Informationen aus meiner Textdatei extrahieren und in einer separaten Datei speichern

2024-6-11 • tag-icon

Ich möchte Informationen aus meiner Textdatei extrahieren und in einer separaten Datei speichern

Ich möchte Informationen aus meiner Datei extrahieren und in einer separaten Datei speichern.

Die Eingabedatei sieht folgendermaßen aus:

>Feature NODE_18_1000_cov_1.366138
1 888 CDS
   db_xref COG:COG3385
   inference ab initio prediction:Prodigal:2.6
   inference similar to AA sequence:ISfinder:ISPg4
   locus_tag LFHBJEGH_33517
   product IS4 family transposase ISPg4
>Feature NODE_18222_1000_cov_1.310053
665 330 CDS
   inference ab initio prediction:Prodigal:2.6
   locus_tag LFHBJEGH_33518
   product hypothetical protein
962 675 CDS
   inference ab initio prediction:Prodigal:2.6
   locus_tag LFHBJEGH_335
   product hypothetical protein
>Feature NODE_18194_1000_cov_2.550265
939 187 CDS
   EC_number 1.14.99.46
   db_xref COG:COG2141
   gene rutA_3
   inference ab initio prediction:Prodigal:2.6
   inference similar to AA sequence:UniProtKB:P75898
   locus_tag LFHBJEGH_33480
   product Pyrimidine monooxygenase RutA

Ich möchte die NODE-ID (den Namen nach ">") mit "locus_tag" abgleichen.
Die gewünschte Ausgabe sieht folgendermaßen aus:

Feature NODE_18_1000_cov_1.366138 LFHBJEGH_33517
Feature NODE_18222_1000_cov_1.310053 LFHBJEGH_33518
Feature NODE_18222_1000_cov_1.310053 LFHBJEGH_335
Feature NODE_18194_1000_cov_2.550265 LFHBJEGH_33480

Welcher Befehl würde hierfür funktionieren?

Antwort1

mit awk:

awk '/>Feature NODE_/{ nodeId=$0; next } /locus_tag/{ print nodeId, $2 }' infile

Antwort1

verwandte Informationen