다음과 같은 입력이 있습니다.
##gff-version 3
chr1 TAIR10 mRNA 3631 5899 . + . ID AT1G01010.1 ;geneID AT1G01010 ;gene_name AT1G01010
chr1 TAIR10 exon 3631 3913 . + . Parent AT1G01010.1
chr1 TAIR10 exon 3996 4276 . + . Parent AT1G01010.1
chr1 TAIR10 exon 4486 4605 . + . Parent AT1G01010.1
chr1 TAIR10 exon 4706 5095 . + . Parent AT1G01010.1
chr1 TAIR10 exon 5174 5326 . + . Parent AT1G01010.1
chr1 TAIR10 exon 5439 5899 . + . Parent AT1G01010.1
ID, geneID 및 gene_name에 다음 출력과 같이 큰따옴표가 포함되기를 원합니다.
##gff-version 3
chr1 TAIR10 mRNA 3631 5899 . + . ID "AT1G01010.1" ;geneID "AT1G01010" ;gene_name "AT1G01010"
chr1 TAIR10 exon 3631 3913 . + . Parent "AT1G01010.1"
chr1 TAIR10 exon 3996 4276 . + . Parent "AT1G01010.1"
chr1 TAIR10 exon 4486 4605 . + . Parent "AT1G01010.1"
chr1 TAIR10 exon 4706 5095 . + . Parent "AT1G01010.1"
chr1 TAIR10 exon 5174 5326 . + . Parent "AT1G01010.1"
chr1 TAIR10 exon 5439 5899 . + . Parent "AT1G01010.1"
나는 테스트를 해왔다.
awk '{sub($10, "\"&\""); print}' file.gtf
내 질문을 읽어주셔서 감사합니다
답변1
빠르고 더러운
sed -E 's#(ID|Parent|gene_name) ([0-9A-Za-z.]+)#\1 \"\2\"#g'