Eu tenho a seguinte string
>gi|374638939|gb|AEZ55452.1| myosin light chain 2, partial [Batrachoseps major]
AAMGR
Repetindo esporadicamente em todo o meu documento e quero remover tudo da >gi|37463
sequência `AAMGR
Mas quero manter os blocos onde JQ250
aparece:
>gi|374638936|gb|**JQ250**332.1| Batrachoseps major isolate b voucher DBW5974 myosin light chain 2 gene, partial cds
GCNGCCATGGGTAAGTGAACGCGCCGGACCAGACCATTCACTGCATGCAATGGGGGCGTTTGTGGGTTGG
AAGGTGTGCCAAAGATCTAGGGAACCCCAACTCCTCAGGATACGGGTGGGAGCCCTAAAATATGTCCAGC
TATAAGGAGATGACCAATGGAAAAGGGGGTATCAGCAGTACTTTACCTGCTACTATAAGAGAATTGCATC
CTGGGAATAGCCTCTGAAAGGTCCCATTTTAGCGACACTGGTAGATGGACACTGGCCTTTGGACAGCACC
AGTAAGTAGAGCATTGCATCTTGGGATTCCTTTGCTGTTCACATGCCACTGAAAGCTCTCACCATAGCAG
ATTCAAAATGCCTACCCGGCAGGTTGCCAGAAAAGCACTGCATCATGGGAGAACCACTTTTAGTGACAAT
TCTAAGAGATGGGTGTCTCTCTGCCAGGCGCTATTATCCAGAGACCCCAGTATGACGTCGTCATTGCTCC
CAGGTAACCATGTTCTCACCCCCTCTCCCACAGGCCGC
E remova apenas as linhas que possuemAEZ554
>gi|374638939|gb|**AEZ554**52.1| myosin light chain 2, partial [Batrachoseps major]
AAMGR
Então, idealmente, o seguinte bloco:
>gi|374638934|gb|JQ250331.1| Batrachoseps major isolate a voucher DBW5974 myosin light chain 2 gene, partial cds
GCNGCCATGGGTAAGTGAACGCGCCGGACCAGACCATTCACTGCATGCAATGGGGGCGTTTGTGGGTTGG
AAGGTGTGCCAAAGATCTAGGGAACCCCAACTCCTCAGGATACGGGTGGGAGCCCTAAAATATGTCCAGC
TATAAGGAGATGACCAATGGAAAAGGGGGTATCAGCAGTACTTTACCTGCTACTATAAGAGAATTGCATC
CTGGGAATAGCCTCTGAAAGGTCCCATTTTAGCGACACTGGTAGATGGACACTGGCCTTTGGACAGCACC
AGTAAGTAGAGCATTGCATCTTGGGATTCCTTTGCTGTTCACATGCCACTGAAAGCTCTCACCATAGCAG
ATTCAAAATGCCTACCCGGCAGGTTGCCAGAAAAGCACTGCATCATGGGAGAACCACTTTTAGTGACAAT
TCTAAGAGATGGGTGTCTCTCTGCCAGGCGCTATTATCCAGAGACCCCAGTATGACGTCGTCATTGCTCC
CAGGTAACCATGTTCTCACCCCCTCTCCCACAGGCCGC
>gi|374638935|gb|AEZ55450.1| myosin light chain 2, partial [Batrachoseps major]
AAMGR
>gi|374638936|gb|JQ250332.1| Batrachoseps major isolate b voucher DBW5974 myosin light chain 2 gene, partial cds
GCNGCCATGGGTAAGTGAACGCGCCGGACCAGACCATTCACTGCATGCAATGGGGGCGTTTGTGGGTTGG
AAGGTGTGCCAAAGATCTAGGGAACCCCAACTCCTCAGGATACGGGTGGGAGCCCTAAAATATGTCCAGC
TATAAGGAGATGACCAATGGAAAAGGGGGTATCAGCAGTACTTTACCTGCTACTATAAGAGAATTGCATC
CTGGGAATAGCCTCTGAAAGGTCCCATTTTAGCGACACTGGTAGATGGACACTGGCCTTTGGACAGCACC
AGTAAGTAGAGCATTGCATCTTGGGATTCCTTTGCTGTTCACATGCCACTGAAAGCTCTCACCATAGCAG
ATTCAAAATGCCTACCCGGCAGGTTGCCAGAAAAGCACTGCATCATGGGAGAACCACTTTTAGTGACAAT
TCTAAGAGATGGGTGTCTCTCTGCCAGGCGCTATTATCCAGAGACCCCAGTATGACGTCGTCATTGCTCC
CAGGTAACCATGTTCTCACCCCCTCTCCCACAGGCCGC
>gi|374638937|gb|AEZ55451.1| myosin light chain 2, partial [Batrachoseps major]
AAMGR
>gi|374638938|gb|JQ250333.1| Batrachoseps major isolate a voucher MVZ:Herp:249023 myosin light chain 2 gene, partial cds
GCCGCCATGGGTAAGTGAACGCGCCGGACCAGACCATTCACTGCCTGCAATGGGGGTGTTTGTGGGTTGG
AAGGTGTGCCAAAGATCTAGGGAACCCCAACTCCTCAGGATACGGGTGGGAGCCCTAAAATATGTCCAGC
TATAAGGAGATGACCAATGGAAAAGGGGGTATCAGCAGTACTTTACTTGCTACTATAAGAGAATTGCATC
CTGGGAATAGCCTCTGAAAGGTCCCATTTTAGCGACACTGGTAGATGGACACTGGCCTTTGGACAGCACC
AGTAAGTAGAGCATTGCATCTTGGGATTCCTTTGCTGTTCACATGCCACTGAAAGCTCTCACCATAGCAG
ATTCAAAATGCCTACCCGGCAGGTTGCCAGAAAAGCACTGCATCATGGGAGAACCACTTTTAGTGACAAT
CCTAAGAGATGGGTGTCTCTCTGCCAGGCGCTATTATCCAAGAGACCCCAGTATGACGTCGTCATTGCTC
CCAGGTAACCATGTTCTCACCCCCTCTCCCACAGGCCGC
>gi|374638939|gb|AEZ55452.1| myosin light chain 2, partial [Batrachoseps major]
AAMGR
Seria deixado como apenas
>gi|374638934|gb|JQ250331.1| Batrachoseps major isolate a voucher DBW5974 myosin light chain 2 gene, partial cds
GCNGCCATGGGTAAGTGAACGCGCCGGACCAGACCATTCACTGCATGCAATGGGGGCGTTTGTGGGTTGG
AAGGTGTGCCAAAGATCTAGGGAACCCCAACTCCTCAGGATACGGGTGGGAGCCCTAAAATATGTCCAGC
TATAAGGAGATGACCAATGGAAAAGGGGGTATCAGCAGTACTTTACCTGCTACTATAAGAGAATTGCATC
CTGGGAATAGCCTCTGAAAGGTCCCATTTTAGCGACACTGGTAGATGGACACTGGCCTTTGGACAGCACC
AGTAAGTAGAGCATTGCATCTTGGGATTCCTTTGCTGTTCACATGCCACTGAAAGCTCTCACCATAGCAG
ATTCAAAATGCCTACCCGGCAGGTTGCCAGAAAAGCACTGCATCATGGGAGAACCACTTTTAGTGACAAT
TCTAAGAGATGGGTGTCTCTCTGCCAGGCGCTATTATCCAGAGACCCCAGTATGACGTCGTCATTGCTCC
CAGGTAACCATGTTCTCACCCCCTCTCCCACAGGCCGC
>gi|374638936|gb|JQ250332.1| Batrachoseps major isolate b voucher DBW5974 myosin light chain 2 gene, partial cds
GCNGCCATGGGTAAGTGAACGCGCCGGACCAGACCATTCACTGCATGCAATGGGGGCGTTTGTGGGTTGG
AAGGTGTGCCAAAGATCTAGGGAACCCCAACTCCTCAGGATACGGGTGGGAGCCCTAAAATATGTCCAGC
TATAAGGAGATGACCAATGGAAAAGGGGGTATCAGCAGTACTTTACCTGCTACTATAAGAGAATTGCATC
CTGGGAATAGCCTCTGAAAGGTCCCATTTTAGCGACACTGGTAGATGGACACTGGCCTTTGGACAGCACC
AGTAAGTAGAGCATTGCATCTTGGGATTCCTTTGCTGTTCACATGCCACTGAAAGCTCTCACCATAGCAG
ATTCAAAATGCCTACCCGGCAGGTTGCCAGAAAAGCACTGCATCATGGGAGAACCACTTTTAGTGACAAT
TCTAAGAGATGGGTGTCTCTCTGCCAGGCGCTATTATCCAGAGACCCCAGTATGACGTCGTCATTGCTCC
CAGGTAACCATGTTCTCACCCCCTCTCCCACAGGCCGC
>gi|374638938|gb|JQ250333.1| Batrachoseps major isolate a voucher MVZ:Herp:249023 myosin light chain 2 gene, partial cds
GCCGCCATGGGTAAGTGAACGCGCCGGACCAGACCATTCACTGCCTGCAATGGGGGTGTTTGTGGGTTGG
AAGGTGTGCCAAAGATCTAGGGAACCCCAACTCCTCAGGATACGGGTGGGAGCCCTAAAATATGTCCAGC
TATAAGGAGATGACCAATGGAAAAGGGGGTATCAGCAGTACTTTACTTGCTACTATAAGAGAATTGCATC
CTGGGAATAGCCTCTGAAAGGTCCCATTTTAGCGACACTGGTAGATGGACACTGGCCTTTGGACAGCACC
AGTAAGTAGAGCATTGCATCTTGGGATTCCTTTGCTGTTCACATGCCACTGAAAGCTCTCACCATAGCAG
ATTCAAAATGCCTACCCGGCAGGTTGCCAGAAAAGCACTGCATCATGGGAGAACCACTTTTAGTGACAAT
CCTAAGAGATGGGTGTCTCTCTGCCAGGCGCTATTATCCAAGAGACCCCAGTATGACGTCGTCATTGCTC
CCAGGTAACCATGTTCTCACCCCCTCTCCCACAGGCCGC
Responder1
Primeiro passo: certifique-se de estar executando a versão mais recente do Notepad++ (deve funcionar em 6 ou superior, testado em 6.1.8) - graças aPrumopor esta. Você pode usar o modo "Expressão regular" da caixa de diálogo localizar e substituir do notepad ++ para remover o texto entre dois marcadores.
Para corresponder todas as linhas que começam >gi|37463
e terminam com AAMGR
, coloque isso na caixa "Localizar:" >gi\|37463.*AAMGR(\r\n)?
, deixe a caixa "Substituir por:" vazia, defina o modo na parte inferior como "Expressão regular" e certifique-se de que ". corresponde à nova linha " édesmarcado.
Para corresponder apenas às linhas contidas AEZ554
neles, use esta string de pesquisa
>gi\|37463.*AEZ554.*AAMGR(\r\n)?
Para combinar com tudo issonão contenha JQ250
neles, use esta string de pesquisa>gi\|37463(?!.*JQ250).*AAMGR(\r\n)?
Observação:pode ser necessário apenas usar \n
em vez de \r\n
se o arquivo foi criado em unix/linux.
Nota 2:se quiser deixar uma linha em branco no arquivo em vez de removê-la completamente, remova o (\r\n)?
do termo de pesquisa.
Nota 3:Se a pergunta fosse "Como faço para remover linhas contidas AEZ554
em um arquivo de texto?" os seguintes comandos shell funcionariam (e seriam mais rápidos):
no Windows XP:type oldfile.txt | find /I /V "AEZ55" > newfile.txt
no Linux/Windows 7:grep -v "AEZ55" oldfile.txt > newfile.txt
Da mesma forma, "Como removo linhas que não contêm JQ250
de um arquivo de texto?"
type oldfile.txt | find /I "JQ250" > newfile.txt
grep "JQ250" oldfile.txt > newfile.txt