grep を使用してテキストを抽出するスクリプト

Question 1

アイデアは、grep からの結果を処理し、それを出力ファイルに明示的に追加することです。この方法では、コンソールを使用してデバッグメッセージを書き込むことができます。

#/bin/bash

# Save output to this file
outputFile='./xmldocs/1.txt'
rm -f $outputFile

# List only *.xml files and iterate
for i in `ls *.xml`
do
    # Echo which file is being processed (only printed to console )
    echo 'Processing :'$i
    # Grep, remove trailing newline and append to $outputFile
    grep "Document ID:" -s $i | tr -d '\n'  >> $outputFile
    # Add char to separate
    printf "~" >> $outputFile
    # Grep, remove trailing newline and append to $outputFile
    grep 'CI[^"]' -s $i | tr -d '\n' >> $outputFile
    # Print newline to separate results
    printf "\n" >> $outputFile
done 

echo '!! done'

これが機能しない場合は、テストするために grep する他の行を投稿してください。

Answer

アイデアは、grep からの結果を処理し、それを出力ファイルに明示的に追加することです。この方法では、コンソールを使用してデバッグメッセージを書き込むことができます。

#/bin/bash

# Save output to this file
outputFile='./xmldocs/1.txt'
rm -f $outputFile

# List only *.xml files and iterate
for i in `ls *.xml`
do
    # Echo which file is being processed (only printed to console )
    echo 'Processing :'$i
    # Grep, remove trailing newline and append to $outputFile
    grep "Document ID:" -s $i | tr -d '\n'  >> $outputFile
    # Add char to separate
    printf "~" >> $outputFile
    # Grep, remove trailing newline and append to $outputFile
    grep 'CI[^"]' -s $i | tr -d '\n' >> $outputFile
    # Print newline to separate results
    printf "\n" >> $outputFile
done 

echo '!! done'

これが機能しない場合は、テストするために grep する他の行を投稿してください。

Question 2

あなたが望むものは次のとおりですpaste:

#!/bin/bash
for f in *.xml
do
    paste -d '~' <(grep 'Document ID:' "$f") <(grep 'CI[\^"]' "$f")
done > /xmldocs/1.txt

Answer

あなたが望むものは次のとおりですpaste:

#!/bin/bash
for f in *.xml
do
    paste -d '~' <(grep 'Document ID:' "$f") <(grep 'CI[\^"]' "$f")
done > /xmldocs/1.txt

Question 3

の使用によってスクリプトがハングアップする理由についてはgrep 'CI[^"]'、^ をエスケープする必要があります。を使用するgrep 'CI[\^"]'と、問題が解決しました。これは、括弧の範囲内であっても、キャロットシンボルが否定として解釈されるためです。

編集: Steeldriver の訂正

Answer

の使用によってスクリプトがハングアップする理由についてはgrep 'CI[^"]'、^ をエスケープする必要があります。を使用するgrep 'CI[\^"]'と、問題が解決しました。これは、括弧の範囲内であっても、キャロットシンボルが否定として解釈されるためです。

編集: Steeldriver の訂正

grep を使用してテキストを抽出するスクリプト

答え1

答え2

答え3

関連情報