
我有一個大文件,其中包含大量帶有 RFH2 標頭的 MQ 訊息。文件中的每個訊息均由空白行分隔。現在我需要將這個大文件分割成小文件,每個小文件包含帶有 RFH2 標頭的單一訊息。
我嘗試使用下面的 awk 命令
awk '{RS=""} {print $0}' inputfile
這會列印沒有控製字元的第一行,這是沒有用的。第一個 MQ RFH 標題行的開頭類似於RFH ^B^C^X^A^Q^C3MQSTR ^D¸
訊息資料。 awk 輸出僅列印文字RFH
。如果執行此命令後輸入檔案有 50 條訊息,我會得到 50 個僅包含文字的檔案RFH
。我期待 50 個帶有 RFH2 標頭和數據的文件。
我無法為您提供真實的文件輸入,因為它包含敏感資料。該文件開頭為
RFH ^B^C^X^A^Q^C3MQSTR ^D¸X<jms>
.........some text of many lines.....
RFH ^B^C^X^A^Q^C3MQSTR ^D¸X<jms>
........some text of many lines.....
RFH ^B^C^X^A^Q^C3MQSTR ^D¸X<jms>
...
輸出文件應該有
RFH ^B^C^X^A^Q^C3MQSTR ^D¸X<jms>
.........some text of many lines
答案1
幹得好。輸入(測試文件):
RFH ^B^C^X^A^Q^C3MQSTR ^D¸X<jms>
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
RFH ^B^C^X^A^Q^C3MQSTR ^D¸X<jms>
........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
RFH ^B^C^X^A^Q^C3MQSTR ^D¸X<jms>
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
代碼:
awk '{print $0 > "file" NR}' RS='\n\n' testfile
將“文件”替換為您想要的文件的名稱。透過這個例子,您將擁有:
$ cat file1
RFH ^B^C^X^A^Q^C3MQSTR ^D¸X<jms>
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
$ cat file2
RFH ^B^C^X^A^Q^C3MQSTR ^D¸X<jms>
........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
$ cat file3
RFH ^B^C^X^A^Q^C3MQSTR ^D¸X<jms>
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
.........some text of many lines.....
答案2
這很接近:
awk '{RS=""} {print $0}' inputfile
但是,您需要定義 RS 變數前awk 開始讀取文件。選擇以下之一:
awk 'BEGIN {RS=""} {print}' inputfile
awk -v RS="" '{print}' inputfile
若要查看控製字符,請將 awk 輸出透過管道傳輸到cat -v
awk -v RS="" 1 inputfile | cat -v