如何 grep 文字到下一個空格?

如何 grep 文字到下一個空格?
7/04/27 10:50:17 INFO Master: Driver submitted org.apache.spark.deploy.worker.DriverWrapper
17/04/27 10:50:17 INFO Master: Launching driver driver-20170427105017-0000 on worker worker-20170427103840-192.168.5.242-7078
17/04/27 10:50:22 INFO Master: 192.168.5.5:53156 got disassociated, removing it.
17/04/27 10:50:22 INFO Master: 192.168.5.5:37668 got disassociated, removing it.
17/04/27 10:50:22 INFO Master: 192.168.5.5:53154 got disassociated, removing it.
17/04/27 10:55:27 INFO Master: Registering app ETL DataPipeline App
17/04/27 10:55:27 INFO Master: Registered app ETL DataPipeline App with ID app-20170427105527-0000
17/04/27 10:55:27 INFO Master: Launching executor app-20170427105527-0000/0 on worker worker-20170427103842-192.168.5.175-7078
17/04/27 10:55:27 INFO Master: Launching executor app-20170427105527-0000/1 on worker worker-20170427103838-192.168.5.37-7078
17/04/27 11:08:25 INFO Master: Asked to kill driver driver-20170427105017-0000
17/04/27 11:08:25 INFO Master: Kill request for driver-20170427105017-0000 submitted
17/04/27 11:08:26 INFO Master: Received unregister request from application app-20170427105527-0000

我將如何取得 driver-20170427105017-0000 和相應的 192.168.5.242 以及類似地如何 grep app-20170427105527-0000/0 及其對應的 192.168.5.175 。

答案1

使用sed來獲取全部 driver以及executor與「啟動」相關的訊息:

$ sed -n -E 's/^.*Launching (driver|executor) ([^ ]*).*worker-[0-9]*-([^-]*).*$/\2 \3/p' file.in
driver-20170427105017-0000 192.168.5.242
app-20170427105527-0000/0 192.168.5.175
app-20170427105527-0000/1 192.168.5.37
  • [^ ]*將匹配任意數量的任意字元(空格除外)。
  • \2\3分別是對第二個和第三個括號相符的內容的反向引用。第二個括號包含並將符合或[^ ]*之後的文本,第三個括號包含並將符合 IP 位址(直到終止位址)。Launching driverLaunching executor[^-]*-
  • ^in$s/^...$/.../p正規表示式錨定在行的開頭和結尾,而 whilep告訴sed「列印」替換的結果(如果進行了替換)。

或者,由於正規表示式的魔力較少,可能會更健壯,使用awk

$ awk '/Launching/ { split($NF, a, "-"); print $7, a[3] }' file.in

相關內容