將大檔案拆分為唯一的檔案

將大檔案拆分為唯一的檔案

我有 100M 的巨大文件,其中有文字(下面幾行,)堆疊行是唯一的scaffold1_scaffold2_等等。

我想要一個特定於每個腳手架的新sed檔案grep

scaffold1_2,C,C,C,C,N,C,N,C,G,G,C,N,C,C,G,N,N,C,G,N,C,C,C,G,C,N,N,C,C
scaffold1_113,T,T,T,T,T,T,T,T,T,C,T,T,T,T,T,T,C,T,T,T,T,T,C,T,T,N,T,T,T
scaffold1_149,G,G,G,G,C,G,C,G,C,G,G,C,C,G,C,C,G,C,G,G,C,G,G,G,G,C,G,G,G
scaffold1_160,G,G,G,T,G,T,T,T,N,T,T,T,G,T,G,G,T,T,T,T,T,N,T,T,G,G,T,T,G
scaffold2_315,C,C,C,G,C,C,C,C,C,C,C,G,C,C,G,G,G,C,C,C,C,C,C,G,C,C,C,C,G
scaffold2_318,G,A,A,A,A,A,A,G,A,A,A,A,A,A,A,A,A,A,G,A,A,A,A,A,A,A,A,A,A
scaffold2_323,T,T,T,T,T,C,C,T,T,T,T,T,T,T,T,T,T,T,T,C,T,T,T,T,T,T,T,T,T
scaffold2_397,A,A,A,A,A,A,C,A,A,A,A,A,A,A,A,A,C,A,A,A,A,A,A,A,A,A,A,A,A
scaffold3_402,C,C,C,C,C,T,C,C,C,C,C,C,C,T,C,C,C,T,C,C,C,C,C,C,C,C,C,C,C
scaffold3_465,G,G,G,G,G,G,G,G,G,C,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,C,C,G
scaffold3_502,C,C,C,C,C,C,C,C,C,G,C,C,C,C,C,C,C,G,C,G,C,G,C,C,C,C,C,C,C
scaffold3_508,G,G,G,C,G,G,G,G,G,C,G,C,C,C,C,C,G,C,G,C,G,C,C,C,C,C,C,C,C
scaffold3_533,G,G,G,G,A,A,A,G,A,G,A,G,G,A,G,G,A,A,G,A,A,G,G,G,G,G,G,G,G
scaffold4_555,T,T,T,T,T,T,T,T,T,T,T,T,T,T,A,A,T,T,T,T,T,T,T,T,T,T,T,T,T
scaffold4_586,T,T,T,T,T,T,T,T,T,T,T,C,C,T,T,T,T,T,T,T,T,T,C,T,C,T,T,T,T
scaffold4_593,A,G,G,G,A,A,A,A,G,G,A,A,A,A,G,G,A,A,G,A,G,G,A,G,A,G,A,A,G
scaffold4_598,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,A,A,C
scaffold4_664,G,A,A,G,G,G,A,G,G,G,G,G,G,G,G,G,A,G,G,G,G,G,G,G,G,G,G,G,G
scaffold5_667,C,C,C,C,C,C,C,C,T,C,C,C,C,C,C,C,C,C,T,C,T,T,C,C,C,C,T,T,C
scaffold5_670,A,A,A,A,A,A,A,A,G,A,A,A,A,A,A,A,A,A,G,A,G,G,A,A,A,A,G,G,A
scaffold5_679,T,C,C,C,C,C,C,T,C,T,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C

生成的文件應該類似於

貓腳手架1.txt
scaffold1_2,C,C,C,C,N,C,N,C,G,G,C,N,C,C,G,N,N,C,G,N,C,C,C,G,C,N,N,C,C
scaffold1_113,T,T,T,T,T,T,T,T,T,C,T,T,T,T,T,T,C,T,T,T,T,T,C,T,T,N,T,T,T
scaffold1_149,G,G,G,G,C,G,C,G,C,G,G,C,C,G,C,C,G,C,G,G,C,G,G,G,G,C,G,G,G
scaffold1_160,G,G,G,T,G,T,T,T,N,T,T,T,G,T,G,G,T,T,T,T,T,N,T,T,G,G,T,T,G
貓腳手架2.txt
scaffold2_315,C,C,C,G,C,C,C,C,C,C,C,G,C,C,G,G,G,C,C,C,C,C,C,G,C,C,C,C,G
scaffold2_318,G,A,A,A,A,A,A,G,A,A,A,A,A,A,A,A,A,A,G,A,A,A,A,A,A,A,A,A,A
scaffold2_323,T,T,T,T,T,C,C,T,T,T,T,T,T,T,T,T,T,T,T,C,T,T,T,T,T,T,T,T,T
scaffold2_397,A,A,A,A,A,A,C,A,A,A,A,A,A,A,A,A,C,A,A,A,A,A,A,A,A,A,A,A,A

答案1

for i in $(cat myfile.txt|cut -d"_" -f 1 | sort | uniq)
do
  grep ${i} myfile.txt > ${i}.txt
done

這應該有效

相關內容