notepad++를 사용하여 문서에서 텍스트 추출

Question

Notepad++에서는 이 작업을 한 번에 수행할 수 없습니다. 다음과 같이 할 수 있습니다.

첫 번째 단계:

Ctrl+H
무엇을 찾다:(?:^|\G).+?NM1\*71\*1\*(.+?)\*{4}XX\*(\d+)
다음으로 교체:$1 $2\n
둘러보기 확인
정규식 확인
확인하지 마세요. matches newline
Replace all

설명:

(?:             : non capture group
  ^             : beginning of line
 |              : R
  \G            : position of last match
)               : end group
.+?             : 1 or more any character, not greedy
NM1\*71\*1\*    : literally "MN1*71*1*", asterisk have to be escaped
(.+?)           : group 1, 1 or more any character, not greedy
\*{4}XX\*       : 4 asterisks, XX, then 1 asterisk 
(\d+)           : group 2, 1 or more digit

대사:

$1      : content of group 1
        : a space
$2      : content of group 2
\n      : line feed, you could change it for the linebreak you need

주어진 예에 대한 결과:

Darbinian*Sevak 1306859178
Boonyaputthikul*Robert 1700198801
LX*1~SV2*0551*HC>G0154*250*UN*4~DTP*472*D8*20180125~REF*6R*74990810~

두 번째 단계에서는 마지막 줄을 삭제해야 합니다.

Answer 1

Notepad++에서는 이 작업을 한 번에 수행할 수 없습니다. 다음과 같이 할 수 있습니다.

첫 번째 단계:

Ctrl+H
무엇을 찾다:(?:^|\G).+?NM1\*71\*1\*(.+?)\*{4}XX\*(\d+)
다음으로 교체:$1 $2\n
둘러보기 확인
정규식 확인
확인하지 마세요. matches newline
Replace all

설명:

(?:             : non capture group
  ^             : beginning of line
 |              : R
  \G            : position of last match
)               : end group
.+?             : 1 or more any character, not greedy
NM1\*71\*1\*    : literally "MN1*71*1*", asterisk have to be escaped
(.+?)           : group 1, 1 or more any character, not greedy
\*{4}XX\*       : 4 asterisks, XX, then 1 asterisk 
(\d+)           : group 2, 1 or more digit

대사:

$1      : content of group 1
        : a space
$2      : content of group 2
\n      : line feed, you could change it for the linebreak you need

주어진 예에 대한 결과:

Darbinian*Sevak 1306859178
Boonyaputthikul*Robert 1700198801
LX*1~SV2*0551*HC>G0154*250*UN*4~DTP*472*D8*20180125~REF*6R*74990810~

두 번째 단계에서는 마지막 줄을 삭제해야 합니다.

notepad++를 사용하여 문서에서 텍스트 추출

답변1

관련 정보