我對 Notepad++ 相當陌生,並嘗試使用正規表示式來搜尋欄位中的特定值並刪除其父標籤(以及包括該欄位的所有內容)。
本質上,我正在嘗試刪除具有特定商店 ID 的交易。這些文件很大,有數千個條目我需要刪除,範例如下!
樣本
<Transaction>
<TxnHeader>
<StoreId>6705</StoreId>
<TillNumber>1</TillNumber>
<TxnNumber>343243</TxnNumber>
<StartDate>2019-02-02T07:42:45</StartDate>
<TxnType>1</TxnType>
</TxnHeader>
<TxnItemLines>
<TxnItemLine>
<DetailSequence>1</DetailSequence>
<ItemNumber>6304</ItemNumber>
<DeptNumber>168</DeptNumber>
<Quantity>1.000000</Quantity>
<LineValue>4.470000</LineValue>
</TxnItemLine>
</TxnItemLines>
</Transaction>
<Transaction>
<TxnHeader>
<StoreId>8351</StoreId>
<TillNumber>1</TillNumber>
<TxnNumber>327527</TxnNumber>
<StartDate>2019-02-02T08:02:47</StartDate>
<TxnType>1</TxnType>
</TxnHeader>
<TxnItemLines>
<TxnItemLine>
<DetailSequence>1</DetailSequence>
<ItemNumber>6304</ItemNumber>
<DeptNumber>168</DeptNumber>
<Quantity>1.000000</Quantity>
<LineValue>7.310000</LineValue>
</TxnItemLine>
</TxnItemLines>
</Transaction>
<Transaction>
<TxnHeader>
<StoreId>7837</StoreId>
<TillNumber>1</TillNumber>
<TxnNumber>164728</TxnNumber>
<StartDate>2019-02-02T08:19:47</StartDate>
<TxnType>1</TxnType>
</TxnHeader>
<TxnItemLines>
<TxnItemLine>
<DetailSequence>1</DetailSequence>
<ItemNumber>1902</ItemNumber>
<DeptNumber>154</DeptNumber>
<Quantity>1.000000</Quantity>
<LineValue>10.000000</LineValue>
</TxnItemLine>
</TxnItemLines>
</Transaction>
期望的
<Transaction>
<TxnHeader>
<StoreId>6705</StoreId>
<TillNumber>1</TillNumber>
<TxnNumber>343243</TxnNumber>
<StartDate>2019-02-02T07:42:45</StartDate>
<TxnType>1</TxnType>
</TxnHeader>
<TxnItemLines>
<TxnItemLine>
<DetailSequence>1</DetailSequence>
<ItemNumber>6304</ItemNumber>
<DeptNumber>168</DeptNumber>
<Quantity>1.000000</Quantity>
<LineValue>4.470000</LineValue>
</TxnItemLine>
</TxnItemLines>
</Transaction>
<Transaction>
<TxnHeader>
<StoreId>7837</StoreId>
<TillNumber>1</TillNumber>
<TxnNumber>164728</TxnNumber>
<StartDate>2019-02-02T08:19:47</StartDate>
<TxnType>1</TxnType>
</TxnHeader>
<TxnItemLines>
<TxnItemLine>
<DetailSequence>1</DetailSequence>
<ItemNumber>1902</ItemNumber>
<DeptNumber>154</DeptNumber>
<Quantity>1.000000</Quantity>
<LineValue>10.000000</LineValue>
</TxnItemLine>
</TxnItemLines>
</Transaction>
上面所需的文字已完全刪除包含 8351 的交易標籤
我嘗試使用以下查詢進行正規表示式查找和替換(沒有任何內容):
<Transaction>.*?<StoreID>8351</StoreID>.*?</Transaction>
它最終將文件的一大塊從頂部一直向下包裝到包含 8351 的第一個事務的末尾
任何幫助將不勝感激!
答案1
- Ctrl+H
- 找什麼:
<Transaction>(?:(?!</Transaction>).)+<StoreId>8351</StoreId>(?:(?!<Transaction>).)+</Transaction>\R
- 用。
LEAVE EMPTY
- 檢查匹配大小寫
- 檢查環繞
- 檢查正規表示式
- 查看
. matches newline
- Replace all
解釋:
<Transaction> # opening tag
(?:(?!</Transaction>).)+ # tempered greedy token, make sure we haven't </Transaction> before the following
<StoreId>8351</StoreId> # literally
(?:(?!<Transaction>).)+ # tempered greedy token, make sure we haven't <Transaction> before the following
</Transaction> # literally, closing tag
\R? # optional any kind of linebreak
螢幕截圖:
更多關於淬煉的貪婪令牌