用於替換捕獲組中的字元的 Sed(或其他)腳本

用於替換捕獲組中的字元的 Sed(或其他)腳本

我正在嘗試將 Pandoc 標記轉換為 Confluence wiki 標記,我正在使用markdown2confluence完成大部分工作。這工作得很好,除了我談論的CSS和FreeMarker,它們在程式碼中使用{&,而Confluence使用&來標記程式碼區塊的開始/結束。所以我需要匹配 中包含的模式。}{{}}{{...}}

如果我了解(更多)Ruby,我可能可以在那裡修復它,但我是一個老派的 Unix 人,所以我想到了 awk 或 sed。

所以我得到了:

   sed 's/{{\([^}}]*\)}}/{{"\1"}}/g' tmp.wkd

這需要:

First we need a way to select a state (or group of states) CSS uses what
is called a selector to choose which elements to apply to, we have been
using one up until now without noticing, it is the {{*}} at the beginning
of our CSS. This is a special selector that means select everything. So
the rule that follows it (the bit between {{{}} and {{}}} apply to every
polygon on the map. But CSS allows us to insert a filter instead by
using {{[...]}} instead of {{*}}.

並產生:

First we need a way to select a state (or group of states) CSS uses what
is called a selector to choose which elements to apply to, we have been
using one up until now without noticing, it is the {{"*"}} at the beginning
of our CSS. This is a special selector that means select everything. So
the rule that follows it (the bit between {{"{"}} and {{""}}} apply to every
polygon on the map. But CSS allows us to insert a filter instead by
using {{"[...]"}} instead of {{"*"}}.

但我需要的是:

First we need a way to select a state (or group of states) CSS uses what
is called a selector to choose which elements to apply to, we have been
using one up until now without noticing, it is the {{*}} at the beginning
of our CSS. This is a special selector that means select everything. So
the rule that follows it (the bit between {{\{}} and {{\}}} apply to every
polygon on the map. But CSS allows us to insert a filter instead by
using {{[...]}} instead of {{*}}.

還需要處理{{${type.name}}}哪些應該成為{{$\{type.name\}}}

有兩個問題

  1. 我需要替換{\{而不是使用引號,所以我需要以\1某種方式進行修改。
  2. 無論我如何嘗試結束模式匹配,看起來令人討厭的內容{{}}}(應該出現的結果都不會出現)。{{\}}}

答案1

以下 sed 指令似乎有效:

   sed 's/{{\([^*[a-z][^}]*\)}}/{{\\\1}}/g;s/{{\\${\([^}]*\)}}}/{{$\\{\1\\}}}/g'

解釋:

  1. {{\([^*[a-z][^}]*\)}}查找{{stuff}},除非stuff*or[或 小寫字母開頭。
  2. 將其替換為{{\stuff}}.
  3. 然後{{\\${\([^}]*\)}}}發現{{\${junk}}}.
  4. 並將其替換為{{$\{junk\}}}.

編輯:在OP澄清後,替代解決方案可能是這樣的:

   sed 's/\({{[^}]*\){\([^}]*}}\)/\1\\{\2/g;s/\({{[^}]*\)}}}/\1\\}}}/g'

眾所周知,sed 不能進行遞歸解析,但這應該適用於大多數簡單的情況。

解釋:

  1. \({{[^}]*\){\([^}]*}}\)查找{{foo{bar}}、 wherefoobardo not contains }
  2. 並將其替換為{{foo\{bar}}. (注意{{xxx{yyy}}}處理沒問題。)
  3. 然後\({{[^}]*\)}}}發現{{baz}}},哪裡baz不包含}
  4. 並將其替換為{{baz\}}}.

foobarbaz可以為空,因此例如根據需要{{}}}轉換為, 。{{\}}}

相關內容