可擴展、標準 TeX 唯一比較平衡標記清單的方法

可擴展、標準 TeX 唯一比較平衡標記清單的方法

是否有一個(相當)高效的巨集可以執行類似於 \long\def\comparets#1#2{\def\aa{#1}\def\bb{#2}\ifx\aa\bb true\else false\fi} except 可擴展的操作(即\newcomparets{<tokens1>}{<tokens2>}擴展為“true”或“false”,包括 inside \edef)?我正在尋找一個“純”TeX(即沒有擴展,例如 e-TeX)解決方案。我看過l3tl宏,但他們似乎使用 e-TeX。此解決方案應該適用於任意標記序列(包括包含各種「有趣空格」和大括號以及任意控制序列的標記序列)。我似乎無法找到一種方法來做到這一點而不執行幾次傳遞。

答案1

我不確定將其作為我自己問題的答案發布是否合理,因為它並沒有真正回答它,但我不會將其標記為這樣(即使假設我可以),所以如果有人有解決原始問題的靈感,我很樂意將其標記為真實答案。

現在關於宏。我提前為它們的形狀道歉。 它們是從我多年來編寫的各種程式碼中提取出來的(並重新命名),所以風格有點......折衷主義,我們可以說。下面可以優化很多東西,但多次傳遞的問題仍然存在,我稍後會解釋,所以如果有人有聰明的技巧來解決它,請告訴我。

有一些注意事項:

1)缺少實際的比較宏,僅存在分析部分,它以“前綴可擴展”方式(例如與技巧一起使用\romannumeral-1)提供類別 11 和 12 標記的字符串,其中包含足夠的信息來識別序列、類別、字符代碼(如果有)、是否是大括號、字元代碼等。

2) 嗯,1) 是善意的謊言,有兩個原因:

a) 任何可以作為參數抓取的標記(即非空格、非大括號)都被(抓取並)替換為\meaning 包含在 t ... e 中的字串(t 和 e 都是類別 11);請注意,字元代碼不是 32 的 10 類標記屬於此類(雙關語)。\yygrabtokenraw可以進行調整以提供更好的分析(如果目標是比較任意平衡的令牌清單但只是歸結為一些仔細編寫的條件,則必須這樣做)。請注意,僅僅只是\string還不夠,因為\escapechar可以為 -1。

b) 缺少「頂級」遞歸步驟;這裡的主要問題是字元代碼 32 的大括號;它們在最後階段處理,當序列的長度已知時,人們可以將\string它們中的每一個都列出來或找出它們的\meaning。好吧,沒那麼快,因為如果它們有類別代碼 32,則兩者\meaning\string將它們轉換為普通空格(將以\meaning兩個空格結尾,這也沒有幫助),這是\detokenize 發明來糾正的一個問題。因此我們需要決定如何抓住它們。程式碼所做的一項保證是每個左大括號都將被正確識別為字元代碼 32 (o1ec1e) 或 32 ( o2e, c2e) 之外的字元代碼。執行此操作的程式碼會弄亂後面的一些右大括號(它們的字元代碼),以便安全地使用大括號,因此c2e第一個大括號後面的“標記”不可靠(但是,如果找到另一個o1e, o1eor o2e ,則它是一個大括號字元代碼 32)。下一次迭代可以抓住「破解」的大括號,而不會弄亂下一個大括號。經過更多次傳遞(不幸的是,最多與右大括號一樣多),一切都可以解決。如果有人有興趣,我可以完成巨集來做到這一點。只有當高德納\meaning以點結尾時...

3)程式碼花費了大量時間「傳播擴展」。典型的情況是 \somemacro{<long list of benign tokens>}{\string}\string這裡需要先進行擴展,然後才能發生其他事情,因此\somemacro需要花費大量時間\expandafter在 中插入 s <long list ...>。請注意,\romannumeral如果<long list ...>很長,則會失敗,因此將所有內容編碼為數字將無濟於事。使用\csname <long ...>\endcsname是可能的(有\expandafter後續),但我對在這種情況下污染 TeX 的哈希表感到不安。

巨集嘗試在第一遍中識別“有趣的空間”,這是\meaning\yymatchblankspace 下面的唯一用途。一個人只能做一件事\string

最後包含了巨集的測試案例。如果我忽略了一些愚蠢的事情,我很抱歉(當約瑟夫賴特和其他人懷疑時,我也傾向於懷疑)。

編輯:除了這些可能存在的其他問題之外,\long為了清楚起見,我在每個定義前面都省略了,所以 a\par會破壞它。

擴展至提供更好的分析上面:為了解決病理情況(例如\escapechar=-1 \let\#=#),可以準備一堆宏(每個字符一個(甚至兩個),例如\expandafter\def\csname match#\endcsname #1\##{...}% last '#' is \catcode 13)或幾個宏,其中一個\defed\def\maintest #1<a list of all active characters and single letter cs's>{...}完成所有繁重的工作(通過遞歸插入 '在潛在的“分隔符號”中抓取了“標記”)。在這兩者之間的選擇(用時間換空間)也是可能的。至於“有很多宏”,這當然是一個問題。我對此的(不完美的)看法是:「如果一個人能夠負擔得起那麼多\catcode寄存器,那麼一個人也可以負擔得起那些特殊的『條件』)。

恐怕擴張傳播上面提到的問題只是在 TeX 中進行遞歸的代價。透過使用\yysx ?where編碼(在第一次傳遞期間)標記可以在一定程度上緩解此問題\def\yysx#1#2{\expandafter\space\expandafter\yysx\expandafter#1\romannumeral-1#2}。這樣,條目\romannumeral-1列表前面的a\yysx ?會將擴展“傳遞”到列表末尾,同時保持完整。

「支架後處理」感覺就像這樣應該是可以避免的。

最後,我多次被問到「為什麼沒有 e-TeX?」。我不確定這是一個討論它的合適地方,但我有(可能是主觀的)理由避免它。如果有人可以建議一個更好的地方來討論此類偏好,我將不勝感激。

% helper macros (to build test cases, etc); @ is a letter

\def\yyreplacestring#1\in#2\with#3{%
      \expandafter\def\expandafter\r@placestring\expandafter##\expandafter1\the#1##2\end{%
          \def\r@placestring{##2}% is this the string at the very end?
          \ifx\r@placestring\empty % then it is the one we inserted, report
              \errmessage{string <\the#1> not present in \the#2}% do not change the register if the string is not there
          \else % remove the extra copy of #1\end at the end
              \expandafter#2\expandafter\expandafter\expandafter
                  {\expandafter\r@plac@string\expandafter{\the#3}{##1}##2\end}%
      \fi}% end of \r@placestring definition
      \expandafter\def\expandafter\r@plac@string
          \expandafter##\expandafter1%
          \expandafter##\expandafter2%
          \expandafter##\expandafter3%
          \the#1\end{##2##1##3}%
      \expandafter\expandafter\expandafter\r@placestring\expandafter\the\expandafter#2\the#1\end
}

\newtoks\toksa
\newtoks\toksb
\newtoks\toksc
\newtoks\toksd

\def\yybreak#1#2\yycontinue{\fi#1}

\def\eatone#1{}
\def\eatonespace#1 {}
\def\identity#1{#1}
\def\yyfirstoftwo#1#2{#1}
\def\yysecondoftwo#1#2{#2}
\def\yysecondofthree#1#2#3{#2}
\def\yythirdofthree#1#2#3{#3}

% #1 -- `call stack'
% #2 -- remaining sequence
% #3 -- `parsed' sequence

\def\yypreparsetokensequenc@#1#2#3{%
    \yystringempty{#2}{#1{#3}}{\yypreparsetokensequen@@{#1}{#2}{#3}}%
}

\def\yypreparsetokensequen@@#1#2#3{% remaining sequence is nonempty
    \yystartsinbrace{#2}{\yydealwithbracedgroup{#1}{#2}{#3}}{\yypreparsetokensequ@n@@{#1}{#2}{#3}}%
}

\def\yydealwithbracedgroup#1#2#3{% the first token of the remaining sequence is a brace
    \iffalse{\fi\yydealwithbracedgro@p#2}{#1}{#3}%
}

\def\yydealwithbracedgro@p#1{%
    \yypreparsetokensequenc@{\yyrepackagesequence}{#1}{}%
}

% #1 -- parsed sequence
% this is a sequence to `propagate expansion' into the next parameter.
% the same can be achieved by packaging the whole sequence with a 
% \csname ... \endcsname pair and using a simple \expandafter
% maybe that would be a better idea ...

\def\yyrepackagesequence#1{%
    \yyrepackagesequenc@{}#1\end
}

% #1 -- `packaged' sequence (\expandafter\expandafter\expandafter ? ...)
% #2 -- the next category 12 character or \end

\def\yyrepackagesequenc@#1#2{%
    \ifx#2\end
        \yybreak{\yyrepackagesequ@nc@{#1\expandafter\expandafter\expandafter}}%
    \else
        \yybreak{\yyrepackagesequenc@{#1\expandafter\expandafter\expandafter#2}}%
    \yycontinue
}

% #1 -- `packaged' sequence (\expandafter\expandafter\expandafter ? ...)
% this macro is followed by the remainder of the original sequence with a so far
% unmatched right brace, the `call stack' and the parsed sequence.

\def\yyrepackagesequ@nc@#1{%
    \expandafter\expandafter\expandafter\yyrepackagesequ@nc@swap#1{\expandafter\eatone\string}%
}

% #1 -- parsed sequence without packaging

\def\yyrepackagesequ@nc@swap#1#{%
    \yyrepackagesequ@nc@sw@p{#1}%
}

% #1 -- parsed `inner' sequence
% #2 -- remainder of the original sequence
% #3 -- `call stack'
% #4 -- parsed sequence so far

\def\yyrepackagesequ@nc@sw@p#1#2#3#4{%
    \yypreparsetokensequenc@{#3}{#2}{#4[#1]}%
}

% `braced group' thread ends here

% #1 -- `call stack'
% #2 -- remaining sequence
% #3 -- `parsed' sequence

\def\yypreparsetokensequ@n@@#1#2#3{% the remaining group in #2 is nonempty and does not start with a brace
    \yystartsinspace{#2}{\yyconsumetruespace{#1}{#2}{#3}}{\yypreparsetokenseq@@n@@{#1}{#2}{#3}}%
}

\def\yyconsumetruespace#1#2#3{%
    \expandafter\yyconsumetruespac@swap\expandafter{\eatonespace#2}{#1}{#3.}%
}

\def\yyconsumetruespac@swap#1#2#3{%
    \yypreparsetokensequenc@{#2}{#1}{#3}%
}

% `group starting with a true (character code 32, category code 10) space' thread ends here

% #1 -- `call stack'
% #2 -- remaining sequence
% #3 -- `parsed' sequence

\def\yypreparsetokenseq@@n@@#1#2#3{% a nonempty group, that does not start with a brace or a true space
    \yymatchblankspace{#2}{\yyrescanblankspace{#2}{#1}{#3}}{\yypreparsetokens@q@@n@@{#1}{#2}{#3}}%
}

% #1 -- remaining sequence
% #2 -- `call stack'
% #3 -- `parsed' sequence

\def\yyrescanblankspace#1#2#3{%
    \expandafter\expandafter\expandafter
        \yyrescanblankspac@swap
    \expandafter\expandafter\expandafter{\expandafter\yynormalizeblankspac@\meaning#1}{#2}{#3*}%
}

\def\yyrescanblankspac@swap#1#2#3{%
    \yystartsinspace{#1}{%
        \expandafter\yyrescanblankspac@sw@p\expandafter{\eatonespace#1}{#2}{#3}%
    }{%
        \expandafter\yyrescanblankspac@sw@p\expandafter{\eatone#1}{#2}{#3}%
    }%
}

\def\yyrescanblankspac@sw@p#1#2#3{%
    \yypreparsetokensequenc@{#2}{#1}{#3}%
}

% `group starting with a blank space' ends here

% #1 -- `call stack'
% #2 -- remaining sequence
% #3 -- `parsed' sequence

\def\yypreparsetokens@q@@n@@#1#2#3{% nonempty group starting with a non blank, non brace token
    \expandafter\yypreparsetokens@q@@n@@swap\expandafter{\eatone#2}{#1}{#30}%
}

\def\yypreparsetokens@q@@n@@swap#1#2#3{%
    \yypreparsetokensequenc@{#2}{#1}{#3}%
}

% #1 -- string of category code 12 or 10 characters
% #2 -- string of category code 12 or 10 characters

\def\yycomparesimplestrings#1#2{%
    \yystringempty{#1}{%
        \yystringempty{#2}{\yyfirstoftwo}{\yysecondoftwo}%
    }{\yycomparesimplestrings@{#1}{#2}}%
}

\def\yycomparesimplestrings@#1#2{% the first string is nonempty
    \yystringempty{#2}{\yysecondoftwo}{\yycomparesimplestrings@@{#1}{#2}}%
}

\def\yycomparesimplestrings@@#1#2{% both strings are nonempty
    \yystartsinspace{#1}{%
        \yystartsinspace{#2}{\yyabsorbfirstspace{#1}{#2}}{\yysecondoftwo}%
    }{%
        \yystartsinspace{#2}{\yysecondoftwo}{\yyabsorbfirstnonspace{#1}{#2}}%
    }    
}

\def\yyabsorbfirstspace#1#2{%
    \expandafter\yyabsorbfirstspac@swap\expandafter{\eatonespace#1}{#2}%
}

\def\yyabsorbfirstspac@swap#1#2{%
     \expandafter\yyabsorbfirst@swap\expandafter{\eatonespace#2}{#1}%
}

\def\yyabsorbfirstnonspace#1#2{%
    \expandafter\yyabsorbfirstnonspac@swap\expandafter{\eatone#1}{#2}%
}

\def\yyabsorbfirstnonspac@swap#1#2{%
     \expandafter\yyabsorbfirst@swap\expandafter{\eatone#2}{#1}%
}

\def\yyabsorbfirst@swap#1#2{%
     \yycomparesimplestrings{#2}{#1}%
}

% `compare strings of category code 12' thread ends here

% #1 -- remaining parsed sequence
% #2 -- analysed sequence

\def\yyanalysetokens@#1#2{%
    \yystringempty{#1}{{#2}}%
        {\yyanalysetok@ns@#1\end{#2}}%
}

\def\yyanalysetok@ns@#1#2\end{%
    \ifx#1.%
        \expandafter\yyfirstoftwo
    \else
        \expandafter\yysecondoftwo
    \fi
    {\yygrabablank{#2}}%
    {%
        \ifx#1[% not a space, an opening brace
            \expandafter\yyfirstoftwo
        \else
            \expandafter\yysecondoftwo
        \fi
        {%
            \yydisableobrace{#2}%
        }{% 
            \ifx#1]% not a space, a closing brace
                \expandafter\yyfirstoftwo
            \else
                \expandafter\yysecondoftwo
            \fi
            {%
                \yydisablecbrace{#2}%
            }{% neither space nor brace
                \yygrabtokenraw{#2}%
            }%
        }%
    }%
}

% #1 -- remaining parsed sequence
% #2 -- analysed sequence
% #3 -- next token

\def\yygrabtokenraw#1#2#3{%
    \expandafter\yyanalysetokens@swap\expandafter{\meaning#3}{#1}{#2}%
}

\def\yyanalysetokens@swap#1#2#3{%
    \yyanalysetokens@{#2}{#3t#1e}%
}

\def\yygrabablank#1#2 {%
    \yyanalysetokens@{#1}{#2s0e}%
}

% #1 -- remaining parsed sequence
% #2 -- analysed sequence

\def\yydisablecbrace#1#2{%
    \yydisablecbrac@{}#1\relax#2\end
}


\def\yydisablecbrac@#1#2{%
    \ifx#2\end
        \yybreak{\yydisablecbrac@@{#1\expandafter\expandafter\expandafter}}%
    \else
        \yybreak{\yydisablecbrac@{#1\expandafter\expandafter\expandafter#2}}%
    \yycontinue
}

\def\yydisablecbrac@@#1{%
    \expandafter\expandafter\expandafter
        \yydisablecbrace@@@#1\end
    \expandafter\expandafter\expandafter
        {\iffalse}\fi\string
}

\def\yydisablecbrace@@@#1\relax#2\end#3{%
    \yystartsinspace{#3}%
        {\expandafter\yyanalysetok@nsswap\expandafter{\eatonespace#3}{#1}{#2c1e}}%
        {\expandafter\yyanalysetok@nsswap\expandafter{\eatone#3}{#1}{#2c2e}}%
}

\def\yyanalysetok@nsswap#1#2#3{%
    \iffalse{\fi\yyanalysetokens@{#2}{#3}#1}%
}

% #1 -- remaining parsed sequence
% #2 -- analysed sequence

\def\yydisableobrace#1#2{%
    \yydisableobrac@{}#1\relax#2\end
}


\def\yydisableobrac@#1#2{%
    \ifx#2\end
        \yybreak{\yydisableobrac@@{#1\expandafter\expandafter\expandafter}}%
    \else
        \yybreak{\yydisableobrac@{#1\expandafter\expandafter\expandafter#2}}%
    \yycontinue
}

\def\yydisableobrac@@#1{%
    \expandafter\expandafter\expandafter
        \yydisableobrace@@@#1\end
    \expandafter\expandafter\expandafter
        {\iffalse}\fi\string
}

\def\yydisableobrace@@@#1\relax#2\end#3{%
    \yystartsinspace{#3}%
        {\expandafter\yyanalysetok@nsswap\expandafter{\eatonespace#3}{#1}{#2o1e}}%
        {\expandafter\yyanalysetok@nsswap\expandafter{\eatone#3}{#1}{#2o2e}}%
}

\uccode`\ =`\-

% \dotspace expands into a character code `\-, category code 10 token (funny space)

\uppercase{\def\dotspace{ }}

\toksa\expandafter\expandafter\expandafter{\expandafter\meaning\dotspace}

\toksb{-}

\toksc{#2}

\toksd\toksa

\yyreplacestring\toksb\in\toksa\with\toksc

\toksc{}
\yyreplacestring\toksb\in\toksd\with\toksc

\expandafter\def\expandafter\yymatchblankspac@\expandafter#\expandafter1\the\toksd{%
    \yystringempty{#1}{\expandafter\yysecondofthree\expandafter{\string}}%
        {\expandafter\yythirdofthree\expandafter{\string}}%
}

\edef\yymatchblankspace#1{% is it \catcode 10 token?
    \noexpand\iffalse{\noexpand\fi
    \noexpand\expandafter
    \noexpand\yymatchblankspac@
    \noexpand\meaning#1\the\toksd}%
}

% the idea behind the sequence below is that a leading character of category code 10
% is replaced either by a character of category code 10 and charachter code 32 or a character
% of category code 12 and character code other than 32
% note that while it is tempting to replace the definition below by something that ends in
% ... blank space #2{ ... with the hope of absorbing the result of \meaning in one step,
% this will not give the desired result in case of an active character,
% say, `~' that had been \let to the normal blank space

\expandafter\def\expandafter\yynormalizeblankspac@\expandafter#\expandafter1\the\toksd{}

\def\yystartsinspace#1{% is it \charcode 32, \catcode 10 token?
    \iffalse{\fi\yystartsinspac@#1 }%
}

\def\yystartsinspac@#1 {%
    \yystringempty{#1}{\expandafter\yysecondofthree\expandafter{\string}}{\expandafter\yythirdofthree\expandafter{\string}}%
}

\def\yystartsinbrace#1{%
  \iffalse{{\fi
  \if!\yytoks@mpty#1}}!%
    \expandafter\yysecondoftwo
  \else
    \expandafter\yyfirstoftwo
  \fi
}

\def\yystringempty#1{%
  \iffalse{{{\fi
  \ifcase\yytoks@mpty#1}}\@ne}\z@
    \expandafter\yyfirstoftwo
  \else
    \expandafter\yysecondoftwo
  \fi
}

\def\yytoks@mpty{%
    \expandafter\eatone\expandafter{\expandafter{%
        \ifcase\expandafter1\expandafter}\expandafter}\expandafter\fi\string
}

%% test code begins here

%\tracingmacros=3
%\tracingonline=3

\catcode`\ =13\relax%
\def\actspace{ }%
\catcode`\ =10\relax%

\catcode`\.=13\relax%
\def\actdotspace{.}%
\catcode`\.=12\relax%

\edef\makefunkydotspace{\let\expandafter\noexpand\actdotspace= \dotspace}
\edef\makefunkyspace{\let\expandafter\noexpand\actspace= \space}

\makefunkyspace
\makefunkydotspace

\catcode`\<=1
\catcode`\>=2
\uccode`\<=32
\uccode`\>=32

% inside the following sequence, < and > will become braces with character code 32 (space),
% \actspace will expand into an active character with character code 32, that has been \let to a
% character code 32, category code 10 token (space)

\uppercase{\edef\temptest{{ } \space\space\dotspace\expandafter\noexpand\actspace\expandafter\noexpand\actdotspace{<> {{}{{ u o l k kk
    \end\noexpand\fi\noexpand\else\noexpand\iffalse{}} }}}}}

%\uppercase{\edef\temptest{\dotspace E <>}}

\show\temptest

\def\displaypreparse#1{%
    \expandafter\errmessage\expandafter{\romannumeral-1\yypreparsetokensequenc@{\yyanalysetokens@}{#1}{}{}#1}%
}

\expandafter\displaypreparse\expandafter{\temptest}

\end

相關內容