拡張可能で、バランスの取れたトークンリストを比較する標準的な TeX のみの方法

Question

これを自分の質問への回答として投稿することが適切かどうかはわかりません。実際には質問に回答していないからです。しかし、私はそれをそのようにフラグ付けするつもりはありません (フラグ付けできると仮定しても)。そのため、元の問題を解決するひらめきが誰かにある場合は、喜んでそれを真の回答としてラベル付けします。

さて、マクロについて。マクロがこのような形になっていることをあらかじめお詫びします。マクロは、私が長年にわたり記述してきたさまざまなコードから抜粋 (および再再名前変更) したものなので、スタイルは少々... 折衷的とでも言うべきでしょうか。以下では多くの部分を最適化できますが、後で説明するように、複数パスの問題は残っています。そのため、これを解決する巧妙なトリックをお持ちの方がいらっしゃいましたら、ぜひお知らせください。

注意点がいくつかあります:

1) 実際の比較マクロは存在せず、分析部分のみが存在し、これにより、\romannumeral-1カテゴリ 11 および 12 のトークンの文字列が「プレフィックス拡張可能」な方法で (つまり、トリックで機能する) 提供され、この文字列には、カテゴリ、文字コード (存在する場合)、中括弧であるかどうか、文字コードなど、シーケンス内のすべてのトークンを識別するのに十分な情報が含まれています。このような文字列は、必要に応じて直接比較できます。

2) そうですね、1) は次の 2 つの点で嘘です。

a) パラメータとして取得できるトークン (つまり、スペースや中括弧ではないもの) は、取得されて、\meaning t ... e で囲まれた文字列に置き換えられます (t と e は両方ともカテゴリ 11)。カテゴリ 10 のトークンで文字コードが 32 以外のものもこのカテゴリに該当することに注意してください (しゃれです)。を\yygrabtokenraw調整して、より適切な分析を行うことができます (目的がトークンの任意のバランスの取れたリストを比較することである場合は必須ですが、慎重に記述されたいくつかの条件に要約されます)。は-1 になる可能性がある\stringため、 just だけでは十分ではないことに注意してください。\escapechar

b) 「トップレベル」の再帰ステップが欠落しています。ここでの主な問題は、文字コード 32 の中括弧です。これらは、シーケンスの長さがわかっていて、\stringそれらをすべて取得するか、そのを見つけることができる最後の段階で処理されます\meaning。まあ、そんなに早くはありません。なぜなら、カテゴリコード 32 の場合、と\meaningはどちらも\stringそれらを通常のスペースに変換します ( は\meaning2 つのスペースで終了し、どちらも役に立ちません)。これは、\detokenize 修正するために考案された 1 つの問題です。したがって、それらをどのように取得するかを決定する必要があります。コードによって保証される唯一のことは、すべての開き括弧が文字コード 32 (o1eまたはc1e) または 32 以外の文字コード ( o2e、c2e) として正しく識別されることです。これを行うコードは、中括弧を安全に使用するために、後続の閉じ括弧の一部 (文字コード) を台無しにするため、c2e最初の中括弧に続く「マーカー」は信頼できません (ただし、別のo1e、o1eまたはo2e が見つかった場合は、文字コード 32 の中括弧です)。次の反復では、次の括弧を台無しにすることなく、「解読された」括弧を取得できます。さらに何回もパスすると (残念ながら閉じ括弧の数だけ)、すべてが解決されます。興味がある方がいらっしゃれば、これを行うためのマクロを完成させることができます。ただし、Knuth がそれぞれを\meaningドットで終了した場合のみです...

3) コードは「拡張の伝播」に多くの時間を費やしています。典型的な状況は、 \somemacro{<long list of benign tokens>}{\string};\stringが他の何かが起こる前にここで拡張される必要があるため、にを\somemacro挿入するのに多くの時間を費やしていることです。が非常に長い場合はが失敗するため、すべてを数字としてコーディングしても役に立たないことに注意してください。を使用することは可能ですが (フォローアップ付き)、この場合、TeX のハッシュテーブルを汚染することに不安を感じます。\expandafter<long list ...>\romannumeral<long list ...>\csname <long ...>\endcsname\expandafter

マクロは最初のパスで「奇妙なスペース」を識別しようとしますが、これは\meaningおよび\yymatchblankspace 以下のの唯一の用途です。のみで実行できます\string。

マクロのテストケースは最後に含まれています。私が何か愚かなことを見落としていたら、お詫びします (Joseph Wright 氏や他の人たちが疑わしいときは、私も疑う傾向があります)。

編集：これらに何か問題があるかもしれないことに加えて、\long明確にするためにすべての定義の前にを省略したので、が\parそれを台無しにします。

さらに詳しくより良い分析を提供する上記: 異常なケース ( など\escapechar=-1 \let\#=#) を解決するには、多数のマクロ ( のように、1 文字につき 1 つ (または 2 つ)) を用意するか、1 つのed がすべての面倒な処理を実行する (「取得された」トークンを潜在的な「区切り文字」に再帰的に挿入する)\expandafter\def\csname match#\endcsname #1\##{...}% last '#' is \catcode 13マクロをいくつか用意することができます。中間のオプション (時間とスペースの交換) も可能です。「マクロが多すぎる」という点については、もちろん懸念事項です。これに関する私の (不完全な) 見解は、「それだけ多くのレジスタを用意できるのであれば、特別な「条件」も用意できる」というものです。\def\def\maintest #1<a list of all active characters and single letter cs's>{...}\catcode

私は恐れている拡大伝播上記の問題は、単に TeX で再帰を実行することの代償です。この問題は、トークンを (最初のパスで) \yysx ?whereでエンコードすることで、ある程度軽減できます\def\yysx#1#2{\expandafter\space\expandafter\yysx\expandafter#1\romannumeral-1#2}。このように、エントリ\romannumeral-1のリストの前にあるa は\yysx ?、リストの末尾に展開を「渡します」が、そのままの状態になります。

「ブレース後処理」はこんな感じすべき避けられる。

最後に、私は何度も「なぜ e-TeX がないのですか?」と尋ねられました。ここがそれについて議論するのに適切な場所かどうかはわかりませんが、私には (おそらく主観的な) 理由があり、それを避ける必要があります。このような好みについて議論するのにもっと良い場所を提案できる方がいらっしゃいましたら、ぜひ教えてください。

% helper macros (to build test cases, etc); @ is a letter

\def\yyreplacestring#1\in#2\with#3{%
      \expandafter\def\expandafter\r@placestring\expandafter##\expandafter1\the#1##2\end{%
          \def\r@placestring{##2}% is this the string at the very end?
          \ifx\r@placestring\empty % then it is the one we inserted, report
              \errmessage{string <\the#1> not present in \the#2}% do not change the register if the string is not there
          \else % remove the extra copy of #1\end at the end
              \expandafter#2\expandafter\expandafter\expandafter
                  {\expandafter\r@plac@string\expandafter{\the#3}{##1}##2\end}%
      \fi}% end of \r@placestring definition
      \expandafter\def\expandafter\r@plac@string
          \expandafter##\expandafter1%
          \expandafter##\expandafter2%
          \expandafter##\expandafter3%
          \the#1\end{##2##1##3}%
      \expandafter\expandafter\expandafter\r@placestring\expandafter\the\expandafter#2\the#1\end
}

\newtoks\toksa
\newtoks\toksb
\newtoks\toksc
\newtoks\toksd

\def\yybreak#1#2\yycontinue{\fi#1}

\def\eatone#1{}
\def\eatonespace#1 {}
\def\identity#1{#1}
\def\yyfirstoftwo#1#2{#1}
\def\yysecondoftwo#1#2{#2}
\def\yysecondofthree#1#2#3{#2}
\def\yythirdofthree#1#2#3{#3}

% #1 -- `call stack'
% #2 -- remaining sequence
% #3 -- `parsed' sequence

\def\yypreparsetokensequenc@#1#2#3{%
    \yystringempty{#2}{#1{#3}}{\yypreparsetokensequen@@{#1}{#2}{#3}}%
}

\def\yypreparsetokensequen@@#1#2#3{% remaining sequence is nonempty
    \yystartsinbrace{#2}{\yydealwithbracedgroup{#1}{#2}{#3}}{\yypreparsetokensequ@n@@{#1}{#2}{#3}}%
}

\def\yydealwithbracedgroup#1#2#3{% the first token of the remaining sequence is a brace
    \iffalse{\fi\yydealwithbracedgro@p#2}{#1}{#3}%
}

\def\yydealwithbracedgro@p#1{%
    \yypreparsetokensequenc@{\yyrepackagesequence}{#1}{}%
}

% #1 -- parsed sequence
% this is a sequence to `propagate expansion' into the next parameter.
% the same can be achieved by packaging the whole sequence with a 
% \csname ... \endcsname pair and using a simple \expandafter
% maybe that would be a better idea ...

\def\yyrepackagesequence#1{%
    \yyrepackagesequenc@{}#1\end
}

% #1 -- `packaged' sequence (\expandafter\expandafter\expandafter ? ...)
% #2 -- the next category 12 character or \end

\def\yyrepackagesequenc@#1#2{%
    \ifx#2\end
        \yybreak{\yyrepackagesequ@nc@{#1\expandafter\expandafter\expandafter}}%
    \else
        \yybreak{\yyrepackagesequenc@{#1\expandafter\expandafter\expandafter#2}}%
    \yycontinue
}

% #1 -- `packaged' sequence (\expandafter\expandafter\expandafter ? ...)
% this macro is followed by the remainder of the original sequence with a so far
% unmatched right brace, the `call stack' and the parsed sequence.

\def\yyrepackagesequ@nc@#1{%
    \expandafter\expandafter\expandafter\yyrepackagesequ@nc@swap#1{\expandafter\eatone\string}%
}

% #1 -- parsed sequence without packaging

\def\yyrepackagesequ@nc@swap#1#{%
    \yyrepackagesequ@nc@sw@p{#1}%
}

% #1 -- parsed `inner' sequence
% #2 -- remainder of the original sequence
% #3 -- `call stack'
% #4 -- parsed sequence so far

\def\yyrepackagesequ@nc@sw@p#1#2#3#4{%
    \yypreparsetokensequenc@{#3}{#2}{#4[#1]}%
}

% `braced group' thread ends here

% #1 -- `call stack'
% #2 -- remaining sequence
% #3 -- `parsed' sequence

\def\yypreparsetokensequ@n@@#1#2#3{% the remaining group in #2 is nonempty and does not start with a brace
    \yystartsinspace{#2}{\yyconsumetruespace{#1}{#2}{#3}}{\yypreparsetokenseq@@n@@{#1}{#2}{#3}}%
}

\def\yyconsumetruespace#1#2#3{%
    \expandafter\yyconsumetruespac@swap\expandafter{\eatonespace#2}{#1}{#3.}%
}

\def\yyconsumetruespac@swap#1#2#3{%
    \yypreparsetokensequenc@{#2}{#1}{#3}%
}

% `group starting with a true (character code 32, category code 10) space' thread ends here

% #1 -- `call stack'
% #2 -- remaining sequence
% #3 -- `parsed' sequence

\def\yypreparsetokenseq@@n@@#1#2#3{% a nonempty group, that does not start with a brace or a true space
    \yymatchblankspace{#2}{\yyrescanblankspace{#2}{#1}{#3}}{\yypreparsetokens@q@@n@@{#1}{#2}{#3}}%
}

% #1 -- remaining sequence
% #2 -- `call stack'
% #3 -- `parsed' sequence

\def\yyrescanblankspace#1#2#3{%
    \expandafter\expandafter\expandafter
        \yyrescanblankspac@swap
    \expandafter\expandafter\expandafter{\expandafter\yynormalizeblankspac@\meaning#1}{#2}{#3*}%
}

\def\yyrescanblankspac@swap#1#2#3{%
    \yystartsinspace{#1}{%
        \expandafter\yyrescanblankspac@sw@p\expandafter{\eatonespace#1}{#2}{#3}%
    }{%
        \expandafter\yyrescanblankspac@sw@p\expandafter{\eatone#1}{#2}{#3}%
    }%
}

\def\yyrescanblankspac@sw@p#1#2#3{%
    \yypreparsetokensequenc@{#2}{#1}{#3}%
}

% `group starting with a blank space' ends here

% #1 -- `call stack'
% #2 -- remaining sequence
% #3 -- `parsed' sequence

\def\yypreparsetokens@q@@n@@#1#2#3{% nonempty group starting with a non blank, non brace token
    \expandafter\yypreparsetokens@q@@n@@swap\expandafter{\eatone#2}{#1}{#30}%
}

\def\yypreparsetokens@q@@n@@swap#1#2#3{%
    \yypreparsetokensequenc@{#2}{#1}{#3}%
}

% #1 -- string of category code 12 or 10 characters
% #2 -- string of category code 12 or 10 characters

\def\yycomparesimplestrings#1#2{%
    \yystringempty{#1}{%
        \yystringempty{#2}{\yyfirstoftwo}{\yysecondoftwo}%
    }{\yycomparesimplestrings@{#1}{#2}}%
}

\def\yycomparesimplestrings@#1#2{% the first string is nonempty
    \yystringempty{#2}{\yysecondoftwo}{\yycomparesimplestrings@@{#1}{#2}}%
}

\def\yycomparesimplestrings@@#1#2{% both strings are nonempty
    \yystartsinspace{#1}{%
        \yystartsinspace{#2}{\yyabsorbfirstspace{#1}{#2}}{\yysecondoftwo}%
    }{%
        \yystartsinspace{#2}{\yysecondoftwo}{\yyabsorbfirstnonspace{#1}{#2}}%
    }    
}

\def\yyabsorbfirstspace#1#2{%
    \expandafter\yyabsorbfirstspac@swap\expandafter{\eatonespace#1}{#2}%
}

\def\yyabsorbfirstspac@swap#1#2{%
     \expandafter\yyabsorbfirst@swap\expandafter{\eatonespace#2}{#1}%
}

\def\yyabsorbfirstnonspace#1#2{%
    \expandafter\yyabsorbfirstnonspac@swap\expandafter{\eatone#1}{#2}%
}

\def\yyabsorbfirstnonspac@swap#1#2{%
     \expandafter\yyabsorbfirst@swap\expandafter{\eatone#2}{#1}%
}

\def\yyabsorbfirst@swap#1#2{%
     \yycomparesimplestrings{#2}{#1}%
}

% `compare strings of category code 12' thread ends here

% #1 -- remaining parsed sequence
% #2 -- analysed sequence

\def\yyanalysetokens@#1#2{%
    \yystringempty{#1}{{#2}}%
        {\yyanalysetok@ns@#1\end{#2}}%
}

\def\yyanalysetok@ns@#1#2\end{%
    \ifx#1.%
        \expandafter\yyfirstoftwo
    \else
        \expandafter\yysecondoftwo
    \fi
    {\yygrabablank{#2}}%
    {%
        \ifx#1[% not a space, an opening brace
            \expandafter\yyfirstoftwo
        \else
            \expandafter\yysecondoftwo
        \fi
        {%
            \yydisableobrace{#2}%
        }{% 
            \ifx#1]% not a space, a closing brace
                \expandafter\yyfirstoftwo
            \else
                \expandafter\yysecondoftwo
            \fi
            {%
                \yydisablecbrace{#2}%
            }{% neither space nor brace
                \yygrabtokenraw{#2}%
            }%
        }%
    }%
}

% #1 -- remaining parsed sequence
% #2 -- analysed sequence
% #3 -- next token

\def\yygrabtokenraw#1#2#3{%
    \expandafter\yyanalysetokens@swap\expandafter{\meaning#3}{#1}{#2}%
}

\def\yyanalysetokens@swap#1#2#3{%
    \yyanalysetokens@{#2}{#3t#1e}%
}

\def\yygrabablank#1#2 {%
    \yyanalysetokens@{#1}{#2s0e}%
}

% #1 -- remaining parsed sequence
% #2 -- analysed sequence

\def\yydisablecbrace#1#2{%
    \yydisablecbrac@{}#1\relax#2\end
}


\def\yydisablecbrac@#1#2{%
    \ifx#2\end
        \yybreak{\yydisablecbrac@@{#1\expandafter\expandafter\expandafter}}%
    \else
        \yybreak{\yydisablecbrac@{#1\expandafter\expandafter\expandafter#2}}%
    \yycontinue
}

\def\yydisablecbrac@@#1{%
    \expandafter\expandafter\expandafter
        \yydisablecbrace@@@#1\end
    \expandafter\expandafter\expandafter
        {\iffalse}\fi\string
}

\def\yydisablecbrace@@@#1\relax#2\end#3{%
    \yystartsinspace{#3}%
        {\expandafter\yyanalysetok@nsswap\expandafter{\eatonespace#3}{#1}{#2c1e}}%
        {\expandafter\yyanalysetok@nsswap\expandafter{\eatone#3}{#1}{#2c2e}}%
}

\def\yyanalysetok@nsswap#1#2#3{%
    \iffalse{\fi\yyanalysetokens@{#2}{#3}#1}%
}

% #1 -- remaining parsed sequence
% #2 -- analysed sequence

\def\yydisableobrace#1#2{%
    \yydisableobrac@{}#1\relax#2\end
}


\def\yydisableobrac@#1#2{%
    \ifx#2\end
        \yybreak{\yydisableobrac@@{#1\expandafter\expandafter\expandafter}}%
    \else
        \yybreak{\yydisableobrac@{#1\expandafter\expandafter\expandafter#2}}%
    \yycontinue
}

\def\yydisableobrac@@#1{%
    \expandafter\expandafter\expandafter
        \yydisableobrace@@@#1\end
    \expandafter\expandafter\expandafter
        {\iffalse}\fi\string
}

\def\yydisableobrace@@@#1\relax#2\end#3{%
    \yystartsinspace{#3}%
        {\expandafter\yyanalysetok@nsswap\expandafter{\eatonespace#3}{#1}{#2o1e}}%
        {\expandafter\yyanalysetok@nsswap\expandafter{\eatone#3}{#1}{#2o2e}}%
}

\uccode`\ =`\-

% \dotspace expands into a character code `\-, category code 10 token (funny space)

\uppercase{\def\dotspace{ }}

\toksa\expandafter\expandafter\expandafter{\expandafter\meaning\dotspace}

\toksb{-}

\toksc{#2}

\toksd\toksa

\yyreplacestring\toksb\in\toksa\with\toksc

\toksc{}
\yyreplacestring\toksb\in\toksd\with\toksc

\expandafter\def\expandafter\yymatchblankspac@\expandafter#\expandafter1\the\toksd{%
    \yystringempty{#1}{\expandafter\yysecondofthree\expandafter{\string}}%
        {\expandafter\yythirdofthree\expandafter{\string}}%
}

\edef\yymatchblankspace#1{% is it \catcode 10 token?
    \noexpand\iffalse{\noexpand\fi
    \noexpand\expandafter
    \noexpand\yymatchblankspac@
    \noexpand\meaning#1\the\toksd}%
}

% the idea behind the sequence below is that a leading character of category code 10
% is replaced either by a character of category code 10 and charachter code 32 or a character
% of category code 12 and character code other than 32
% note that while it is tempting to replace the definition below by something that ends in
% ... blank space #2{ ... with the hope of absorbing the result of \meaning in one step,
% this will not give the desired result in case of an active character,
% say, `~' that had been \let to the normal blank space

\expandafter\def\expandafter\yynormalizeblankspac@\expandafter#\expandafter1\the\toksd{}

\def\yystartsinspace#1{% is it \charcode 32, \catcode 10 token?
    \iffalse{\fi\yystartsinspac@#1 }%
}

\def\yystartsinspac@#1 {%
    \yystringempty{#1}{\expandafter\yysecondofthree\expandafter{\string}}{\expandafter\yythirdofthree\expandafter{\string}}%
}

\def\yystartsinbrace#1{%
  \iffalse{{\fi
  \if!\yytoks@mpty#1}}!%
    \expandafter\yysecondoftwo
  \else
    \expandafter\yyfirstoftwo
  \fi
}

\def\yystringempty#1{%
  \iffalse{{{\fi
  \ifcase\yytoks@mpty#1}}\@ne}\z@
    \expandafter\yyfirstoftwo
  \else
    \expandafter\yysecondoftwo
  \fi
}

\def\yytoks@mpty{%
    \expandafter\eatone\expandafter{\expandafter{%
        \ifcase\expandafter1\expandafter}\expandafter}\expandafter\fi\string
}

%% test code begins here

%\tracingmacros=3
%\tracingonline=3

\catcode`\ =13\relax%
\def\actspace{ }%
\catcode`\ =10\relax%

\catcode`\.=13\relax%
\def\actdotspace{.}%
\catcode`\.=12\relax%

\edef\makefunkydotspace{\let\expandafter\noexpand\actdotspace= \dotspace}
\edef\makefunkyspace{\let\expandafter\noexpand\actspace= \space}

\makefunkyspace
\makefunkydotspace

\catcode`\<=1
\catcode`\>=2
\uccode`\<=32
\uccode`\>=32

% inside the following sequence, < and > will become braces with character code 32 (space),
% \actspace will expand into an active character with character code 32, that has been \let to a
% character code 32, category code 10 token (space)

\uppercase{\edef\temptest{{ } \space\space\dotspace\expandafter\noexpand\actspace\expandafter\noexpand\actdotspace{<> {{}{{ u o l k kk
    \end\noexpand\fi\noexpand\else\noexpand\iffalse{}} }}}}}

%\uppercase{\edef\temptest{\dotspace E <>}}

\show\temptest

\def\displaypreparse#1{%
    \expandafter\errmessage\expandafter{\romannumeral-1\yypreparsetokensequenc@{\yyanalysetokens@}{#1}{}{}#1}%
}

\expandafter\displaypreparse\expandafter{\temptest}

\end

Answer 1