LuaLaTex, pdfpages, distinguish between with-font and no-font?

LuaLaTex, pdfpages, distinguish between with-font and no-font?

I use only LuaLaTeX, so a compiler-specific answer is OK.

The pdfpages package permits me to include one or more PDF pages into a TeX document. Let us focus on the situation where exactly one PDF page will be included (per command), and there is no problem with the included page being too large for the document dimensions. Keep it simple.

The included PDF page is supposed to be an image. It was not necessarily generated by TeX.

I would like to distinguish a PDF page that has only image content (that is, no fonts), from a page that does not have only image content (may have some text, using embedded font subsets).

If the included PDF is image-only, OK. But if the included PDF has embedded font content of any kind, it is to be rejected with an error message.

The pdfpages package itself says nothing about this. It relies on graphicx which also says nothing about this.

So my question is: Is there any simple LuaTeX code that would inspect a PDF page, and distingush between a page that has no fonts, and a page that does?

Why I want to know: For licensing reasons (beyond the scope of Tex) I can include font-glyph-images but not font-glyph-vectors. Although I personally can inspect things myself, I have other users who might not be so thorough, and I want to set up a block.

There are three possible answers, as I see it: (1) Unrealistic request, so forget it. (2) LuaTeX can do it, and the answer is obvious to Lua gurus, so here's the answer. (3) LuaTeX can do it, but it is very complicated.

If (3) just let me know; I don't expect others to do the work for me.

Note: We can disregard the possibility of text without the font subset embedded.

EDIT: Thanks to DG and DC for suggesting the pdffonts command-line program. It is part of Xpdf, which works on both Linux and Windows. I already have it on both platforms, but had never used it (or even knew about it). It is a reasonably simple matter to include pdffonts in the BASH/Batch script that I am already using as part of a larger workflow.

Although pdffonts could not be called from within TeX without adding it to the list of approved shell-escape commands, that is not necessary for my purpose. Instead of TeX calling programs, I have a script that calls various programs, then finishes by calling lualatex on the pre-processed results.

So, I consider this to be answered.

FURTHER INFO: Here is why I asked. As we know, LaTeX cannot include a tiff image. But in some cases, and end-user requires tiff, rather than jpeg. For example: Add tif image to LaTeX

A PDF does not directly store the "image format." That is, there is no tiff or jpeg inside the PDF. Instead there is an XObject of Image type, with a compression method. So what is actually required is either an uncompressed stream or Flate decode, rather than JPEG compression; and, the image may need to be CMYK, which excludes png. Finally, the PDF may need to be PDF/X-1a, for commercial printing.

It is possible to do this, using ImageMagick and LuaLaTeX together. First, ImageMagick is used to convert an RGB image into a CMYK tiff image, according to a color profile (which may have an ink limit). Then, the profile is stripped, and ImageMagick converts the tiff to PDF. Then, that PDF is included in a suitable document class, using pdfpages. I have the suitable class (custom), and the output PDF meets PDF/X-1a:2001, as verified by Adobe Acrobat Pro.

Why not simply use jpeg? That does indeed work, with less effort. But as I said, some end-users insist on Flate decode rather than jpeg, for their own reasons.

Now, why I asked about fonts: If the user attempts to include a PDF with fonts, rather than just an image, there is no objection from TeX, and the PDF looks good. BUT it will fail PDF/X-1a test, even though it claims to be PDF/X-1a. There is no free software (to my knownledge) that can reveal the problem.

I have all of this working. Looks good. But I wanted to add an automated test so that other users (who often do not read instructions) will be informed, if the included PDF is incorrect.

답변1

A solution without command-line tools using the LuaTeX epdf library:

\documentclass{scrartcl}
\usepackage{luacode,pdfpages}
\begin{luacode*}
function check_for_fonts(name)
  local doc = epdf.open(name);
  if doc == nil then
    tex.sprint(luatexbase.catcodetables['latex-package'],
        "\\errmessage{Could not open " .. name .. "}{}{}\\@gobbletwo")
    return;
  else
    for pageno=1,doc:getNumPages() do
      local fonts = doc:getCatalog():getPage(pageno):getResourceDict():lookup("Font");
      if not fonts:isNull() and fonts:dictGetLength() ~= 0 then
        tex.sprint(luatexbase.catcodetables['latex-package'], '\\@firstoftwo')
        return;
      end
    end
  end
  tex.sprint(luatexbase.catcodetables['latex-package'], '\\@secondoftwo')
  return;
end
\end{luacode*}
\newcommand\PDFHasFontTF[1]{\directlua{check_for_fonts("\luaescapestring{#1}")}}
\begin{document}
\PDFHasFontTF{some_file.pdf}{%
  \errmessage{some_file.pdf contains fonts!}%
}{%
  \includepdf[pages=-]{some_file.pdf}%
}
\end{document}

관련 정보