Como utilizar todos os núcleos em CPUs topo de linha usando ffmpeg

Como utilizar todos os núcleos em CPUs topo de linha usando ffmpeg

Eu tenho um servidor HPE ProLiant DL325 G10 de última geração com uma CPU AMD EPYC 7401P de 24 núcleos/48 threads, 128 GB de RAM DDR4 e um Intel P4800x PCIe NVMe. Tentei executar o ffmpeg para converter um vídeo (MKV) em MP4 para streaming online e ele não utiliza todos os 24 núcleos. Durante o processo de codificação ele utiliza cerca de 10% da CPU de acordo com o top. Um exemplo da saída principal está abaixo.

Eu li perguntas semelhantes em sites stackexchange, mas todas ficaram sem resposta e têm entre 6 e 8 anos de idade. Tentei adicionar o -threadsparâmetro antes e depois -icom as opções 0, 24, 48 e vários outros, mas parece ignorar esta entrada. Também não estou dimensionando o vídeo.

Também estou codificando em H.264. Abaixo estão alguns dos comandos que usei. Não consigo descobrir o que estou fazendo de errado ou qual é exatamente o gargalo.

Alguma sugestão de como posso fazer isso?

Comando usado:

ffmpeg -threads 24 -i input.mkv -c:v libx264 -preset medium -c:a copy -vf subtitles=input.mkv output.mp4

Eu também usei -sws_flags fast_bilinear& -x264-params sliced-threads=1mas ambos não mudam muito. Notei um -tune zerolatencyuso ligeiramente aumentado da CPU para alguns núcleos, mas o uso geral da CPU estava abaixo de 15%.

saída superior:

top - 16:14:13 up 57 min,  2 users,  load average: 2.65, 0.59, 0.50
Tasks: 509 total,   1 running, 280 sleeping,   1 stopped,   0 zombie
%Cpu0  :  5.7 us,  0.7 sy, 12.2 ni, 80.4 id,  0.0 wa,  0.0 hi,  1.0 si,  0.0 st
%Cpu1  :  2.6 us,  0.3 sy,  4.6 ni, 92.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  6.2 us,  0.3 sy,  6.9 ni, 86.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  4.0 us,  0.0 sy,  1.7 ni, 94.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  :  2.0 us,  0.0 sy, 13.9 ni, 84.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  :  2.3 us,  0.0 sy,  3.3 ni, 94.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  :  6.6 us,  0.3 sy,  8.2 ni, 84.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  :  8.1 us,  0.0 sy, 12.8 ni, 79.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu8  :  1.7 us,  0.3 sy, 31.5 ni, 66.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu9  : 21.6 us,  0.0 sy,  2.6 ni, 75.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu10 : 15.3 us,  0.7 sy,  9.6 ni, 74.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu11 : 12.3 us,  0.0 sy, 10.3 ni, 77.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu12 :  1.3 us,  0.3 sy, 26.4 ni, 71.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu13 :  3.3 us,  0.0 sy, 12.0 ni, 84.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu14 :  4.0 us,  0.0 sy,  7.7 ni, 88.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu15 :  2.0 us,  0.3 sy, 20.3 ni, 77.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu16 :  4.0 us,  0.3 sy, 29.7 ni, 66.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu17 :  2.3 us,  0.3 sy, 21.7 ni, 75.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu18 : 11.6 us,  0.0 sy, 22.8 ni, 65.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu19 :  6.0 us,  0.0 sy, 20.4 ni, 73.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu20 :  6.0 us,  0.0 sy, 26.3 ni, 67.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu21 :  0.3 us,  0.3 sy, 44.5 ni, 54.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu22 :  0.0 us,  0.3 sy, 26.1 ni, 73.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu23 : 20.1 us,  0.0 sy, 15.7 ni, 64.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu24 :  7.6 us,  0.7 sy,  5.0 ni, 86.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu25 :  3.0 us,  0.3 sy, 14.9 ni, 81.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu26 :  0.3 us,  0.0 sy, 16.6 ni, 83.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu27 :  3.0 us,  0.7 sy,  0.0 ni, 96.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu28 :  0.7 us,  0.0 sy,  0.3 ni, 99.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu29 :  8.3 us,  0.0 sy,  0.7 ni, 91.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu30 :  6.7 us,  0.3 sy,  9.7 ni, 83.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu31 :  4.0 us,  0.3 sy, 22.3 ni, 73.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu32 :  1.0 us,  0.0 sy, 24.6 ni, 74.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu33 : 10.6 us,  0.3 sy,  9.9 ni, 79.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu34 :  5.4 us,  0.0 sy,  2.7 ni, 91.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu35 :  6.6 us,  0.0 sy,  4.6 ni, 88.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu36 :  4.3 us,  0.0 sy, 17.1 ni, 78.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu37 :  5.6 us,  0.3 sy,  3.3 ni, 90.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu38 :  2.3 us,  0.0 sy,  8.6 ni, 89.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu39 :  2.3 us,  0.0 sy, 22.0 ni, 75.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu40 :  1.7 us,  0.0 sy, 17.4 ni, 80.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu41 :  6.0 us,  0.3 sy,  3.3 ni, 90.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu42 :  4.7 us,  0.3 sy, 15.7 ni, 79.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu43 :  7.9 us,  0.7 sy, 24.2 ni, 67.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu44 :  7.0 us,  0.3 sy, 28.3 ni, 64.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu45 : 29.1 us,  1.0 sy, 13.9 ni, 56.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu46 :  0.0 us,  0.0 sy, 45.3 ni, 54.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu47 :  0.0 us,  0.0 sy, 48.5 ni, 51.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

informações do ffmpeg:

ffmpeg version 3.4.4-0ubuntu0.18.04.1 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.3.0-16ubuntu3)
  configuration: --prefix=/usr --extra-version=0ubuntu0.18.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
  libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
  libavfilter     6.107.100 /  6.107.100
  libavresample   3.  7.  0 /  3.  7.  0
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100

Versão Ubuntu:

Linux ubuntu 4.15.0-46-generic #49-Ubuntu SMP Wed Feb 6 09:33:07 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

E, por último, aqui está um trecho da saída durante o processo de codificação:

frame=34048 fps=316 q=-1.0 Lsize=  119840kB time=00:23:40.03 bitrate= 691.3kbits/s dup=2 drop=0 speed=13.2x    

informação relacionada