Problemas al unir vídeos en FFmpeg con efecto de fundido cruzado (desbordamiento de la cola del búfer, caída de fotogramas)

Problemas al unir vídeos en FFmpeg con efecto de fundido cruzado (desbordamiento de la cola del búfer, caída de fotogramas)

Puedo unir exitosamente un par de videos en uno (usandoestemétodo) con un agradable efecto de fundido cruzado en el medio, pero el vídeo de salida no contiene audio. Estoy usando el -filter_complexcomando.

Logré modificar el comando para que el audio también estuviera allí, pero sigo recibiendo el mensaje.Desbordamiento de cola de búfer, caídaerror al decodificar. Por extraño que parezca, las transmisiones de audio se fusionan correctamente, con el efecto de fundido cruzado en el medio, mientras que la transmisión de video contiene solo un pequeño porcentaje de la entrada original, debido a los fotogramas eliminados. Aquí está el comando que estoy usando:

ffmpeg -i 0.mp4 -i 1.mp4 -i 2.mp4 -i 3.mp4 -f lavfi -i color=white:s=1920x1080
-filter_complex
"
[0:v]format=pix_fmts=yuva420p,fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=28:d=1:alpha=1,setpts=PTS-STARTPTS[v0];
[1:v]format=pix_fmts=yuva420p,fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=98:d=1:alpha=1,setpts=PTS-STARTPTS+28/TB[v1];
[2:v]format=pix_fmts=yuva420p,fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=101:d=1:alpha=1,setpts=PTS-STARTPTS+126/TB[v2];
[3:v]format=pix_fmts=yuva420p,fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=37:d=1:alpha=1,setpts=PTS-STARTPTS+227/TB[v3];
[4:v]trim=duration=265[over0];
[0:a]afade=in:st=0:d=1,afade=out:st=28:d=1[a0];
[1:a]afade=in:st=0:d=1,afade=out:st=98:d=1,adelay=28000[a1];
[2:a]afade=in:st=0:d=1,afade=out:st=101:d=1,adelay=126000[a2];
[3:a]afade=in:st=0:d=1,afade=out:st=37:d=1,adelay=227000[a3];
[a0][a1][a2][a3]amix=inputs=4,volume=2;
[over0][v0]overlay[over1];
[over1][v1]overlay[over2];
[over2][v2]overlay[over3];
[over3][v3]overlay=format=yuv420[output]
"
-vcodec libx264 -map [output] output.mp4

Este comando produce una salida inutilizable, con más del 90% de los fotogramas perdidos. Los comandos para el audio parecen ser el problema, porque cuando los elimino, el vídeo se fusiona correctamente, sin perder fotogramas:

ffmpeg -i 0.mp4 -i 1.mp4 -i 2.mp4 -i 3.mp4 -f lavfi -i color=white:s=1920x1080
-filter_complex
"
[0:v]format=pix_fmts=yuva420p,fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=28:d=1:alpha=1,setpts=PTS-STARTPTS[v0];
[1:v]format=pix_fmts=yuva420p,fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=98:d=1:alpha=1,setpts=PTS-STARTPTS+28/TB[v1];
[2:v]format=pix_fmts=yuva420p,fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=101:d=1:alpha=1,setpts=PTS-STARTPTS+126/TB[v2];
[3:v]format=pix_fmts=yuva420p,fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=37:d=1:alpha=1,setpts=PTS-STARTPTS+227/TB[v3];
[4:v]trim=duration=265[over0];
[over0][v0]overlay[over1];
[over1][v1]overlay[over2];
[over2][v2]overlay[over3];
[over3][v3]overlay=format=yuv420[output]
"
-vcodec libx264 -map [output] output.mp4

El comando anterior proporciona una salida que no contiene audio, pero con todos los fotogramas de vídeo en su lugar.

Intenté usar otros comandos para unirme a las transmisiones de audio, pero amixparece ser el único que realmente funciona aquí. Llevo horas intentando solucionar esto y no consigo solucionarlo. ¿Me estoy perdiendo algo o estoy haciendo todo mal? Cualquier sugerencia será bienvenida...

Además, aquí está el resultado completo de la consola (eliminé solo algunos de los metadatos). Dejé la codificación después de recibir un montón de errores de "desbordamiento de cola de búfer"; de lo contrario, el resultado sería mucho, mucho mayor que esto:

>ffmpeg -i 0.mp4 -i 1.mp4 -i 2.mp4 -i 3.mp4 -f lavfi -i color=white:s=1920x1080 -filter_complex "[0:v]format=pix_fmts=yuva420p,fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=28:d=1:alpha
=1,setpts=PTS-STARTPTS[v0];[1:v]format=pix_fmts=yuva420p,fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=98:d=1:alpha=1,setpts=PTS-STARTPTS+28/TB[v1];[2:v]format=pix_fmts=yuva420p,fade=t=
in:st=0:d=1:alpha=1,fade=t=out:st=101:d=1:alpha=1,setpts=PTS-STARTPTS+126/TB[v2];[3:v]format=pix_fmts=yuva420p,fade=t=in:st=0:d=1:alpha=1,fade=t=out:st=37:d=1:alpha=1,setpts=PTS-ST
ARTPTS+227/TB[v3];[4:v]trim=duration=265[over0];[0:a]afade=in:st=0:d=1,afade=out:st=28:d=1[a0];[1:a]afade=in:st=0:d=1,afade=out:st=98:d=1,adelay=28000[a1];[2:a]afade=in:st=0:d=1,af
ade=out:st=101:d=1,adelay=126000[a2];[3:a]afade=in:st=0:d=1,afade=out:st=37:d=1,adelay=227000[a3];[a0][a1][a2][a3]amix=inputs=4,volume=2;[over0][v0]overlay[over1];[over1][v1]overla
y[over2];[over2][v2]overlay[over3];[over3][v3]overlay=format=yuv420[output]" -vcodec libx264 -map [output] output.mp4
ffmpeg version N-71959-g9253cc4 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 4.9.2 (GCC)
  configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-li
bass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libdcadec --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libm
p3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsoxr --enable-libspeex --en
able-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enabl
e-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-lzma --enable-decklink --enable-zlib
  libavutil      54. 23.101 / 54. 23.101
  libavcodec     56. 37.102 / 56. 37.102
  libavformat    56. 32.100 / 56. 32.100
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 16.101 /  5. 16.101
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '0.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp41isom
  Duration: 00:00:29.31, start: 0.000000, bitrate: 19944 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 19831 kb/s, 29.81 fps, 30 tbr, 30k tbn, 60k tbc (default)
    Metadata:
      handler_name    : VideoHandler
      encoder         : AVC Coding
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 97 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from '1.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp41isom
    creation_time   : 2015-03-24 19:06:46
    date            : 2015-03-24T19:08:28Z
    date-eng        : 2015-03-24T19:08:28Z
    location        : █████████████████
    location-eng    : █████████████████
  Duration: 00:01:39.75, start: 0.000000, bitrate: 20017 kb/s
    Stream #1:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 19953 kb/s, 30 fps, 30 tbr, 30k tbn, 60k tbc (default)
    Metadata:
      creation_time   : 2015-03-24 19:06:46
      handler_name    : VideoHandler
      encoder         : AVC Coding
    Stream #1:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 98 kb/s (default)
    Metadata:
      creation_time   : 2015-03-24 19:06:46
      handler_name    : SoundHandler
Input #2, mov,mp4,m4a,3gp,3g2,mj2, from '2.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp41isom
    creation_time   : 2015-04-21 15:56:27
    date            : 2015-04-21T15:58:12Z
    date-eng        : 2015-04-21T15:58:12Z
    location        : █████████████████
    location-eng    : █████████████████
  Duration: 00:01:42.35, start: 0.000000, bitrate: 20020 kb/s
    Stream #2:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 19948 kb/s, 29.99 fps, 30 tbr, 30k tbn, 60k tbc (default)
    Metadata:
      creation_time   : 2015-04-21 15:56:27
      handler_name    : VideoHandler
      encoder         : AVC Coding
    Stream #2:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 97 kb/s (default)
    Metadata:
      creation_time   : 2015-04-21 15:56:27
      handler_name    : SoundHandler
Input #3, mov,mp4,m4a,3gp,3g2,mj2, from '3.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp41isom
    creation_time   : 2015-05-02 12:20:47
    date            : 2015-05-02T12:21:28Z
    date-eng        : 2015-05-02T12:21:28Z
    location        : █████████████████
    location-eng    : █████████████████
  Duration: 00:00:38.89, start: 0.000000, bitrate: 19937 kb/s
    Stream #3:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 19934 kb/s, 29.98 fps, 30 tbr, 30k tbn, 60k tbc (default)
    Metadata:
      creation_time   : 2015-05-02 12:20:47
      handler_name    : VideoHandler
      encoder         : AVC Coding
    Stream #3:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 97 kb/s (default)
    Metadata:
      creation_time   : 2015-05-02 12:20:47
      handler_name    : SoundHandler
Input #4, lavfi, from 'color=white:s=1920x1080':
  Duration: N/A, start: 0.000000, bitrate: N/A
    Stream #4:0: Video: rawvideo (I420 / 0x30323449), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 25 tbr, 25 tbn, 25 tbc
[libx264 @ 0000000004fa0020] using SAR=1/1
[libx264 @ 0000000004fa0020] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.1 Cache64
[libx264 @ 0000000004fa0020] profile High, level 4.0
[libx264 @ 0000000004fa0020] 264 - core 146 r2538 121396c - H.264/MPEG-4 AVC codec - Copyleft 2003-2015 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 a
nalyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=3 lookah
ead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 key
int=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp41isom
    encoder         : Lavf56.32.100
    Stream #0:0: Audio: aac (libvo_aacenc) ([64][0][0][0] / 0x0040), 44100 Hz, mono, s16, 128 kb/s (default)
    Metadata:
      encoder         : Lavc56.37.102 libvo_aacenc
    Stream #0:1: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 25 fps, 12800 tbn, 25 tbc (default)
    Metadata:
      encoder         : Lavc56.37.102 libx264
Stream mapping:
  Stream #0:0 (h264) -> format
  Stream #0:1 (aac) -> afade
  Stream #1:0 (h264) -> format
  Stream #1:1 (aac) -> afade
  Stream #2:0 (h264) -> format
  Stream #2:1 (aac) -> afade
  Stream #3:0 (h264) -> format
  Stream #3:1 (aac) -> afade
  Stream #4:0 (rawvideo) -> trim
  volume -> Stream #0:0 (libvo_aacenc)
  overlay -> Stream #0:1 (libx264)
Press [q] to stop, [?] for help
[Parsed_overlay_31 @ 0000000004fc3bc0] [framesync @ 0000000005173a68] Buffer queue overflow, dropping.
    Last message repeated 3 times
[Parsed_overlay_31 @ 0000000004fc3bc0] [framesync @ 0000000005173a68] Buffer queue overflow, dropping.
    Last message repeated 2 times
[Parsed_overlay_32 @ 0000000004fc47c0] [framesync @ 000000000517d388] Buffer queue overflow, dropping.
    Last message repeated 7 times
[Parsed_overlay_33 @ 0000000004fc3680] [framesync @ 000000000517d988] Buffer queue overflow, dropping.
    Last message repeated 6 times
[Parsed_overlay_31 @ 0000000004fc3bc0] [framesync @ 0000000005173a68] Buffer queue overflow, dropping.
    Last message repeated 9 times
[Parsed_overlay_33 @ 0000000004fc3680] [framesync @ 000000000517d988] Buffer queue overflow, dropping.
    Last message repeated 9 times
[Parsed_overlay_32 @ 0000000004fc47c0] [framesync @ 000000000517d388] Buffer queue overflow, dropping.
    Last message repeated 9 times
[Parsed_overlay_31 @ 0000000004fc3bc0] [framesync @ 0000000005173a68] Buffer queue overflow, dropping.
[Parsed_overlay_32 @ 0000000004fc47c0] [framesync @ 000000000517d388] Buffer queue overflow, dropping.
[Parsed_overlay_33 @ 0000000004fc3680] [framesync @ 000000000517d988] Buffer queue overflow, dropping.
frame=   75 fps=3.6 q=-1.0 Lsize=    2425kB time=00:00:02.97 bitrate=6681.4kbits/s
video:2374kB audio:46kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.183280%
[libx264 @ 0000000004fa0020] frame I:5     Avg QP:18.75  size: 45143
[libx264 @ 0000000004fa0020] frame P:50    Avg QP:24.25  size: 36116
[libx264 @ 0000000004fa0020] frame B:20    Avg QP:23.56  size: 19940
[libx264 @ 0000000004fa0020] consecutive B-frames: 49.3% 45.3%  0.0%  5.3%
[libx264 @ 0000000004fa0020] mb I  I16..4: 27.3% 65.2%  7.6%
[libx264 @ 0000000004fa0020] mb P  I16..4:  2.8% 18.9%  1.1%  P16..4: 50.6% 12.1%  3.7%  0.0%  0.0%    skip:10.8%
[libx264 @ 0000000004fa0020] mb B  I16..4:  0.5%  2.5%  0.1%  B16..8: 42.4%  7.1%  0.6%  direct: 7.0%  skip:39.6%  L0:55.9% L1:42.0% BI: 2.1%
[libx264 @ 0000000004fa0020] 8x8 transform intra:77.5% inter:85.8%
[libx264 @ 0000000004fa0020] coded y,uvDC,uvAC intra: 56.3% 52.7% 2.3% inter: 30.5% 46.8% 0.2%
[libx264 @ 0000000004fa0020] i16 v,h,dc,p: 46% 10%  5% 39%
[libx264 @ 0000000004fa0020] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 19% 17% 16%  5% 10%  8% 12%  6%  7%
[libx264 @ 0000000004fa0020] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 22% 12%  5% 12%  8% 10%  4%  4%
[libx264 @ 0000000004fa0020] i8c dc,h,v,p: 61% 18% 17%  4%
[libx264 @ 0000000004fa0020] Weighted P-Frames: Y:20.0% UV:20.0%
[libx264 @ 0000000004fa0020] ref P L0: 71.4% 15.5%  9.5%  3.5%  0.0%
[libx264 @ 0000000004fa0020] ref B L0: 94.6%  5.4%
[libx264 @ 0000000004fa0020] kb/s:6480.80

Respuesta1

Esto se debe a la superposición de audio y el motivo es que no ha especificado asetptsexplícitamente el audio. Puedes usar algo como,

asetpts=PTS-STARTPTS[a0]

Árbitro:Mira aquí

información relacionada