Concat muchos vídeos: FPS incorrecto y tono de audio

Concat muchos vídeos: FPS incorrecto y tono de audio

Tengo un proceso como este:

  1. tengo N entrada de vídeo; Los corto y transcodifico usando el siguiente código:

    ffmpeg.exe -c:v h264_cuvid -resize 1920x1080 -noaccurate_seek -ss 00:00:04.000 -to 00:00:44.240 -i "input_N.MP4" -v error -filter:v fps=fps=25 -c:v h264_nvenc -preset medio -perfil alto -b:v 8000k -bufsize 8000k -maxrate 10000k -qmin 0 -g 250 -bf 2 -i_qfactor 0,75 -b_qfactor 1,1 "output_N.mp4"

  2. Creo un videoArrayFile.txt en el siguiente formato:

    archivo 'myFilePath\output_N.mp4' archivo 'myFilePath\output_N.mp4' archivo 'myFilePath\output_N.mp4' ...

  3. Concateno los archivos en videoArrayFile.txt usando el siguiente comando:

    ffmpeg.exe -f concat -v error -safe 0 -i "myPath\videoArrayFile.txt" -filter:a "volume=1" -c:v copy -c:a aac "myPath\outputConcat.mp4"

Este es el registro de ffmpeg cuando ejecuto el comando anterior:

[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (50) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (50) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (50) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (50) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (50) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (50) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (50) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input
[aac @ 000001fa4273e000] Number of scalefactor bands in group (51) exceeds limit (49).
Error while decoding stream #0:1: Invalid data found when processing input

El procedimiento está funcionando pero, al utilizar algunos archivos de entrada, me enfrento a algunosProblema de FPS y tono de audio: el archivo de salida tiene una velocidad de fotogramas variable y la velocidad del audio es incorrecta.

Ffprobe en "output_N" produce lo siguiente:

file 'D:\SharedExp_SSD\Outs\0.51.8\2020-10-30_11-45_LancioTandem-30FPS_1604054749\video_HL\tmp\introEncoded.mp4'

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'introEncoded.mp4':   Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.42.100   Duration: 00:00:04.60, start: 0.000000, bitrate: 2707 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 2575 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      handler_name    : Core Media Video
    Stream #0:1(ita): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 131 kb/s (default)
    Metadata:
      handler_name    : Core Media Audio
       file 'D:\SharedExp_SSD\Outs\0.51.8\2020-10-30_11-45_LancioTandem-30FPS_1604054749\video_HL\tmp\Exp_1488_clip_0.mp4'

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Exp_1488_clip_0.mp4':   Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.42.100   Duration: 00:00:40.27, start: 0.000000, bitrate: 8142 kb/s
    Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 8221 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      handler_name    :  GoPro AVC
      timecode        : 12:52:03:24
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 32000 Hz, mono, fltp, 69 kb/s (default)
    Metadata:
      handler_name    :  GoPro AAC
    Stream #0:2(eng): Data: none (tmcd / 0x64636D74)
    Metadata:
      handler_name    :  GoPro AVC
      timecode        : 12:52:03:24 Unsupported codec with id 0 for input stream 2

file 'D:\SharedExp_SSD\Outs\0.51.8\2020-10-30_11-45_LancioTandem-30FPS_1604054749\video_HL\tmp\Exp_1488_clip_1.mp4'

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Exp_1488_clip_1.mp4':   Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.42.100   Duration: 00:00:05.03, start: 0.000000, bitrate: 8460 kb/s
    Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 10239 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      handler_name    :  GoPro AVC
      timecode        : 13:22:01:01
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 32000 Hz, mono, fltp, 69 kb/s (default)
    Metadata:
      handler_name    :  GoPro AAC
    Stream #0:2(eng): Data: none (tmcd / 0x64636D74), 0 kb/s
    Metadata:
      handler_name    :  GoPro AVC
      timecode        : 13:22:01:01 Unsupported codec with id 0 for input stream 2


file 'D:\SharedExp_SSD\Outs\0.51.8\2020-10-30_11-45_LancioTandem-30FPS_1604054749\video_HL\tmp\Exp_1488_clip_2.mp4'

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Exp_1488_clip_2.mp4':   Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.42.100   Duration: 00:00:08.03, start: 0.000000, bitrate: 8926 kb/s
    Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 9303 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      handler_name    :  GoPro AVC
      timecode        : 13:22:01:01
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 32000 Hz, mono, fltp, 69 kb/s (default)
    Metadata:
      handler_name    :  GoPro AAC
    Stream #0:2(eng): Data: none (tmcd / 0x64636D74), 0 kb/s
    Metadata:
      handler_name    :  GoPro AVC
      timecode        : 13:22:01:01 Unsupported codec with id 0 for input stream 2

Mientras ffprobe en outputConcat.mp4 produce

outputConcat.mp4:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'outputConcat.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.42.100
  Duration: 00:00:57.54, start: 0.000000, bitrate: 7900 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 7820 kb/s, 24.28 fps, 25 tbr, 15360 tbn, 50 tbc (default)
    Metadata:
      handler_name    : Core Media Video
    Stream #0:1(ita): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 111 kb/s (default)
    Metadata:
      handler_name    : Core Media Audio

La base de tiempo en outputConcat.mp4 cambia con respecto a los demás: ¿este comportamiento está relacionado con las diferentes frecuencias de muestreo de las transmisiones de audio (la única diferencia que puedo notar en los archivos de entrada)?

Como puede ver en el punto 1), el FPS de entrada está configurado en 25 y todo el vídeo de entrada tiene exactamente el mismo códec y resolución.

¿Cómo puedo concatenar los videos para que el de salida tenga el mismo FPS de las entradas y no haya problemas de tono de audio? ¿Puede ser el problema de la base de tiempo?

Gracias, Marco

Respuesta1

Encontré una solución parcial al problema: el problema era que la frecuencia de muestreo de audio de las entradas no es coherente y no lo logré durante la primera conversión. Agregando -ar 32000a toda la cadena de conversión inicial resolví el problema del tono del audio. Todavía estoy tratando de entender por qué cambian los fps.

información relacionada