Fondo:
- Estoy tratando de tomar un video de entrada (que incluye audio), dividirlo en cuadros usando cv2 (una biblioteca de Python), aplicar algunos cambios a los cuadros y luego usar ffmpeg para unirlos nuevamente y agregar nuevamente el audio original.
Mi problema
- Cuando sigo estos pasos, el audio no coincide con el video. El vídeo parece ir por detrás del audio. Sospecho que esto sucede porque el video original dice que su velocidad de fotogramas es de 29,8218 fps, y es posible que ffmpeg lo esté tratando como 29 fps.
Mi pregunta
- ¿Cómo puedo hacer que ffmpeg cree fotogramas a una velocidad de 29,8218 fps?
Más información
Este es el comando que uso para unir las imágenes:
ffmpeg -framerate X -i <path-to-input-images> -y <output>
Este es el comando que uso para combinar el video sin sonido con el audio para producir el video final:
ffmpeg -i <video-path> -i <audio-path> -codec copy <output-path>
Estos son los registros que producen los dos comandos anteriores:
ffmpeg version 3.4 Copyright (c) 2000-2017 the FFmpeg developers
built with gcc 7.2.0 (GCC)
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-cuda --enable-cuvid --enable-d3d11va --enable-nvenc --enable-dxva2 --enable-avisynth --enable-libmfx
libavutil 55. 78.100 / 55. 78.100
libavcodec 57.107.100 / 57.107.100
libavformat 57. 83.100 / 57. 83.100
libavdevice 57. 10.100 / 57. 10.100
libavfilter 6.107.100 / 6.107.100
libswscale 4. 8.100 / 4. 8.100
libswresample 2. 9.100 / 2. 9.100
libpostproc 54. 7.100 / 54. 7.100
Input #0, image2, from 'C:\Users\Nathan\Documents\rhymecraft\server\services\generate_video_for_lyrics\frames\192/%09d.png':
Duration: 00:01:01.55, start: 0.000000, bitrate: N/A
Stream #0:0: Video: png, rgba(pc), 406x720, 29 fps, 29 tbr, 29 tbn, 29 tbc
Stream mapping:
Stream #0:0 -> #0:0 (png (native) -> h264 (libx264))
Press [q] to stop, [?] for help
[libx264 @ 00000227a18c10a0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 00000227a18c10a0] profile High 4:4:4 Predictive, level 3.0, 4:4:4 8-bit
[libx264 @ 00000227a18c10a0] 264 - core 152 r2851 ba24899 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x1:0x111 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=0 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=4 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'C:\Users\Nathan\Documents\rhymecraft\server\services\generate_video_for_lyrics\soundless_videos\192\movie.mp4':
Metadata:
encoder : Lavf57.83.100
Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv444p, 406x720, q=-1--1, 29 fps, 14848 tbn, 29 tbc
Metadata:
encoder : Lavc57.107.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
frame= 1785 fps= 53 q=-1.0 Lsize= 4759kB time=00:01:01.44 bitrate= 634.4kbits/s speed=1.83x
video:4737kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.453360%
[libx264 @ 00000227a18c10a0] frame I:9 Avg QP:19.92 size: 12889
[libx264 @ 00000227a18c10a0] frame P:493 Avg QP:22.73 size: 5537
[libx264 @ 00000227a18c10a0] frame B:1283 Avg QP:26.34 size: 1562
[libx264 @ 00000227a18c10a0] consecutive B-frames: 3.2% 1.6% 3.5% 91.7%
[libx264 @ 00000227a18c10a0] mb I I16..4: 45.9% 0.0% 54.1%
[libx264 @ 00000227a18c10a0] mb P I16..4: 14.6% 0.0% 7.8% P16..4: 37.5% 14.2% 4.2% 0.0% 0.0% skip:21.8%
[libx264 @ 00000227a18c10a0] mb B I16..4: 1.4% 0.0% 0.8% B16..8: 43.3% 5.3% 0.6% direct: 1.0% skip:47.6% L0:48.9% L1:47.5% BI: 3.6%
[libx264 @ 00000227a18c10a0] coded y,u,v intra: 26.9% 10.0% 12.4% inter: 6.1% 0.9% 1.8%
[libx264 @ 00000227a18c10a0] i16 v,h,dc,p: 35% 23% 16% 26%
[libx264 @ 00000227a18c10a0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 29% 17% 22% 5% 7% 7% 6% 5% 3%
[libx264 @ 00000227a18c10a0] Weighted P-Frames: Y:1.0% UV:0.4%
[libx264 @ 00000227a18c10a0] ref P L0: 58.3% 11.2% 20.1% 10.4% 0.1%
[libx264 @ 00000227a18c10a0] ref B L0: 90.1% 7.8% 2.1%
[libx264 @ 00000227a18c10a0] ref B L1: 95.8% 4.2%
[libx264 @ 00000227a18c10a0] kb/s:630.39
ffmpeg version 3.4 Copyright (c) 2000-2017 the FFmpeg developers
built with gcc 7.2.0 (GCC)
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-cuda --enable-cuvid --enable-d3d11va --enable-nvenc --enable-dxva2 --enable-avisynth --enable-libmfx
libavutil 55. 78.100 / 55. 78.100
libavcodec 57.107.100 / 57.107.100
libavformat 57. 83.100 / 57. 83.100
libavdevice 57. 10.100 / 57. 10.100
libavfilter 6.107.100 / 6.107.100
libswscale 4. 8.100 / 4. 8.100
libswresample 2. 9.100 / 2. 9.100
libpostproc 54. 7.100 / 54. 7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'C:\Users\Nathan\Documents\rhymecraft\server\services\generate_video_for_lyrics\soundless_videos\192/movie.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.83.100
Duration: 00:01:01.55, start: 0.000000, bitrate: 633 kb/s
Stream #0:0(und): Video: h264 (High 4:4:4 Predictive) (avc1 / 0x31637661), yuv444p, 406x720, 630 kb/s, 29 fps, 29 tbr, 14848 tbn, 58 tbc (default)
Metadata:
handler_name : VideoHandler
Input #1, mp3, from 'C:\Users\Nathan\Documents\rhymecraft\server\services\generate_video_for_lyrics\audio_files\64e5ed6ec1556b3cfb2fe1f39aa670cd.mp3':
Metadata:
major_brand : mp42
minor_version : 1
compatible_brands: mp41mp42isom
encoder : Lavf58.27.103
Duration: 00:01:00.03, start: 0.025057, bitrate: 192 kb/s
Stream #1:0: Audio: mp3, 44100 Hz, stereo, s16p, 192 kb/s
Metadata:
encoder : Lavc58.52
Output #0, avi, to 'C:\Users\Nathan\Documents\rhymecraft\server\services\generate_video_for_lyrics\finished_videos\192\output.avi':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
ISFT : Lavf57.83.100
Stream #0:0(und): Video: h264 (High 4:4:4 Predictive) (avc1 / 0x31637661), yuv444p, 406x720, q=2-31, 630 kb/s, 29 fps, 29 tbr, 58 tbn, 58 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1: Audio: mp3 (U[0][0][0] / 0x0055), 44100 Hz, stereo, s16p, 192 kb/s
Metadata:
encoder : Lavc58.52
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #1:0 -> #0:1 (copy)
Press [q] to stop, [?] for help
frame= 1785 fps=0.0 q=-1.0 Lsize= 6294kB time=00:01:01.46 bitrate= 838.9kbits/s speed=3.42e+003x
video:4737kB audio:1407kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.446166%