The source code:
The idea behind depth encoding:
The technical side of decoding:
The latency measurement detailed description:
The above final solution builds up on my other libraries (as git submodules) grouped here:
A word of caution:
The encoder uses lossy video codec for depth map encoding. It will not work perfectly.
Note for users:
The code currently works on Unix-like operating systems only (e.g. - Linux) and with Intel on encoding side (HEVC Main10 encoding through VAAPI).
See also:
Meta information (on making the video)
Public Domain music used (thank you!):
After the rain by Rexlambo
Creative Commons — Attribution 3.0 Unported — CC BY 3.0
Free Download / Stream:
Music promoted by Audio Library
Polaroid by extenz
Creative Commons — Attribution 3.0 Unported — CC BY 3.0
Free Download / Stream:
Music promoted by Audio Library
Level Plane - Riot
Creative Commons — Attribution 4.0 International — CC BY 4.0
Birds by Scandinavianz |
Music promoted by
Creative Commons Attribution 3.0 Unported License
Takayama by Niwel
Creative Commons — Attribution 3.0 Unported — CC BY 3.0
Free Download / Stream:
Music promoted by Audio Library
Public Domain sound effects used (thank you!):
The Slow Mo Guys slow motion sound effect
Hardware:
Video recording/photographs: Nikon D5300 (DSLR)
Samsung S9+ (slow motion in "The Measurement" scene)
Voice recording ("The Overview" scene) Behringer C2 condenser microphones and PreSonus AudioBox iTwo hardware interface
Software:
FFMpeg with OpenGL flipping disabled in X Server Settings.
ffmpeg -video_size 1920x1080 -framerate 30 -f x11grab -i :1 -draw_mouse 0 -c:v libx264 -crf 0 -preset ultrafast ~/Videos/$1
Voice recording: PreSonus Studio One Artist DAW
Video editing: VSDC Free Video Editor
Meta meta information: (on making the materials for the video)
All depth maps were grabbed with Realsense D435 at 848x480@30 FPS without any post processing.
Some were streamed remotely between:
- LattePanda Alpha
- Laptop
Some were streamed locally (encoded and decoded on Laptop).
Various Depth Units for Realsense were set (including those only available from 5.12.1.0 development firmware).
Various bitrates for depth encoding were used, mainly 2Mbit, 4Mbit, 8Mbit.
B frames were always set to 0 (for low latency, if you prefer quality or smaller size over latency use positive value for B frames).
The voice quality in DSLR recorded videos was obtained by first recording with DSLR to get good voice timings then over-dubbing the voice with condenser microphones . Finally original layer of sound was deleted, leaving only the clapping sound ("The Overview" scene)
0 Comments