Remix.run Logo
boppo1 2 hours ago

I really wantt to get into splatting and I have the tools: good camera, v comfy in blender, comfy with graphics programming ideas, 4080. But I haven't found a good 'all in one intro' to it yet. Possibly because I'm foss-biased and have dismissed proprietary options. But does anyone know of a good 'vertical tutorial' on this stuff?

dimitri-vs an hour ago | parent | next [-]

Maybe not exactly the kind of tutorial you're looking for but very enjoyable none the less: https://youtu.be/eekCQQYwlgA

Yen 32 minutes ago | parent | prev [-]

I recently got into splatting. I looked for some good all-in-one tutorials, but didn't find any, and mostly muddled through through trial and error and LLM assistance. I present this workflow as a straight-line pipeline, though in practice it took a lot of iteration and backtracking and rework to get the final result. Here's what worked for me:

I captured a video on a smartphone camera, using the OpenCamera app. Specifically, this video was captured with exposure locked, framerate locked, focus locked, fairly high framerate and resolution. I walked slowly and carefully around an outdoor scene, trying to get fairly good coverage from multiple angles. I took roughly 20 minutes of video, weighing 19GB.

This video was sampled into individual image frames at about 5fps using ffmpeg. There's room for experimentation and improvement here, an adaptive, coverage-aware sampling strategy would be better. But fixed 5fps was Good Enough (tm). This resulted in roughly 8,000 images at 4k. This was a pretty hefty dataset for my limited 1080, but I made it work.

I then generated masks for these images, to ignore transient objects during the splat training. (i.e. to cut out people who transiently walked through the scene). For this I used Cutie (https://github.com/hkchengrex/Cutie). For outdoor scenes, it can also make sense to mask out low-parallax areas like faraway mountains or especially the sky, as these are difficult to train correctly. If masks are generated for some images, you'll need at least placeholder masks for the all of them. In the end I've got about 8,000 PNGs that are monochrome black/white masks.

Then the images are handed to COLMAP (https://github.com/colmap/colmap), using the 'global mapper' option. This registers the camera positions in 3D space, and generates a crude point cloud that's good for sanity-checking. This step required a fair bit of iteration to get right. The full reconstructed output from COLMAP is not necessary, only the pose-estimate .bin files. The output directory here was about 500MB for this step for me.

With COLMAP registration done, the next step is the actual training. I found two useful pieces of software for this, with different tradeoffs.

Brush (https://github.com/ArthurBrussee/brush). Was very straightforward to install and use, requiring very little in external dependencies and setup. It was also pretty speedy on training, and gave good results. Minor modifications to the training process were possible by editing source, though I didn't get too wild here. Brush takes the *.bin files from COLMAP, plus the original images directory, and the masks directory if it exists. Run on its own, this could produce gaussian splat .ply files, 500-800MB in size, containing 1-10M splats. More than that and my poor little 8GB of VRAM OOM'd.

nerfstudio (https://github.com/nerfstudio-project/nerfstudio) Was also useful, as many research papers get implemented in its framework. In particular, for this outdoor scene, I used wild-gaussians (https://github.com/jkulhanek/wild-gaussians/) to generate just a sky sphere (to help seed low-parallax areas in my particular dataset), stopped training, and used this as an init.ply to pass to brush.

I then set up a very simple viewer website, using SuperSplat (https://github.com/playcanvas/supersplat). I used supersplat's editor to align the splat's coordinate system with the rotation and scaling that I wanted, and then exported an optimized .sog file, roughly 1/10th the size. .sog is nominally open-standards, though I'm not aware of any other projects using the format. This gave fairly good framerates and adequate controls across a variety of platforms.

As a little bit extra, supersplat's splat-transform CLI tool was used to generate a crude collision mesh for the scene, enabling a walking mode that respected object boundaries.

If there's interest I can post my results, I got a bit sidetracked with other projects and other splats, and this particular one I got fiddling with some more cleanup. I can get it up with a few more hours work. But hopefully that's a good start, all of these are fully FOSS, and resulted in a good-looking splat.

ireadmevs 19 minutes ago | parent [-]

Thank you for sharing!