Super cool to read but can someone eli5 what Gaussian splatting is (and/or radiance fields?) specifically to how the article talks about it finally being "mature enough"? What's changed that this is now possible?

▲

meindnoch 6 hours ago | parent | next [-]

1. Create a point cloud from a scene (either via lidar, or via photogrammetry from multiple images)

2. Replace each point of the point cloud with a fuzzy ellipsoid, that has a bunch of parameters for its position + size + orientation + view-dependent color (via spherical harmonics up to some low order)

3. If you render these ellipsoids using a differentiable renderer, then you can subtract the resulting image from the ground truth (i.e. your original photos), and calculate the partial derivatives of the error with respect to each of the millions of ellipsoid parameters that you fed into the renderer.

4. Now you can run gradient descent using the differentiable renderer, which makes your fuzzy ellipsoids converge to something closely reproducing the ground truth images (from multiple angles).

5. Since the ellipsoids started at the 3D point cloud's positions, the 3D structure of the scene will likely be preserved during gradient descent, thus the resulting scene will support novel camera angles with plausible-looking results.

▲

klondike_klive 6 hours ago | parent | next [-]

You... you must have been quite some 5 year old.

▲

efskap 10 minutes ago | parent | next [-]

ELI5 has meant friendly simplified explanations (not responses aimed at literal five-year-olds) since forever, at least on the subreddit where the concept originated.

Now, perhaps referring to differentiability isn't layperson-accessible, but this is HN after all. I found it to be the perfect degree of simplification personally.

▲

SchemaLoad 4 hours ago | parent | prev [-]

Some things would be literally impossible to properly explain to a 5 year old.

	▲	zapzupnz 2 hours ago \| parent [-]
		If one actually tried to explain to a five year old, they can use things like analogy, simile, metaphor, and other forms of rhetoric. This was just a straight-up technical explanation.

▲

renewiltord 3 hours ago | parent | prev | next [-]

Great explanation/simplification. Top quality contribution.

▲

chrisjj 3 hours ago | parent | prev | next [-]

Or: Matrix bullet time with more viewpoints and less quality.

▲

4 hours ago | parent | prev [-]

[deleted]

▲

tel 8 hours ago | parent | prev | next [-]

Gaussian splatting is a way to record 3-dimensional video. You capture a scene from many angles simultaneously and then combine all of those into a single representation. Ideally, that representation is good enough that you can then, post-production, simulate camera angles you didn't originally record.

For example, the camera orbits around the performers in this music video are difficult to imagine in real space. Even if you could pull it off using robotic motion control arms, it would require that the entire choreography is fixed in place before filming. This video clearly takes advantage of being able to direct whatever camera motion the artist wanted in the 3d virtual space of the final composed scene.

To do this, the representation needs to estimate the radiance field, i.e. the amount and color of light visible at every point in your 3d volume, viewed from every angle. It's not possible to do this at high resolution by breaking that space up into voxels, those scale badly, O(n^3). You could attempt to guess at some mesh geometry and paint textures on to it compatible with the camera views, but that's difficult to automate.

Gaussian splatting estimates these radiance fields by assuming that the radiance is build from millions of fuzzy, colored balls positioned, stretched, and rotated in space. These are the Gaussian splats.

Once you have that representation, constructing a novel camera angle is as simple as positioning and angling your virtual camera and then recording the colors and positions of all the splats that are visible.

It turns out that this approach is pretty amenable to techniques similar to modern deep learning. You basically train the positions/shapes/rotations of the splats via gradient descent. It's mostly been explored in research labs but lately production-oriented tools have been built for popular 3d motion graphics tools like Houdini, making it more available.

▲

cubefox 6 hours ago | parent [-]

> Gaussian splatting is a way to record 3-dimensional video.

I would say it's a 3D photo, not a 3D video. But there are already extensions to dynamic scenes with movement.

▲

poly2it 6 hours ago | parent [-]

See 4D splatting.

	▲	tiborsaas an hour ago \| parent [-]
		Brain dances!

▲

dmarcos 7 hours ago | parent | prev | next [-]

It’s a point cloud where each point is a semitransparent blob that can have a view dependent color: color changes depending on direction you look at them. Allowing to capture reflections, iridescence…

You generate the point clouds from multiple images of a scene or an object and some machine learning magic

▲

KerrickStaley 2 hours ago | parent | prev | next [-]

This 2-minute video is a great intro to the topic https://www.youtube.com/watch?v=HVv_IQKlafQ

I think this tech has become "production-ready" recently due to a combination of research progress (the seminal paper was published in 2023 https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/) and improvements to differentiable programming libraries (e.g. PyTorch) and GPU hardware.

▲

ravedave5 2 hours ago | parent | prev | next [-]

This is a REALLY good video explaining it. https://www.youtube.com/watch?v=eekCQQYwlgA

▲

krackers 4 hours ago | parent | prev | next [-]

https://aras-p.info/blog/2023/09/05/Gaussian-Splatting-is-pr... and for a visual demo of the result https://antimatter15.com/splat/

▲

djeastm 8 hours ago | parent | prev | next [-]

For the ELI5, Gaussian splatting represents the scene as millions of tiny, blurry colored blobs in 3D space and renders by quickly "splatting" them onto the screen, making it much faster than computing an image by querying a neural net model like radiance fields.

I'm not up on how things have changed recently

▲

rkuykendall-com 8 hours ago | parent | prev [-]

I found this VFX breakdown of the recent Superman movie to have a great explanation of what it is and what it makes possible: https://youtu.be/eyAVWH61R8E?t=232

tl;dr eli5: Instead of capturing spots of color as they would appear to a camera, they capture spots of color and where they exist in the world. By combining multiple cameras doing this, you can make a 3D works from footage that you can then zoom a virtual camera round.

	▲	michaelrubloff 5 hours ago \| parent \| next [-]
		I also spoke to the vfx team from Superman on how they achieved the reconstructions! (I’m also the author for the Helicopter article here). https://radiancefields.com/gaussian-splatting-in-superman
	▲	7 hours ago \| parent \| prev [-]
		[deleted]