Remix.run Logo
jiggawatts a day ago

I've been studying machine learning during the xmas break, and as an exercise I started tinkering around with the raw Bayer data from my Nikon camera, throwing it at various architectures to see what I can squeeze out of the sensor.

Something that surprised me is that very little of the computation photography magic that has been developed for mobile phones has been applied to larger DSLRs. Perhaps it's because it's not as desperately needed, or because prior to the current AI madness nobody had sufficient GPU power lying around for such a purpose.

For example, it's a relatively straightforward exercise to feed in "dark" and "flat" frames as extra per-pixel embeddings, which lets the model learn about the specifics of each individual sensor and its associated amplifier. In principle, this could allow not only better denoising, but also stretch the dynamic range a tiny bit by leveraging the less sensitive photosites in highlights and the more senstive ones in the dark areas.

Similarly, few if any photo editing products do simultaneous debayering and denoising, most do the latter as a step in normal RGB space.

Not to mention multi-frame stacking that compensates for camera motion, etc...

The whole area is "untapped" for full-frame cameras, someone just needs to throw a few server grade GPUs at the problem for a while!

AlotOfReading 20 hours ago | parent | next [-]

This stuff exists and it's fairly well-studied. It's surprisingly hard to find without coming across it in literature though, the universe of image processing is huge. Joint demosaicing, for example, is a decades-old technique [0] fairly common in astrophotography. Commercial photographers simply never cared or asked for it, and so the tools intended for them didn't bother either. You'd find more of it in things like scientific ISP and robotics.

[0] https://doi.org/10.1145/2980179.2982399

jiggawatts 20 hours ago | parent [-]

I trawled through much of the research but as you’ve mentioned it seems to be known only in astrophotography and mobile devices or other similarly constrained hardware.

pbalau 12 hours ago | parent | prev [-]

> Something that surprised me is that very little of the computation photography magic that has been developed for mobile phones has been applied to larger DSLRs. Perhaps it's because it's not as desperately needed, or because prior to the current AI madness nobody had sufficient GPU power lying around for such a purpose.

Sony Alpha 6000 had face detection in 2014.

jiggawatts 12 hours ago | parent [-]

Sure, and my camera can do bird eye detection and whatnot too, but that's a very lightweight model running in-body. Probably just a fine-tuned variant of something like YOLO.

I've seen only a couple of papers from Google talking about stacking multiple frames from a DSLR, but that was only research for improving mobile phone cameras.

Ironically, some mobile phones now have more megapixels than my flagship full-frame camera, yet they manage to stack and digitally process multiple frames using battery power!

This whole thing reminds me of the Silicon Graphics era, where the sales person would tell you with a straight face that it's worth spending $60K on a workstation and GPU combo that can't even texture map when I just got a Radeon for $250 that runs circles around it.

One industry's "impossible" is a long-since overcome minor hurdle for another.

trashb 9 hours ago | parent [-]

A DSLR and mobile phone camera optimize for different things and can't really be compared.

Mobile phone camera's are severely handicapped by the optics & sensor size. Therefore to create a acceptable picture (to share on social media) they need to do a lot of processing.

DSLR and professional camera's feature much greater hardware. Here the optics and sensor size/type are important it optimize the actual light being captured. Additionally in a professional setting the image is usually captured in a raw format and adjusted/balanced afterwards to allow for certain artistic styles.

Ultimately the quality of a picture is not bound to it's resolution size but to the amount and quality of light captured.

jiggawatts 2 hours ago | parent [-]

> A DSLR and mobile phone camera optimize for different things and can't really be compared.

You sound exactly like the sales guy trying to explain why that Indigo workstation is “different” even though it was performing the exact same vector and matrix algebra as my gaming GPU. The. Exact. Same. Thing.

Everything else you’ve said is irrelevant to computational photography. If anything, it helps matters because there’s better raw data to work with.

The real reason is that one group had to solve these problems, the other could keep making excuses for why it was “impossible” while the problem clearly wasn’t.

And anyway, what I’m after isn’t even in-body processing! I’m happy to take the RAW images and grind them through an AI that barely fits into a 5090 and warms my room appreciably for each photo processed.