I’d like to keep this discussion relatively open ended, but I want the discussion centered around this subject:
(tl;dr at bottom)
(edit: extremely open ended, because I don’t think it’s actually possible to conclude this discussion or to garner enough input on the topic possibly ever)
What I am looking for (or at least a future for) is discussion on a hypothetical system that cannot only reconstruct a scene comparably to volumetric video capture, but also break down the scene into objects using object recognition for the purpose of predicting what will happen in a scene statistically (using deep learning perhaps) before it happens in a scene with as minimal limitations as possible.
Initial Comments on Discussion:
Benefits of such a system could be extremely expansive. My initial thoughts were in the videogaming and VR industry, as well as the transit tracking and cartography industry like in applications such as Google Maps, or (even more insidious) a global tracking system for everything that occurs on planet earth (very NSA/CIA surveillance feel/vibe intended here) are some possibilities.
Some Technical Comments on Discussion:
I’d like to point out that perhaps the biggest problem with addressing such an interesting system is the wide range of study and developer input required to build it.
I’ve looked at several approaches and differing perspectives at approaching such an expansive system over the last few months during my free time. While doing so, I’ve garnished a relatively introductory-level knowledge of the following topics:
- Electromagnetic Physics in regards to light, lenses, and lasers
- Photogrammetry and 3d Scanning (got a question about this further towards the bottom)
- Deep Learning
- 3D modelling and rendering (got a question about this further towards the bottom)
- Camera Distortions and Computer Science in regards to making sense out of distorted images using algorithms to reverse distortion and deep learning
- Big data
(I’ve reviewed other auxiliary topics such as postprocessing images, topography, stereo vision, and more, but I am interested in expanding my horizons)
Can you "3D scan" lighting? Say "reverse-render" a scene to determine where the light sources are? I see a lot of 3D scans UV mapped with textures that "preserve" the lighting in the textures, so instead of there being a light source determined, all shadows, highlights, glows, etc. are stored as part of the texture which doesn’t seem all that intuitive.
From what I understand, our (human) memory of vision is only a fraction of what we actually see and have seen in our lifetime and that fraction is extremely microscopic in comparison. A contributor to this is that while "recording" what we see to temporal memory, we only "record" what we are "focusing" on. I was thinking, can photogrammetry work the same way? Instead of scanning several images for all points of interest, instead only identify a small limited amount of POIs at the center of an image and alot the amount of POIs decrementally the center the further away the POIs are from the center, then reconstruct the rest of the surrounding scene from POIs stored in memory (POIs not in the currently-being processed image)
tl;dr: what technologies, techniques, and systems are availability to reproduce a scene intelligently that is itself intelligent using 3D scanning, Photogrammetry, and a little touch of Big Data?
from Artificial Intelligence http://ift.tt/2v3PNYq