Les Greys on Warpcast

July pfp

For fun, I've been thinking a lot about 3D Gaussian Splatting & NeRFs - mostly out of my own curiosity. Specifically on how (in the future) on could take a series of photos or videso to create SfM data (structure from motion) -- turn it into point cloud or use COLMAP, and then use 3D Gaussian Splatting to make it super easy to view (on some sort of web view?) Perhaps in the long run we can have Roc Camera go not only take verifiably real photos but also capture verifiably real worlds. gsplat seems like a pretty cool tool - here's a real-time large scale rendering of over 30 millions Gaussians in total using gsplat also this is not a feature in our roadmap, just thinking out loud how it would be cool https://docs.gsplat.studio/main/examples/large_scale.html

7 replies

7 recasts

43 reactions

Angel - Not A Bot pfp

Angel - Not A Bot

I’m really excited for this type of capture to become easier and also consumption. Meta’s hyperscape app and AVP spatial memories are a great peek into what’s coming

1 reply

1 recast

2 reactions

July pfp

its crazy how fast its becoming, NeRF was so slow, and now this is much faster. I've been thinking about how with embedded devices like AVP, etc how much of it is going to be done on the client side, and how much of it is going to be done on the server. And what sensors are going to be used to do it, doesn't seem like there is a consensus at the moment, and data formatting and getting it to the point where you can train a 3DGS models Stuff like this (scalable training for creating billion param 3DGS models) becomes really interesting when places like Tokyo release their entire point cloud (https://info.tokyo-digitaltwin.metro.tokyo.lg.jp/3dmodel/)

2 replies

0 recast

3 reactions

Les Greys pfp

this may be a dumb question but with all powerful cellphones/compute in peoples pockets that barely get used, how does all the stuff your saying factor into some Batman like mapping?

1 reply

0 recast

1 reaction

July pfp

In the longer run, I can see devices / robots doing something like this - start with SLAM (simultaneous localization and mapping) - which many vehicles or robots constantly do already to localize themselves in their reference frame - SLAM provides the poses, you feed image frames / point cloud / camera video into and use something like 3D Gaussian splatting to update the 3D representation pretty fast (much faster than NeRF) - and there's papers out there doing dynamic 3d gaussian splatting (instead of a still image, think dynamically changing video) - then one can move through or see that 3d representation that is dyanmically changing in real-time - so you can do cool things like put virtual objects in there (AR/VR) or get reference frames / better navigation, odometry etc (robots / vehicles) or just real-time visualization / interaction

1 reply

0 recast

1 reaction