Comments (6)
Chatting with NingXin Hu, he suggested mediacapture-worker (https://w3c.github.io/mediacapture-worker/) might be a good starting point for how to structure a vision worker.
That is essentially what I have imagined: a worker-like setup where WebXR can execute custom CV code (perhaps in WebAsssembly, Javascript or even on the GPU) for each video frame. We have the opportunity to provide whatever necessary data we want (e.g., camera intrinsics, pose of camera relative to some frame of reference, other time-synchronized sensor data such as accelerometers/gyros, etc) as well.
Some of this (sensor data) might be best provided via a separate sensor API (assuming we can leverage shared memory to share it between workers). I think (when we look at modern camera APIs) we might want to consider assuming we have things like intrinsics for each camera: at least, make this an optional field.
Having the camera video not just be assumed to be “the video we are overlaying AR onto” is essential, I think: we want to support see-through devices with cameras (like Hololens), multi-camera devices, and devices (like Vive) that have cameras that don’t align/cover the camera view.
We should assume that we can provide the camera pose relative to some “display” frame of reference. CV for AR (in general) has been greatly hampered by not knowing the calibrated structure of the display and sensor package, but when you look at real devices (e.g., Hololens, Vive, etc) that have cameras, the relationships between the device coordinate system and camera, along with the camera intrinsics, is pre-calibrated. ARKit and ARCore will also provide this information on mobile, and I assume any custom HMD will be able to provide it for any attached devices.
from webxr-polyfill.
The MediaCapture Worker doc is marked as inactive, so probably not going to help on the implementation side, but I agree that the pattern is one that could work for this.
Yes, we need to handle camera data of varying types and FOV coverage with intrinsics to inform the CV algorithms.
from webxr-polyfill.
Yes, I don't mean "use it": we don't want to use WebRTC at all, directly. What I envision, eventually, might be a way to "add in" WebRTC sources to the worker structure, but for now, I think the video sources would be accessed and configured via WebXR, because we only want ones that really have the information we need, and that we can access efficiently.
I was thinking of the patterns, yes.
from webxr-polyfill.
Agree. I don't suggest to take MediaCapture worker spec as is.
We (with Mozilla folks) used to try bringing CV to web. We made some progresses on MediaCapture worker for off-main-thread processing, ImageBitmap extension for efficient captured image data access, MediaCapture depth extension for depth camera access and OpenCV.js for CV algorithms on web (asm.js at that time, now support wasm).
I think we can leverage some experiences obtained from previous work and benefit the CV use cases in WebXR.
from webxr-polyfill.
I am thinking of two use scenarios of camera data:
- upload camera data to WebGL for rendering, e.g. for HoloLens or Vive
- send camera data to a worker for marker detection, for ARKit/ARCore, HoloLens
The first case can be handled by the main thread. The second case needs to be handled by a worker thread.
It requires to represent the camera data by a opaque handle. The handle supports uploading image data to GPU if the data is in CPU memory or skip that if data is already in GPU memory. The handle also supports copying the camera data to WebAssembly heap for CPU processing case. It should avoid the unnecessary color-conversion and memory copies of current mediastream -> video -> canvas pipeline. ImageBitmap extension is a good fit here.
from webxr-polyfill.
Initiated a API sketch mozilla/webxr-api#18 for discussion.
from webxr-polyfill.
Related Issues (20)
- View Animation gtlf HOT 1
- floor anchor in common.js in examples won't work in general
- Can I use OrbitControls or TrackballControls? HOT 2
- XRPointCloud from rawFeaturePoints HOT 3
- CODE_OF_CONDUCT.md file missing
- Make the XR Store code available HOT 1
- [Bug] this._videoFrameCanvas is undefined HOT 1
- why does some examples don't work ? HOT 2
- Finish the WebRTC case for AR
- Make sure we display as much information about errors and failures as we can
- Poor render quality on WebARonARCore viewer HOT 4
- Clarify Anchor vs AnchorOffset in the API HOT 3
- Simplify application logic by exposing XRAnchor-related methods in Reality HOT 1
- Argon browser support, Vuforia image/model tracking & XRAnchor States HOT 7
- The RequestFrame function doesn't pass the delta time information HOT 1
- need a method to remove/destroy anchors HOT 2
- need a listener to subscript to to get notified when new anchors are created
- Undeclared variable _this in XRpolyfill.js HOT 4
- Raycasting is incorrect in mobile AR HOT 1
- Should I use XRDevice or XRDisplay? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from webxr-polyfill.