immersive-web / dom-overlays Goto Github PK

View Code? Open in Web Editor NEW

69.0 69.0 11.0 12.36 MB

A feature incubation repo for layering DOM content on/in WebXR content. Feature lead: Piotr Bialecki

Home Page: https://immersive-web.github.io/dom-overlays/

License: Other

Makefile 0.76% Bikeshed 99.24%

webxr

dom-overlays's People

Contributors

Stargazers

Watchers

Forkers

klausw manishearth himorin lp249839965 dontcallmedom rv12r isabella232 seanpm2001

dom-overlays's Issues

XRDOMOverlayState should be an interface

Per the spec, XRDOMOverlayState should be a dictionary, but in Chromium it's an interface.

Because XRDOMOverlayState is the type of an attribute on an interface, it doesn't really work to use a dictionary, since dictionaries are passed by value, and one would end up returning a new object each time, such that xrSession.domOverlay == xrSession.domOverlay would not hold.

Web IDL says simply:

Dictionaries must not be used as the type of an attribute or constant.

This doesn't make much of a difference, but will mean that there will be an window.XRDOMOverlayState interface object. (There is in Chromium, but shouldn't be per spec, which is how I noticed.)

[comment] iframes and aligning CSS transforms to AR planes

For real-world use reference:

I was today looking into how to achieve this in the current WebXR Device APIs in Chrome for a PoC I'm building and didn't find a way to do it.

For reference my use-case goes beyond 2D HUD behaviours - I was meaning to render iframes with external web content into walls (using the matrix transformations detected from plane detection as CSS transforms to the iframes).
My goal would be to provide an interactive lobby to the user where he can spawn/interact with different web pages (eg. html5 games) in ie. his living room.

I understand there would probably be a factor of performance hit to achieve this - but the potential of what could be achieved by having the web as we know it as another element in WebAR would be immense.

Keep up the good work - I've started following immersive-web/dom-overlays and really hope for it to start kicking off soon.

Feature 'dom-overlay' is not supported for mode: immersive-vr

Why overlays are not supported on immersive-vr sessions?

They are super useful in case you want to have something displayed all the time in the view no matter what is doing (HUD, Time or Weather information are good examples).

Most times it is an overkill or it is too difficult difficult to make this information displayed/merged at the gl the gl, this is specially an issue on Chrome Android because it completely launches the VR app instead staying at Chrome itself full-screened.

Accessibility out of scope?

Coming here from the TAG review thread, as we are looking at this at our virtual face-to-face.

I was a little taken aback at the comment that the overlaid interface would help with internationalisation and accessibility, but that accessibility is out of scope.

Could you elaborate on why accessibility is out of scope, if it is worth mentioning as a benefit?
How do you envisage accessibility benefiting from this API?

Interoperability and compatibility - browser vendors & web developers support for DOM Overlay

I'm currently preparing to send out an Intent to Ship for WebXR DOM Overlays Module so I'd like to poll for other vendors' support for the module. @raviramachandra, @Manishearth, @thetuvix, @grorg, @kearwood and others - can you share your opinion please?

There's a currently an open issue (#15) about integration with Fullscreen API, I'm working on a PR to address this. (Unless I'm misunderstanding it, I think this is just a matter of clarifications and I'm not expecting API-level changes as a result of this.)

Web developers' opinion about this module is also welcome!

Rethink non-goals: placing DOM elements in the 3D scene

The explainer clearly states that:

This API is not intended to support placing DOM elements in the 3D scene. It does not address use cases such as placing labels directly on 3D objects or world features.

But as I'm trying to develop a proof-of-concept application, I'm now facing a myriad of layout & 2D UI design related issues (flexbox layouts, typography, text alignment, CSS animations etc.) that are already solved on the web. While the DOM overlay could fulfil the use cases of designing simple HUDs and such, the ability to add something like styled DOM elements to surfaces of 3D objects (maybe with a fixed depth per z-index layer) is exactly what would be needed down the line.

While clearly a major headache, I think this should be given another consideration and a graduated approach (perhaps dom-layers instead of dom-overlays), otherwise we'll have to reinvent the wheel...

Cross origin content and input

The spec has the following to say about cross origin content and input:

For a DOM overlay, XR input event data is treated as similar to mouse movement data. Poses remain available to the outer page if there is no cross-origin content, or if the cross-origin content is not receiving input, but are limited (blocked) when interacting with cross-origin content.

Does this mean the conforming UA must determine whether cross origin content is drawn anywhere inside of the DOM overlay element and, if so, stop returning all poses to the web developer? If so, that seems like a tough thing to determine for any random element. No?

In the context of VR DOM overlays, what does it mean for the cross-origin content to not receive input? Can the cross origin content only receive select-type input?

Can you please elaborate on the phase "but are limited (blocked) when interacting with cross-origin content"?

Spell out how optional Fullscreen API integration works

https://immersive-web.github.io/dom-overlays/#fullscreen-api-integration says this:

The UA MAY implement DOM Overlay as a special case of the [FULLSCREEN] API. In this case, the UA MUST prevent changes to the active fullscreen element, rejecting requestFullscreen requests for the duration of the immersive session.

From the discussion in #14 (comment) it sounds like the "MAY" here is to allow for the integration to work differently depending on what the hardware is. However, my reading of it was that integrating with the Fullscreen API at all is optional.

If the flexibility here is around the hardware and not differences between implementations, then it would be clearer if the integration is spelled out detail in terms of what steps in which algorithms do what. Then, make certain steps conditional on a concept that explicitly depends on what hardware is available, with a non-normative note saying what kinds of scenarios it could make sense to return true or false from this check.

https://fullscreen.spec.whatwg.org/#fullscreen-is-supported is a limited example of this, saying "Fullscreen is supported if there is no previously-established user preference, security risk, or platform limitation."

https://html.spec.whatwg.org/multipage/media.html#concept-media-load-resource is another example, where one step says "Optionally, run the following substeps."

Configuring the active DOM element for the overlay

The current sample implementation uses a feature descriptor named "dom-overlay-for-handheld-ar" for enabling DOM overlay mode, but does not specify a DOM element at session creation time. By default, the DOM overlay shows the <body> element, and applications can use the Fullscreen API to change the element while the session is active.

Going forward, would it be preferable to explicitly declare at session creation time which DOM element is to be used for the overlay?

This would potentially be a use case for a feature descriptor with attributes, i.e. something like this:

let elem = document.getElementById('content');
navigator.xr.requestSession(‘immersive-ar’,
   {optionalFeatures: [
     new XRDOMOverlayForHandheldAR({overlayElement: elem}) 
   ]});

However, as far as I know there isn't yet a clear consensus how non-enum feature attributes should be declared, see for example ongoing discussions in immersive-web/webxr#860, and previous discussions in immersive-web/webxr#791 that mention a XRFeatureInit class. This comment by toji@ uses new XRMinimumBoundsRequirement(3, 2) as an example.

Alternatively, should there be a separate API such an XRSession setDOMOverlayElement(elem) method for configuring this? That seems more heavyweight and invasive though.

Is there a need to be able to change the fullscreened element at runtime, or is it good enough to specify an element at session start? The application would always be able to use DOM manipulation to move nodes in and out of that element as needed, or use CSS visibility rules.

Note that if there is a DOM overlay specific way to specify the overlay element, there's still some overlap with the Fullscreen API. Some implementations may be using similar mechanisms for DOM overlay and fullscreen mode and may not be able to change them separately. To avoid incompatibility, it may be worth specifying that using the fullscreen API while in DOM overlay mode is either prohibited or results in undefined behavior. Alternatively, if the requirement is that both work independently, this could be an additional burden on implementations.

Decide on the feature name

Currently the feature name is dom-overlay-for-handheld-ar. Would it be preferable to shorten this to just dom-overlay since this feature could also be implemented for AR headsets or VR devices?

Overide css for overlaid element

Hi there !
I'm playing with WebXR spec and dom-overlays elements :)

I wish to know if this is possible to override the default user agent style for the pseudo class ?
The pseudo class is xr-overlay like mentioned in the spec https://immersive-web.github.io/dom-overlays/#css-pseudo-class

Thanks :)

Clarification on domOverlay element and sizing

The spec has the following to say about domOverlay element sizing:

While active, the DOM overlay element is automatically resized to fill the dimensions of the UA-provided DOM overlay rectangle. Its background color is automatically styled as transparent for the duration of the session.

In "Example 2", the code fetches the uiElement directly from the DOM and passes it to requestSession

let uiElement = document.getElementById('ui');
navigator.xr.requestSession('immersive-ar', {
    optionalFeatures: ['dom-overlay'],
    domOverlay: { root: uiElement } }).then((session) => {
    ...
  }
}

For scenarios where the 2D web page is rendered on a separate monitor, how does sizing and transparency affect how the element appears in the 2D web page?

Automatically making element backgrounds be transparent could have security implications for domOverlay elements on the 2D page that are in front of other elements in different security domains.

DOM Overlays (for Phone AR support and more)

As described in immersive-web/webxr#394, when we move away from allowing AR content inline there's a desire to still enable 2D UI to be built using DOM for cases like phone AR, where support for displaying DOM elements and the AR stream together should be pretty trivial. (Previously the same effect would have been supported by using the fullscreen API on a parent element that contained both the AR canvas and some overlaid DOM elements.) Given that we want to utilize the advantages of the web as our platform whenever possible, it would be incredibly unfortunate to lose this ability in pursuit of an explicit AR mode.

For reference, consider this image of Pokemon Go:

If created with WebXR, ideally the animal and likely the ball at the bottom of the screen (because it has an associated throwing animation) would be rendered with WebGL as part of the core session rAF loop, but the other UI elements would ideally be handled as standard DOM elements that are simply composited over the AR content by the UA. (The name of the animal floating over it's head is a special case that I'll address more in a second.)

However, while the core need is to retain DOM support for phone AR there's also a desire to potentially enable that DOM content to be surfaced on headsets, with the idea being that the AR content would be fully immersive while the DOM portion is composited in by the UA somehow to ensure that the user can still access it. Exactly how the would appear is an open question, and one that we probably would want to leave up to the UA to avoid prescribing unproven UX patterns. Some possibilities I could see are:

Having the overlay DOM be a floating, moveable window in space
Attaching it to your wrist
Pinning it to a wall
Showing/hiding it with the push of a button

Doing so would likely be considered a "compatibility" mode, and would definitely not be a path we'd encourage for developers explicitly targeting headset AR. The big benefit being that it would enable AR content built for the more common devices (phone AR) to still be accessible on more advanced devices, thus immediately increasing the content that's accessible to them.

That said, there's also been some concerns voiced that supporting DOM like this in headsets could be difficult for some platforms, or would be hard to make a good user experience. As such, I'm reluctant to say that supporting a DOM overlay should be required for all devices that support AR. And certainly I believe that we should offer the right signals and tools in all cases to allow developers to explicitly create experiences optimized for any given devices they choose to support.

So this issue is simply to talk about how we should go about supporting those overlays and what guarantees of availability that mode should have.

Some other considerations:

In my mind this is explicitly different than a DOM-to-texture or DOM layer solution, which would primarily be aimed at enabling DOM content to be shown in a developer-controlled way in 3D space. While you certainly could use such a mechanism to achieve the same effect it has a lot more technical, security, and ergonomics issues involved, doesn't feel like the right fit for a simple "I just want a couple of DOM buttons" UI cases, and the complexity would likely prevent us from shipping any time soon.
As pointed out by @blairmacintyre at the AR F2F, it would be ideal if we ensured that cases where DOM UI isn't desired could be optimized by the UA, which would no longer have to do some processing on the DOM tree/compositor.
We expect that users will attempt to do alignment of DOM elements and 3D rendered elements, probably involving gratuitous amounts of matrix math and CSS 3D transforms. See, for example, the name floating above the animal's head in the image above. Problematically, these types of alignment would probably function OK for phone AR but be next-to-useless in headset AR, especially if the developer has no sense of where the DOM content is relative to the AR content. It's an open question for me if we want to accept this as inevitable and encourage this type of use by providing spatial mapping functions, or if we should discourage it for the sake of headset AR by explicitly making it difficult to reverse-engineer the spatial mapping between DOM and AR content.

Request for Examples

Can we have a hello world DOM Overlay example ?

Use cases for DOM Overlays in VR

It would be useful to collect some use cases for how DOM Overlay's are being used in VR for potential implementations.

This example by @mrdoob is a great use case: https://twitter.com/mrdoob/status/1385184290867187715

DOM overlay not well-defined for non-roots

The current API allows any element to become the "Root" of a WebXR DOM overlay.

As specified, this is not well-defined. For example, what happens if there is a filter, scrolling, or transform on an ancestor element of that root? Are these properties somehow just ignored?

The fullscreen API also has these issues, but is already out in the wild and we have a limited ability to do anything about it. We shouldn't make that mistake again.

At what distance should this overlay be placed w/ Head Mounted MR devices

There is ambiguity with respect at what distance should we place this. Head locked is fine. For placing the DOM overlay, can each UA make the best fit (best experience) decision ? On Magic Leap, lets say we find out that at distance 'x', the overlay has the best visual experience and we place it there. 'x' is what the UA/platform decides ?

Add DOM Overlay test to WPT

There should be a test for this feature at https://github.com/web-platform-tests/wpt .

The current test at https://source.chromium.org/chromium/chromium/src/+/master:third_party/blink/web_tests/wpt_internal/webxr/ar/ar_dom_overlay.https.html uses some extensions to the test API that would need to be moved to the shared WPT repository to make this work.

[Comment] Best practices for overlays

When designing our WebXR UIs, our design team likes to consider whether UI elements are diagetic (characters / objects in the scene are aware of their presence) or non-diagetic (visible only to the user) when we consider whether they should be 2d dom elements or 3d elements embedded in the scene.

From this point of view, it might be useful to think of dom overlays that are attached to the document body body as "outside" of the immersive scene, and shown through a platform appropriate HUD type experience, and it's up to the platform to figure out what this means.

Consider calling out XFO and `frame-ancestors` in addition to `frame-src`.

https://immersive-web.github.io/dom-overlays/#security reasonably calls out frame-src as applying to overlay content. It would be reasonable to note that the content itself might reasonably opt-out of such embedding via x-frame-options and/or frame-ancestor. It's likely the case that this is implicitly covered, but it's worth making it explicit that the overlay doesn't create a new top-level browsing context.

Spec forces a choice of pointing ray

https://immersive-web.github.io/dom-overlays/#onbeforexrselect

An XRSessionEvent of type beforexrselect is dispatched on the DOM overlay element before generating a WebXR selectstart input event if the input source’s targetRaySpace intersects the DOM overlay element at the time the input device’s primary action is triggered.

This will have the same problem as immersive-web/layers#21 , this is enforcing a particular choice of pointing ray (the -Z axis of the targetRaySpace). In general while targetRaySpace is the preferred pointing ray, applications are free to offset it and render whatever they like.

As in immersive-web/layers#21 it might be worth having an XRRay field on XRInputSource that defaults to a ray at the origin along -Z (i.e. the default XRRay) that defines what the pointing ray is.

I feel like overall this doesn't impact the transient screen input case because the orientation doesn't matter when your ray is directly on the UI element, but if anyone implements floating DOM overlays for HMDs applications may want control over this. This can be added backwards compatibly, so it might be worth waiting for layers to figure this out and piggyback on the solution. Filing an issue here as well to just keep track.

(There's also a slight issue here in that we should define this as intersecting with the -Z axis of the target ray space, there is no such thing as "space intersects element", but that's a minor change: #30)

CSS Backdrop FIlter Behaviour

CSS has a cool property called backdrop filter.

It defines how content underneath a semi-transparent object gets filtered before displayed.

Do we have defined behaviour for how that should work with dom-overlays?

It would be neat if it did work to have beautiful frosted glass HTML interfaces especially in head mounted use cases.

`beforexrselect` confusion

Coming here from the TAG review thread, as we are looking at this at our virtual face-to-face.

The explainer mentions

"WebXR's input events ("selectstart", "selectend", and "select") potentially duplicate DOM events"

I wasn't clear from the explainer on why this is the case. Could you give a fuller example, demonstrating when beforexrselect would solve such a problem?

Also, the previous paragraph reads,

If a WebXR application uses a DOM overlay in conjunction with XR input, it is possible that a user action could be interpreted as both an interaction with a DOM element and a 3D input to the XR application, for example if the user touches an onscreen button, or uses a controller's primary button while pointing at the DOM overlay.

Does this imply that the same is true for, e.g., click? How would I specify where the click event should be handled?