Let me start by a simple example: Say, we are creating visualization software for a city traffic and we want to render the city and all the cars inside. Each car is composed of chassis and four wheels. The chassis is in local coordinates of the car, the wheels are represented by one model instanced four times relative to the car local coordinate system. And we want to instantiate the car 1'000 times in 10 colors, so 10'000 instances in total.
In Anari, we would create two geometries - one for the chassis, another for the wheel. This is perfect and without issues.
Then, we would create 10 surfaces, assign ten different colors to them and set their geometries to the chassis. For the wheels, we would create one surface, assign the wheel as geometry and black color as material. I do not fully like such lightweight objects as surfaces that just associate geometry with material, but let's see later if there are any alternatives. Anyway, so far, we are ok.
Now, to instantiate one car, we need to create Anari::Instance that carries single transformation. We create it and assign the car transformation. We cannot associate it with the chassis surface directly, but we have to create new group and array of Surfaces and set the array's single element to the chassis surface. So, to instantiate the chassis, we had to allocate four objects - Instance, Group, Array and Surface. Luckily, the surface will be shared among 1000 objects. So, in our example, it is only three, but having unique textures for each car would result in four objects.
But what about the wheels? They cannot be placed bellow chassis instance. For this, we would need to support a kind of hierarchy of instances or multi-level instancing as you call it in #2. Currently, we would need to create Anari::Instance for each wheel and assign it composition of car and wheel transformation. I guess, your team is already aware of the problem.
However, the multi-level instancing would, unfortunately, not save the day. If all four wheels would have constant transformation against car's local coordinate system, it would work. But if we would like to change front wheel transformation during turning left or turning right and slightly rotate front wheels to follow the path of the car, it would not work. Some of 1000 cars would be turning left, some to the right and some going straight, all having different transformations of their wheels.
OpenGL Performer (famous real-time rendering library of Silicon Graphics, now a little historical) tackled the problem by introducing shared instancing and cloned instancing. With the shared instancing, all rear wheel transformations can be shared among all cars as these wheels are fixed (let's not make wheels spinning with the speed of the car, for simplicity). But all front wheels need to use cloned instancing so all cars will have their own transformation for the front wheels to allow them turning left and right.
Now, how to apply the principle to Anari? (This is not about telling that the current Anari design is not good. Rather it is about the opportunity to give arguments, why the current design is the best one and about finding arguments for it, while at the same time it is the opportunity to find other ways, possibly even better ways, that nobody thought about before.)
- Keep the current design - Each car instance will result in five Anari instances - one for the chassis and four for wheels. The main disadvantages are: (a) front wheels need to receive transformation as combined matrix of particular car transformation and front wheel transformation. Many users might prefer to have two separate nodes, one for car transformation and second for front wheel transformation. (b) Rear wheels transformation needs to be updated each time car moves. Although their local transformation is static, nevertheless car movement causes the need to update their transformation.
- Allow for nested transformations - This would probably result in simple change of the current Anari design, that Instance might have Instances as children. This would remove both disadvantages (a) and (b) from the previous point. The car would be composed of the single first level instance containing chassis and four second level instances, each one representing one of the wheels. The four second level instances would carry transformation in local coordinate system of the car, removing the problem of combined matrix of front wheels and removing problem of updates of rear wheels whenever the car position is updated.
- Introduce transformation graph - The previous point has given us the scene graph based on coordinate transformation while parent-child relation of Instances defines the relation between parent local coordinate system and child local coordinate system. However, coupling parent-child relation with transformations might sometimes not be desirable. The parent-child relation in Anari graph gives ownership (thus existence in memory) and visibility (thus existence in rendered scene). An user might want to create model of a building with ten transformations - e.g. ten coordinate systems - one for each floor. But he might want to organize the scene by object categories, not by their transformations, e.g. by the floor on which they are placed. This would allow him to remove a single group from the scene and reinsert it later and he might hide and show again all walls of the model, or by removing one child he might hide all the furniture and later show it again by just appending again the removed child. The child would be the group holding all the objects of particular kind. This would remove coupling of the graph structure and transformations. But where to put the transformations in such case? They would have its own graph. In our example, it would have root transformation of the building and ten child-transformations for each floor. Each scene object (each Instance) would reference one of these transformations. One more additional benefits we get is the speed of transformation updates. Without transformation graph, moving all the objects on particular floor would mean changing transformation of all the objects (potentially thousands of them). With transformation graph, we just update one transformation node. Another advantage would be that having transformation graph means we can accelerate transformation computations (matrix multiplications) on GPU, removing the burden from CPU - and we might have millions of matrices in some large scenes.
Another thought was why to have all this Instance+Group+Array+Surface just to instantiate one Geometry. I asked myself whether instance of Geometry - is it not just Geometry+Material+Transformation? Why not having just single object holding all the three components? Something named possibly Drawable, Instance or Geode (GEOmetry noDE) and containing the references to Geometry+Material+Transformation? You might argue that we need to instantiate Volumes and Lights also, but Instance class might be made to handle all three mentioned types. You might also argue that Instance+Group+Array+Surface solution provides the ability to instantiate many Surfaces, Volumes and Lights using just one Instance. But this lacks flexibility of instancing of our four wheels in one instance of a car. It also requires all Geometries to share the same local coordinate system - basically, they must fit. All they can differ is material. If we would like this new Instance (Geometry+Material+Transformation) solution to handle more Geometries with different Materials, we might allow it to reference Array of Geometries and Materials or even better, array of Geometry+Material+Transformation. Optionally, we might call it MultiInstance. Surely, we can analyze many alternatives and look for the best fit. Or we can analyze why the current Anari design is the best one.