This example workspace main purpose is to get you started with the understanding of behavior trees to generate animated sequences of characters in an automotive in-cabin environment. Behavior trees in Anyverse are key to generate dynamic datasets needed to train machine learning models that use the time variable for predictions. Depending on its implementation one single behavior tree can define a reusable behavior that you can apply to different characters in the workspace. You can define a save behavior trees in your own local library to reuse.
The workspace is ready to generate a 3 second sequence at 24 frames per second. It is designed to help you understand some other basic concepts to get you started working with the Anyverse platform such as: cameras, sensors, ISPs, locators and more.
The car cabin
To generate data for an in-cabin monitoring system, we need to place cameras inside a car cabin. We are no using a base scene, we are using 3D asset that are detailed car cabin interiors of actual car brands and models. You will see an entity in the workspace
Simulation node called
The_Car, you can see that it has an associated 3D asset: Audi_Q5_v1. That is the cabin where we have placed cameras and people to generate a sequence.
As you can see,
The_Car entity has other children entities, these are mostly “Locators”, dimensionless entities that represent a 3D position (x, y, z) and a rotation in the 3D space around the 3 axes.
Locators are a very important concept in Anyverse. They are used to position other elements with respect to them, any entity place in a locator will have a (0, 0, 0) position and rotation (transform for short) with respect to it. That means that the “effective” transform of the placed element will be the same ase the locator’s. Changing the transform of the element with respect to the locator will change its “effective” transform and it’s calculated adding the locator’s one.
In fact the nesting of element transforms apply to all elements in the simulation. The “efective” transform of an element is the sum of all the predecessor’s transforms and its own relative to its immediate parent.
This “efective” transform is what gets annotated as the position and rotation of an element in the generated data
You can explore the different locators in
The_Car and you will see nested locators and other elements like the
Driver and the
You will notice that in the workspace
Simulation there is an entity named
Ego. That is a special entity we use to rig the cameras and you can think of it as a special locator. In our workspace we have defined 2 different cameras and placed them inside
The_Car. These are the
RVM_Camara, placed in the interior rear view mirror and the
CC_Camera placed in the central console.
You can change the 3 viewer point of view by selecting a camera in the top left pulldown or by pressing P to toggle between cameras.
If you select a camera in the workspace you can see its properties in the properties panel and understand how to define a camera in Anyverse. The most important one is the camera parameters block where you configure:
- Resolution: width and height of you camera sensor in pixels.
- Film size: actual width and height of the sensor in meters
- Focal length of the camera lens. Disabled for fisheye lens types
- Target distance: maximum target distance
- Fisheye FOV: The field of view in degrees for fisheye lenses. Disabled for other types of lenses
- F-Stop: camera aperture
- Lens type. Anyverse provides different analytical lens models:
- PIN_HOLE. Perfect lens
- ANYVERSE_FISHEYE_CIRCULAR. Circular equidistant fisheye distortion model
- ANYVERSE_FISHEYE_DIAGONAL. Diagonal equidistant fisheye distortion model
- ANYVERSE_SPHERICAL. Spherical lens analytical model.
- ANYVERSE_THIN_LENS. Thin lens model
- OPENCV. For OpenCV calibration intrinsics.
- OPENCV_FISHEYE. For OpenCV fisheye calibration intrinsics.
When selecting OPENCV or OPENCV_FISHEYE, you need to input the intrinsics block with the center of the lens in pixels and coefficients for the barrel (K1 to K6) and tangential (P0 and P1) distortions. This values along with the focal length in pixels are the result of the manual calibration process you can do with a real camera. After translating the focal length to meters (using the camera pixel size) you will have the characterization of the distortion of any real camera.
Additionally, for every camera you can select an specific sensor and ISP to simulate. To do this you need to have at least one sensor and one ISP defined in the workspace. Another characteristic of the sensors is the type of shutter and exposure time. With Anyverse you can simulate both, global and rolling shutters. this last one produce characteristic “bending” artifacts in the images and videos when the camera or objects in the scene move at high speeds.
Both cameras in the workspace are fisheye diagonal cameras with a 640×480 resolution. the
CC_Camera has NIR sensor and ISP defined and the
RVM_Camera has a Sony IMX265 sensor and a RGB ISP defined. Feel free to play around with the camera, sensor and ISP configurations to match you particular cameras.
Another great characteristic of Anyverse is that you can define your own artificial light source. In the case of the in-cabin use case, it is very common to have a source of light illuminating the scene. This happens always when using a NIR or RGB-IR camera that have an active light illuminating with a 940 nm (invisible for the human eye).
You can define many light source characteristics such as, type of emission, geometry and size of the emitter, power in different units and even a specific emisión profile depending on the wavelength.
In this workspace we have placed a spot emitter between both cameras. You can turn it on or off (as any other element in the workspace) by clicking the icon right by the name in the workspace.
In the workspace we have placed 2 character assets in the driver and copilot seats respectively.. You can change the characters by adding new characters to the workspace from the resources view and change the reference to the assets in the Driver and Copilot entities in the simulation
You can change the initial pose of the characters by applying different animations to them form the available animations in the workspace. Bear in mind that you have to select the right animation for the different parts of the body you want to apply the pose to. The animations for the limbs (arms and legs) and the head have a weight that you can change that indicates the degree of the pose you want to apply. This weight goes from 0 to 1.
Other relevant properties for the characters are hands attachment and gaze control. In other words where are the characters hands positioned and where are they looking at (we use locators for this, see below)
You need to specify the locators in the workspace the hands are attached to. You don’t’ have to attach them both. We have attached only the one that we are going to animate in the sequence. If you don’t attach then thay won’t be animated and the pose will be control by the animations defined for the character.
IK stands for Inverse Kinematics, this means that to attach the hands and move the gaze to the correspondent locators we will apply inverse kinematics involving adjacent joints of the body so the movement is more natural. The number of adjacent joints involved in the inverse kinematics calculations specified by the IK Chain Length parameter. For the hands attachment this is automatic.
The offset in the hand attachment refers to an offset space you want to leave between the hand an the locator to avoid that the hand penetrates in other geometries if the locator is close to another asset surface
In this workspace there are some specific locators we use to animate the sequence using behavior trees and the characters hands attachment and gaze control. Some were specifically created and others are fixed locators that belong to the car cabin.
Some specify the final position the characters are going to look at and reach to like:
And others are attached to hands and gaze, like:
We use all the above in the behavior trees to define the animation of characters in the sequence. Basically the characters will follow these locators around as they move in the scene in the way specified in the behavior trees.
Behavior trees are a computational model used in the field of artificial intelligence and game development to represent and control the behavior of autonomous agents or characters. They provide a hierarchical structure for defining the decision-making and action selection process.
A behavior tree consists of nodes that are organized in a tree-like structure. Each node represents a specific behavior or decision-making operation. The nodes can be categorized into three main types:
- Composite Nodes: These nodes define the overall structure of the behavior tree. They control the flow of execution and can contain child nodes. Examples of composite nodes include sequence nodes, where child nodes are executed in order, and selector nodes, where child nodes are executed until one succeeds.
- Decorator Nodes: These nodes modify the behavior of their child nodes. They can add conditions or constraints to the execution of their child nodes. Examples of decorator nodes include condition nodes that check a specific condition before executing their child node, and inverter nodes that invert the result of their child node.
- Action Nodes: These nodes represent specific actions or behaviors that an agent can perform. They are usually the leaf nodes of the behavior tree and do not have child nodes. Examples of action nodes include movement actions, attacking actions, or any other specific behavior that the agent can perform.
Behavior trees provide a flexible and modular way to design complex behaviors by combining and organizing simple behavior nodes. They allow for easy modification and expansion of behavior without having to modify the entire system. This makes them widely used in areas such as video games, robotics, and AI-driven applications.
In Anyverse you ca assign a behavior tree to any entity. select an entity such as the Driver and open the Behavior trees View from the menu.
In this workspace to help you get started with behaviors, we have two predefined behavior trees, one for the Driver entity and another one for the Copilot entity in the Simulation.
Once you open the behavior view if you click on a different entity in the workspace you will see the correspondent behavior tree assigned to that entity.
There are 3 different tabs in the behavior view:
- Nodes: With the different types of nodes to use in your trees. Just drag and drop them in the editing area to use them
- Variables: To define internal and public variables your behavior will use. You can think of the public ones as input parameters for your behavior tree.
- Node Properties: With descriptive properties of the node and the input parameters to specify
Once you open the behavior view you can edit the behavior. All behaviors are saved as part of the workspace. Additionally you can always save any behavior tree to your local library to reuse them as action nodes, by clicking the save button on the top. The public variables in your behavior tree will be exposed as input parameters for the node.
Variables can have different types. When a variable is an entity reference, you can decide to refer it to the owner of the behavior tree or to any other one in the workspace. To refer it to the owner click the to the left of the variable definition. To select a specific entity from the workspace, click the ‘No entity’ button.
You can use variables as input for node parameters. You just have to link them using the link icon to the left of the parameter definition
Run the simulation
To define the sequence duration you need to set the Simulation Duration property to the desired length and the Generator Capture frequency to captures per second that you want. Bear in mind that every capture will be a render. For example, as it is, the workspace is fully usable to generate a 3 seconds sequence at 24 captures/second with the driver reaching and looking at the central console, and the copilot reaching to the driver’s headrest and looking at the back of the car. that is 3 x 24 = 72, plus 1 for the 0 key frame, 73 renders in total.
To see how the behavior trees run, and see the resulting animation in the 3D viewer, you can run the simulation clicking on the rocket icon in the bottom left corner of the 3D viewer.
You can see that if you have the behavior view visible and an entity with behavior tree selected in the workspace you can follow the behavior tree execution and the status in every step.
Generate a sequence
To generate a sequence from your workspace, you first have to create a dataset with the output channels you require. Go to the generator inspector space (1), then click the + icon to the right in the
Dataset node of the tree (2), give it a name and select the channels you need for your dataset (you won’t be able to change them after creation) (3) and the Create button (4).
Now you are ready to generate a sample sequence, right click on the
Generate node in the workspace and select a dataset (the one you just created, for example). A pop-up will give you the details of the generation you are going to run, after giving a meaningful name to the result there will be another popo-up with the progress of the generation. when closing you can go to the dataset in the generation inspector space to see the execution progress in the cloud. When all the render finish you will see the results.
Clicking the different frames will show you the results and for each frame you can click on the different channels to the right to se them with more detail and zoom on them with the mouse. Additionally yo can browse the JSON ground truth metadata all the way to the right in this space.
Finally you can download all the frames with channels and ground truth to you local machine. Right click on the generated batch name and select the Download to local option in the contextual menu. Then you just stich al frames together to create a video in any format you want.
Here you have a couple of final sequences with both cameras in the workspace and different behavior changed just by changing one of the input variable. Would you be able to replicate these 2 videos?