Layered Self-Adaptive Layered Software Style

We describe an architecture-centric design and implementation approach for building self-adapting and self-managing  systems. The basis of this approach is the concept of meta-level components, which facilitate adaptation and management of application-level components. Our approach applies two key enhancements to the traditional usage of meta-level components: (1) we utilize three distinct, specialized meta-level components for the three fundamental activities of a system: sensing, computation, and control, and (2) we allow meta-level components to themselves be monitored, managed and adapted by other (higher layer) meta-level components. In this way, our approach flexibly supports adaptive layered architectures of arbitrary depth, the specification of arbitrary system adaptation policies, and the provision of intelligent facilities for constructing adaptation plans on-the-fly. 

Figure above shows a conceptual depiction of a system implemented using our approach that demonstrates the power of multi-layer adaptive capabilities. At the robotics layer,
application components implement a basic robot behavior: object following. The Camera Driver takes raw streaming video from a camera hardware device. The Object Follower interprets the video to locate the relative position of the object being followed. The Motor Actuator directs the robot motor to move in the required direction. 

At the meta layer, meta-level components implement a basic adaptive behavior: reactive fault recovery. Reactive fault recovery detects when faults occur and then takes a mitigating action. The Fault Detector monitors the camera and reports failures to a Replacement Selector. The Replacement Selector determines the best replacement component
for the camera based on adaptation policies. The Replacement Selector notifies the Replacement Deployer of the new component that is needed, and the Replacement Deployer instantiates the component. 

At the meta2 layer, meta-level components implement an advanced adaptive behavior: fault tolerance strategy selection, Fault tolerance strategy selection permits the use of different fault tolerance mechanisms for different circumstances. For example, if reactive fault recovery is resulting in unacceptable down-time during replacement selection and deployment, the active replication fault tolerance strategy can be used instead. Active replication ensures that multiple synchronized copies of a component are always available in case one fails. The Fault Recovery Monitor records the amount of system down-time after each fault. The Fault Recovery Evaluator determines whether the system availability is acceptable. If reactive fault recovery is not meeting availability requirements, the Fault Tolerance Deployer instantiates a new set of meta-layer components that implement active replication.

We showcase our approach using a team of autonomous mobile robots that engage in a leader-follower scenario and experience a wide variety of failures, activating distinct recovery mechanisms. 

Three robots in the above video are in a convoy. The video starts at the time that the tan robot is dead and therefore the orange robot is waiting for the rest of the convoy to join it. The last robot is initially following the tan robot using its webcam. When the last robot bumps into the tan robot, it starts avoiding the tan robot by getting around the tan robot. The last robot avoids the tan robot because it knows that the tan robot is dead. When the last robot recognizes the orange robot, it starts following the orange robot. At this point the orange robot continues to follow the line and lead the robot behind it. Later, the webcam of the last robot is disconnected. As a result, the software architecture of this robot gets adapted and it starts following the orange robot using its IR receiver. Notice that the orange robot has a IR signal emitter in its back.