Module2 - mtrl - Verification

Temporal Logic

It is good to have some idea of what temporal logic is and to be able to read and understand text containing linear temporal logic LTL. Therefore, take a look at

LTL Teaching slides of professor Alessandro Artale at the Free University of Bozen-Bolzano
Example on state-transition diagram Links to an external site.

Students with a good background on control should also have time to do the following (not needed to do the quiz on Verification)

install PRISM, a tool for probabilistic model checking, described below
do PRISM Example 1 (6-sided dice) described below

With spare time and interest you might also want to have a quick look at the paper

"Synthesis of Control Protocols for Autonomous Download Synthesis of Control Protocols for Autonomous
Systems Download Systems
" that surveys LTL, model verification and describes the tool TULIP, with some application to autonomous driving.

Model checking

Model checking is one of the branches of formal verification. The term refers to the act of checking (exhaustively and automatically) whether a property (specified as a formula) holds or not for a given model.

From Wikipedia: In order to solve such a problem algorithmically, both the model of the system and the specification are formulated in some precise mathematical language. To this end, the problem is formulated as a task in logic, namely to check whether a given structure satisfies a given logical formula. This general concept applies to many kinds of logics and suitable structures. A simple model checking problem is verifying whether a given formula in the propositional logic is satisfied by a given structure.

Model Checking Intro Links to an external site. (3:14min)
[Additional material] Slides from a Probabilistic Model Checker course (19 lectures, many hours of fun!) Links to an external site.

PRISM model checker

There are many model checkers. Among these, we have selected PRISM, a probabilistic model checker. PRISM allows modelling and analysis of systems that exhibit probabilistic behavior. It has been used to analyze systems from many different application domains Links to an external site., including communication and multimedia protocols, randomized distributed algorithms, security protocols, biological systems and many others.

Introduction to PRISM Links to an external site. (3:40min)

Additional materialModel Checking and Strategy Synthesis for Stochastic Games: From Theory to Practice Links to an external site. (42:39min)

PRISM Tutorial at the Marktoberdorf summer school, 2011 Links to an external site.

PRISM can be used to analyze several different types of probabilistic models, including discrete-time Markov chains (DTMC), continuous-time Markov chains (CTMC), Markov decision processes (MDP) and probabilistic extensions of the timed automata formalism (PTA). Properties to be verified against these models are expressed in probabilistic extensions of temporal logic.

Installation

Download the latest version of PRISM from the official website Links to an external site. (available for all major OS. But only installation on Ubuntu 16.04 has been verified by us.). When you unpack the downloaded file you will see a file structure that resembles the following. The following instructions assume that BASEDIR is the folder where you have unpacked your PRISM files.

|- bin
|- doc
|- etc
|- examples
|- lib
CHANGELOG.txt
COPYING.txt
install.sh
README.txt

Execute the install script with the following command. The install script only changes the environment variable to locate the PRISM directory.

cd BASEDIR /* directory where you have unpacked the files */
./install.sh

The bin folder contains the binary files for PRISM (prism and xprism). To execute prism move to the folder and execute the command.

cd BASEDIR /* directory where you have unpacked the files */
./bin/xprism

This will open the graphical interface.

Now you can write your first PRISM model. Open the tutorial Links to an external site. page and follow the examples that are shown on the website. In particular, study the following example:

Example 1 Links to an external site.: This uses a simple discrete-time Markov chain (DTMC) example - a randomised algorithm for modelling a 6-sided die with a fair coin. It introduces the basics of the PRISM modelling language and the PRISM tool.

Additional:

Example 2 Links to an external site.: This uses another simple randomised algorithm - Herman's self-stabilisation algorithm, which is also modelled as a DTMC. It introduces some additional features of the PRISM modelling and property languages.
Example 3 Links to an external site.: This introduces a continuous-time Markov chain (CTMC) example and is based on an analysis of dynamic power management systems in PRISM.

Guided larger example

We will now create one additional example. We want to model potential sensor failures in a control system. The control system is composed of n sensors, belonging to the set $S = {s 1, s 2, \dots s n}$ $\mathcal{S} = \lbrace s1, s2, \dots sn \rbrace$ and a controller. The controller is invoked at every time instant k and receives as input the vector $y (k)$ $\mathbf{y}(k)$ that contains the values read by the sensors , $y (k) = [y_{1} (k); y_{2} (k); \dots; y_{n} (k)]$ $\mathbf{y}(k) = [ y_1(k); y_2(k); \dots; y_n(k) ]$ . We assume that all the sensors are equal and that they have a failure probability $p_{f}$ $p_f$ . Moreover, we assume that at a given time instant, only one sensor can fail. We denote with $S_{s}$ $\mathcal{S}_s$ the set of sensors that are successfully producing values, and with $S_{f}$ $\mathcal{S}_f$ the set of sensors that are currently experiencing failures.

If the set of failing sensors is not empty, i.e., if $S_{f} \neq \emptyset$ $\mathcal{S}_f \neq \emptyset$ , at each time instant one sensor can be repaired. We denote with $p_{r}$ $p_r$ the probability that the chosen repair operation is successful. Finally, we assume that the controller can operate if and only if at least a fraction $p_{n}$ $p_n$ of the sensors are successfully producing data.

We want to model this situation with PRISM and take advantage of the probabilistic model checking functionalities to verify given properties on quantities like the percentage of correct invocations of the controller, given some parameter set ${n, p_{f}, p_{r}, p_{n}}$ $\lbrace n, p_f, p_r, p_n\rbrace$ . First we need to select the type of model. For our purposes, an MDP model is enough. We will then model separately the set of sensors and the controller. For these two entities we will use modules. The percentage of correct invocations of the controller is going to be computed in a reward structure. The code skeleton is as follows

mdp

// definition of general parameters
const int NUM_SENSORS = 15;
const double PRO_SENSOR_FAIL = 0.8;
const double PRO_SENSOR_REPAIR = 0.8;
const double PERCENTAGE_NEEDED_FOR_CORRECT_OPERATION = 0.5;

module sensors
endmodule

module controller
endmodule

rewards "percentage_correct"
endrewards

Now we have to fill in the description of the behavior of the two modules and of the reward function. We want to enforce a precedence between the action of the modules. First, we want the sensors module to determine how many sensors are currently working, based on the previous state of the system and on a probabilistic transition determining if one sensor is failing. Second, we want to use this information in the controller module to determine if the operation is carried out successfully or not.

To implement this logic we define a turn variable, that can have values in the interval [0..1] and with initial assigned value zero. The fact that turn is a global variable ensures that all the modules can modify it. We will encode that leaving the module sensors, the turn variable will be set to one, and leaving the module controller it will be set to zero. A global variable cannot be modified in labelled transitions.

// The global variable turn cycles between
//   - 0 (collect data from the sensors)
//   - 1 (invoke the controller execution)
global TURN: [0..1] init 0;

Now we have all we need to define the module sensors. We include a local variable s that contains the number of currently working sensors. This variable can assume values in the interval [0..NUM_SENSORS]. The value current_fail determines if there is a current failure or not. The value current_repair determines if one sensor is being repaired. The module will have three different transitions, executed one after the other. In the entry point, we determine (based on $p_{f}$ $p_f$ ) if there is a failing sensor. When this check is executed, we determine (based on $p_{r}$ $p_r$ ) if there is a repaired sensor. Finally, we update s and pass the turn to the controller.

module sensors

  // s contains the number of working sensors
  s : [0..NUM_SENSORS] init NUM_SENSORS;
  current_fail : [0..1] init 0;
  current_repair : [0..1] init 0;
  fail_check : bool init false; // done fail check?
  repair_check : bool init false; // done repair check?

  // check if a sensor is failing (probabilistic transition)
  // this does not check if all are sensors are in fail
  // state already (it could be easily done)
  // we implement the check using the max operator
  // in the last transition
  // HERE: it is my turn and
  //       I have not done fail and repair checks
  [] (TURN=0 & !fail_check & !repair_check) ->
         PRO_SENSOR_FAIL  : (current_fail' = 1) & (fail_check' = true) +
      (1-PRO_SENSOR_FAIL) : (current_fail' = 0) & (fail_check' = true);

  // check if a sensor is being repaired at this point in time
  // --> if possible to repair some (there is at least one that is in fail state)
  //     probabilistic transition
    // HERE: it is my turn, I have done the fail check, and
  //       I have not done the repair check
  [] (TURN=0 & fail_check & !repair_check & s<NUM_SENSORS) ->
         PRO_SENSOR_REPAIR  : (current_repair' = 1) & (repair_check' = true) +
      (1-PRO_SENSOR_REPAIR) : (current_repair' = 0) & (repair_check' = true);
  // --> if not possible to repair some (none is in fail state)
  [] (TURN=0 & fail_check & !repair_check & s=NUM_SENSORS) ->
         (current_repair' = 0) & (repair_check' = true);

  // update the number of working sensors and then pass the turn to the controller, notice the max operator (a "min" is
  // not needed, because the repair check is based on the
  // number of sensors in fail state)
  [] (TURN=0 & fail_check & repair_check) ->
         (s' = max(0, s - current_fail + current_repair)) & // set working sensors
         (fail_check' = false) & (repair_check' = false) & // reinitialize
         (TURN' = 1); // pass the turn to the controller

endmodule

Finally we can write the controller module. We want to label the transition with the correct operation and the transition with the incorrect operation to facilitate writing reward structures based on which transition is taken. Once we determine if the controller executed correctly, we can pass the turn to the sensors again.

module controller
  correct_invocations: int init 0;
  total_invocations: int init 0;
  decided_if_correct: bool init false;

  [correct] (TURN=1 & !decided_if_correct & s >= PERCENTAGE_NEEDED_FOR_CORRECT_OPERATION * NUM_SENSORS) ->
     (correct_invocations' = correct_invocations + 1) &
     (total_invocations' = total_invocations + 1) &
     (decided_if_correct' = true);

  [incorrect] (TURN=1 & !decided_if_correct & s < PERCENTAGE_NEEDED_FOR_CORRECT_OPERATION * NUM_SENSORS) ->
     (correct_invocations' = correct_invocations) &
     (total_invocations' = total_invocations + 1) &
     (decided_if_correct' = true);

  [] (TURN=1 & decided_if_correct) ->
     (decided_if_correct' = false) &
     (TURN'=0);

endmodule

Finally, we can write the reward structure for the percentage of correct invocations.

rewards "percentage_correct"
  true : correct_invocations/total_invocations;
endrewards

We could also write rewards structures using labelled transitions, like the following two. In this case we are counting the amount of correct invocations based on the transition that is taken, and the amount of total invocations rewarding both the correct and the incorrect transitions.

rewards "total_correct"
  [correct] true : 1;
endrewards

rewards "total_invocations"
  [correct] true : 1;
  [incorrect] true : 1;
endrewards

We can simulate the system now (the simulation determines one path and shows it to the user, it does not take advantage of the full model checking). The simulation that you see in the following pictures is obtained with n=15, $p_{f}$ $p_f$ =0.8, $p_{r}$ $p_r$ =0.7, and $p_{n}$ $p_n$ =0.5. The length of the simulation has been extended compared to the standard one.

We can then define properties to verify. One example is what is the minimum percentage of correct invocations that we can obtain in 100 invocations? We can then simulate or verify the property. If we simulate it, it means that we are running Monte Carlo simulations and determining bounds based on the results of the simulations. If we verify the property, we are using model checking to check exhaustively all the states (don't do it with this property, it will not terminate).

Property specification Download Property specification

Different computation engines can be chosen for verification purposes, see the PRISM manual for details. Depending on the type of model and on the amount of branches that the solver should traverse, some solvers are more efficient than others. Also, not all the solvers support all the types of properties.