Modeling Pitfalls

A successful verification is a prerequisite for having a correct model of state machines. But, is it also sufficient? The answer is: certainly not! The quote of Edsger W. Dijkstra: "Program testing can be used to show the presence of bugs, but never to show their absence!" applies to state machine verification as well. It is better to be honest about this fact. A vendor of a state machine verification tool should not mislead customers by stating that defect-free source code is generated after successful verification. The only thing that can be guaranteed about the generated source code refers to verified features such as correct triggers, visited states, deadlock, etc. However, still many flaws can be hidden in the model and hence in the code, like forgotten calls, wrong guards, swapped states. The examples below illustrate possible pitfalls when practicing state machine verification. A paper about this subject is available at Bits&Chips (in Dutch).

1. Door protocol

This example deals with a door (or a tray of a DVD player), which can be opened or closed prematurely. Figure 1.1 shows the Protocol State Machine (PSM) for the server role, and figure 1.2 the PSM for the client role.

ClientPsm.png ServerPsm.png
Figure 1.1 DoorProtocol PSM client role Figure 1.2 DoorProtocol PSM server role

The client may interrupt an activity in state Opening by calling Close() for closing the door, or in state Closing by calling Open() for opening the door. As a consequence the client PSM looks cluttered by the pending callbacks that have to be ignored. A verification tool such as TismTool helps you to find all these callbacks that must be modeled for having a successful verification. The model and the verification output can be downloaded from DoorProtocol.zip to convince you about the successful verification. But, do you think that the protocol is correct? After reading the above, I don't expect your answer to be affirmative. Just figure out what happens if the client invokes the 3 subsequent calls Open(), Close(), Open(). Receiving callback CbOpened() does not warrant that the door is open, since this callback can be the response from the first Open() call and not from the last Open() call. If the client receives another CbOpened() callback in state Opened, then it is pretty certain that a transition to this state happened too early. The same holds for state Closed. Relying on these states for subsequent actions could be disastrous when the state has wrongly been entered too early. A verification tool could help by raising a warning for this situation. It is relatively easy to recognize the pattern where a state is entered by a particular callback and such a callback is ignored in that state. Let’s consider now a solution for the design problem. Three possible solutions pop up:

  1. Ask confirmation from the server after an Open() / Close() call, and wait asynchronously for this confirmation. Obsolete callbacks can then be ignored in the waiting state.
  2. Cancel a request in state Opening / Closing when making the “opposite” request. Wait asynchronously for the cancel confirmation so that obsolete callbacks can be ignored.
  3. Use session management, which can also be applied for other situations such as cancelling a timer. This choice will be elaborated, refer to figures 1.3 and 1.4 showing the improved pair PSMs.
ClientCheckPsm.png ServerCheckPsm.png
Figure 1.3 Improved DoorProtocol PSM client role Figure 1.4 Improved DoorProtocol PSM server role

The client passes at each Open() / Close() call a request identification. The server returns this identification as a response identification in the corresponding callback function after finishing the involved activity. The client checks the received response identification, and only executes a transition if its last request identification matches the response identification. Now, the state Opened has no self-transitions of ignored callbacks any more. The CbClosed() callbacks are ignored in state Opening. And also the obsolete CbOpened() callbacks are ignored in state Opening after the identification comparison. Analogous remarks hold for state Closed. The improved protocol and its output can be donwloaded from DoorCheckProtocol.zip.

Conclusions:

2. Deadlock

This example addresses the mostly feared occurrence of a deadlock situation. Figure 2.1 shows the context diagram of a system consisting of 2 components, called Master and Controller. That system is constructed for being subject to deadlock with a minimum number of elements.

Deadlock Context Diagram
Figure 2.1 Context Diagram

The Master and the Controller communicate along 2 protocols, named FunctionProtocol and CallbackProtocol, via their respective client and server ports. The FunctionProtocol contains a single function Fn(), refer to figure 2.2 for the pair of protocol state machines (PSMs). Similarly, the CallbackProtocol contains a single callback function Cb(), refer to figure 2.3 for the pair of PSMs.

FunctionProtocol PSM client role CallbackProtocol PSM client role
Figure 2.2a FunctionProtocol PSM client role Figure 2.3a CallbackProtocol PSM client role
FunctionProtocol PSM server role CallbackProtocol PSM server role
Figure 2.2b FunctionProtocol PSM server role Figure 2.3b CallbackProtocol PSM server role

The Master and the Controller have to obey these protocols in their design state machines. The Master intents to invoke the call Fn() via its client port CpFunction, so that the function Fn() triggers the Controller at its server port SpFunction. Analogously, the Controller will invoke the callback function Cb() via its server port SpCallback, so that this callback triggers the Master at its client port CpCallback.

The design state machines of Master and Controller are kept as small as possible. They contain only one (real) state, refer to figure 2.4 and figure 2.5. It is obvious that the system with these state machines starts in a deadlock situation. Both components wait for each other's trigger.

Master state machine Controller state machine
Figure 2.4 Master state machine Figure 2.5 Controller state machine

What is so remarkable about this tiny example? The answer will become clear after verifying the 2 individual components of the system. Consider the verification environment for the Controller component in figure 2.6.

Controller verification context
Figure 2.6 Controller verification context

It consists of an instance of the Controller component, its server ports and the 2 mocked client ports. All communication passes the 2 client ports, and the verification algorithm has full responsibility over these mocked ports. A NativeTrigger (see Figure 2.2a) takes care that function Fn() is activated at port CpFunction. This results in a call of Fn() at server port SpFunction. According to the design state machine the callback function Cb() will be invoked at server port SpCallback, and hence at client port CpCallback. In fact, that is all possible interaction. Verification of this environment will be successful due to the independence of the 2 mocked client ports. A bounded callback queue will prevent an unlimited number of Fn() calls. In any case there will be no deadlock. In the same way the Master component can be verified. Also now there will be no deadlock thanks to the independent mocked server ports.

So, this example reveals that the system can suffer from deadlock even when its 2 components have been verified successfully and declared to be free from deadlock.

A simple system like the one above can be easily analysed manually. However, tool support is really necessary for investigating a set of components with concurrent (complex) state machines. Of course, the applied verification tool must then be able to perform integral verification of multiple concurrent state machines. The above modeled system and its verification output can be downloaded from Deadlock.zip.

Conclusions: