Tips on FSA in Prolog
In the first part of the lab, we will see how to implement FSAs in Prolog. This is actually a misleading way to describe what we are going to do. For, although we have been talking about FSAs as machines, we are going to treat them as passive data structures that are manipulated by other programs.
We will use three predicates to represent FSAs:
* initial/1
* final/1
* arc/3
initial(1), for instance, says that 1 is an initial state, final(4) says that 4 is a final state, and arc(1,2,h) says that there is a h transition from state 1 to state 2. We will, furthermore, use the atom '#' to mark jump arcs.
Now, that we know how to represent FSAs, we would of course like to do something with them; i.e. we want to use them to generate and recognize strings. That is, we need programs to manipulate those FSA representations. Those programs should be general enough, so that we don't have to know anything about the structure of a certain FSA before using it - Prolog helps us a lot with this point, because it's built-in backtracking mechanism provides us with the search tool that we need. Furthermore, Prolog is so declarative, that one and the same program can (up to a point) work as both a recognizer and a generator.
We will define the predicate recognize1/2 which takes the number of the node you want to start from as first argument and a list of symbols representing the string that you want to recognize as second argument. The query recognize1(Node,SymbolList) should succeed, if the list of symbols SymbolList can be recognized by the FSA in Prolog's database starting from node Node and ending in a final state.
We will define recognize1/2 as a recursive predicate, which first tries to find a transition from state Node to some other state reading the first symbol of the list SymbolList and then calls itself with the node it can reach through this transition and the tail of SymbolList.
In the base case, recognize1/2 is called with an empty list, i.e. the whole input has been read. In this case, it succeeds if the Node is a final state:
recognize1(Node,[]) :-
final(Node).
In the case where SymbolsList is not empty, we first retrieve an arc starting from Node from the database. Then we take this transition, thereby reading a symbol of the input, and recursively call recognize1/2 again.
recognize1(Node1,String) :-
arc(Node1,Node2,Label),
traverse1(Label,String,NewString),
recognize1(Node2,NewString).
The predicate traverse1/3 checks that we can indeed take this transition, i.e. that the label of the arc is the same as the first symbol of the input list, and returns the input without the symbol that we have just read.
traverse1(Label,[Label|Symbols],Symbols).
Here is the whole program:
recognize1(Node,[]) :-
final(Node).
recognize1(Node1,String) :-
arc(Node1,Node2,Label),
traverse1(Label,String,NewString),
recognize1(Node2,NewString).
traverse1(Label,[Label|Symbols],Symbols).
Now, if Prolog should ever retrieve an arc from the database that later turns out to be a bad choice because it doesn't lead to success, it will (automatically) backtrack and look for alternatives.
As promised, we can use recognize1/2 in two different modes. In the recognition mode, we want to give a list of symbols SymbolList and want to find out whether there is an initial node Node such that the query recognize1(Node,SymbolList) returns yes. Here is a driver predicate for the recognition mode:
test1(Symbols) :-
initial(Node),
recognize1(Node,Symbols).
In the generation mode, we want to get all lists of symbols which recognize1/2 can generate starting from some initial node. For this, we can just call test1/1 with an uninstantiated variable. test1/1 then selects an initial node and calls recognize1/2 with this node as first argument and an uninstantiated second argument.
generate1(X) :-
test1(X).