#+TITLE: Support for the Hanoi Omega Automata (HOA) Format #+SETUPFILE: setup.org #+HTML_LINK_UP: tools.html The [[http://adl.github.io/hoaf/][Hanoi Omega-Automa format]] is a textual representation of ω-automata labeled by Boolean formulas over a set of atomic propositions, and using an arbitrary acceptance condition. The typical acceptances conditions like Büchi, generalized-Büchi, co-Büchi, Rabin, Streett, parity, ... are all supported, but the main advantage of this format is that any arbitrary acceptance condition can be defined. The HOA format has support for many features such as non-determinism, alternation, multiple initial states, transition or state-based acceptance, named states, and a range of property flags making it possible to store additional information about the automaton. The HOA format is already supported in [[http://adl.github.io/hoaf/support.html][several tools]]. The goal of this page is to detail the support of this format in Spot. It contains some information that are useful to better understand the behavior of the tools distributed by Spot, and it also look at some lower-level, discussing details that are interesting when programming with Spot. * Format, files, and TωA Some note about the abbreviation first. We usually write "HOA format" or occasionally HOAF to denote the format (as a specification), and HOA or "HOA file" to denote an automaton in that format. In most examples involving HOA files, we use =*.hoa= as a filename extension (even if the actual extension does not matter). When an HOA file is loaded by Spot, it is stored into the data-structure used by Spot to represent ω-Automata. This structure is called Transition-based ω-Automaton, henceforth abbreviated TωA. Such a TωA can be saved back as an HOA file. If you run a command such as =autfilt -H input.hoa >output.hoa= this is exactly what happens: the file =input.hoa= is parsed to create a TωA, and this TωA is then printed in the HOA format into =output.hoa=. Since the TωA structure is not a perfect one-to-one representation of the HOA format, the output may not be exactly the same as the input. * Features of the HOA format with no or limited support in Spot - Alternating automata are not supported. Only nondeterministic (in the broad sense, meaning "not deterministic" or "deterministic") automata are currently supported by Spot. - The maximum number of acceptance sets used is (currently) limited to 32. This limit is not very hard to increase in the source code, however we want to keep it until it becomes an actual problem. So please report to us if you suffer from it. In the past, this limitation has forced us to improve some of our algorithms to be less wasteful and not introduce useless acceptance sets. - Multiple (or missing) initial states are emulated. The internal TωA representation used by Spot supports only a single initial state. When an HOA with multiple initial states is read, it is transformed into an equivalent TωA by merging the initial states into a single one. The merged state can either be one of the original initial states (if one of those has no incoming edge) or a new state introduced for that purpose. Similarly, when an automaton with no initial state is loaded (this includes the case where the automaton has no state), a disconnected initial state is added. As a consequence, Spot's HOA output always contains at least one state, even when the input had no state. - =Fin(!x)= and =Inf(!x)= are rewritten away. Internally Spot only deals with acceptance conditions involving the primitives =Fin(x)= or =Inf(x)=. When the parser encounters the variants =Fin(!x)= or =Inf(!x)=, it automatically complements the set =x= so that the resulting acceptance uses only =Fin(x)= and =Inf(x)=. For instance =Fin(0)&Inf(!1)= gets rewritten into =Fin(0)&Inf(1)= and the membership of all transitions to the set =1= is reversed. If =x= was already used without complementation in another primitive, then a new set has to be created. For instance the acceptance =Inf(0)&Inf(!0)= can only be fixed by adding a new set, =1=, that stores the complement of set =0=, and using =Inf(0)&Inf(1)=. * Internal representations of some features In this section we discuss features of the format that are fully supported, but in a way that so people could find unexpected. These design choices do not affect the semantics of the HOA format in any way. ** State-based vs. transition-based acceptance A Transition-based ω-Automaton (TωA), as its name implies, uses transition-based acceptance sets. Each edge is stored as a quadruplet $(s,d,\ell,F)$ where $s$ and $d$ are the source and destination state numbers, $\ell$ is a Binary Decision Diagram (BDD) representing the Boolean function labeling the edge, and $F$ is a bit-vector representing the membership of the transition to each declared acceptance set. States are just numbers, and may not belong to any accepting set. When reading a HOA file that use state-based acceptance (or even a mix of state-based and transitions-based acceptance), all the acceptance are pushed onto the outgoing transitions. So an automaton represented as an HOA file with this transition structure: #+BEGIN_SRC sh :results silent :exports results cat >stvstracc.hoa <sba.hoa <1$) or =Buchi= (corresponding to $n=1$) or =all= (corresponding to $n=0$). The use of =Buchi= or =all= instead of =generalized-Buchi n= follow the same idea as our use of state-based acceptance whenever possible. By using the name of these inferior acceptance conditions, we hope that the resulting automaton can be easier to use with tools that only deal with such inferior acceptance conditions. However, unlike for state vs. transition-based acceptance, there is currently no means to request another acceptance name to be used. The [[http://adl.github.io/hoaf/#canonical-acceptance-specifications-for-classical-conditions][canonical encodings for acceptance conditions]] are specified quite strictly in the HOA format. For instance =generalized-Buchi 2= corresponds to =Inf(0)&Inf(1)=, not to =Inf(1)&Inf(0)=, even though the two formulas are equivalent. Spot's HOA output routine contains some limited form of equivalence check (based mostly on associativity and commutativity of the Boolean operators), so that if it detects such a simple inversion, it will output it in the order required to be allowed to name the acceptance condition. In the following example, you can see =autfilt= removing the duplicate Rabin pair, and reordering the remaining pair to fit the syntax corresponding to =Rabin 1=. #+BEGIN_SRC sh :results verbatim :exports both :wrap SRC hoa autfilt -H <stvstrlab.hoa <prop_deterministic()= because that only checks the property bit, and it might be false even if the =aut= is deterministic. Instead, call the function =is_deterministic(aut)=. This function will first test the property bit, and do the actual check if it has to. Algorithms that update a TωA should call the method =prop_keep()= and use the argument to specify which of the properties they preserve. Algorithms that input a TωA and output a new one may call the method =prop_copy()= to copy over the subset of properties they preserve. Using these two functions ensure that in the future, whenever a new property is added to the TωA class, we cannot forget to update all the calls =prop_copy()= or =prop_keep()= (because these functions will take a new argument). The =HOA= printer also tries to not bloat the output with many redundant and useless properties. For instance =deterministic= automata are necessarily =unambiguous=, and people interested in unambiguous automata know that, so Spot only output the =unambiguous= property if an unambiguous automaton is non-deterministic. Similarly, while Spot only outputs non-alternating automata, it does not output the =no-univ-branch= property because we cannot think of a situation where this would be useful. This decision can be overridden by passing the =-Hv= (or =--hoa=v=) option to the command-line tools: this requests "verbose" properties. The following table summarizes how supported properties are handled. In particular: - for the parser =checked= means that the property is always inferred and checked against any declaration (if present), =trusted= means that the property will be stored without being checked (unless =--trust-hoa=no= is specified). - Stored properties are those represented as bits in the automaton. - the printer will sometime check some properties when it can do it as part of its initial "survey scan" of the automaton; in that case the stored property is not used. This makes it possible to detect deterministic automata that have been output by algorithms that do not try to output deterministic automata. | property | parser | stored | printer | notes | |---------------------+---------+--------+---------------------------------------------+------------------------------------------------------------------| | =state-labels= | checked | no | checked if =-Hk= | state labels are converted to transition labels when reading TωA | | =trans-labels= | checked | no | always, unless =-Hi= or =-Hk= | | | =implicit-labels= | checked | no | if =-Hi= | =-Hi= only works for deterministic automata | | =explicit-labels= | checked | no | always, unless =-Hi= | | | =state-acc= | checked | yes | checked, unless =-Ht= or =-Hm= | | | =trans-acc= | checked | no | if not =state-acc= and not =-Hm= | | | =no-univ-branch= | ignored | no | only if =-Hv= | | | =deterministic= | checked | yes | checked | | | =complete= | checked | no | checked | | | =unambiguous= | trusted | yes | as stored if (=-Hv= or not =deterministic=) | can be checked with =--check=unambiguous= | | =stutter-invariant= | trusted | yes | as stored | can be checked with =--check=stuttering= | | =stutter-sensitive= | trusted | yes | as stored | can be checked with =--check=stuttering= | | =terminal= | trusted | yes | as stored | can be checked with =--check=strength= | | =weak= | trusted | yes | as stored if (=-Hv= or not =terminal=) | can be checked with =--check=strength= | | =inherently-weak= | trusted | yes | as stored if (=-Hv= or not =weak=) | can be checked with =--check=strength= | | =colored= | ignored | no | checked | | ** Named properties In addition to the bit properties discussed above, a TωA can carry named properties of any type. When attaching a property to a TωA, you only supply a name for the property, a pointer, and an optional destructor function. They are currently two named properties related to the HOA format. - =name= :: Is a string that stores the name of the automaton (the one given after =name:= in the HOA format) - =state-names= :: is a vector of strings that stores the name of the states (in case states are named in the HOA format) You can see these properties being preserved when an automaton is read and then immediately output: #+NAME: hello-world #+BEGIN_SRC sh :results verbatim :exports both :wrap SRC hoa cat >hw.hoa <