#+TITLE: Support for the Hanoi Omega Automata (HOA) Format
#+SETUPFILE: setup.org
#+HTML_LINK_UP: tools.html


The [[http://adl.github.io/hoaf/][Hanoi Omega-Automa format]] is a textual representation of
ω-automata labeled by Boolean formulas over a set of atomic
propositions, and using an arbitrary acceptance condition. The
typical acceptances conditions like Büchi, generalized-Büchi,
co-Büchi, Rabin, Streett, parity, ... are all supported, but the main
advantage of this format is that any arbitrary acceptance condition
can be defined.  The HOA format has support for many features such as
non-determinism, alternation, multiple initial states, transition or
state-based acceptance, named states, and a range of property flags
making it possible to store additional information about the
automaton.

The HOA format is already supported in [[http://adl.github.io/hoaf/support.html][several tools]].  The goal of
this page is to detail the support of this format in Spot.  It contains
some information that are useful to better understand the behavior
of the tools distributed by Spot, and it also look at some lower-level,
discussing details that are interesting when programming with Spot.

* Format, files, and TωA

Some note about the abbreviation first.  We usually write "HOA format"
or occasionally HOAF to denote the format (as a specification), and
HOA or "HOA file" to denote an automaton in that format.  In most
examples involving HOA files, we use =*.hoa= as a filename extension
(even if the actual extension does not matter).

When an HOA file is loaded by Spot, it is stored into the
data-structure used by Spot to represent ω-Automata.  This structure
is called Transition-based ω-Automaton, henceforth abbreviated TωA.
Such a TωA can be saved back as an HOA file.  If you run a command
such as =autfilt -H input.hoa >output.hoa= this is exactly what
happens: the file =input.hoa= is parsed to create a TωA, and this TωA
is then printed in the HOA format into =output.hoa=.

Since the TωA structure is not a perfect one-to-one representation of
the HOA format, the output may not be exactly the same as the input.

* Features of the HOA format with no or limited support in Spot

- Alternating automata are not supported.

  Only nondeterministic (in the broad sense, meaning "not
  deterministic" or "deterministic") automata are currently supported
  by Spot.

- The maximum number of acceptance sets used is (currently) limited
  to 32.

  This limit is not very hard to increase in the source code,
  however we want to keep it until it becomes an actual problem.  So
  please report to us if you suffer from it.  In the past, this
  limitation has forced us to improve some of our algorithms to be
  less wasteful and not introduce useless acceptance sets.

- Multiple (or missing) initial states are emulated.

  The internal TωA representation used by Spot supports only a single
  initial state.  When an HOA with multiple initial states is read, it
  is transformed into an equivalent TωA by merging the initial states
  into a single one.  The merged state can either be one of the
  original initial states (if one of those has no incoming edge) or a
  new state introduced for that purpose.

  Similarly, when an automaton with no initial state is loaded (this
  includes the case where the automaton has no state), a disconnected
  initial state is added.  As a consequence, Spot's HOA output always
  contains at least one state, even when the input had no state.

- =Fin(!x)= and =Inf(!x)= are rewritten away.

  Internally Spot only deals with acceptance conditions involving the
  primitives =Fin(x)= or =Inf(x)=.  When the parser encounters the
  variants =Fin(!x)= or =Inf(!x)=, it automatically complements the
  set =x= so that the resulting acceptance uses only =Fin(x)= and
  =Inf(x)=.  For instance =Fin(0)&Inf(!1)= gets rewritten into
  =Fin(0)&Inf(1)= and the membership of all transitions to the set =1=
  is reversed.

  If =x= was already used without complementation in another
  primitive, then a new set has to be created.  For instance the
  acceptance =Inf(0)&Inf(!0)= can only be fixed by adding a new set,
  =1=, that stores the complement of set =0=, and using
  =Inf(0)&Inf(1)=.

* Internal representations of some features

In this section we discuss features of the format that are fully
supported, but in a way that so people could find unexpected.  These
design choices do not affect the semantics of the HOA format in any
way.

** State-based vs. transition-based acceptance

A Transition-based ω-Automaton (TωA), as its name implies, uses
transition-based acceptance sets.  Each edge is stored as a
quadruplet $(s,d,\ell,F)$ where $s$ and $d$ are the source and
destination state numbers, $\ell$ is a Binary Decision Diagram (BDD)
representing the Boolean function labeling the edge, and $F$ is a
bit-vector representing the membership of the transition to each
declared acceptance set.

States are just numbers, and may not belong to any accepting set.
When reading a HOA file that use state-based acceptance (or even
a mix of state-based and transitions-based acceptance), all the
acceptance are pushed onto the outgoing transitions.

So an automaton represented as an HOA file with this transition
structure:

#+BEGIN_SRC sh :results silent :exports results
cat >stvstracc.hoa <<EOF
HOA: v1
States: 3
Start: 0
AP: 2 "p0" "p1"
Acceptance: 2 Inf(0)&Inf(1)
--BODY--
/* state-based acceptance */
State: 0 {0 1}
[0&!1] 0
[0&1] 1
[!0] 2
/* mixed state- and transition-based acceptance */
State: 1 {0}
[0] 1 {1}
[0&1] 2
/* transition-based acceptance */
State: 2
[!0] 1 {0}
[0]  2 {1}
--END--
EOF
#+END_SRC

#+RESULTS:

#+BEGIN_SRC sh :results verbatim :exports results :wrap SRC hoa
sed -n '/--BODY/,/--END/p' stvstracc.hoa | grep -v -- --
#+END_SRC

#+RESULTS:
#+BEGIN_SRC hoa
/* state-based acceptance */
State: 0 {0 1}
[0&!1] 0
[0&1] 1
[!0] 2
/* mixed state- and transition-based acceptance */
State: 1 {0}
[0] 1 {1}
[0&1] 2
/* transition-based acceptance */
State: 2
[!0] 1 {0}
[0]  2 {1}
#+END_SRC

will always be stored as a TωA with this transition structure:

#+BEGIN_SRC sh :results verbatim :exports results :wrap SRC hoa
autfilt -Ht stvstracc.hoa | sed -n '/--BODY/,/--END/p' | grep -v -- --
#+END_SRC

#+RESULTS:
#+BEGIN_SRC hoa
State: 0
[0&!1] 0 {0 1}
[0&1] 1 {0 1}
[!0] 2 {0 1}
State: 1
[0] 1 {0 1}
[0&1] 2 {0}
State: 2
[!0] 1 {0}
[0] 2 {1}
#+END_SRC

#+BEGIN_SRC sh :results silent :exports results
rm -f stvstracc.hoa
#+END_SRC

Even if an input HOA file uses only state-based acceptance, Spot
internally stores it using transition-based acceptance.  However in
that case the TωA will have a property bit indicating that it actually
represents an automaton with the "state-based acceptance" property:
this implies that transitions leaving one state all belong to the same
acceptance sets.  A couple of algorithms in Spot checks for this
property, and enable specialized treatments of state-based automata.

Furthermore, even if an automaton does not have the "state-based
acceptance" property bit set, the HOA output routine may detect that
the automaton satisfies this property.  In that case, it output the
automaton with state-based acceptance.

For instance in the following automaton, the outgoing transitions of
each states belong to the same sets:

#+NAME: state-based-example
#+BEGIN_SRC sh :results verbatim :exports both :wrap SRC hoa
cat >sba.hoa <<EOF_HOA
HOA: v1
States: 3
Start: 0
AP: 2 "a" "b"
Acceptance: 2 Inf(0)&Inf(1)
--BODY--
State: 0 {0}
[0&!1] 0
[0&1] 1
[!0] 2
State: 1
[0] 1 {1}
[0&1] 2 {1}
State: 2
[!0] 1 {0 1}
[0]  2 {0 1}
--END--
EOF_HOA
autfilt -H sba.hoa
#+END_SRC

so the HOA output of =autfilt= automatically uses state-based acceptance:

#+RESULTS: state-based-example
#+BEGIN_SRC hoa
HOA: v1
States: 3
Start: 0
AP: 2 "a" "b"
acc-name: generalized-Buchi 2
Acceptance: 2 Inf(0)&Inf(1)
properties: trans-labels explicit-labels state-acc
--BODY--
State: 0 {0}
[0&!1] 0
[0&1] 1
[!0] 2
State: 1 {1}
[0] 1
[0&1] 2
State: 2 {0 1}
[!0] 1
[0] 2
--END--
#+END_SRC

The rational for this automatic switch to state-based acceptance is as follows:
- Tools that support transition-based acceptance can easily see
  state-based acceptance as syntactic sugar, so they should be
  able to process state-based or transition-based acceptance
  indifferently.
- Tools that support only state-based acceptance, cannot easily
  process automata with transition-based acceptance.  So by using
  state-based acceptance whenever possible, we are making these
  automata compatible with a larger number of tools.
- Using state-based acceptance is slightly more space efficient,
  because there is less redundancy in the output file.

Nevertheless, should you really insist on having an output with
transition-based acceptance, you can do so by passing the option =t=
to the HOA printer:

#+BEGIN_SRC sh :results verbatim :exports both :wrap SRC hoa
autfilt -Ht sba.hoa
#+END_SRC

#+RESULTS:
#+BEGIN_SRC hoa
HOA: v1
States: 3
Start: 0
AP: 2 "a" "b"
acc-name: generalized-Buchi 2
Acceptance: 2 Inf(0)&Inf(1)
properties: trans-labels explicit-labels trans-acc
--BODY--
State: 0
[0&!1] 0 {0}
[0&1] 1 {0}
[!0] 2 {0}
State: 1
[0] 1 {1}
[0&1] 2 {1}
State: 2
[!0] 1 {0 1}
[0] 2 {0 1}
--END--
#+END_SRC

#+BEGIN_SRC sh :results silent :exports results
rm sba.hoa
#+END_SRC

By default, the output uses either state-based acceptance, or
transition-based acceptance.  However there is no restriction in the
format to prevents mixing the two: if you use =-Hm=, the decision of
using state or transition-based acceptance will be made for each state
separately.  For instance:

#+BEGIN_SRC sh :results verbatim :exports both :wrap SRC hoa
ltl2tgba -Hm 'GFa | Fb'
#+END_SRC

#+RESULTS:
#+BEGIN_SRC hoa
HOA: v1
name: "F(b | GFa)"
States: 3
Start: 1
AP: 2 "b" "a"
acc-name: Buchi
Acceptance: 1 Inf(0)
properties: trans-labels explicit-labels complete stutter-invariant
--BODY--
State: 0 {0}
[t] 0
State: 1
[0] 0
[!0] 1
[!0&1] 2
State: 2
[1] 2 {0}
[!1] 2
--END--
#+END_SRC


So far we have discussed transforming state-based acceptance into
transition-based acceptance (this can be seen as removing syntactic
sugar), and representing transition-based acceptance into state-based
acceptance when this is possible (adding syntactic sugar) to do so
without adding states.

It is also possible to transform automata with transition-based
acceptance into automata with state-based acceptance, adding states
when necessary.  Most tools have a =-S= option (or
=--state-based-acceptance=) for this purpose.  Compare the following
output with the previous one.

#+BEGIN_SRC sh :results verbatim :exports both :wrap SRC hoa
ltl2tgba -S -Hm 'GFa | Fb'
#+END_SRC

#+RESULTS:
#+BEGIN_SRC hoa
HOA: v1
name: "F(b | GFa)"
States: 4
Start: 0
AP: 2 "b" "a"
acc-name: Buchi
Acceptance: 1 Inf(0)
properties: trans-labels explicit-labels complete stutter-invariant
--BODY--
State: 0
[0] 1
[!0] 0
[!0&1] 2
State: 1 {0}
[t] 1
State: 2 {0}
[1] 2
[!1] 3
State: 3
[1] 2
[!1] 3
--END--
#+END_SRC

** Generic acceptance

Currently, Spot's parser for HOA ignores the optional =acc-name:=
line, and only uses the mandatory =Acceptance:= line.  As explained
earlier, if this line contains primitives of the form =Inf(!x)= or
=Fin(!x)= these are rewritten away, because internally Spot only works
with primitives of the form =Inf(x)= or =Fin(x)=.  This also means
that Spot will never produce an acceptance condition containing
=Fin(!x)= or =Inf(!x)=.

Whenever an HOA file is output, Spot attempts to recognize the
acceptance condition to give it a suitable =acc-name:= (even if Spot
does not use this line, it is useful to tools only deal with one
specific acceptance condition and that do not want to parse the
=Acceptance:= line).  However the HOA output routine has no idea of
what type of automata you are trying to output: it is only looking at
the acceptance condition and trying to name it as precisely as
possible.  This could be a problem when a given condition accepts
multiple names.

For instance according to the [[http://adl.github.io/hoaf/#canonical-acceptance-specifications-for-classical-conditions][canonical encodings specified by the HOA
format]] the condition =Inf(0)= could be called =Buchi=, or
=generalized-Buchi 1=, or (why not?) =parity min even 1= or =parity
max even 1=.  Spot will always call this acceptance condition =Buchi=.

Similarly the acceptance condition =t= is always called =all= (not
=generalized-Buchi 0= or =Rabin 0=, etc.), and while =f= is always
named =none=.


One of the consequence is that when you run =ltl2tgba= with its
default settings (which are to produce automata with transition-based
generalized Büchi acceptance) you actually obtain an output that:
- has an =Acceptance:= line that is a conjunction of =Inf(x)= primitives (or =t=), because
  that is what generalized Büchi is;
- has an =acc-name:= line that can be either =generalized-Buchi n=
  (for $n>1$) or =Buchi= (corresponding to $n=1$) or =all= (corresponding to $n=0$).

The use of =Buchi= or =all= instead of =generalized-Buchi n= follow
the same idea as our use of state-based acceptance whenever possible.
By using the name of these inferior acceptance conditions, we hope
that the resulting automaton can be easier to use with tools that only
deal with such inferior acceptance conditions.  However, unlike for
state vs. transition-based acceptance, there is currently no means to
request another acceptance name to be used.

The [[http://adl.github.io/hoaf/#canonical-acceptance-specifications-for-classical-conditions][canonical encodings for acceptance conditions]] are specified quite
strictly in the HOA format.  For instance =generalized-Buchi 2=
corresponds to =Inf(0)&Inf(1)=, not to =Inf(1)&Inf(0)=, even though
the two formulas are equivalent.  Spot's HOA output routine contains
some limited form of equivalence check (based mostly on associativity
and commutativity of the Boolean operators), so that if it detects
such a simple inversion, it will output it in the order required to be
allowed to name the acceptance condition.

In the following example, you can see =autfilt= removing the duplicate
Rabin pair, and reordering the remaining pair to fit the syntax
corresponding to =Rabin 1=.

#+BEGIN_SRC sh :results verbatim :exports both :wrap SRC hoa
autfilt -H <<EOF
HOA: v1
States: 3
Start: 0
AP: 2 "a" "b"
Acceptance: 2 (Inf(1) & Fin(0)) | (Inf(1) & Fin(0))
--BODY--
State: 0 {0}
[0&!1] 0
[0&1] 1
[!0] 2
State: 1
[0] 1 {1}
[0&1] 2 {1}
State: 2
[!0] 1 {0 1}
[0]  2 {0 1}
--END--
EOF
#+END_SRC

#+RESULTS:
#+BEGIN_SRC hoa
HOA: v1
States: 3
Start: 0
AP: 2 "a" "b"
acc-name: Rabin 1
Acceptance: 2 Fin(0) & Inf(1)
properties: trans-labels explicit-labels state-acc
--BODY--
State: 0 {0}
[0&!1] 0
[0&1] 1
[!0] 2
State: 1 {1}
[0] 1
[0&1] 2
State: 2 {0 1}
[!0] 1
[0] 2
--END--
#+END_SRC

Internally, the acceptance condition is stored as an array in reverse
polish notation, and the primitives =Inf= and =Fin= are actually
parametered by bitsets representing multiple sets numbers.  For
instance the generalized Büchi acceptance
=Inf(0)&Inf(1)&Inf(2)&Inf(3)= is actually stored as a single term
=Inf({0,1,2,3})=.  Similarly, =Fin({1,3,5})= is our internal encoding
for =Fin(1)|Fin(3)|Fin(5)=.

A more complex acceptance condition, such as
=(Fin(0)&Inf(1))|(Fin(2)&Inf(3)&Inf(4))|Fin(5)= (a generalized-Rabin
acceptance), would be encoded as the following 8-element array.

: Fin({5}) Inf({3,4}) Fin({2}) 2& Inf({1}) Fin(0) 2& 3|

This has to be read as a [[http://en.wikipedia.org/wiki/Reverse_Polish_notation][reverse Polish notation]] where the numbers in
front of the operators =&= and =|= indicate the number of arguments
they consume (these operators are n-ary).

When you look at an acceptance condition output by Spot, you can
actually spot the terms that have been grouped together internally by
looking at the spacing around operators =&= and =|=.  For instance:

#+BEGIN_SRC sh :results verbatim :exports both :wrap SRC hoa
randaut -A"Fin(0)|Fin(1)|Fin(2)&Fin(3)&Inf(4)&Inf(5)" -H 0 | grep Acceptance:
#+END_SRC

#+RESULTS:
#+BEGIN_SRC hoa
Acceptance: 6 (Fin(0)|Fin(1)) | (Fin(2) & Fin(3) & (Inf(4)&Inf(5)))
#+END_SRC

Here =Fin(0)|Fin(1)= is actually a single internal term =Fin({0,1})=,
and likewise for =Inf(4)&Inf(5)=.

** State-based vs. transition-based labels

State labels are handled in the same way as state-based acceptance:
Spot store labels on transitions internally, so if an input automaton
has state labels, those are pushed to all outgoing transitions.

For instance an automaton declared in some HOA file with this body:

#+BEGIN_SRC sh :results silent :exports results
cat >stvstrlab.hoa <<EOF
HOA: v1
States: 3
Start: 0
AP: 2 "p0" "p1"
Acceptance: 1 Inf(1)
--BODY--
State: [0&1] 0
0 1 2
State: [!0&1] 1 {0}
0 1
State: [!1] 2
2 1
--END--
EOF
#+END_SRC

#+RESULTS:

#+BEGIN_SRC sh :results verbatim :exports results :wrap SRC hoa
sed -n '/--BODY/,/--END/p' stvstrlab.hoa | grep -v -- --
#+END_SRC

#+RESULTS:
#+BEGIN_SRC hoa
State: [0&1] 0
0 1 2
State: [!0&1] 1 {0}
0 1
State: [!1] 2
2 1
#+END_SRC

will always be stored as an automaton with the following transition
structure


#+BEGIN_SRC sh :results verbatim :exports results :wrap SRC hoa
autfilt -Ht stvstrlab.hoa | sed -n '/--BODY/,/--END/p' | grep -v -- --
#+END_SRC

#+RESULTS:
#+BEGIN_SRC hoa
State: 0
[0&1] 0
[0&1] 1
[0&1] 2
State: 1
[!0&1] 0 {0}
[!0&1] 1 {0}
State: 2
[!1] 2
[!1] 1
#+END_SRC

#+BEGIN_SRC sh :results silent :exports results
rm -f stvstrlab.hoa
#+END_SRC

The HOA printer has an option to output automata using state-based
labels whenever that is possible.  The option is named =k= (i.e., use
=-Hk= with command-line tools) because it is useful when the HOA file
is used to describe a Kripke structure.

** Property bits

The =HOA= format supports a number of optional =property:= tokens.
These properties can be useful to speedup certain algorithms: for
instance it is easier to complement a deterministic automaton that is
known to be inherently weak.

Spot stores this properties that matters to its algorithms as
additional bits attached to each automaton.  Currently the HOA parser
ignores all the properties that are unused by Spot.

Some of the supported properties are double-checked when the automaton
is parsed; this is for instance the case of =deterministic=,
=state-based=.  The parser will in fact infer these properties from the
body of the file, and then return and error if what has been declared
does not correspond to the reality.

Some supported properties (like =weak=, =unambiguous=, or
=stutter-invariant=) are not double-checked, because that would
require too much additional operations.  Command-line tools that read
HOA files all take a =--trust-hoa=no= option to ignore properties that
are not double-checked by the parser.

It should be noted that these bits are only implications.  When such a
bit is false, it only means that it is unknown whether the property is
true.  For instance if in some algorithm you want to know whether an
automaton is deterministic (the equivalent of calling =autfilt -q
--is-deterministic aut.hoa= from the command-line), you should not
call the method =aut->prop_deterministic()= because that only checks
the property bit, and it might be false even if the =aut= is
deterministic.  Instead, call the function =is_deterministic(aut)=.
This function will first test the property bit, and do the actual
check if it has to.

Algorithms that update a TωA should call the method =prop_keep()= and
use the argument to specify which of the properties they preserve.
Algorithms that input a TωA and output a new one may call the method
=prop_copy()= to copy over the subset of properties they preserve.
Using these two functions ensure that in the future, whenever a new
property is added to the TωA class, we cannot forget to update all the
calls =prop_copy()= or =prop_keep()= (because these functions will
take a new argument).

The =HOA= printer also tries to not bloat the output with many
redundant and useless properties.  For instance =deterministic=
automata are necessarily =unambiguous=, and people interested in
unambiguous automata know that, so Spot only output the =unambiguous=
property if an unambiguous automaton is non-deterministic.  Similarly,
while Spot only outputs non-alternating automata, it does not output
the =no-univ-branch= property because we cannot think of a situation
where this would be useful.  This decision can be overridden by
passing the =-Hv= (or =--hoa=v=) option to the command-line tools:
this requests "verbose" properties.

The following table summarizes how supported properties are handled.  In
particular:
- for the parser =checked= means that the property is always inferred
  and checked against any declaration (if present), =trusted= means
  that the property will be stored without being checked (unless
  =--trust-hoa=no= is specified).
- Stored properties are those represented as bits in the automaton.
- the printer will sometime check some properties when it can do
  it as part of its initial "survey scan" of the automaton; in that
  case the stored property is not used.  This makes it possible
  to detect deterministic automata that have been output by algorithms
  that do not try to output deterministic automata.

| property            | parser  | stored | printer                                     | notes                                                            |
|---------------------+---------+--------+---------------------------------------------+------------------------------------------------------------------|
| =state-labels=      | checked | no     | checked if =-Hk=                            | state labels are converted to transition labels when reading TωA |
| =trans-labels=      | checked | no     | always, unless =-Hi= or =-Hk=               |                                                                  |
| =implicit-labels=   | checked | no     | if =-Hi=                                    | =-Hi= only works for deterministic automata                      |
| =explicit-labels=   | checked | no     | always, unless =-Hi=                        |                                                                  |
| =state-acc=         | checked | yes    | checked, unless =-Ht= or =-Hm=              |                                                                  |
| =trans-acc=         | checked | no     | if not =state-acc= and not =-Hm=            |                                                                  |
| =no-univ-branch=    | ignored | no     | only if =-Hv=                               |                                                                  |
| =deterministic=     | checked | yes    | checked                                     |                                                                  |
| =complete=          | checked | no     | checked                                     |                                                                  |
| =unambiguous=       | trusted | yes    | as stored if (=-Hv= or not =deterministic=) | can be checked with =--check=unambiguous=                        |
| =stutter-invariant= | trusted | yes    | as stored                                   | can be checked with =--check=stuttering=                         |
| =stutter-sensitive= | trusted | yes    | as stored                                   | can be checked with =--check=stuttering=                         |
| =terminal=          | trusted | yes    | as stored                                   | can be checked with =--check=strength=                           |
| =weak=              | trusted | yes    | as stored if (=-Hv= or not =terminal=)      | can be checked with =--check=strength=                           |
| =inherently-weak=   | trusted | yes    | as stored if (=-Hv= or not =weak=)          | can be checked with =--check=strength=                           |
| =colored=           | ignored | no     | checked                                     |                                                                  |

** Named properties

In addition to the bit properties discussed above, a TωA can carry
named properties of any type.  When attaching a property to a TωA, you
only supply a name for the property, a pointer, and an optional
destructor function.

They are currently two named properties related to the HOA format.

- =name= :: Is a string that stores the name of the automaton (the one given after =name:= in the HOA format)
- =state-names= :: is a vector of strings that stores the name of the states (in case states are named in the HOA format)

You can see these properties being preserved when an automaton is read and then immediately output:

#+NAME: hello-world
#+BEGIN_SRC sh :results verbatim :exports both :wrap SRC hoa
cat >hw.hoa <<EOF
HOA: v1
name: "hello world!"
States: 3
Start: 0
AP: 2 "a" "b"
Acceptance: 2 Inf(0)&Inf(1)
--BODY--
State: 0 {0}
[0&!1] 0
[0&1] 1
[!0] 2
State: 1 "I am a state"
[0] 1 {1}
[0&1] 2 {1}
State: 2 "so am I"
[!0] 1 {0 1}
[0]  2 {0 1}
--END--
EOF
autfilt -H hw.hoa
#+END_SRC

#+RESULTS: hello-world
#+BEGIN_SRC hoa
HOA: v1
name: "hello world!"
States: 3
Start: 0
AP: 2 "a" "b"
acc-name: generalized-Buchi 2
Acceptance: 2 Inf(0)&Inf(1)
properties: trans-labels explicit-labels state-acc
--BODY--
State: 0 {0}
[0&!1] 0
[0&1] 1
[!0] 2
State: 1 "I am a state" {1}
[0] 1
[0&1] 2
State: 2 "so am I" {0 1}
[!0] 1
[0] 2
--END--
#+END_SRC

However if =autfilt= performs some transformation, and actually has to
construct a new automaton, those properties will not be quarried over
to the new automaton.  First because it is not obvious that the new
automaton should have the same name, and second because if a new
automaton is created, there might not be clear correspondence between
the old states and the new ones.


Here is for instance the result when =autfilt= is instructed to
simplify the automaton:

#+BEGIN_SRC sh :results verbatim :exports both :wrap SRC hoa
autfilt -H --small hw.hoa
#+END_SRC

#+RESULTS:
#+BEGIN_SRC hoa
HOA: v1
States: 3
Start: 0
AP: 2 "a" "b"
acc-name: Buchi
Acceptance: 1 Inf(0)
properties: trans-labels explicit-labels state-acc deterministic
--BODY--
State: 0
[0&!1] 0
[0&1] 1
[!0] 2
State: 1
[0&!1] 1
[0&1] 2
State: 2 {0}
[!0] 1
[0] 2
--END--
#+END_SRC


Note that if the name of the automaton is important to you, it can be
fixed via the =--name= option.  For instance =--name=%M= will
construct the new name by simply copying the one of the original
automaton.

#+BEGIN_SRC sh :results verbatim :exports both :wrap SRC hoa
autfilt -H --small hw.hoa --name=%M
#+END_SRC

#+RESULTS:
#+BEGIN_SRC hoa
HOA: v1
name: "hello world!"
States: 3
Start: 0
AP: 2 "a" "b"
acc-name: Buchi
Acceptance: 1 Inf(0)
properties: trans-labels explicit-labels state-acc deterministic
--BODY--
State: 0
[0&!1] 0
[0&1] 1
[!0] 2
State: 1
[0&!1] 1
[0&1] 2
State: 2 {0}
[!0] 1
[0] 2
--END--
#+END_SRC

The page about [[file:oaut.org][common output option for automata]] has a section showing
how =--name= can be used to construct complex pipelines with automata that
preserve their equivalent LTL formula in the =name:= field.

* Streaming support

The HOA format has been designed to easily allow multiple automata to
be concatenated together (in the same file, or in a pipe) and
processed in batch.  Spot's parser supports this scenario and can be
called repeatedly to read the next automaton from the input stream.

For instance the following creates 3 formulas of the form $\bigwedge_i
\mathsf{G}\mathsf{F} p_i$, translates those into Büchi automata output
in the HOA format, and then read those automata with =autfilt= to
randomize the order of their transitions and states before printing
them in HOA format.

#+BEGIN_SRC sh :results verbatim :exports both :wrap SRC hoa
genltl --and-gf=1..3 | ltl2tgba -B -F- -H | autfilt --randomize -H
#+END_SRC

#+RESULTS:
#+BEGIN_SRC hoa
HOA: v1
name: "GFp1"
States: 2
Start: 1
AP: 1 "p1"
acc-name: Buchi
Acceptance: 1 Inf(0)
properties: trans-labels explicit-labels state-acc complete
properties: deterministic stutter-invariant
--BODY--
State: 0
[0] 1
[!0] 0
State: 1 {0}
[!0] 0
[0] 1
--END--
HOA: v1
name: "G(Fp1 & Fp2)"
States: 3
Start: 1
AP: 2 "p1" "p2"
acc-name: Buchi
Acceptance: 1 Inf(0)
properties: trans-labels explicit-labels state-acc complete
properties: deterministic stutter-invariant
--BODY--
State: 0
[!0&1] 2
[0&1] 1
[!1] 0
State: 1 {0}
[0&1] 1
[!0&1] 2
[!1] 0
State: 2
[!0] 2
[0] 1
--END--
HOA: v1
name: "G(Fp1 & Fp2 & Fp3)"
States: 4
Start: 1
AP: 3 "p1" "p2" "p3"
acc-name: Buchi
Acceptance: 1 Inf(0)
properties: trans-labels explicit-labels state-acc complete
properties: deterministic stutter-invariant
--BODY--
State: 0
[!2] 0
[!1&2] 2
[0&1&2] 1
[!0&1&2] 3
State: 1 {0}
[!0&1&2] 3
[!1&2] 2
[0&1&2] 1
[!2] 0
State: 2
[0&1] 1
[!0&1] 3
[!1] 2
State: 3
[0] 1
[!0] 3
--END--
#+END_SRC


It should be noted that the HOA parser is less efficient when it reads
from a pipe than when it reads from a file.  The reason is that if two
commands =A | B= exchange automata via a pipe, the command =A= may
require some time between the output of two automata, so if =B= does a
block read on its input to fill the parser's buffer, it might not be
able to process any automaton before =A= has produced enough automata
to fill the buffer.  To avoid this delay, whenever =B= detects that
the input is a pipe (or a terminal), it switches to an interactive
mode, where the input is read one character at a time.  This way an
automaton can be processed by =B= as soon as its =--END--= has been
received.

The HOA format has support for a =--ABORT--= token, that can be used
by tools that produce automata in a stream to cancel the current one.
This makes sense for instance when the automaton is constructed
on-the-fly, while it is being output.  This scenario does not occur in
Spot (automata are constructed before they are output), so it does not
emit =--ABORT--=.  However the input parser is fully aware of this
token.  Tools like =autfilt= will diagnose aborted automata in the
input, and continue processing with the next automaton.  The Python
bindings for the HOA parser can be configured in two modes: skip
aborted automata, or flag them as errors.

* Error recovery

The HOA parser does a fair amount of error recovery.  It is important
that when parsing a stream of automata, a syntax error in one
automaton does not invalidate the following automata (the parser
should at least be able to ignore everything up to =--END--= if it
cannot recover before).

Another scenario where syntax errors are more frequent is when an HOA
file is hand-edited.  For instance one could edit an HOA file to add a
few states and transitions, and forget to update the total number of
states in the format.  In that case the parser will diagnose the problem,
and fix the number of states.

* Checked properties

When an automaton is output in HOA format, the =property:= lines
includes property registered into the automaton (see the Property
bits section above), and property that are trivial to compute.

Command-line tools with a HOA output all have a =--check= option that
can be used to request additional checks such as testing whether the
automaton is stutter-invariant, unambiguous, (inherently) weak, and
terminal.