ltl2tgba: Add a --csv-escape option and document CSV I/O.

* src/bin/common_output.cc, src/bin/common_output.hh:
(output_formula_checked, aut_stat_printer): New.
* src/bin/genltl.cc, src/bin/randltl.cc, src/bin/ltlfilt.cc: Call
output_formula_checked() instead of output_formula().
* src/bin/ltl2tgba.cc: Use aut_stat_printer and add option --csv-escape.
* doc/org/csv.org: New file to document CSV I/O.
* doc/Makefile.am: Add it.
* doc/org/ioltl.org, doc/org/ltlfilt.org, doc/org/ltl2tgba.org,
doc/org/tools.org: Link to csv.org
This commit is contained in:
Alexandre Duret-Lutz 2013-11-29 18:46:02 +01:00
parent 0faea814da
commit 846e33b9e5
13 changed files with 399 additions and 39 deletions

View file

@ -60,6 +60,7 @@ ORG_FILES = \
org/.dir-locals.el \
org/init.el.in \
org/syntax.css \
org/csv.org \
org/dstar2tgba.org \
org/genltl.org \
org/ioltl.org \

238
doc/org/csv.org Normal file
View file

@ -0,0 +1,238 @@
#+TITLE: Reading and writing CSV files
#+EMAIL spot@lrde.epita.fr
#+OPTIONS: H:2 num:nil toc:t
#+LINK_UP: file:tools.html
This page discusses features available in Spot's command-line
tools to produce an consume CSV files.
* Producing CSV files
All the tools that normally produce formulas (like [[file:genltl.org][=genltl=]], [[file:randltl.org][=randltl=]],
and [[file:ltlfilt.org][=ltlfilt=]]) have a [[file:ioltl.org][=--format= option]]. That can be used to
customize the way output is formatted.
For instance here is how we could use =genltl= to generate a CSV file
with three columns: the family name of the formula, its parameter, and
the formula itself.
#+BEGIN_SRC sh :results verbatim :exports both
genltl --and-gf=1..5 --u-left=1..5 --format='%F,%L,%f' > gen.csv
cat gen.csv
#+END_SRC
#+RESULTS:
#+begin_example
and-gf,1,GFp1
and-gf,2,GFp1 & GFp2
and-gf,3,GFp1 & GFp2 & GFp3
and-gf,4,GFp1 & GFp2 & GFp3 & GFp4
and-gf,5,GFp1 & GFp2 & GFp3 & GFp4 & GFp5
u-left,1,p1
u-left,2,p1 U p2
u-left,3,(p1 U p2) U p3
u-left,4,((p1 U p2) U p3) U p4
u-left,5,(((p1 U p2) U p3) U p4) U p5
#+end_example
Tools that produce automata (like [[file:ltl2tgba.org][=ltl2tgba=]], or [[file:dstar2tgba.org][=dstar2tgba=]]) have a
=--stats= option that can be used to output various statistics about
the constructed automaton (these statistics are shown *instead* of
printing the automaton).
For instance, the following command will translate all the previous
formulas, and show the resulting number of states (=%s=) and edges
(=%e=) of the automaton constructed for each formula.
#+BEGIN_SRC sh :results verbatim :exports both
genltl --and-gf=1..5 --u-left=1..5 | ltl2tgba -F- --stats '%f,%s,%e'
#+END_SRC
#+RESULTS:
#+begin_example
GFp1,1,2
G(Fp1 & Fp2),1,4
G(Fp1 & Fp2 & Fp3),1,8
G(Fp1 & Fp2 & Fp3 & Fp4),1,16
G(Fp1 & Fp2 & Fp3 & Fp4 & Fp5),1,32
p1,2,2
p1 U p2,2,3
(p1 U p2) U p3,4,10
((p1 U p2) U p3) U p4,8,34
(((p1 U p2) U p3) U p4) U p5,16,116
#+end_example
If the translated formulas may contain commas, or double-quotes, this
simple output may prove difficult to process by other tools. For
instance consider the translation of the following two formulas:
#+BEGIN_SRC sh :results verbatim :exports both
ltl2tgba -f Xa -f 'G("switch == on" -> F"tab[3,5] < 12")' --stats '%f,%s,%e'
#+END_SRC
#+RESULTS:
: Xa,3,3
: G(!"switch == on" | F"tab[3,5] < 12"),2,4
The second line of this input does no conform to [[https://www.rfc-editor.org/rfc/rfc4180.txt][RFC 4180]] because
non-escaped fields are not allowed to contain comma or double-quotes.
To fix this, use =ltl2tgba='s =--csv-escape= option: this causes
"=%f=" to produce a double-quoted string properly escaped.
#+BEGIN_SRC sh :results verbatim :exports both
ltl2tgba -f Xa -f 'G("switch == on" -> F"tab[3,5] < 12")' --stats '%f,%s,%e' --csv-escape
#+END_SRC
#+RESULTS:
: "Xa",3,3
: "G(!""switch == on"" | F""tab[3,5] < 12"")",2,4
The tool [[file:ltlcross.org][=ltlcross=]] has its own =--csv=FILENAME= option to format the
statistics it gathers in a CSV file, but you have very little control
hover how this CSV file is formatted (it can only be changed
via option such as =--products= or =--omit-missing=).
* Reading CSV files
All the tools that read formulas from files extend the filename syntax
to support the specification of a CSV column. The notation
=filename/COL= denotes the column =COL= of that file.
For instance let's consider the file =gen.csv= built with the first command of
this page. It contains:
#+BEGIN_SRC sh :results verbatim :exports results
cat gen.csv
#+END_SRC
#+RESULTS:
#+begin_example
and-gf,1,GFp1
and-gf,2,GFp1 & GFp2
and-gf,3,GFp1 & GFp2 & GFp3
and-gf,4,GFp1 & GFp2 & GFp3 & GFp4
and-gf,5,GFp1 & GFp2 & GFp3 & GFp4 & GFp5
u-left,1,p1
u-left,2,p1 U p2
u-left,3,(p1 U p2) U p3
u-left,4,((p1 U p2) U p3) U p4
u-left,5,(((p1 U p2) U p3) U p4) U p5
#+end_example
We can run =ltl2tgba= on the third column to produce
the same output as in a previous example:
#+BEGIN_SRC sh :results verbatim :exports both
ltl2tgba -F gen.csv/3 --stats '%f,%s,%e'
#+END_SRC
#+RESULTS:
#+begin_example
GFp1,1,2
G(Fp1 & Fp2),1,4
G(Fp1 & Fp2 & Fp3),1,8
G(Fp1 & Fp2 & Fp3 & Fp4),1,16
G(Fp1 & Fp2 & Fp3 & Fp4 & Fp5),1,32
p1,2,2
p1 U p2,2,3
(p1 U p2) U p3,4,10
((p1 U p2) U p3) U p4,8,34
(((p1 U p2) U p3) U p4) U p5,16,116
#+end_example
When =ltlfilt= is used on a CSV file, it will preserve the
text before and after the matched formula in the CSV file.
For instance:
#+BEGIN_SRC sh :results verbatim :exports both
ltlfilt -F gen.csv/3 --size-min=8 --relabel=abc
#+END_SRC
#+RESULTS:
: and-gf,3,GFa & GFb & GFc
: and-gf,4,GFa & GFb & GFc & GFd
: and-gf,5,GFa & GFb & GFc & GFd & GFe
: u-left,5,(((a U b) U c) U d) U e
For security, in case a formula may contain double-quotes or
commas, you should use the =--csv-escape= option:
#+BEGIN_SRC sh :results verbatim :exports both
ltlfilt -F gen.csv/3 --size-min=8 --relabel=abc --csv-escape
#+END_SRC
#+RESULTS:
: and-gf,3,"GFa & GFb & GFc"
: and-gf,4,"GFa & GFb & GFc & GFd"
: and-gf,5,"GFa & GFb & GFc & GFd & GFe"
: u-left,5,"(((a U b) U c) U d) U e"
The preservation in the output of the text before and after the
selected column can be altered using the =--format= option. The =%<=
escape sequence represent the (comma-separated) data of all the
columns before the selected column, and =%>= is the same for the
trailing data. Note that the comma that separate formulas' column
from the other column are excluded and should be added in the format
string.
For instance this moves the first two columns after the formulas.
#+BEGIN_SRC sh :results verbatim :exports both
ltlfilt -F gen.csv/3 --size-min=8 --csv-escape --format='%f,%<'
#+END_SRC
#+RESULTS:
: "GFp1 & GFp2 & GFp3",and-gf,3
: "GFp1 & GFp2 & GFp3 & GFp4",and-gf,4
: "GFp1 & GFp2 & GFp3 & GFp4 & GFp5",and-gf,5
: "(((p1 U p2) U p3) U p4) U p5",u-left,5
Typical uses of =ltlfilt= on CSV file include:
- Filtering lines based on an LTL criterion, as above.
- Changing the syntax of LTL formulas. For instance =ltl2tgba='s
=--stats= option, and =ltlcross='s =--csv= option always output
formulas in Spot's format. If that is inappropriate, simply
use =ltlfilt= to rewrite the relevant column in your prefered
syntax.
* Dealing with header lines
Some CSV contain a header lines that should not be processed.
The CSV file produced by =ltlcross= have such a line:
#+BEGIN_SRC sh :results verbatim :exports both
randltl -n 2 a b | ltlfilt --remove-wm |
ltlcross --csv=results.csv 'ltl2tgba -s %f >%N' 'ltl3ba -f %s >%N'
cat results.csv
#+END_SRC
#+RESULTS:
: "formula","tool","exit_status","exit_code","time","states","edges","transitions","acc","scc","nonacc_scc","terminal_scc","weak_scc","strong_scc","nondet_states","nondet_aut","terminal_aut","weak_aut","strong_aut","product_states","product_transitions","product_scc"
: "(1)","ltl2tgba -s %f >%N","ok",0,0.0247303,1,1,1,1,1,0,1,0,0,0,0,1,0,0,200,3994,1
: "(1)","ltl3ba -f %s >%N","ok",0,0.00314673,1,1,1,1,1,0,1,0,0,0,0,1,0,0,200,3994,1
: "(0)","ltl2tgba -s %f >%N","ok",0,0.0246916,1,1,0,1,1,1,0,0,0,0,0,1,0,0,1,0,1
: "(0)","ltl3ba -f %s >%N","ok",0,0.00343519,1,0,0,1,1,1,0,0,0,0,0,1,0,0,1,0,1
: "(!(G((F(b)) | (F(!((b) | (G(b))))))))","ltl2tgba -s %f >%N","ok",0,0.0233752,1,1,0,1,1,1,0,0,0,0,0,1,0,0,1,0,1
: "(!(G((F(b)) | (F(!((b) | (G(b))))))))","ltl3ba -f %s >%N","ok",0,0.00316933,1,0,0,1,1,1,0,0,0,0,0,1,0,0,1,0,1
: "(G((F(b)) | (F(!((b) | (G(b)))))))","ltl2tgba -s %f >%N","ok",0,0.0238983,1,1,1,1,1,0,1,0,0,0,0,1,0,0,200,4083,1
: "(G((F(b)) | (F(!((b) | (G(b)))))))","ltl3ba -f %s >%N","ok",0,0.00315896,1,1,1,1,1,0,1,0,0,0,0,1,0,0,200,4083,1
If we run =ltlfilt= on the first column, it will process the =formula=
header as if it was an LTL formula.
#+BEGIN_SRC sh :results verbatim :exports both
ltlfilt -F results.csv/1 --format='%f' --unique
#+END_SRC
#+RESULTS:
: formula
: 1
: 0
: !G(Fb | F!(b | Gb))
: G(Fb | F!(b | Gb))
In such case, the syntax =FILENAME/-COL= (with a minus sign before the
column number) can be used to discard the first line of a CSV file.
#+BEGIN_SRC sh :results verbatim :exports both
ltlfilt -F results.csv/-1 --format='%f' --unique
#+END_SRC
#+RESULTS:
: 1
: 0
: !G(Fb | F!(b | Gb))
: G(Fb | F!(b | Gb))

View file

@ -16,7 +16,10 @@ ltl2tgba --help | sed -n '/Input options:/,/^$/p' | sed '1d;$d'
#+END_SRC
#+RESULTS:
: -f, --formula=STRING process the formula STRING
: -F, --file=FILENAME process each line of FILENAME as a formula
: -F, --file=FILENAME[/COL] process each line of FILENAME as a formula; if COL
: is a positive integer, assume a CSV file and read
: column COL; usea negative COL to drop the first
: line of the CSV file
: --lbt-input read all formulas using LBT's prefix syntax
: --lenient parenthesized blocks that cannot be parsed as
: subformulas are considered as atomic properties
@ -27,7 +30,12 @@ be repeated to pass multiple formulas.
=-F= is used to read formulas from a file (one formula per line).
This option can also be repeated to pass multiple files. If the
filename specified is =-= (as in =-F-=), then formulas are read from
standard input.
standard input. If a filename is suffixed with =/COL=, where =COL= is
a positive integer, then the file is assumed to be a CSV file, and
formulas are read from its =COL=-th column. Use =/-COL= to read from
column =COL= and ignore the first line of the CSV file (which often
contains column headers). We have [[file:csv.org][examples of reading or writing CSV
files on a separate page]].
** Default parser
@ -132,6 +140,7 @@ used by [[http://www.ltl2dstar.de][ltl2dstar]].
=--lbt-input= is a global option that affects *all* formulas that are read.
* Common output options
All tools that output LTL/PSL formulas implement the following options:
@ -140,15 +149,18 @@ All tools that output LTL/PSL formulas implement the following options:
ltlfilt --help | sed -n '/Output options:/,/^$/p' | sed '1d;$d'
#+END_SRC
#+RESULTS:
: -8, --utf8 output using UTF-8 characters
: --format=FORMAT specify how each line should be output (default:
: "%f")
: -l, --lbt output in LBT's syntax
: --latex output using LaTeX macros
: -p, --full-parentheses output fully-parenthesized formulas
: -s, --spin output in Spin's syntax
: --spot output in Spot's syntax (default)
: --wring output in Wring's syntax
#+begin_example
-8, --utf8 output using UTF-8 characters
--csv quote the formula for use in a CSV file
--format=FORMAT specify how each line should be output (default:
"%f")
-l, --lbt output in LBT's syntax
--latex output using LaTeX macros
-p, --full-parentheses output fully-parenthesized formulas
-s, --spin output in Spin's syntax
--spot output in Spot's syntax (default)
--wring output in Wring's syntax
#+end_example
# LocalWords: syntaxes LTL PSL num toc SRC ltl tgba sed FILENAME
@ -171,16 +183,23 @@ with read it as =(a U b) U c=.
The =--lbt= option requests an output in LBT's prefix format, and in
that case discussing associativity and parentheses makes no sense.
The =--csv= causes the formulas to be double-quoted (with inner
double-quotes doubled, as per RFC 4180), regardless of the selected
format. This is needed if the formula should appear in a CSV file,
and you want to be robust to formulas that contains commas or
double-quotes. We have [[file:csv.org][examples of reading or writing CSV files on a
separate page]].
The =--format= option can be used to fine-tune the way the formula is
output. Not using the =--format= option is equivalent to using
=--format=%f=. The semantic of the available =%=-sequences differ
from tool to tool:
| | =%f= | =%F= | =%L= |
|-----------+----------------+----------------+-------------------|
| [[file:ltlfilt.org][=ltlfilt=]] | output formula | input filename | input line |
| [[file:genltl.org][=genltl=]] | output formula | pattern name | pattern parameter |
| [[file:randltl.org][=randltl=]] | output formula | (empty) | formula number |
| | =%f= | =%F= | =%L= | =%<= | =%>= |
|-----------+----------------+----------------+-------------------+--------------+---------------|
| [[file:ltlfilt.org][=ltlfilt=]] | output formula | input filename | input line | leading text | trailing text |
| [[file:genltl.org][=genltl=]] | output formula | pattern name | pattern parameter | (empty) | (empty) |
| [[file:randltl.org][=randltl=]] | output formula | (empty) | formula number | (empty) | (empty) |
# LocalWords: lbt LBT's filename UTF gfa GFa ltlfilt LBTT scheck
# LocalWords: utf associativity

View file

@ -551,15 +551,19 @@ following sequence of characters (other characters are output as-is):
ltl2tgba --help | sed -n '/^ *%/p'
#+END_SRC
#+RESULTS:
: %% a single %
: %a number of acceptance sets
: %c number of SCCs
: %d 1 if the automaton is deterministic, 0 otherwise
: %e number of edges
: %f the formula, in Spot's syntax
: %n number of nondeterministic states
: %s number of states
: %t number of transitions
#+begin_example
%% a single %
%a number of acceptance sets
%c number of SCCs
%d 1 if the automaton is deterministic, 0 otherwise
%e number of edges
%f the formula, in Spot's syntax
%n number of nondeterministic states
%p 1 if the automaton is complete, 0 otherwise
%r translation time (including pre- and
%s number of states
%t number of transitions
#+end_example
For instance we can study the size of the automata generated for the
right-nested =U= formulas as follows:
@ -592,6 +596,9 @@ Two automata with the same structures (states and edges) but differing
labels, may have a different count of transitions, e.g., if one has
more restricted labels.
[[file:csv.org][More examples of how to use =--stats= to create CSV
files are on a separate page]].
* Building Monitors
In addition to TGBA and BA, =ltl2tgba= can output /monitor/ using the

View file

@ -351,6 +351,9 @@ ltlfilt --lbt-input -F scheck.ltl --guarantee --format=%L
: 3
: 4
[[file:csv.org][More examples of how to use =--format= to create CSV files are on a
separate page]]
# LocalWords: ltlfilt num toc LTL PSL syntaxes LBT's SRC GFp scheck
# LocalWords: ltl EOF lbt Gp Fp Xp XFp XXp randltl ary nnf wm abc
# LocalWords: pnn Xb Fc XFb XXd sed boolean bsize nox Gb Fb Xa XGb

View file

@ -46,6 +46,7 @@ corresponding commands are hidden.
* Advanced uses
- [[file:csv.org][Reading and writing CSV files]]
- [[file:satmin.org][SAT-based minimization of Deterministic (Generalized) Büchi automata]]
* Citing