diff --git a/NEWS b/NEWS index bb01adf24..8d84215c2 100644 --- a/NEWS +++ b/NEWS @@ -19,6 +19,13 @@ New in spot 2.2.2.dev (Not yet released) * autfilt has a new option '--highlight-languages' that helps to see graphically which states of an automaton recognize the same languages. + * autfilt '--sat-minimize' option and common '-x sat-minimize' option have + undergone some changes because some new algorithms have been implemented + and improvements have been made. After benchmarks (that you can reproduce, + take a look at bench/dtgbasat), the dichotomy proves to be more effective. + It is now used by default. Please, read the man page of spot-x for further + details. + Library: * A twa is required to have at least one state, the initial state. @@ -80,6 +87,29 @@ New in spot 2.2.2.dev (Not yet released) * language_map() and highlight_languages() are new functions used to implement the --highlight-languages command line option described above. + * dt*a_sat_minimize_dichotomy() now uses language_map() algo to find the + lower bound of the binary search. + + * Memory usage of SAT-based minimization is now fully reduced. + Those optimizations relies on the fact that almost everything about + the candidate automaton is known in advance and all litterals used + to encode the SAT problem are continuous. + + * There is two new algorithms of SAT-based minimization of ω-automata: + dt*a_sat_minimize_incr() and dt*a_sat_minimize_assume(). They are + implemented to use everything they learn when they get the N-1 size + automaton - i.e. all the encoding is preserved. Some clauses are just + added gradually to delete more states. For more details, especially the + difference between them, read spot/twaalgos/dt*asat.hh headers. + + * Spot introduces a new environment variable "SPOT_XCNF". Incremental + SAT competitors, this is for you. During any SAT-based minimization, SPOT + can generate the XCNF file corresponding to the incremental resolution from + the starting automaton to the minimal one. The file will be outputed as + 'incr.xcnf' in the folder path given throught "SPOT_XCNF". However, this + feature requires the use of an external SAT solver throught + "SPOT_SATSOLVER". See man page of spot-x for details. + Build: * The configure script has a new option --enable-c++14, to compile in @@ -87,6 +117,11 @@ New in spot 2.2.2.dev (Not yet released) to check that nothing breaks when we will switch to C++14. This option is also available in the configure script of buddy. + * Spot is now distributed with a SAT-solver, PicoSAT 965. This integration + takes place to optimize SAT-based minimization of ω-automata. However, it + is still possible to use another SAT-solver already installed thanks to + the "SPOT_SATSOLVER" environment variable. + New in spot 2.2.2 (2016-12-16) Build: diff --git a/bin/autfilt.cc b/bin/autfilt.cc index 638d05136..f0aa9bd48 100644 --- a/bin/autfilt.cc +++ b/bin/autfilt.cc @@ -143,6 +143,8 @@ enum { OPT_WEAK_SCCS, }; +#define DOC(NAME, TXT) NAME, 0, nullptr, OPTION_DOC | OPTION_NO_USAGE, TXT, 0 + static const argp_option options[] = { /**************************************************/ @@ -317,9 +319,13 @@ static const argp_option options[] = "Fin(x) by a new Fin(y) and adjust the automaton", 0 }, { "sat-minimize", OPT_SAT_MINIMIZE, "options", OPTION_ARG_OPTIONAL, "minimize the automaton using a SAT solver (only works for deterministic" - " automata)", 0 }, - /**************************************************/ - { nullptr, 0, nullptr, 0, "Decorations (for -d and -H1.1 output):", 8 }, + " automata). Supported options are acc=STRING, states=N, max-states=N, " + "sat-incr=N, sat-incr-steps=N, sat-langmap, sat-naive, colored, preproc=N" + ". Spot uses by default its PicoSAT distribution but an external SAT" + "solver can be set thanks to the SPOT_SATSOLVER environment variable" + "(see spot-x)." + , 0 }, + { nullptr, 0, nullptr, 0, "Decorations (for -d and -H1.1 output):", 9 }, { "highlight-nondet-states", OPT_HIGHLIGHT_NONDET_STATES, "NUM", OPTION_ARG_OPTIONAL, "highlight nondeterministic states with color NUM", 0 }, diff --git a/bin/man/autfilt.x b/bin/man/autfilt.x index f5f572981..40e32b77a 100644 --- a/bin/man/autfilt.x +++ b/bin/man/autfilt.x @@ -3,6 +3,70 @@ autfilt \- filter, convert, and transform omega-automata [DESCRIPTION] .\" Add any additional description here +[OPTIONS FOR SAT\-MINIMIZATION] +.TP +\fB\fP +By default, SAT\-based minimization executes a binary search, checking N/2 etc. +The upper bound being N (the size of the starting automaton), the lower bound +is always 1 except when \fBsat-langmap\fR option is used. + +.TP +\fBacc=DOUBLEQUOTEDSTRING\fP +DOUBLEQUOTEDSTRING is an acceptance formula in the HOA syntax, or a +parametrized acceptance name (the different acc\-name: options from HOA). + +.TP +\fBcolored\fP +force all transitions (or all states if \fB\-S\fR is used) to belong to exactly +one acceptance condition. + +.TP +\fBmax\-states=M\fP +M is an upper-bound on the maximum number of states of the constructed +automaton. + +.TP +\fBsat\-incr=M\fP +use an incremental approach for SAT-based minimization algorithm. M can be +either \fB1\fR or \fB2\fR. They correspond respectively to +\fB\-x sat\-minimize=2\fR and \fB\-x sat\-minimize=3\fR options. They restart +the encoding only after (N\-1)\-\fBsat\-incr\-steps\fR states have been won. +Each iterations of both starts by encoding the research of an N\-1 automaton, +N being the size of the starting automaton. \fB1\fR uses Picosat assumptions. +It additionally assumes that the last \fBsat-incr-steps\fR states are +unnecessary. On failure, it relax the assumptions to do a binary search +between N\-1 and (N\-1)\-\fBsat-incr-steps\fR. \fBsat-incr-steps\fR defaults +to 6. \fB2\fR, as for it, after an N-1 state automaton has been found, uses +incremental solving for the next \fBsat\-incr\-steps\fR iterations by forbidding +the usage of an additional state without reencoding the problem again. A full +encoding occurs after \fBsat\-incr\-steps\fR iterations unless +\fBsat\-incr\-steps=\-1\fR (see SPOT_XCNF environment variable described in +spot\-x). It defaults to 2. + +.TP +\fBsat\-incr\-steps=M\fP +set the value of \fBsat\-incr\-steps\fR to M. This is used by \fBsat\-incr\fR +option. + +.TP +\fBsat-naive\fP +use the naive algorithm to find a smaller automaton. It starts from N (N being +the size of the starting automaton) and then checks N\-1, N\-2, etc. until the +last successful check. + +.TP +\fBsat-langmap\fP +Find the lower bound of default sat\-minimize procedure (1). This relies on the +fact that the size of the minimal automaton is at least equal to the total +number of different languages recognized by the automaton's states. + +.TP +\fBstates=M\fP +M is a fixed number of states to use in the result (all the states needs +not be accessible in the result. Therefore, the output might be smaller +nonetheless). The SAT\-based procedure is just used once to synthesize +one automaton, and no further minimization is attempted. + [BIBLIOGRAPHY] The following papers are related to some of the transformations implemented in autfilt. diff --git a/bin/man/spot-x.x b/bin/man/spot-x.x index cf1f984e2..68f11087d 100644 --- a/bin/man/spot-x.x +++ b/bin/man/spot-x.x @@ -9,8 +9,36 @@ spot-x \- Common fine-tuning options and environment variables. [DESCRIPTION] .\" Add any additional description here -[ENVIRONMENT VARIABLES] +[SAT\-MINIMIZE VALUES] +.TP +\fB1\fR +Used by default, \fB1\fR performs a binary search, checking N/2, etc. +The upper bound being N (the size of the starting automaton), the lower bound +is always 1 except when \fBsat-langmap\fR option is used. +.TP +\fB2\fR +Use PicoSAT assumptions. Each iteration encodes the search of an (N\-1) state +equivalent automaton, and additionally assumes that the last +\fBsat\-incr\-steps\fR states are unnecessary. On failure, relax the assumptions +to do a binary search between N\-1 and N\-1\-\fBsat\-incr\-steps\fR. +\fBsat\-incr\-steps\fR defaults to 6. + +.TP +\fB3\fR +After an (N\-1) state automaton has been found, use incremental solving for +the next \fBsat\-incr\-steps\fR iterations by forbidding the usage of an +additional state without reencoding the problem again. A full encoding will +occur after \fBsat\-incr\-steps\fR iterations unless \fBsat\-incr\-steps=-1\fR +(see \fBSPOT_XCNF\fR environment variable). \fBsat\-incr\-steps\fR defaults to +2. + +.TP +\fB4\fR +This naive method tries to reduce the size of the automaton one state at a +time. Note that it restarts all the encoding each time. + +[ENVIRONMENT VARIABLES] .TP \fBSPOT_DEFAULT_FORMAT\fR Set to a value of \fBdot\fR or \fBhoa\fR to override the default @@ -67,14 +95,14 @@ problem. .TP \fBSPOT_SATSOLVER\fR -If set, this variable should indicate how to call a SAT\-solver. This -is used by the sat\-minimize option described above. The default -value is \f(CW"glucose -verb=0 -model %I >%O"\fR, it is correct for -glucose version 3.0 (for older versions, remove the \fCW(-model\fR -option). The escape sequences \f(CW%I\fR and \f(CW%O\fR respectively +If set, this variable should indicate how to call an external +SAT\-solver \- by default, Spot uses PicoSAT, which is distributed +with. This is used by the sat\-minimize option described above. +The format to follow is the following: \f(CW" [options] %I >%O"\fR. +The escape sequences \f(CW%I\fR and \f(CW%O\fR respectively denote the names of the input and output files. These temporary files are created in the directory specified by \fBSPOT_TMPDIR\fR or -\fBTMPDIR\fR (see below). The SAT-solver should follow the convention +\fBTMPDIR\fR (see below). The SAT\-solver should follow the convention of the SAT Competition for its input and output format. .TP @@ -115,6 +143,17 @@ current directory. When this variable is defined, temporary files are not removed. This is mostly useful for debugging. +.TP +\fBSPOT_XCNF\fR +Assign a folder path to this variable to generate XCNF files whenever +SAT\-based minimization is used \- the file is outputed as "incr.xcnf" +in the specified directory. This feature works only with an external +SAT\-solver. See \fBSPOT_SATSOLVER\fR to know how to provide one. Also note +that this needs an incremental approach without restarting the encoding i.e +"sat\-minimize=3,param=-1" for ltl2tgba and ltl2tgta or "incr,param=-1" for +autfilt (see sat\-minimize options described above or autfilt man page). +The XCNF format is the one used by the SAT incremental competition. + [BIBLIOGRAPHY] .TP 1. diff --git a/bin/spot-x.cc b/bin/spot-x.cc index 7d9cd893d..d191d9a19 100644 --- a/bin/spot-x.cc +++ b/bin/spot-x.cc @@ -113,15 +113,20 @@ Enabled by default.") }, if the TGBA is not already deterministic. Doing so will degeneralize \ the automaton. This is disabled by default, unless sat-minimize is set.") }, { DOC("sat-minimize", - "Set to 1 to enable SAT-based minimization of deterministic \ -TGBA: it starts with the number of states of the input, and iteratively \ -tries to find a deterministic TGBA with one less state. Set to 2 to perform \ -a binary search instead. Disabled (0) by default. The sat solver to use \ -can be set with the SPOT_SATSOLVER environment variable (see below). By \ -default the procedure looks for a TGBA with the same number of acceptance \ -set; this can be changed with the sat-acc option, or of course by using -B \ -to construct a Büchi automaton. Enabling SAT-based minimization will \ -also enable tba-det.") }, + "Set it to enable SAT-based minimization of deterministic \ +TGBA. Depending on its value (from 1 to 4) it changes the algorithm \ +to perform. The default value is (1) and it proves to be the most effective \ +method. SAT-based minimization uses PicoSAT (distributed with Spot), but \ +another installed SAT-solver can be set thanks to the SPOT_SATSOLVER \ +environment variable. Enabling SAT-based minimization will also enable \ +tba-det.") }, + { DOC("sat-incr-steps", "Set the value of sat-incr-steps. This variable \ +is used by two SAT-based minimization algorithms: (2) and (3). They are both \ +described below.") }, + { DOC("sat-langmap", "Find the lower bound of default sat-minimize \ +procedure (1). This relies on the fact that the size of the minimal automaton \ +is at least equal to the total number of different languages recognized by \ +the automaton's states.") }, { DOC("sat-states", "When this is set to some positive integer, the SAT-based \ minimization will attempt to construct a TGBA with the given number of \ diff --git a/doc/org/satmin.org b/doc/org/satmin.org index c81a46981..f1a7b73fa 100644 --- a/doc/org/satmin.org +++ b/doc/org/satmin.org @@ -4,6 +4,11 @@ #+SETUPFILE: setup.org #+HTML_LINK_UP: tools.html +#+NAME: version +#+BEGIN_SRC sh :exports none +head -n 1 ../../picosat/VERSION | tr -d '\n' +#+END_SRC + This page explains how to use [[file:ltl2tgba.org][=ltl2tgba=]], [[file:dstar2tgba.org][=dstar2tgba=]], or [[file:autfilt.org][=autfilt=]] to minimize deterministic automata using a SAT solver. @@ -23,8 +28,9 @@ Let us first state a few facts about this minimization procedure. 3) These two procedures can optionally constrain their output to use state-based acceptance. (They simply restrict all the outgoing transitions of a state to belong to the same acceptance sets.) -4) A SAT solver should be installed for this to work. (Spot does not - distribute any SAT solver.) +4) Spot distributes a SAT solver, PicoSAT call_version()[:results raw]. This solver was chosen for its performances, simplicity of integration and licence compatible with Spot's one. + However, it is still possible to use an external SAT solver (as described + below). 5) [[file:ltl2tgba.org][=ltl2tgba=]] and [[file:dstar2tgba.org][=dstar2tgba=]] will always try to output an automaton. If they fail to determinize the property, they will simply output a nondeterministic automaton, if they managed to obtain a @@ -44,15 +50,11 @@ Let us first state a few facts about this minimization procedure. * How to change the SAT solver used The environment variable =SPOT_SATSOLVER= can be used to change the -SAT solver used by Spot. The default is "=glucose -verb=0 -model %I ->%O=", therefore if you have installed [[http://www.labri.fr/perso/lsimon/glucose/][=glucose= 3.0]] in your =$PATH=, -it should work right away. Otherwise you may redefine this variable -to point the correct location or to another SAT solver (for older -versions of glucose, remove the =-model= option). The =%I= and =%O= -sequences will be replaced by the names of temporary files containing -the input for the SAT solver and receiving its output. We assume that -the SAT solver should follow the conventions of the [[http://www.satcompetition.org/][SAT competition]] -for input and output. +SAT solver used by Spot. By default it uses the one distributed with (PicoSAT +call_version()[:results raw]). +Here is the expected format of =SPOT_SATSOLVER= : "= [options] %I >%O=". The =%I= and =%O= sequences will be replaced by the names of temporary files containing the input for the SAT solver and receiving its output. +If you have installed the corresponding binary in your =$PATH=, it should work right away. Otherwise you may redefine this variable to point the correct location of the SAT solver. +We assume that the SAT solver should follow the conventions of the [[http://www.satcompetition.org/][SAT competition]] for input and output. * Enabling SAT-based minimization in =ltl2tgba= or =dstar2tgba= @@ -337,15 +339,35 @@ situations where the tools will produce non-deterministic automata. The following options can be used to fine-tune this procedure: - =-x tba-det= :: attempt a powerset construction and check if - there exists a acceptance set such that the - resulting DTBA is equivalent to the input -- =-x sat-minimize= :: enable SAT-based minimization. By default it - tries to reduce the size of the automaton one state at a time. - This option implies =-x tba-det=. -- =-x sat-minimize=2= :: enabled SAT-based minimization, but perform a - dichotomy to locate the correct automaton size. Use this only if - you suspect that the optimal size is far away from the input - size. This option implies =-x tba-det=. + there exists an acceptance set such that the + resulting DTBA is equivalent to the input. +- =-x sat-minimize= :: enable SAT-based minimization. It is the same as + =-x sat-minimize=1= (which is the default value). It performs a dichotomy + to find the correct automaton size.This option implies =-x tba-det=. +- =-x sat-minimize=[2|3]= :: enable SAT-based + minimization. Let us consider each intermediate automaton as a =step= + towards the minimal automaton and assume =N= as the size of the starting + automaton. =2= and =3= have been implemented with the aim of not + restarting the encoding from scratch at each step. To do so, both restart + the encoding after =N-1-(sat-incr-steps)= states have been won. Now, + where is the difference? They both start by encoding the research of the + =N-1= step and then: + - =2= uses PicoSAT assumptions. It adds =sat-incr-steps= assumptions + (each of them removing one more state) and then checks direclty the + =N-1-(sat-incr-steps)= step. If such automaton is found, the process is + restarted. Otherwise, a binary search begins between =N-1= and + =N-1-sat-incr-steps=. If not provided, =sat-incr-steps= default value + is 6. + - =3= checks incrementally each =N-(2+i)= step, =i= ranging from =0= to + =sat-incr-steps=. This process is fully repeated until the minimal + automaton is found. The last SAT problem solved correspond to the + minimal automaton. =sat-incr-steps= defaults to 2. + Both implies =-x tba-det=. +- =-x sat-minimize=4= :: enable SAT-based minimization. It tries to reduce the + size of the automaton one state at a time. This option implies + =-x tba-det=. +- =-x sat-incr-steps=N= :: set the value of =sat-incr-steps= to N. It doest not + make sense to use it without =-x sat-minimize=2= or =-x sat-minimize=3=. - =-x sat-acc=$m$= :: attempt to build a minimal DTGBA with $m$ acceptance sets. This options implies =-x sat-minimize=. - =-x sat-states=$n$= :: attempt to build an equivalent DTGBA with $n$ @@ -666,6 +688,13 @@ $txt [[file:autfiltsm8.png]] +By default, the SAT-based minimization tries to find a smaller automaton by +performing a binary search starting from =N/2= (N being the size of the +starting automaton). After various benchmarks, this algorithm proves to be the +best. However, in some cases, other rather similar methods might be better. The +algorithm to execute and some other parameters can be set thanks to the +=--sat-minimize= option. + The =--sat-minimize= option takes a comma separated list of arguments that can be any of the following: @@ -679,9 +708,33 @@ that can be any of the following: so the output might be smaller nonetheless). If this option is used the SAT-based procedure is just used once to synthesize one automaton, and no further minimization is attempted. -- =dichotomy= :: instead of looking for a smaller automaton starting - from =N=, and then checking =N-1=, =N-2=, etc., do a binary - search starting from =N/2=. +- =sat-incr=[1|2]= :: =1= and =2= correspond respectively to + =-x sat-minimize=2= and =-x sat-minimize=3= options. + They have been implemented with the aim of not restarting the + encoding from scratch at each step - a step is a minimized intermediate + automaton. To do so, both restart the encoding after + =N-1-(sat-incr-steps)= states have been won - =N= being the size of the + starting automaton. Now, where is the difference? They both start by + encoding the research of the =N-1= step and then: + - =1= uses PicoSAT assumptions. It adds =steps= assumptions (each of + them removing one more state) and then checks direclty the + =N-1-(sat-incr-steps)= step. If such automaton is found, the process is + restarted. Otherwise, a binary search begins between =N-1= and + =N-1-sat-incr-steps=. If not provided, =sat-incr-steps= defaults to 6. + - =2= checks incrementally each =N-(2+i)= step, =i= ranging from =0= to + =sat-incr-steps=. This process is fully repeated until the minimal + automaton is found. The last SAT problem solved correspond to the + minimal automaton. =sat-incr-steps= defaults to 2. + Both implies =-x tba-det=. +- =sat-incr-steps=N= :: set the value of =sat-incr-steps= to =N=. + This is used by =sat-incr= option. +- =sat-naive= :: use the =naive= algorithm to find a smaller automaton. It starts + from =N= and then checks =N-1=, =N-2=, etc. until the last successful + check. +- =sat-langmap= :: Find the lower bound of default sat-minimize procedure. This + relies on the fact that the size of the minimal automaton is at least equal + to the total number of different languages recognized by the automaton's + states. - =colored= :: force all transitions (or all states if =-S= is used) to belong to exactly one acceptance condition. @@ -802,6 +855,7 @@ The generated CSV file use the following columns: - system time for encoding the SAT problem - user time for solving the SAT problem - system time for solving the SAT problem +- automaton produced Times are measured with the times() function, and expressed in ticks (usually: 1/100 of seconds).