simulation: heuristically use a separated-label approach to rebuild

Closes issue #568.

* spot/twaalgos/simulation.cc (direct_simulation::build_result):
Implement an alternate loop based on edge_separator::basis to iterate
over a signature to build results.
* tests/core/568.test: New file.
* tests/Makefile.am: Add it.
* NEWS: Mention the optimization.
This commit is contained in:
Alexandre Duret-Lutz 2024-03-22 23:01:50 +01:00
parent 7ee2d9995f
commit bda40a5f19
4 changed files with 199 additions and 55 deletions

12
NEWS
View file

@ -165,12 +165,12 @@ New in spot 2.11.6.dev (not yet released)
- spot::dualize() learned a trick to be faster on states that have - spot::dualize() learned a trick to be faster on states that have
less outgoing edges than atomic proposition declared on the less outgoing edges than atomic proposition declared on the
automaton. spot::remove_alternation() and automaton. spot::remove_alternation(), spot::tgba_powerset(), and
spot::tgba_determinize() learned a similar trick, except it isn't simulation-based reductions learned a similar trick, except it
applied at the state level but of the entire alternating use few isn't applied at the state level but if the entire automaton use
distinct labels. These changes may speed up the complementation few distinct labels. These changes may speed up the processing of
of some very weak automata, and the minimization of some automata with many atomic propositions but few distinct labels.
WDBA. (Issue #566.) (Issue #566 and issue #568.)
- [Potential backward incompatibility] spot::dualize() does not call - [Potential backward incompatibility] spot::dualize() does not call
cleanup_acceptance() anymore. This change ensures that the dual cleanup_acceptance() anymore. This change ensures that the dual

View file

@ -29,6 +29,8 @@
#include <spot/twaalgos/isdet.hh> #include <spot/twaalgos/isdet.hh>
#include <spot/misc/bddlt.hh> #include <spot/misc/bddlt.hh>
#include <spot/twaalgos/cleanacc.hh> #include <spot/twaalgos/cleanacc.hh>
#include <spot/twaalgos/split.hh>
#include <spot/priv/robin_hood.hh>
// Work around GCC bug 80947 (dominates_edge is causing spurious // Work around GCC bug 80947 (dominates_edge is causing spurious
// visibility warnings) // visibility warnings)
@ -507,6 +509,35 @@ namespace spot
res->copy_ap_of(a_); res->copy_ap_of(a_);
res->copy_acceptance_of(a_); res->copy_acceptance_of(a_);
// We have two ways of "spliting" a signature to create the
// outgoing edges. One is to iterate over 2^AP, then collect
// the destinations. The second is to first create a coarser
// basis for the original set of labels, and then iterate on
// this basis. The latter is good when we have few distinct
// labels. With too many different labels that may have
// nonempty intersections, the basis approach can consume a
// lot of memory. We have to heuristically select between
// those two.
unsigned nap = res->ap().size();
bool will_use_basis = nap > 5;
edge_separator es;
if (will_use_basis)
// Gather all labels, but stop if we see too many. The
// threshold below is arbitrary: adjust if you know better.
will_use_basis = es.add_to_basis(a_, 256 * nap);
// We use a cache to avoid the costly loop over the basis.
//
// Cache entries have the form (bdd, [begin, end]) where bdd
// what should be split using the basis, and begin/end denotes
// a range of existing transition numbers that cover the
// split.
//
// std::pair causes some noexcept warnings when used in
// robin_hood::unordered_map with GCC 9.4. Use robin_hood::pair
// instead.
typedef robin_hood::pair<unsigned, unsigned> cached_t;
robin_hood::unordered_map<bdd, cached_t, bdd_hash> split_cond;
auto state_mapping = new std::vector<unsigned>(); auto state_mapping = new std::vector<unsigned>();
state_mapping->resize(a_->num_states()); state_mapping->resize(a_->num_states());
res->set_named_prop("simulated-states", state_mapping); res->set_named_prop("simulated-states", state_mapping);
@ -549,55 +580,23 @@ namespace spot
auto all_inf = all_inf_; auto all_inf = all_inf_;
unsigned srcst = 0; unsigned srcst = 0;
// For each class, we will create
// all the edges between the states.
for (auto& p: sorted_classes_)
{
// All states in p.second have the same class, so just
// pick the class of the first one first one.
bdd src = previous_class_[p->second.front()];
assert(gb->get_state(src.id()) == srcst);
// Get the signature to derive successors. auto create_edges = [&](int srcid, bdd one, bdd dest) {
bdd sig = compute_sig(p->second.front()); // Iterate over all possible destination classes. We
assert(signatures.size() == srcst); // use minato_isop here, because if the same valuation
signatures.push_back(sig); // of atomic properties can go to two different
// classes C1 and C2, iterating on C1 + C2 with other
if (Cosimulation) // means would see C1 then (!C1)C2, instead of C1 then
sig = bdd_compose(sig, bddfalse, bdd_var(bdd_initial)); // C2. With minatop_isop, we ensure that no negative
// class variable will be seen (likewise for
// promises).
// Get all the variables in the signature. minato_isop isop(dest);
bdd sup_sig = bdd_support(sig);
// Get the variable in the signature which represents the
// conditions.
bdd sup_all_atomic_prop = bdd_exist(sup_sig, nonapvars);
// Get the part of the signature composed only with the atomic
// proposition.
bdd all_atomic_prop = bdd_exist(sig, nonapvars);
// First loop over all possible valuations atomic properties.
for (bdd one: minterms_of(all_atomic_prop, sup_all_atomic_prop))
{
// For each possible valuation, iterate over all possible
// destination classes. We use minato_isop here, because
// if the same valuation of atomic properties can go
// to two different classes C1 and C2, iterating on
// C1 + C2 with the above minters_of loop will see
// C1 then (!C1)C2, instead of C1 then C2.
// With minatop_isop, we ensure that the no negative
// class variable will be seen (likewise for promises).
minato_isop isop(bdd_restrict(sig, one));
++nb_minterms; ++nb_minterms;
bdd cond_acc_dest; bdd cond_acc_dest;
while ((cond_acc_dest = isop.next()) != bddfalse) while ((cond_acc_dest = isop.next()) != bddfalse)
{ {
++stat.edges; ++stat.edges;
++nb_minato; ++nb_minato;
// Take the edge, and keep only the variable which // Take the edge, and keep only the variable which
@ -624,14 +623,81 @@ namespace spot
accst[srcst] = acc; accst[srcst] = acc;
acc = {}; acc = {};
} }
gb->new_edge(dst.id(), src.id(), one, acc); gb->new_edge(dst.id(), srcid, one, acc);
} }
else else
{ {
gb->new_edge(src.id(), dst.id(), one, acc); gb->new_edge(srcid, dst.id(), one, acc);
}
}
};
// For each class, we will create
// all the edges between the states.
for (auto& p: sorted_classes_)
{
// All states in p.second have the same class, so just
// pick the class of the first one first one.
bdd src = previous_class_[p->second.front()];
assert(gb->get_state(src.id()) == srcst);
// Get the signature to derive successors.
bdd sig = compute_sig(p->second.front());
assert(signatures.size() == srcst);
signatures.push_back(sig);
if (Cosimulation)
sig = bdd_compose(sig, bddfalse, bdd_var(bdd_initial));
// Get all the variables in the signature.
bdd sup_sig = bdd_support(sig);
// Get the variables in the signature which represent the
// conditions.
bdd sup_all_atomic_prop = bdd_exist(sup_sig, nonapvars);
// Get the part of the signature composed only with the atomic
// proposition.
bdd all_atomic_prop = bdd_exist(sig, nonapvars);
if (!will_use_basis)
{
for (bdd one: minterms_of(all_atomic_prop, sup_all_atomic_prop))
create_edges(src.id(), one, bdd_restrict(sig, one));
}
else
{
auto& [begin, end] = split_cond[all_atomic_prop];
if (begin == end)
{
begin = res->num_edges() + 1;
for (bdd label: es.basis())
create_edges(src.id(), label,
bdd_relprod(label, sig,
res->ap_vars()));
end = res->num_edges() + 1;
}
else
{
// We have already split all_atomic_prop once, so
// we can simply reuse the set of labels we used
// then, avoiding the iteration on es.basis().
auto& g = res->get_graph();
bdd last = bddfalse;
for (unsigned i = begin; i < end; ++i)
{
bdd label = g.edge_storage(i).cond;
if (label == last)
continue;
last = label;
create_edges(src.id(), label,
bdd_relprod(label, sig,
res->ap_vars()));
} }
} }
} }
++srcst; ++srcst;
} }

View file

@ -221,6 +221,7 @@ TESTS_twa = \
core/521.test \ core/521.test \
core/522.test \ core/522.test \
core/566.test \ core/566.test \
core/568.test \
core/acc.test \ core/acc.test \
core/acc2.test \ core/acc2.test \
core/bdddict.test \ core/bdddict.test \

77
tests/core/568.test Executable file
View file

@ -0,0 +1,77 @@
#!/bin/sh
# -*- coding: utf-8 -*-
# Copyright (C) by the Spot authors, see the AUTHORS file for details.
#
# This file is part of Spot, a model checking library.
#
# Spot is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# Spot is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
# or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
# License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
. ./defs
set -e
# For issue #568. Before the patch implemented for #568, running
# simulation-based reduction on automata with many APs was getting
# exponentially slow even if few different labels were used.
#
# For instance we had this:
#
# % genaut --cycle-onehot=10..20 --name='size %L' |
# autfilt --small --stats='%M: %S->%s in %r seconds'
# size 10: 100->10 in 0.0395407 seconds
# size 11: 121->11 in 0.0950484 seconds
# size 12: 144->12 in 0.227828 seconds
# size 13: 169->13 in 0.391545 seconds
# size 14: 196->14 in 0.954784 seconds
# size 15: 225->15 in 2.34656 seconds
# size 16: 256->16 in 5.80549 seconds
# size 17: 289->17 in 14.3545 seconds
# size 18: 324->18 in 47.1589 seconds
# size 19: 361->19 in 138.023 seconds
#
# The test below shows that even with --cycle-onehot-nba=80 we don't
# have any problem now.
genaut --cycle-onehot-nba=20 \
--cycle-onehot-nba=40 \
--cycle-onehot-nba=60 \
--cycle-onehot-nba=80 \
--cycle-log-nba=20 \
--cycle-log-nba=40 \
--cycle-log-nba=60 \
--cycle-log-nba=80 \
| autfilt --small --stats='%S -> %s' > out
cat >expected <<EOF
400 -> 20
1600 -> 40
3600 -> 60
6400 -> 6400
400 -> 20
1600 -> 40
3600 -> 60
6400 -> 6400
EOF
diff expected out
# the reason for the 6400 states above is that simulation-based
# reductions are disabled above 4096 states by default. This can be
# changed with -x simul-max=N.
genaut --cycle-onehot-nba=80 --cycle-log-nba=80 \
| autfilt -x simul-max=6400 --small --stats='%S -> %s' > out
cat >expected <<EOF
6400 -> 80
6400 -> 80
EOF
diff expected out