Salut Pollux,

Thanks for the prompt reply, that's much appreciated.

I did have a feeling that I was abusing the method, given its "semi-internal" status. :)

As for terminology, "defrag" didn't clearly indicate to me that it could only remove unreachable states.  I agree that it shouldn't "remove anything that is used", but since we pass as argument a use-vector and a number of used states, I thought that the notion of "being used" was taken from these args.

In my use case, I compute a vector v of size num_states() via a fix point algorithm and every position above a given threshold corresponds to a state that I'd want flushed.  In code, I was doing:

std::vector<unsigned> v = compute_fp (aut);
std::vector<unsigned> defrag (aut->num_states ());
unsigned used = 0, delta = 0;
for (size_t q = 0; q < aut->num_states(); ++q)
  if (v[q] > threshold) {
    delta++;
    defrag[q] = -1;
  }
  else {
    defrag[q] = q - delta;
    used++;
  }
aut->defrag_states (std::move (defrag), used);

(By the way, not sure why defrag_states insists on receiving an rvalue.)

This is certainly pandering to the oddity of erasing using defrag.  I like your kill_state + purge suggestion, it seems to be more robust and less condition-dependent.  It also leaves renumbering to the internals, which feels saner.

Cheers,
M.

On Thu, Feb 18, 2021 at 5:01 AM Alexandre Duret-Lutz <adl@lrde.epita.fr> wrote:
Hi Michaël!

Michaël Cadilhac <michael@cadilhac.name> writes:
> In a preprocessing step, I want to flush a few states from a TωA. 
> The one method that seems to be available is defrag_states, the
> description of which having the expected semantics.

The intend of this function is to help removing unused entries from the
state vector, to make it less fragmented.  So far it's only used
on states that are unreachable, I guess we should make that clear in the
documentation.  (For me, the name "defrag" evokes the similar process
on file systems, where we do not remove anything that is used.)

What one want to do with the incoming edge when removing a reachable
state might differ according to the context.  For instance in the case
of an alternating automaton, if a removed state is part of the
destination set of a universal edge, we might want to remove the
universal edge altogether (as if the removed state had an empty
language), or might want to just remove this state from the destination
set (as if the removed state had a universal language).  So I'd rather
build some more user-friendly interface above defrag_state() rather than
extend this helper function.

What would be a nice interface for you?  Two iterators to specify a
range of states to remove?  Another idea would be to have a method that
simply removes all the outgoing edges of one state, let's say
aut->kill_state(s), so that multiple states could be killed this way
before eventually calling aut->purge_dead_states() to remove them for
good.

The functions mask_keep_states() and mask_keep_accessible_states() seem
close to your need.   They just don't work in place.

> However, after defrag, it sometimes happens that the dst of some
> transitions points to delete states (-1U).

I'm tempted to throw an exception in this case, but the automaton
will already be in a sad state :-(