Segfault when postprocessing large automaton

List overview All Threads
Download

newer

older

Spot Polish Notation Parsing

accepting_run() returning words...

Reed Oei

30 Mar 2020 30 Mar '20

2:14 p.m.

Hello,

I am encountering a bug where Spot segfaults when I try to postprocess a relatively large automaton. I have uploaded the file to http://reedoei.com/files/smaller-inner.aut.zip; it has 113930 states and 1537155 edges. I tried to come up with a smaller example but was having trouble doing so. Below is the sequence of commands, using the Python bindings, to reproduce this bug for me:

import spot A = spot.automaton('smaller-inner.aut') B = A.postprocess('BA', 'Deterministic', 'Low') # Crashes here B.save('postprocessed.aut')

I also tried using various other others for postprocess such as Small and Low, or Any and High (because Any and Low/Medium does nothing).

I am using Ubuntu 18.04, Python 3.6.9, and Spot 2.8.7 (but also tried with 2.8.6).

Thank you, Reed Oei

Attachments:

attachment.html (text/html — 1.2 KB)

Show replies by date

Alexandre Duret-Lutz

30 Mar 30 Mar

3:56 p.m.

On Mon, Mar 30, 2020 at 2:15 PM Reed Oei reedoei2@illinois.edu wrote:

...

Hello,

I am encountering a bug where Spot segfaults when I try to postprocess a relatively large automaton. I have uploaded the file to http://reedoei.com/files/smaller-inner.aut.zip; it has 113930 states and 1537155 edges. I tried to come up with a smaller example but was having trouble doing so. Below is the sequence of commands, using the Python bindings, to reproduce this bug for me:

import spot A = spot.automaton('smaller-inner.aut') B = A.postprocess('BA', 'Deterministic', 'Low') # Crashes here B.save('postprocessed.aut')

I also tried using various other others for postprocess such as Small and Low, or Any and High (because Any and Low/Medium does nothing).

Hi Reed,

This is a HUGE automaton to pass to postprocess.

Running

% autfilt --det --low smaller-inner.aut

did not finish within 20 minutes, so I interrupted it and noticed the code was busy in the simulation-based reduction code. The simulation-based reduction is notoriously slow: it uses a BDD-based representation that requires one variable per equivalence classe (maybe the number of states), and when too many variables are used, it's possible that the garbage collection for the BuDDy library crashes with a stack overflow (see https://gitlab.lrde.epita.fr/spot/spot/-/issues/396 ) Maybe that's the crash you got.

You can disable simulation-based reductions with

% autfilt -x simul=0,ba-simul=0,det-simul=0 --det --low smaller-inner.aut

but this also did not finish withing 20 minutes. When I interrupted it it was stuck in the determinization code. Running an exponential algorithm on an automaton with soo many states is unlikely to finish quickly.

Running the same command without requiring a determinisation, and without simulation-based reductions worked and reduced the automaton to 107083 states:

% autfilt -x simul=0,ba-simul=0 --small --low smaller-inner.aut

-- Alexandre Duret-Lutz

Reed Oei

19 Apr 19 Apr

6:33 p.m.

Hi,

Thanks for the response. I decided to look into the isuse a little bit, and it seems like the problem is mainly in apply_rec() in bddop.c. I was able to rewrite it into a stack-based version, which seems to have fixed the crashing (though postprocessing still takes forever, which is the original result I was expecting, anyway). I also tried my new version on the command from that issue you linked ( https://gitlab.lrde.epita.fr/spot/spot/-/issues/396), and that also works now, though it takes about 4-5 hours to finish.

It's probably not the ideal implementation, but I attached a patch file containing the fix, if you are interested. It seems to work (all the tests pass), and it's about the same speed in most cases.

Thanks, Reed

Le lun. 30 mars 2020 à 08:56, Alexandre Duret-Lutz adl@lrde.epita.fr a écrit :

...

On Mon, Mar 30, 2020 at 2:15 PM Reed Oei reedoei2@illinois.edu wrote:

...
Hello,

I am encountering a bug where Spot segfaults when I try to postprocess a

relatively large automaton. I have uploaded the file to http://reedoei.com/files/smaller-inner.aut.zip; it has 113930 states and 1537155 edges. I tried to come up with a smaller example but was having trouble doing so. Below is the sequence of commands, using the Python bindings, to reproduce this bug for me:

...
import spot A = spot.automaton('smaller-inner.aut') B = A.postprocess('BA', 'Deterministic', 'Low') # Crashes here B.save('postprocessed.aut')

I also tried using various other others for postprocess such as Small

and Low, or Any and High (because Any and Low/Medium does nothing).

Hi Reed,

This is a HUGE automaton to pass to postprocess.

Running

% autfilt --det --low smaller-inner.aut

did not finish within 20 minutes, so I interrupted it and noticed the code was busy in the simulation-based reduction code. The simulation-based reduction is notoriously slow: it uses a BDD-based representation that requires one variable per equivalence classe (maybe the number of states), and when too many variables are used, it's possible that the garbage collection for the BuDDy library crashes with a stack overflow (see https://gitlab.lrde.epita.fr/spot/spot/-/issues/396 ) Maybe that's the crash you got.

You can disable simulation-based reductions with

% autfilt -x simul=0,ba-simul=0,det-simul=0 --det --low smaller-inner.aut

but this also did not finish withing 20 minutes. When I interrupted it it was stuck in the determinization code. Running an exponential algorithm on an automaton with soo many states is unlikely to finish quickly.

Running the same command without requiring a determinisation, and without simulation-based reductions worked and reduced the automaton to 107083 states:

% autfilt -x simul=0,ba-simul=0 --small --low smaller-inner.aut

-- Alexandre Duret-Lutz

Alexandre Duret-Lutz

20 Apr 20 Apr

12:13 p.m.

On Sun, Apr 19, 2020 at 6:33 PM Reed Oei reedoei2@illinois.edu wrote:

...

Thanks for the response. I decided to look into the isuse a little bit, and it seems like the problem is mainly in apply_rec() in bddop.c. I was able to rewrite it into a stack-based version, which seems to have fixed the crashing (though postprocessing still takes forever, which is the original result I was expecting, anyway). I also tried my new version on the command from that issue you linked (https://gitlab.lrde.epita.fr/spot/spot/-/issues/396), and that also works now, though it takes about 4-5 hours to finish.

It's probably not the ideal implementation, but I attached a patch file containing the fix, if you are interested. It seems to work (all the tests pass), and it's about the same speed in most cases.

Hi Reed !

Thank you very much for this implementation! I've pushed it on a branch and updated the issue with some comments until we can improve upon this patch. Clearly this kind of implementation needs to be generalized to all recursive BDD functions, but it's encouraging that the original issue can be fixed only by changing apply_rec. Unless you are willing to do it, I'll do it eventually, but that will be after the release of Spot 2.9 (which we hope will by the end of the month).

Best regards,

-- Alexandre Duret-Lutz

1930

Age (days ago)

1951

Last active (days ago)

spot@lrde.epita.fr

3 comments

2 participants

tags (0)

participants (2)

Alexandre Duret-Lutz
Reed Oei