[Buildroot] Autobnuilders timeouts [was: Re: autobuilder flock hangs]

Hollis Blanchard hollis_blanchard at mentor.com
Thu Sep 27 17:18:01 UTC 2018


On 08/25/2018 06:05 AM, Yann E. MORIN wrote:
> On 2018-08-24 16:45 -0700, Hollis Blanchard spake thusly:
>> I suspect a) something goes wrong with the buildroot job, b) it's killed in
>> a way that leaves a dangling flock, c) future buildroot jobs run headlong
>> into the lingering flock and triggers a timeout.
> So I had a look at the autobuilder code, and we kill the build process
> with SIGKILL (-9)., so it has no chance of propagating it down to its
> children.
>
> I wonder if, should we were to use SIGTERM instead, there would be an
> improvement. Could you try to leave your autobuilder running with this
> patch, please?
>
> diff --git a/scripts/autobuild-run b/scripts/autobuild-run
> index 3d2e99a..ba86d3d 100755
> --- a/scripts/autobuild-run
> +++ b/scripts/autobuild-run
> @@ -390,7 +390,7 @@ def stop_on_build_hang(monitor_thread_hung_build_flag,
>                   if sub_proc.poll() is None:
>                       monitor_thread_hung_build_flag.set() # Used by do_build() to determine build hang
>                       log_write(log, "INFO: build hung")
> -                    sub_proc.kill()
> +                    sub_proc.terminate()
>                   break
>           monitor_thread_stop_flag.wait(30)
By the way, I tried this, came back weeks later, and found my 
autobuilder hung like so:

    [Fri, 07 Sep 2018 18:40:22] INFO: generate the configuration
    KCONFIG_SEED=0xEB4D3CE
    [Fri, 07 Sep 2018 18:40:40] INFO: build started
    [Fri, 07 Sep 2018 22:39:10] INFO: build hung

output/logfile says:

     >>> solarus v1.5.3 Building

    <snip>

    [ 88%] Building CXX object
    CMakeFiles/solarus.dir/src/third_party/snes_spc/SNES_SPC_state.cpp.o
    [ 88%] Building CXX object
    CMakeFiles/solarus.dir/src/third_party/snes_spc/spc.cpp.o
    [ 89%] Building CXX object
    CMakeFiles/solarus.dir/src/third_party/snes_spc/SPC_DSP.cpp.o
    [ 89%] Building CXX object
    CMakeFiles/solarus.dir/src/third_party/snes_spc/SPC_Filter.cpp.o
    [ 90%] Linking CXX shared library libsolarus.so


So it seems like solarus exceeded the build timeout (simple host 
performance issue?), and .terminate() was not enough. However, I bet 
.kill() would have left those subprocesses we saw earlier. (Also, in 
hindsight, I suspect they were all flocks for the same directory or two, 
so they all disappeared when the lock owner died.)

I have not tried your other patch series that changes download timeouts, 
but I don't think it would have affected the particular case above, 
since it's not stuck downloading.

Hollis Blanchard
Mentor Graphics Emulation Division

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.busybox.net/pipermail/buildroot/attachments/20180927/444ee999/attachment.html>


More information about the buildroot mailing list