[nas] nasd outputs only part of a sample, client stalls -- exceptunder strace

Jon Trulson jon at radscan.com
Tue Oct 2 16:41:33 MDT 2007


On Mon, 10 Sep 2007, Erik Auerswald wrote:

> Hi,
>
> On Mon, Sep 03, 2007 at 01:28:38PM -0600, Jon Trulson wrote:
>> On Mon, 3 Sep 2007, Erik Auerswald wrote:
>>> On Sun, Sep 02, 2007 at 08:24:18PM -0600, Jon Trulson wrote:
>>>> On Sun, 2 Sep 2007, Erik Auerswald wrote:
>>>>> On Mon, Jul 23, 2007 at 08:54:00AM -0400, mmurray wrote:
>>>>>>
>>>>>> To the best of my reasoning, the SIGALRM is being lost in a race
>>>>>> somewhere.
>>>>>
>>>>> This seems to be correct -- sending a SIGALRM to nasd whenever the
>>>>> playback hangs results in continued playback.
>>
>>   I'll have to see about installing one of these kernels and try to
>>   track the problem further (unless someone beats me to it :).
>
> I've written a test program to reproduce the problem. Setting the
> interval timer to values used by auvoxware (or to values an order of
> magnitude higher or lower) works on every tested kernel (2.6.11, 2.6.18,
> 2.6.20 and 2.6.22). Ignoring most of the generated SIGALRMs by having a
> long running signal handler works as well.
>
> Using sleep() inside the signal handler for SIGALRM shows similar
> behaviour to that of auvoxware: After a while the program does not
> recieve any SIGALRMs, sending a SIGALRM to the process lets it continue.
> According to the man-page of sleep(), sleep can be implemented using
> SIGALRM:
>

   Erik, I just wanted to let you know I have't forgotten about this
   issue.  There's just been alot going on around here.  I have a
   laptop with the latest 2.6.22 kernel (DynaTicks enabled) that I will
   use to test this, but it may be a couple of weekends still before I
   can scavenge the time to look at it.

   The issue is almost certainly the fact that we disable SIGALRM in
   intervalProc.  Ideally we should never disable the alarm once
   started, only block it when neccessary.  However simply removing the
   calls causes AuProcessData to stop the flow when it's called and
   there's nothing to do, with the result that the flow is continually
   stopped/started as a sample is playing.  Hence this will require
   some time - something that's been in short supply lately :)

> --- man page extract ---
>
> BUGS
>     sleep()  may  be implemented using SIGALRM; mixing calls to alarm() and
>     sleep() is a bad idea.
>
>     Using longjmp() from a signal handler  or  modifying  the  handling  of
>     SIGALRM while sleeping will cause undefined results.
>
> --- end ---
>
> I did not see any obvious use of sleep() in auvoxware while audio is
> processed (a few sleep()s are used in the code).
>

   I think we only use these when waiting for the device to be
   available when ReleaseDevice is enabled...

> The hangs happen with kernel 2.6.22 only, with the test program as well
> as nasd. The test program did not hang on an x86_64 2.6.22 kernel (which
> does not use the tickless code under suspicion).
>
> If you want to try my test programs: An interval value of 11 corresponds
> to auvoxware processing stereo 44.1kHz audio. Only itimertest_sleep is
> supposed to hang after a while (possibly a few hours).
>
> Erik
>

-- 
Happy cheese in fear                 | Jon Trulson
against oppressor, rebel!            | mailto:jon at radscan.com 
Brocolli, hostage.       -Unknown    | #include <std/disclaimer.h>



More information about the Nas mailing list