[nas] NAS or Solaris deadlock...

Erik Inge Bolsø knan at mo.himolde.no
Thu Feb 14 03:20:57 MST 2002


On Wed, 13 Feb 2002, Jon Trulson wrote:
>On Thu, 14 Feb 2002, Erik Inge Bolsø wrote:
>> Greetings Jon and all.
>>
>> After some rather extensive debugging, it seems that we've hit either a
>> NAS solaris server bug, or a solaris OS bug.
>>
>> Means of triggering: run nasd on a solaris box, then run mpg123 by way of
>> libaudiooss. This triggers a very long loop of open/close of connections
>> to the NAS server, and halfway through, nasd and client mpg123/libaudiooss
>> hangs.
>>
>> Have not managed to reproduce it against nasd running on linux.
>>
>> NASD version: 1.4.2, 1.5
>>
>> The last part of our debugging session attached below. Including backtrace
>> of mpg123/libaudiooss and the hung nasd process.
>>
>> Suggestions?
>
>	Strange... The stacktrace of mpg123 just shows that it's
>wating for a response from the server - normal if the server is
>otherwise wedged.
>
>	The second trace just appears to be libaudiooss waiting on
>nasd again, also normal if nasd is hung.
>
>	The first trace (nas) seems to be the issue - it's waiting in
>open(), which it should never be doing for any significant amount of
>time.  This - at first glance anyway - looks like a kernel problem.
>If the open() never returns, what can nasd do?  Out of curiousity, how
>many opens via libaudio are occurring?

50-200, I'd guess - mpg123 iterates through all combinations of a small
list of frequencies, formats and mono/stereo, and tries to set those
settings to see what it has available. In current audiooss, that's handled
by a teardown-and-reopen-with-new-settings. I believe I've seen a patch to
handle it in a more lazy way... Tobias?

>	When this happens, can you kill nasd and restart it and then
>have everything working again?  What does a 'truss' show on the nasd
>process?

Kill & restart works. As for "truss" ... ziying / rick, could you check?

>	Does it only happen with 1.4.2d and 1.5?  Did earlier versions
>show this behavior?

Well, libaudiooss wasn't working on solaris until recently, so this has
never worked AFAIK. Have tried 1.4.1, 1.4.2d and 1.5, as far as I
can remember.

>	Since mpg123 has native NAS support - does that work?  Would
>be interesting since it would be using the same libaudio library as
>libaudiooss is using... Just throwing things out here based on what
>I've seen so far.

Works, no problem. But using native nas it doesn't do the open/close
millions of times trick. :)

--
Erik I. Bolsø | email: <knan at mo.himolde.no>
The UNIX philosophy basically involves giving you enough rope to
hang yourself.  And then a couple of feet more, just to be sure.







More information about the Nas mailing list