[nas] NAS or Solaris deadlock...
Erik Inge Bolsø
knan at mo.himolde.no
Thu Feb 14 03:20:57 MST 2002
On Wed, 13 Feb 2002, Jon Trulson wrote:
>On Thu, 14 Feb 2002, Erik Inge Bolsø wrote:
>> Greetings Jon and all.
>>
>> After some rather extensive debugging, it seems that we've hit either a
>> NAS solaris server bug, or a solaris OS bug.
>>
>> Means of triggering: run nasd on a solaris box, then run mpg123 by way of
>> libaudiooss. This triggers a very long loop of open/close of connections
>> to the NAS server, and halfway through, nasd and client mpg123/libaudiooss
>> hangs.
>>
>> Have not managed to reproduce it against nasd running on linux.
>>
>> NASD version: 1.4.2, 1.5
>>
>> The last part of our debugging session attached below. Including backtrace
>> of mpg123/libaudiooss and the hung nasd process.
>>
>> Suggestions?
>
> Strange... The stacktrace of mpg123 just shows that it's
>wating for a response from the server - normal if the server is
>otherwise wedged.
>
> The second trace just appears to be libaudiooss waiting on
>nasd again, also normal if nasd is hung.
>
> The first trace (nas) seems to be the issue - it's waiting in
>open(), which it should never be doing for any significant amount of
>time. This - at first glance anyway - looks like a kernel problem.
>If the open() never returns, what can nasd do? Out of curiousity, how
>many opens via libaudio are occurring?
50-200, I'd guess - mpg123 iterates through all combinations of a small
list of frequencies, formats and mono/stereo, and tries to set those
settings to see what it has available. In current audiooss, that's handled
by a teardown-and-reopen-with-new-settings. I believe I've seen a patch to
handle it in a more lazy way... Tobias?
> When this happens, can you kill nasd and restart it and then
>have everything working again? What does a 'truss' show on the nasd
>process?
Kill & restart works. As for "truss" ... ziying / rick, could you check?
> Does it only happen with 1.4.2d and 1.5? Did earlier versions
>show this behavior?
Well, libaudiooss wasn't working on solaris until recently, so this has
never worked AFAIK. Have tried 1.4.1, 1.4.2d and 1.5, as far as I
can remember.
> Since mpg123 has native NAS support - does that work? Would
>be interesting since it would be using the same libaudio library as
>libaudiooss is using... Just throwing things out here based on what
>I've seen so far.
Works, no problem. But using native nas it doesn't do the open/close
millions of times trick. :)
--
Erik I. Bolsø | email: <knan at mo.himolde.no>
The UNIX philosophy basically involves giving you enough rope to
hang yourself. And then a couple of feet more, just to be sure.
More information about the Nas
mailing list