From: Jesse Pollard <pollard@cats-chateau.net>
Subject: Re: 1.0.9b-pre3 SMP test (was 1.0.9b-pre2 uploaded)
Date: Thu, 16 Mar 2000 21:25:24 -0600
Next Article (by Subject): Re: 1.0.9b-pre3 SMP test (was 1.0.9b-pre2 uploaded) ao@morpork.shnet.org (A. Ott)
Previous Article (by Subject): Re: 1.0.9b-pre2 uploaded ao@morpork.shnet.org (A. Ott)
Next in Thread: Re: 1.0.9b-pre3 SMP test (was 1.0.9b-pre2 uploaded) ao@morpork.shnet.org (A. Ott)
Articles sorted by: [Date]
[Author]
[Subject]
On Wed, 15 Mar 2000, you wrote: >********* ***************** ********** **** ***** ***** ************ > To subject Re: 1.0.9b-pre2 uploaded > pollard@cats-chateau.net (Jesse Pollard) wrote: >********** ******************** ****** ******** ******* ************* > >> Hi, >> A followup on SMP testing - >> 1. I've finished reconfiguring my system. I now have a single 2G partition >> for testing. The test systems only use the one partition + swap. (Haven't >> finished web reconfig...) >> 2. I did load 2.2.13, and 2.3.47 onto it. The problems still occur - >> a. once I traced it down to a page fault (looked like disk failure...) >> this under 2.2.13 >> b. once I traced it to the keyboard, also under 2.2.13 >> c. Under 2.3.47, I couldn't get any output after the problem occured. I >> could type in 6 characters, and recieve the echo. Then it hung. No output >> dump trace ever. >> d. Under 2.3.47, I tried a maintence kernel, but the same thing occured. >> >> In all three cases I did a little extra testing while booting: >> >> I added thefollowing code to rc.S, after enabling swap and running >> /bin/update: >> >> if [ "`/bin/uname -r`" != "2.2.13.SMP" ]; then >> echo "CRASH TEST - echo of output" >> echo "CRASH TEST" >/CRASH.TEST >> echo "append test" >>/CRASH.TEST >> echo "after append test" >> echo " reading contents of CRASH.TEST" >> cat /CRASH.TEST >> echo "beginning keyboard read test" >> echo "beginning keyboard read test" >>CRASH.TEST >> read junk >> echo "READ ...${junk}..." >>CRASH.TEST >> >> echo " reading contents again:" >> cat /CRASH.TEST >> fi >> >> When I boot the system I do have the disk write enabled to see if anything >> occured. The failure only occurs at the "read junk" line. I do get the >> contents of the CRASH.TEST file output, even though it doesn't quite make it >> to disk (might if I put a sync in there...) > >I admit I currently have no idea what happens here, but I will >reinvestigate the locking. > >> One other thing I noticed -- from another post (AUTH problems): >> >> > kernel: rsbac_reg_init(): Initializing RSBAC: REG module registration Mar >> > 8 12:20:05 ganja kernel: rsbac_init(): Starting rsbacd thread Mar 8 >> > 12:20:05 ganja kernel: rsbac_init(): Setting RSBAC auto timer Mar 8 >> > 12:20:05 ganja kernel: rsbac_init(): Ready. >> >> I don't get the line "kernel: rsbac_init(): Ready.". This may be due to >> it being the very first boot. > >No, from pre3 on it should always be there. It is logged on the same level >as the first one ('initializing'): KERN_INFO. > >> A default /rsbac/useraci file was created. > >Good. > >> If you have some debugging suggestions/configuration changes I'm ready to >> try them out. > >Please try changing the rsbac locking functions in aci_data_structures.h >to using irqsave/irqrestore (see include/asm/spinlock.h). The flags >parameter should be correctly provided in all locking calls. > >This is to make sure that really nothing can bypass the locks, but it >cannot be a long term solution. I switched to the pre3 version: I do see the "ready" message in both SMP and uniprocessor tests shown below. The SMP errors still occur too. The following (quickie) trace is what I saw: wait_on_irq, CPU 1: irq: 0 [0 0] bh: 1 [1 0] <[c010bc21]> __global_cli <[c01f588b]> vgacon_set_cursor_size <[c01bf9be]> update_region <[c01c3542]> con_flush_chars <[c01c690c]> opost_block I did try the exact same configuration with a uniprocessor kernel. This system did what I expected: (taken from the messages file...): .... Mar 16 21:09:31 tabby kernel: Partition check: Mar 16 21:09:31 tabby kernel: sda: sda1 sda2 Mar 16 21:09:31 tabby kernel: sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 > Mar 16 21:09:31 tabby kernel: sdc: sdc1 Mar 16 21:09:31 tabby kernel: sdd: sdd1 Mar 16 21:09:31 tabby kernel: Real Time Clock Driver v1.10 Mar 16 21:09:31 tabby kernel: Linux PCMCIA Card Services 3.1.11 Mar 16 21:09:31 tabby kernel: options: [pci] [cardbus] Mar 16 21:09:31 tabby kernel: Databook TCIC-2 PCMCIA probe: not found. Mar 16 21:09:31 tabby kernel: ds: no socket drivers loaded! Mar 16 21:09:31 tabby kernel: rsbac_init(): Initializing RSBAC v1.0.9b-pre3 Mar 16 21:09:31 tabby kernel: rsbac_init(): compiled modules: MAC FC SIM PM MS F F AUTH REG ACL Mar 16 21:09:31 tabby kernel: rsbac_init(): Registering RSBAC proc dir Mar 16 21:09:31 tabby kernel: rsbac_init_pm(): Initializing RSBAC: PM subsystem Mar 16 21:09:31 tabby kernel: rsbac_init_auth(): Initializing RSBAC: AUTH subsys tem Mar 16 21:09:31 tabby kernel: rsbac_init_acl(): Initializing RSBAC: ACL subsyste m Mar 16 21:09:31 tabby kernel: rsbac_reg_init(): Initializing RSBAC: REG module r egistration Mar 16 21:09:31 tabby kernel: rsbac_init(): Starting rsbacd thread Mar 16 21:09:31 tabby kernel: rsbac_init(): Setting RSBAC auto timer Mar 16 21:09:31 tabby kernel: rsbac_init(): Ready. Mar 16 21:09:31 tabby kernel: scsi0: Tagged Queuing now active for Target 0 Mar 16 21:09:31 tabby kernel: Adding Swap: 265064k swap-space (priority -1) Mar 16 21:09:31 tabby kernel: rsbac_adf_request(): request CHANGE_OWNER, caller_ pid 76, caller_prog_name rpc.portmap, caller_uid 0, target-type PROCESS, tid 76, attr owner, value 1, result NOT_GRANTED by AUTH Mar 16 21:16:25 tabby syslogd 1.3-3: restart. The last entry is when I rebooted the system. From another mail message I saw the answer to the "...request CHANGE_OWNER...NOT_GRANTED by AUTH" message where it hung. (I have to give init, ... and other daemons the privleges needed). I did get the expected warning about the read/write root in the uniprocessor test. It's looking more and more like a missing lock in the vga console driver... It looks like my little keyboard test above isn't showing the problem quite yet. I'm going to think about trying a serial console (a temporary link to my firewalls unused serial lines...). That may separate the problem from the vga virtual terminals and the console device. -- ------------------------------------------------------------------------- Jesse I Pollard, II Email: pollard@cats-chateau.net Any opinions expressed are solely my own. - To unsubscribe from the rsbac list, send a mail to majordomo@morpork.shnet.org with unsubscribe rsbac as single line in the body.
Next Article (by Subject): Re: 1.0.9b-pre3 SMP test (was 1.0.9b-pre2 uploaded) ao@morpork.shnet.org (A. Ott)
Previous Article (by Subject): Re: 1.0.9b-pre2 uploaded ao@morpork.shnet.org (A. Ott)
Next in Thread: Re: 1.0.9b-pre3 SMP test (was 1.0.9b-pre2 uploaded) ao@morpork.shnet.org (A. Ott)
Articles sorted by: [Date]
[Author]
[Subject]