| Home | About Us | Sybase Training | Synergy | Consulting | Job Openings | Tech Videos | Rules and Disclaimer | Search |
![]() |
| Home | About Us | Sybase Training | Synergy | Consulting | Job Openings | Tech Videos | Rules and Disclaimer | Search |
|
orphan process resulting into Signal 11 stack trace causing server crash
|
|
02-21-2012, 06:06 AM
Post: #1
|
|||
|
|||
|
orphan process resulting into Signal 11 stack trace causing server crash
hi Folks, We have been experiencing frequent "Signal 11 stack trace" in our production enviroment. This was second time when an orphan process caused the "Signal 11 stack trace" resulting into server crash. We are using sybase Adaptive Server Enterprise/15.5/EBF 18664 Cluster Edition ESD#4 Does any one have any idea of what this error means & what can be done to rectify this error? ############################################ Signal 11 Stack Trace ---1 ########################################### 01:06:00000:00281:2012/02/19 09:27:49.23 kernel timeslice -501, current process infected at 1002db6d8 (aix_get_lock+0x48) 01:06:00000:00281:2012/02/19 09:27:49.24 kernel **** Saved signal context (0x0000000169847570): **** 01:06:00000:00281:2012/02/19 09:27:49.24 kernel __sc_onstack: 0x0, __sc_uerror: 11 01:06:00000:00281:2012/02/19 09:27:49.24 kernel uc_sigmask: 0x8e000 0x0 0x0 0x0 01:06:00000:00281:2012/02/19 09:27:49.24 kernel Machine Save State: 01:06:00000:00281:2012/02/19 09:27:49.24 kernel PC (iar): 00000001002db6d8 (aix_get_lock+0x48) 01:06:00000:00281:2012/02/19 09:27:49.24 kernel Link Register (lr): 000000010028f444 (ubfree+0x8c) 01:06:00000:00281:2012/02/19 09:27:49.24 kernel Stack Pointer (stkp): 0000000169847a60 01:06:00000:00281:2012/02/19 09:27:49.24 kernel msr : a00000000000d032 ctr : 0000000000000009 cr : 42442028 01:06:00000:00281:2012/02/19 09:27:49.24 kernel xer : 00000011 01:06:00000:00281:2012/02/19 09:27:49.24 kernel r0 : 0000000000000000 r1 : 0000000169847a60 r2 : 000000011069f768 01:06:00000:00281:2012/02/19 09:27:49.24 kernel r3 : 0000000000596191 r4 : 000000016203da40 r5 : 000000010028f444 01:06:00000:00281:2012/02/19 09:27:49.24 kernel r6 : 0000000000000020 r7 : 0000000000000108 r8 : 0000000000000001 01:06:00000:00281:2012/02/19 09:27:49.24 kernel r9 : 000000000028d5e4 r10 : 0000000000036e9a r11 : 0000000000036ef9 01:06:00000:00281:2012/02/19 09:27:49.24 kernel r12 : 00000001002e3f44 r13 : 000000011089bf40 r14 : 0000000000000012 01:06:00000:00281:2012/02/19 09:27:49.24 kernel r15 : 0000000022288800 r16 : 0000000000000000 r17 : 0000000000000001 01:06:00000:00281:2012/02/19 09:27:49.24 kernel r18 : 0000000044442028 r19 : 000000042aec5400 r20 : 0000000413861bc0 01:06:00000:00281:2012/02/19 09:27:49.24 kernel r21 : 000000040af2dbc8 r22 : 0000000000000000 r23 : 0000000000000001 01:06:00000:00281:2012/02/19 09:27:49.24 kernel r24 : 00000002000006cd r25 : 000000040af1dbc8 r26 : 000000042aec50e0 01:06:00000:00281:2012/02/19 09:27:49.24 kernel r27 : 000000016203da40 r28 : 00000000deadbabe r29 : 00000001106c7cd8 01:06:00000:00281:2012/02/19 09:27:49.24 kernel r30 : 000000042af65800 r31 : 0000000413827800 01:06:00000:00281:2012/02/19 09:27:49.24 kernel **** end of signal context **** 01:06:00000:00281:2012/02/19 09:27:49.24 kernel timeslice error: spid 281 exhausted its 'time slice' of 100 milliseconds and additional 'cpu grace time' of 500 ticks (50000 milliseconds). It has been marked for termination. 01:02:00000:00281:2012/02/19 09:28:11.44 server Unable to do cleanup for the killed process; received Msg 1142 01:00:00000:00532:2012/02/19 13:46:00.51 kernel Current process (0x25fa0053) infected with signal 11 (SIGSEGV) 01:00:00000:00532:2012/02/19 13:46:00.51 kernel Address 0x0000000100bbcfdc (lock__ins_syslocks+0x22c), siginfo (code, address) = (51, 0x00000000039b5a9a) 01:00:00000:00532:2012/02/19 13:46:00.51 kernel Spinlocks held by kpid 637141075 01:00:00000:00532:2012/02/19 13:46:00.51 kernel Spinlock fglockspins at address 0000000162240a00 owned by 637141075 01:00:00000:00532:2012/02/19 13:46:00.51 kernel End of spinlock display. 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001005fca1c pcstkwalk+0x88() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001005fc310 ucstkgentrace+0x1ac() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001005fb7c8 ucbacktrace+0x90() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100c81cb8 terminate_process__fdpr_5+0xa0() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001021cf458 kisignal__fdpr_1+0xc8() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel [Handler pc: 0x00000001006330ac std_handle+0x0 installed by the following function:-] 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100bbcfdc lock__ins_syslocks+0x22c() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100bbcd50 lock_make_syslocks+0x13c() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001014febe4 make_fake__fdpr_3+0xd0() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100472ae4 s__setup_tabsdes__fdpr_20+0x64() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001002a9460 s_execute__fdpr_50+0x10() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel [Handler pc: 0x00000001004a3d74 hdl_stack+0x0 installed by the following function:-] 01:00:00000:00532:2012/02/19 13:46:00.51 kernel [Handler pc: 0x00000001004ef49c s_handle+0x0 installed by the following function:-] 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100290cc8 sequencer+0x2e8() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x000000010027c364 execproc+0x408() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x000000010027cf08 s_execute__fdpr_36+0xf8() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel [Handler pc: 0x00000001004a3d74 hdl_stack+0x0 installed by the following function:-] 01:00:00000:00532:2012/02/19 13:46:00.51 kernel [Handler pc: 0x00000001004ef49c s_handle+0x0 installed by the following function:-] 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100290cc8 sequencer+0x2e8() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001003dd110 tdsrecv_language+0xb8() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel [Handler pc: 0x000000010046eb20 ut_handle+0x0 installed by the following function:-] 01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100326ee0 conn_hdlr__fdpr_3+0x60() 01:00:00000:00532:2012/02/19 13:46:00.51 kernel end of stack trace, spid 532, kpid 637141075, suid 1 01:00:00000:00532:2012/02/19 13:46:00.51 kernel ueshutdown: exiting ############################################ Signal 11 Stack Trace ---2 ########################################### 01:04:00000:00401:2012/02/20 17:57:02.65 kernel Current process (0x6a520177) infected with signal 11 (SIGSEGV) 01:04:00000:00401:2012/02/20 17:57:02.65 kernel Address 0x0000000100bbcfdc (lock__ins_syslocks+0x22c), siginfo (code, address) = (51, 0x00000000039b5a9a) 01:04:00000:00401:2012/02/20 17:57:02.65 kernel Spinlocks held by kpid 1783759223 01:04:00000:00401:2012/02/20 17:57:02.65 kernel Spinlock fglockspins at address 0000000162164280 owned by 1783759223 01:04:00000:00401:2012/02/20 17:57:02.65 kernel End of spinlock display. 01:04:00000:00401:2012/02/20 17:57:02.65 kernel pc: 0x00000001005fca1c pcstkwalk+0x88() 01:04:00000:00401:2012/02/20 17:57:02.65 kernel pc: 0x00000001005fc310 ucstkgentrace+0x1ac() 01:04:00000:00401:2012/02/20 17:57:02.65 kernel pc: 0x00000001005fb7c8 ucbacktrace+0x90() 01:04:00000:00401:2012/02/20 17:57:02.65 kernel pc: 0x0000000100c81cb8 terminate_process__fdpr_5+0xa0() 01:04:00000:00401:2012/02/20 17:57:02.65 kernel pc: 0x00000001021cf458 kisignal__fdpr_1+0xc8() 01:04:00000:00401:2012/02/20 17:57:02.67 kernel [Handler pc: 0x00000001006330ac std_handle+0x0 installed by the following function:-] 01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x0000000100bbcfdc lock__ins_syslocks+0x22c() 01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x0000000100bbcd50 lock_make_syslocks+0x13c() 01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x00000001014febe4 make_fake__fdpr_3+0xd0() 01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x0000000100472ae4 s__setup_tabsdes__fdpr_20+0x64() 01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x00000001002a9460 s_execute__fdpr_50+0x10() 01:04:00000:00401:2012/02/20 17:57:02.67 kernel [Handler pc: 0x00000001004a3d74 hdl_stack+0x0 installed by the following function:-] 01:04:00000:00401:2012/02/20 17:57:02.67 kernel [Handler pc: 0x00000001004ef49c s_handle+0x0 installed by the following function:-] 01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x0000000100290cc8 sequencer+0x2e8() 01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x00000001003dd110 tdsrecv_language+0xb8() 01:04:00000:00401:2012/02/20 17:57:02.67 kernel [Handler pc: 0x000000010046eb20 ut_handle+0x0 installed by the following function:-] 01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x0000000100326ee0 conn_hdlr__fdpr_3+0x60() 01:04:00000:00401:2012/02/20 17:57:02.67 kernel end of stack trace, spid 401, kpid 1783759223, suid 28 01:04:00000:00401:2012/02/20 17:57:02.67 kernel ueshutdown: exiting |
|||
|
02-21-2012, 08:39 AM
Post: #2
|
|||
|
|||
|
RE: orphan process resulting into Signal 11 stack trace causing server crash
Youplease better raise a ticket with Sybase Technical Support immediately.
JP, TechSupport-Member(SybaseTeam.Com) |
|||
|
04-15-2012, 03:49 PM
Post: #3
|
|||
|
|||
|
RE: orphan process resulting into Signal 11 stack trace causing server crash
Yeah. You are trying to tune the server too much at the lower levels. When a process gets corrupted (Signal 11), it is normally not an issue (does not cause problems for other processes). The exception is, when the process is holding a spinlock, which only it can release. Signal 11 with spinlock held, affects other processes. 1. Back off on your timeslice setting. Changing the timeslice setting is very serious, you really have to know what you are doing, nd maintain monitoring (via sysmon or MDA). 2. Reduce the load on the spinlocks. That means, figure which ones are being overused from sysmon, and increase the number of spinlocks covering that resource. Eg. increase global_cache_partition_number or the local partition (per cache). 3. Increase the stack_guard_size. The overall problem is, your server is badly lock-bound. Too many DPL/DRL tables, and transactions are too large. BTW, that is not an orphan, it is a zombie, and one that will kill the server. Ashirvad to my Shishyas, Cheers to the others Derek Asirvadem Information Architect / Sr Sybase DBA Website Selection of Useful Documents for the Sybase DBA |
|||
|
« Next Oldest · Next Newest »
|