You are not logged in or registered. Please login or register to use the full functionality of this Website SybaseTeam.Com...
Hello There, Guest! (LoginRegister) Remember Me? Current time: 09-19-2014, 04:52 PM
   Home  |  About Us  |  Sybase Training  |  Synergy  |  Consulting  |  Job Openings  |  Tech Videos  |  Rules and Disclaimer  |  Search
Post Reply 
Forum Tools
orphan process resulting into Signal 11 stack trace causing server crash
02-21-2012, 06:06 AM
Post: #1
Quote this message in a reply
orphan process resulting into Signal 11 stack trace causing server crash


hi Folks,

We have been experiencing frequent "Signal 11 stack trace" in our production enviroment. This was second time when an orphan process caused the "Signal 11 stack trace" resulting into server crash.

We are using sybase Adaptive Server Enterprise/15.5/EBF 18664 Cluster Edition ESD#4

Does any one have any idea of what this error means & what can be done to rectify this error?

############################################
Signal 11 Stack Trace ---1
###########################################
01:06:00000:00281:2012/02/19 09:27:49.23 kernel timeslice -501, current process infected at 1002db6d8 (aix_get_lock+0x48)
01:06:00000:00281:2012/02/19 09:27:49.24 kernel **** Saved signal context (0x0000000169847570): ****
01:06:00000:00281:2012/02/19 09:27:49.24 kernel __sc_onstack: 0x0, __sc_uerror: 11
01:06:00000:00281:2012/02/19 09:27:49.24 kernel uc_sigmask: 0x8e000 0x0 0x0 0x0
01:06:00000:00281:2012/02/19 09:27:49.24 kernel Machine Save State:
01:06:00000:00281:2012/02/19 09:27:49.24 kernel PC (iar): 00000001002db6d8 (aix_get_lock+0x48)
01:06:00000:00281:2012/02/19 09:27:49.24 kernel Link Register (lr): 000000010028f444 (ubfree+0x8c)
01:06:00000:00281:2012/02/19 09:27:49.24 kernel Stack Pointer (stkp): 0000000169847a60
01:06:00000:00281:2012/02/19 09:27:49.24 kernel msr : a00000000000d032 ctr : 0000000000000009 cr : 42442028
01:06:00000:00281:2012/02/19 09:27:49.24 kernel xer : 00000011
01:06:00000:00281:2012/02/19 09:27:49.24 kernel r0 : 0000000000000000 r1 : 0000000169847a60 r2 : 000000011069f768
01:06:00000:00281:2012/02/19 09:27:49.24 kernel r3 : 0000000000596191 r4 : 000000016203da40 r5 : 000000010028f444
01:06:00000:00281:2012/02/19 09:27:49.24 kernel r6 : 0000000000000020 r7 : 0000000000000108 r8 : 0000000000000001
01:06:00000:00281:2012/02/19 09:27:49.24 kernel r9 : 000000000028d5e4 r10 : 0000000000036e9a r11 : 0000000000036ef9
01:06:00000:00281:2012/02/19 09:27:49.24 kernel r12 : 00000001002e3f44 r13 : 000000011089bf40 r14 : 0000000000000012
01:06:00000:00281:2012/02/19 09:27:49.24 kernel r15 : 0000000022288800 r16 : 0000000000000000 r17 : 0000000000000001
01:06:00000:00281:2012/02/19 09:27:49.24 kernel r18 : 0000000044442028 r19 : 000000042aec5400 r20 : 0000000413861bc0
01:06:00000:00281:2012/02/19 09:27:49.24 kernel r21 : 000000040af2dbc8 r22 : 0000000000000000 r23 : 0000000000000001
01:06:00000:00281:2012/02/19 09:27:49.24 kernel r24 : 00000002000006cd r25 : 000000040af1dbc8 r26 : 000000042aec50e0
01:06:00000:00281:2012/02/19 09:27:49.24 kernel r27 : 000000016203da40 r28 : 00000000deadbabe r29 : 00000001106c7cd8
01:06:00000:00281:2012/02/19 09:27:49.24 kernel r30 : 000000042af65800 r31 : 0000000413827800
01:06:00000:00281:2012/02/19 09:27:49.24 kernel **** end of signal context ****
01:06:00000:00281:2012/02/19 09:27:49.24 kernel timeslice error: spid 281 exhausted its 'time slice' of 100 milliseconds and additional 'cpu grace time' of 500 ticks (50000 milliseconds). It has been marked for termination.
01:02:00000:00281:2012/02/19 09:28:11.44 server Unable to do cleanup for the killed process; received Msg 1142

01:00:00000:00532:2012/02/19 13:46:00.51 kernel Current process (0x25fa0053) infected with signal 11 (SIGSEGV)
01:00:00000:00532:2012/02/19 13:46:00.51 kernel Address 0x0000000100bbcfdc (lock__ins_syslocks+0x22c), siginfo (code, address) = (51, 0x00000000039b5a9a)
01:00:00000:00532:2012/02/19 13:46:00.51 kernel Spinlocks held by kpid 637141075
01:00:00000:00532:2012/02/19 13:46:00.51 kernel Spinlock fglockspins at address 0000000162240a00 owned by 637141075
01:00:00000:00532:2012/02/19 13:46:00.51 kernel End of spinlock display.
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001005fca1c pcstkwalk+0x88()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001005fc310 ucstkgentrace+0x1ac()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001005fb7c8 ucbacktrace+0x90()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100c81cb8 terminate_process__fdpr_5+0xa0()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001021cf458 kisignal__fdpr_1+0xc8()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel [Handler pc: 0x00000001006330ac std_handle+0x0 installed by the following function:-]
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100bbcfdc lock__ins_syslocks+0x22c()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100bbcd50 lock_make_syslocks+0x13c()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001014febe4 make_fake__fdpr_3+0xd0()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100472ae4 s__setup_tabsdes__fdpr_20+0x64()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001002a9460 s_execute__fdpr_50+0x10()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel [Handler pc: 0x00000001004a3d74 hdl_stack+0x0 installed by the following function:-]
01:00:00000:00532:2012/02/19 13:46:00.51 kernel [Handler pc: 0x00000001004ef49c s_handle+0x0 installed by the following function:-]
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100290cc8 sequencer+0x2e8()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x000000010027c364 execproc+0x408()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x000000010027cf08 s_execute__fdpr_36+0xf8()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel [Handler pc: 0x00000001004a3d74 hdl_stack+0x0 installed by the following function:-]
01:00:00000:00532:2012/02/19 13:46:00.51 kernel [Handler pc: 0x00000001004ef49c s_handle+0x0 installed by the following function:-]
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100290cc8 sequencer+0x2e8()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x00000001003dd110 tdsrecv_language+0xb8()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel [Handler pc: 0x000000010046eb20 ut_handle+0x0 installed by the following function:-]
01:00:00000:00532:2012/02/19 13:46:00.51 kernel pc: 0x0000000100326ee0 conn_hdlr__fdpr_3+0x60()
01:00:00000:00532:2012/02/19 13:46:00.51 kernel end of stack trace, spid 532, kpid 637141075, suid 1
01:00:00000:00532:2012/02/19 13:46:00.51 kernel ueshutdown: exiting

############################################
Signal 11 Stack Trace ---2 ###########################################

01:04:00000:00401:2012/02/20 17:57:02.65 kernel Current process (0x6a520177) infected with signal 11 (SIGSEGV)
01:04:00000:00401:2012/02/20 17:57:02.65 kernel Address 0x0000000100bbcfdc (lock__ins_syslocks+0x22c), siginfo (code, address) = (51, 0x00000000039b5a9a)
01:04:00000:00401:2012/02/20 17:57:02.65 kernel Spinlocks held by kpid 1783759223

01:04:00000:00401:2012/02/20 17:57:02.65 kernel Spinlock fglockspins at address 0000000162164280 owned by 1783759223
01:04:00000:00401:2012/02/20 17:57:02.65 kernel End of spinlock display.
01:04:00000:00401:2012/02/20 17:57:02.65 kernel pc: 0x00000001005fca1c pcstkwalk+0x88()
01:04:00000:00401:2012/02/20 17:57:02.65 kernel pc: 0x00000001005fc310 ucstkgentrace+0x1ac()
01:04:00000:00401:2012/02/20 17:57:02.65 kernel pc: 0x00000001005fb7c8 ucbacktrace+0x90()
01:04:00000:00401:2012/02/20 17:57:02.65 kernel pc: 0x0000000100c81cb8 terminate_process__fdpr_5+0xa0()
01:04:00000:00401:2012/02/20 17:57:02.65 kernel pc: 0x00000001021cf458 kisignal__fdpr_1+0xc8()
01:04:00000:00401:2012/02/20 17:57:02.67 kernel [Handler pc: 0x00000001006330ac std_handle+0x0 installed by the following function:-]
01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x0000000100bbcfdc lock__ins_syslocks+0x22c()
01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x0000000100bbcd50 lock_make_syslocks+0x13c()
01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x00000001014febe4 make_fake__fdpr_3+0xd0()
01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x0000000100472ae4 s__setup_tabsdes__fdpr_20+0x64()
01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x00000001002a9460 s_execute__fdpr_50+0x10()
01:04:00000:00401:2012/02/20 17:57:02.67 kernel [Handler pc: 0x00000001004a3d74 hdl_stack+0x0 installed by the following function:-]
01:04:00000:00401:2012/02/20 17:57:02.67 kernel [Handler pc: 0x00000001004ef49c s_handle+0x0 installed by the following function:-]
01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x0000000100290cc8 sequencer+0x2e8()
01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x00000001003dd110 tdsrecv_language+0xb8()
01:04:00000:00401:2012/02/20 17:57:02.67 kernel [Handler pc: 0x000000010046eb20 ut_handle+0x0 installed by the following function:-]
01:04:00000:00401:2012/02/20 17:57:02.67 kernel pc: 0x0000000100326ee0 conn_hdlr__fdpr_3+0x60()
01:04:00000:00401:2012/02/20 17:57:02.67 kernel end of stack trace, spid 401, kpid 1783759223, suid 28
01:04:00000:00401:2012/02/20 17:57:02.67 kernel ueshutdown: exiting


Find all posts by this user
02-21-2012, 08:39 AM
Post: #2
Quote this message in a reply
RE: orphan process resulting into Signal 11 stack trace causing server crash


Youplease better raise a ticket with Sybase Technical Support immediately.



JP,
TechSupport-Member(SybaseTeam.Com)
Find all posts by this user
04-15-2012, 03:49 PM
Post: #3
Quote this message in a reply
RE: orphan process resulting into Signal 11 stack trace causing server crash


Yeah. You are trying to tune the server too much at the lower levels.

When a process gets corrupted (Signal 11), it is normally not an issue (does not cause problems for other processes). The exception is, when the process is holding a spinlock, which only it can release. Signal 11 with spinlock held, affects other processes.

1. Back off on your timeslice setting. Changing the timeslice setting is very serious, you really have to know what you are doing, nd maintain monitoring (via sysmon or MDA).

2. Reduce the load on the spinlocks. That means, figure which ones are being overused from sysmon, and increase the number of spinlocks covering that resource. Eg. increase global_cache_partition_number or the local partition (per cache).

3. Increase the stack_guard_size.

The overall problem is, your server is badly lock-bound. Too many DPL/DRL tables, and transactions are too large.

BTW, that is not an orphan, it is a zombie, and one that will kill the server.



Ashirvad to my Shishyas, Cheers to the others
Derek Asirvadem
Information Architect / Sr Sybase DBA
Website
Selection of Useful Documents for the Sybase DBA
Visit this user's website Find all posts by this user
Post Reply 


Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  how to find which process filled up the log for a database ashwin510 2 2,077 08-14-2012 03:21 PM
Last Post: sybanva
  Backup server shutdown due to memory allocation failure on sybase 15.7 phadkedhawal21 0 2,711 06-12-2012 05:28 AM
Last Post: phadkedhawal21
  SYBASE ASE 15.0.3 ESD#4 signal 11 strace lasek 1 3,414 05-03-2012 05:43 PM
Last Post: lasek
  dbcc sqltext(spid) might cause signal 11 and stacktrace phadkedhawal21 1 2,861 02-02-2012 07:54 AM
Last Post: padalav
  current process (0x210021) infected with 11 ambrozio 1 2,733 03-16-2011 01:59 PM
Last Post: ambrozio
  Adaptive Server Enterprise ODBC Connection String ptn77 0 4,331 10-06-2010 02:50 PM
Last Post: ptn77
  Presentation on Troubleshooting Sybase Adaptive Server (ASE) Joshi 0 3,047 11-19-2009 07:32 AM
Last Post: Joshi
  Sybase server Error: 913, Severity: 22, State: 2 albert 0 3,397 07-08-2009 05:14 PM
Last Post: albert
Rolleyes Killing background process spid 1 fgalan 1 3,777 04-20-2009 03:23 AM
Last Post: john
  knowledge base entry: loading sybase 15 database dumps to sybase 12.5 sybase server ? john 0 3,312 03-24-2009 06:16 PM
Last Post: john

Options:
Forum Jump:


Contact Us | SybaseTeam | Disclaimer & Rules | Return to Top | Return to Content | Lite (Archive) Mode | RSS Syndication