求助:ORA-00445,ORA-609问题分析

各位老师,同学:
            
   帮忙抽时间给分析一下吧
相关日志文件:
链接:http://pan.baidu.com/s/1o6ii73s 密码:ariq
(1)全部alert日志,160M:alert_orcl2.log
(2)2015年6月17日的日志:alert_orcl2(20150617).txt
问题说明:
环境:solaris系统,oracle rac 11.2.0.3.0 环境
现象:sqlplus 登录的时候卡住,应用程序也卡住。
时间:2015年6月17日下午17:00 开始
操作:
(1)delete一张大表中的制定日志的数据
(2)项目组发现 系统资源占用较高,中断了操作
解决:重启数据库shundown immediate,异常缓慢,kill 进程,然后startup。问题得以解决
初步分析
(1)服务器内存64G,SGA:30G,系统在16:49分---18:30 ,通过nmon报告,发现内存使用异常高,瞬间出现多次100%情况。
(2)ORA-609该现象已经存在很长时间了。但是在17:30 以后,一直报这个错,初步判断为:oracle进程不能正常处理。
(3)ORA-00445: background process "W001" did not start after 120 seconds,类似这种 后台进程不能正常启动,之前也出现过,不明白这些后台进程为什么会自动停止了呢?官方解析:
1. Too much activity on your machine.
2. NFS latency issues.
3. Disk latency issue (that affects I/O).
4. Network latency.
个人认为应该是第一个原因,不过sqlnet.ora,目前还没有调整。个人认为默认的60s,应该符合大部分程序的连接。不过当时 ps -ef |grep ora |grep LOCAL=NO|wc -l 大概有300个,不释放的连接。不知道是不是这些长连接不释放,造成的内存占用问题。
(4)http://blog.itpub.net/12679300/viewspace-1169288/ 参考这个网址,说是randomize参数值,但是我们的系统是solaris的。
疑问:
(1)ora-609 ,属于正常现象吗?可以忽略吗?
(2)ORA-00445: background process "W001" did not start after 120 seconds 类似这种情况,一般是由什么问题造成的。(个人认为,就是 一些后台进程不能正常工作,导致的故障现象)


还请大家给分析一下整个问题产生的过程,及问题分析思路?
先谢过大家啦!
标签: 暂无标签
牛角书生

写了 1 篇文章,拥有财富 14,被 2 人关注

转播转播 分享分享 分享淘帖
回复

使用道具

P3 | 发表于 2018-3-1 16:16:25
It should be noted that this problem has only been positively diagnosed in Redhat 5 and Oracle 11.2.0.2.
It is also likely, as per unpublished BUG:8527473,  that this issue will reproduce running on Generic Linux platforms running  any Oracle 11.2.0.x. or 12.1.0.x  on Redhat/OEL kernels which have ASLR.

This issue has been seen in both Single Instance and RAC environments.

ASLR also exists in SLES10 and SLES 11 kernels and by default ASLR is turned on.  To date no problem has been seen on SuSE servers running Oracle  but Novell confirm ASLR may cause problems.  Please refer to

http://www.novell.com/support/kb/doc.php?id=7004855 mmap occasionally infringes on stack
You can verify whether ASLR is being used as follows:

# /sbin/sysctl -a | grep randomize
kernel.randomize_va_space = 1

If the parameter is set to any value other than 0 then ASLR is in use.

On Redhat 5 to permanently disable ASLR.

add/modify this parameter in /etc/sysctl.conf
kernel.randomize_va_space=0
kernel.exec-shield=0

You need to reboot for kernel.exec-shield parameter to take effect.

Note that both kernel parameters are required for ASLR to be switched off.



There may be other reasons for a process failing to start, however, by switching ASLR off, you can quickly discount ASLR being the problem. More and more issues are being identified when ASLR is in operation.



Note:  "In RHEL/OEL 7 exec-shield is not modifiable anymore, so changing the exec-shield parameter produces an error."
回复

使用道具

您需要登录后才可以回帖 登录 | 加入社区

本版积分规则

意见
反馈