
it2022-05-05  210

check_mk raw 1.2.8p17 FAQ Q:有没有已经实施的案例 A: Q:check-mk-agent怎么安装? A: 使用epel源,yum -y install check-mk-agent xinetd && /etc/init.d/xinetd start && chkconfig xinetd on,同时开放tcp:6556端口,目前epel源最高版本 1.2.6p16-3.el6 Q:客户端获取应用监控信息,如mysql等 A: 放置在/usr/share/check-mk-agent/plugins目录,如果需要异步的话,在plugins目录新建以秒为名称的目录,如/usr/share/check-mk-agent/plugins/60 Q:服务端过滤无用监控信息 A: 编辑/opt/omd/sites/monitor/etc/check_mk/,添加以下内容 ignored_checktypes = ["chrony", "kernel", "cpu.threads", "logwatch", "megaraid_pdisks", "ipmi", "ipmi_sensors", "mounts", "ps", "ps.perf", "postfix_mailq", "postfix_mailq_status", "logins", "omd_apache", "omd_status", "livestatus_status"] Q:监控数据量大的时候,是如何做优化的 A: 单机用目录方式进行优化,另外多机的话,还支持分布式 Q:如何变更检测时间及重试次数 A: Host & Service Parameters -> Monitoring Configuration -> Service Checks -> Normal check interval for service checks 设置1分钟 Host & Service Parameters -> Monitoring Configuration -> Service Checks -> Retry check interval for service checks 设置1分钟 Host & Service Parameters -> Monitoring Configuration -> Service Checks -> Maximum number of check attempts for service 设置1次 Host & Service Parameters -> Monitoring Configuration -> Service Checks -> Normal check interval for host checks 设置1分钟 Host & Service Parameters -> Monitoring Configuration -> Service Checks -> Retry check interval for host checks 设置1分钟 Host & Service Parameters -> Monitoring Configuration -> Service Checks -> Maximum number of check attempts for service 设置1次 Q:如何自动添加发现的新监控项 A: Host & Service Parameters -> Monitoring Configuration -> Inventory and Check_MK settings -> Periodic service discovery 增加rule Perform service discovery every 设置为1h Severity of unmonitored services 设置为warning Severity of vanished services 设置为ok Automatically update service configuration中,mode设置为Add unmonitored services & remove varnished services Group discovery and activation for up to 设置为10m Q:支持批量主机导入吗 A:Host -> Bulk host import Q:建立主机组并关联主机 A: Host & Service Groups -> new host group Host Tags -> new tag group Host & Service Parameters -> Grouping -> Assignment of hosts to host groups 新建rule , Assignment of hosts to host groups 选择group_bj,Conditions下host tag选择 location is group_bj Q:磁盘io模式由summary变更为separate A: Host & Service Parameters -> Parameters for discovered services -> Storage, Filesystems and Files -> Discovery mode for Disk IO check 新建rule,选择Create a separate check for each physical disk Q:网络设备接口名称由id变更为description A: Host & Service Parameters -> Parameters for discovered services -> Discovery - automatic service detection -> Network Interface and Switch Port Discovery,新建rule,选择Use description as service name for network interface checks,选择use description Q:改变监控阀值 A: #load Host & Service Parameters -> Parameters for discovered services -> Operating System Resources -> CPU load (not utilization!),添加rule,选择fixed level,warning & critical设置为10 #cpu Host & Service Parameters -> Parameters for discovered services -> Operating System Resources -> CPU utilization on Linux/UNIX,添加rule,选择Alert on too high CPU utilization,warning & critical设置为95,选择Alert on too high disk wait (IO wait),warning & critical设置为50 #memory Host & Service Parameters -> Parameters for discovered services -> Operating System Resources -> Memory and Swap usage on Linux,添加rule,选择Levels for RAM,warning & critical设置为95 #storage Host & Service Parameters -> Parameters for discovered services -> Storage, Filesystems and Files -> Filesystems (used space and growth) 添加rule,选择Levels for filesystem free space,dynamic level, Filesystem larger than 1T -> Absolute free space -> warning & critical设置为20000MB Filesystem larger than 300G -> Absolute free space -> warning & critical设置为10000MB Filesystem larger than 100G -> Absolute free space -> warning & critical设置为5000MB Filesystem larger than 10G -> Absolute free space -> warning & critical设置为1000MB Filesystem larger than 1G -> Absolute free space -> warning & critical设置为100MB Filesystem larger than 300M -> Absolute free space -> warning & critical设置为20MB Filesystem larger than 100M -> Absolute free space -> warning & critical设置为10MB Filesystem larger than 0B -> Absolute free space -> warning & critical设置为0MB #traffic Host & Service Parameters -> Parameters for discovered services -> Networking -> Network interfaces and switch ports,新建rule, 1g Operating speed -> ignore speed,Operational state -> 1 - up,Assumed input speed -> 1g, Assumed output speed -> 1g, Measurement unit -> Bits,Used bandwidth (minimum or maximum traffic) -> in/out & upper & percent level & warning 95 & critical 95,Average values -> 10mins,Port Specification -> em1 em2 em3 em4 eth0 eth1 eth2 eth3 2g Operating speed -> ignore speed,Operational state -> 1 - up,Assumed input speed -> 2000000000 bit, Assumed output speed -> 2000000000 bit, Measurement unit -> Bits,Used bandwidth (minimum or maximum traffic) -> in/out & upper & percent level & warning 95 & critical 95,Average values -> 10mins,Port Specification -> bond0 bond1 10g Operating speed -> ignore speed,Operational state -> 1 - up,Assumed input speed -> 10g, Assumed output speed -> 10g, Measurement unit -> Bits,Used bandwidth (minimum or maximum traffic) -> in/out & upper & percent level & warning 95 & critical 95,Average values -> 10mins,Port Specification -> p1p1 p1p2 p2p1 p2p2 cmk -L |grep redis tcp (no man page present) cmk -I kvm-48-113 cmk --debug -nv kvm-48-113 posted on 2017-02-20 12:57 北京涛子 阅读( ...) 评论( ...) 编辑 收藏


