spark的监控我们目前只介绍4种,分别是
Spark的webUI界面给我们提供了非常好的作业监控界面,通过仔细观察那些界面我们可以做很多的事,比如可以查看正在运行的spark程序的作业的详细信息,Duration、gc、launch的时间,这都需要在生产上观看的。但是当任务跑完或者挂了的时候,我们是无法看到任何信息的
当启动spark作业的时候 可以看到http://hadoop001:4040这个webUI界面的地址,尝试打开 首先在spark中实现一个join 实现join以后,查看webUI界面 DAG图 当我们把spark作业关掉,再刷新http://hadoop001:4040界面,发现界面打不开了这就导致我们无法查看生产上导致job挂了的原因,也就无法做出解决
通过Spark HistoryServer我们可以观看已经结束的Spark Application
想要使用HistoryServer,需要先进行配置才能使用,配置的时候需要细心一点,否则会有问题,以下内容是根据官网的最新内容进行配置
在spark-defaults.conf打开以下两个选项 spark.eventLog.enabled true #开启事件日志 spark.eventLog.dir hdfs://hadoop001:9000/g6_directory #事件日志存放位置 #hadoop001:9000就是你hadoop /home/hadoop/app/hadoop/etc/hadoop/core-site.xml,你的 #core-site.xml中的fs.defaultFS配置的啥 你就写啥
由上图红框可以看出,所有的spark.history.*参数都要配置在SPARK_HISTORY_OPTS之下
下表是所有的spark.history参数 Property NameDefaultMeaningspark.history.providerorg.apache.spark.deploy.history.FsHistoryProviderName of the class implementing the application history backend. Currently there is only one implementation, provided by Spark, which looks for application logs stored in the file system.spark.history.fs.logDirectoryfile:/tmp/spark-events 日志目录For the filesystem history provider, the URL to the directory containing application event logs to load. This can be a local file:// path, an HDFS path hdfs://namenode/shared/spark-logs or that of an alternative filesystem supported by the Hadoop APIs.spark.history.fs.update.interval10s #多久更新日志The period at which the filesystem history provider checks for new or updated logs in the log directory. A shorter interval detects new applications faster, at the expense of more server load re-reading updated applications. As soon as an update has completed, listings of the completed and incomplete applications will reflect the changes.spark.history.retainedApplications50 #内存中最多持有程序数,多的则需要读取磁盘
The number of applications to retain UI data for in the cache. If this cap is exceeded, then the oldest applications will be removed from the cache. If an application is not in the cache, it will have to be loaded from disk if it is accessed from the UI.spark.history.ui.maxApplicationsInt.MaxValueThe number of applications to display on the history summary page. Application UIs are still available by accessing their URLs directly even if they are not displayed on the history summary page.spark.history.ui.port18080 UIweb界面端口号默认的The port to which the web interface of the history server binds.spark.history.kerberos.enabledfalseIndicates whether the history server should use kerberos to login. This is required if the history server is accessing HDFS files on a secure Hadoop cluster. If this is true, it uses the configs spark.history.kerberos.principalandspark.history.kerberos.keytab.spark.history.kerberos.principal(none)Kerberos principal name for the History Server.spark.history.kerberos.keytab(none)Location of the kerberos keytab file for the History Server.spark.history.fs.cleaner.enabledfalse #是否开启清理日志数据功能,生产上是必须清理的,需开启Specifies whether the History Server should periodically clean up event logs from storage.spark.history.fs.cleaner.interval1dHow often the filesystem job history cleaner checks for files to delete. Files are only deleted if they are older than spark.history.fs.cleaner.maxAgespark.history.fs.cleaner.maxAge7dJob history files older than this will be deleted when the filesystem history cleaner runs.spark.history.fs.endEventReparseChunkSize1mHow many bytes to parse at the end of log files looking for the end event. This is used to speed up generation of application listings by skipping unnecessary parts of event log files. It can be disabled by setting this config to 0.spark.history.fs.inProgressOptimization.enabledtrueEnable optimized handling of in-progress logs. This option may leave finished applications that fail to rename their event logs listed as in-progress.spark.history.fs.numReplayThreads25% of available coresNumber of threads that will be used by history server to process event logs.spark.history.store.maxDiskUsage10gMaximum disk usage for the local directory where the cache application history information are stored.spark.history.store.path(none)Local directory where to cache application history data. If set, the history server will store application data on disk instead of keeping it in memory. The data written to disk will be re-used in the event of a history server restart.
说明:-D是必须加在最前边,x代表spark.history.*参数,y代表给参数赋的值
这里的
SPARK_HISTORY_OPTS="-Dspark.history.fs.logDirectory=hdfs://hadoop001:9000/g6_directory"的hdfs://hadoop001:9000/g6_directory目录必须跟配置一中spark-defaults.conf的spark.eventLog.dir路径配置一致 ,因为日志存放在哪里就要从哪里读取呀
提示:日志的存放目录必须提前在HDFS上创建
启动HistoryServer以后,我们可以启动一个spark应用程序,然后关闭,这时候我们发现,web界面的信息并没有随着应用程序的结束而消失,实际上hadoop001:18080这个界面显示的只是job结束以后的信息,在job还在运行的时候是看不到的
如图
在applications路径之下我们可以根据上图中的job的完成状态最早开始时间俎新开始时间等等,自行查看自己想要的信息这只是一部分,其他详细信息可自行去官网查看http://spark.apache.org/docs/latest/monitoring.html
转载于:https://www.cnblogs.com/xuziyu/p/11049350.html