SparkListener
is a developer API for custom Spark listeners. It is an abstract class that is a SparkListenerInterface with empty no-op implementations of all the callback methods.
With all the callbacks being no-ops, you can focus on events of your liking and process a subset of events.
Tip
|
Developing a custom SparkListener is an excellent introduction to low-level details of Spark’s Execution Model. Check out the exercise Developing Custom SparkListener to monitor DAGScheduler in Scala. |
Caution
|
FIXME Give a less code-centric description of the times for the events. |
SparkListenerApplicationStart(
appName: String,
appId: Option[String],
time: Long,
sparkUser: String,
appAttemptId: Option[String],
driverLogs: Option[Map[String, String]] = None)
SparkListenerApplicationStart
is posted when SparkContext
does postApplicationStart
.
SparkListenerJobStart(
jobId: Int,
time: Long,
stageInfos: Seq[StageInfo],
properties: Properties = null)
SparkListenerJobStart
is posted when DAGScheduler
does handleJobSubmitted
and handleMapStageSubmitted
.
SparkListenerStageSubmitted(stageInfo: StageInfo, properties: Properties = null)
SparkListenerStageSubmitted
is posted when DAGScheduler
does submitMissingTasks
.
SparkListenerTaskStart(stageId: Int, stageAttemptId: Int, taskInfo: TaskInfo)
SparkListenerTaskStart
is posted when DAGScheduler
does handleBeginEvent
.
SparkListenerTaskGettingResult(taskInfo: TaskInfo)
SparkListenerTaskGettingResult
is posted when DAGScheduler
does handleGetTaskResult
.
SparkListenerTaskEnd(
stageId: Int,
stageAttemptId: Int,
taskType: String,
reason: TaskEndReason,
taskInfo: TaskInfo,
// may be null if the task has failed
@Nullable taskMetrics: TaskMetrics)
SparkListenerTaskEnd
is posted when DAGScheduler
does handleTaskCompletion
.
SparkListenerStageCompleted(stageInfo: StageInfo)
SparkListenerStageCompleted
is posted when DAGScheduler
does markStageAsFinished
.
SparkListenerJobEnd(
jobId: Int,
time: Long,
jobResult: JobResult)
SparkListenerJobEnd
is posted when DAGScheduler
does cleanUpAfterSchedulerStop
, handleTaskCompletion
, failJobAndIndependentStages
, and markMapStageJobAsFinished.
SparkListenerApplicationEnd(time: Long)
SparkListenerApplicationEnd
is posted when SparkContext
does postApplicationEnd
.
SparkListenerEnvironmentUpdate(environmentDetails: Map[String, Seq[(String, String)]])
SparkListenerEnvironmentUpdate
is posted when SparkContext
does postEnvironmentUpdate
.
SparkListenerBlockManagerAdded(
time: Long,
blockManagerId: BlockManagerId,
maxMem: Long)
SparkListenerBlockManagerAdded
is posted when BlockManagerMasterEndpoint
registers a BlockManager
.
SparkListenerBlockManagerRemoved(
time: Long,
blockManagerId: BlockManagerId)
SparkListenerBlockManagerRemoved
is posted when BlockManagerMasterEndpoint
removes a BlockManager
.
SparkListenerBlockUpdated(blockUpdatedInfo: BlockUpdatedInfo)
SparkListenerBlockUpdated
is posted when BlockManagerMasterEndpoint
receives UpdateBlockInfo
message.
SparkListenerUnpersistRDD(rddId: Int)
SparkListenerUnpersistRDD
is posted when SparkContext
does unpersistRDD
.
SparkListenerExecutorAdded(
time: Long,
executorId: String,
executorInfo: ExecutorInfo)
SparkListenerExecutorAdded
is posted when DriverEndpoint
RPC endpoint (of CoarseGrainedSchedulerBackend
) handles RegisterExecutor
message, MesosFineGrainedSchedulerBackend
does resourceOffers
, and LocalSchedulerBackendEndpoint
starts.
SparkListenerExecutorRemoved(
time: Long,
executorId: String,
reason: String)
SparkListenerExecutorRemoved
is posted when DriverEndpoint
RPC endpoint (of CoarseGrainedSchedulerBackend
) does removeExecutor
and MesosFineGrainedSchedulerBackend
does removeExecutor
.
The following is the complete list of all known Spark listeners:
-
ExecutorsListener
that prepares information to be displayed on the Executors tab in web UI. -
SparkFirehoseListener
that allows users to receive all SparkListenerEvent events by overriding the singleonEvent
method only. -
ExecutorAllocationListener
-
web UI and EventLoggingListener listeners
Caution
|
FIXME Make it complete. |