OUT - OP - @Internal public abstract class StreamTask<OUT,OP extends StreamOperator<OUT>> extends Object implements org.apache.flink.runtime.jobgraph.tasks.TaskInvokable, org.apache.flink.runtime.jobgraph.tasks.CheckpointableTask, org.apache.flink.runtime.jobgraph.tasks.CoordinatedTask, org.apache.flink.runtime.taskmanager.AsyncExceptionHandler, ContainingTaskDetails
StreamOperators which form the
Task's operator chain. Operators that are chained together execute synchronously in the same
thread and hence on the same stream partition. A common case for these chains are successive
map/flatmap/filter tasks.
The task chain contains one "head" operator and multiple chained operators. The StreamTask is specialized for the type of the head operator: one-input and two-input tasks, as well as for sources, iteration heads and iteration tails.
The Task class deals with the setup of the streams read by the head operator, and the streams produced by the operators at the ends of the operator chain. Note that the chain may fork and thus have multiple ends.
The life cycle of the task is set up as follows:
-- setInitialState -> provides state of all operators in the chain
-- invoke()
|
+----> Create basic utils (config, etc) and load the chain of operators
+----> operators.setup()
+----> task specific init()
+----> initialize-operator-states()
+----> open-operators()
+----> run()
+----> finish-operators()
+----> close-operators()
+----> common cleanup
+----> task specific cleanup()
The StreamTask has a lock object called lock. All calls to methods on a StreamOperator must be synchronized on this lock object to ensure that no methods are called
concurrently.
| 限定符和类型 | 字段和说明 |
|---|---|
protected org.apache.flink.runtime.state.CheckpointStorage |
checkpointStorage
Our checkpoint storage.
|
protected StreamConfig |
configuration
The configuration of this streaming task.
|
protected StreamInputProcessor |
inputProcessor
The input processor.
|
protected static org.slf4j.Logger |
LOG
The logger used by the StreamTask and its subclasses.
|
protected MailboxProcessor |
mailboxProcessor |
protected OP |
mainOperator
the main operator that consumes the input streams of this task.
|
protected OperatorChain<OUT,OP> |
operatorChain
The chain of operators executed by this task.
|
protected org.apache.flink.runtime.io.network.api.writer.RecordWriterDelegate<org.apache.flink.runtime.plugable.SerializationDelegate<StreamRecord<OUT>>> |
recordWriter |
protected org.apache.flink.runtime.state.StateBackend |
stateBackend
Our state backend.
|
protected TimerService |
systemTimerService
In contrast to
timerService we should not register any user timers here. |
protected TimerService |
timerService
The internal
TimerService used to define the current processing time (default =
System.currentTimeMillis()) and register timers for tasks to be executed in the
future. |
static ThreadGroup |
TRIGGER_THREAD_GROUP
The thread group that holds all trigger timer threads.
|
| 限定符 | 构造器和说明 |
|---|---|
protected |
StreamTask(org.apache.flink.runtime.execution.Environment env)
Constructor for initialization, possibly with initial state (recovery / savepoint / etc).
|
protected |
StreamTask(org.apache.flink.runtime.execution.Environment env,
TimerService timerService)
Constructor for initialization, possibly with initial state (recovery / savepoint / etc).
|
protected |
StreamTask(org.apache.flink.runtime.execution.Environment environment,
TimerService timerService,
Thread.UncaughtExceptionHandler uncaughtExceptionHandler) |
protected |
StreamTask(org.apache.flink.runtime.execution.Environment environment,
TimerService timerService,
Thread.UncaughtExceptionHandler uncaughtExceptionHandler,
StreamTaskActionExecutor actionExecutor)
Constructor for initialization, possibly with initial state (recovery / savepoint / etc).
|
protected |
StreamTask(org.apache.flink.runtime.execution.Environment environment,
TimerService timerService,
Thread.UncaughtExceptionHandler uncaughtExceptionHandler,
StreamTaskActionExecutor actionExecutor,
TaskMailbox mailbox) |
| 限定符和类型 | 方法和说明 |
|---|---|
void |
abortCheckpointOnBarrier(long checkpointId,
org.apache.flink.runtime.checkpoint.CheckpointException cause) |
protected void |
advanceToEndOfEventTime()
Emits the
MAX_WATERMARK so that all registered timers are fired. |
protected void |
afterInvoke() |
void |
cancel() |
protected void |
cancelTask() |
void |
cleanUp(Throwable throwable) |
protected void |
cleanUpInternal() |
static <OUT> org.apache.flink.runtime.io.network.api.writer.RecordWriterDelegate<org.apache.flink.runtime.plugable.SerializationDelegate<StreamRecord<OUT>>> |
createRecordWriterDelegate(StreamConfig configuration,
org.apache.flink.runtime.execution.Environment environment) |
StreamTaskStateInitializer |
createStreamTaskStateInitializer() |
protected void |
declineCheckpoint(long checkpointId) |
void |
dispatchOperatorEvent(org.apache.flink.runtime.jobgraph.OperatorID operator,
org.apache.flink.util.SerializedValue<org.apache.flink.runtime.operators.coordination.OperatorEvent> event) |
protected void |
endData(org.apache.flink.runtime.io.network.api.StopMode mode) |
protected void |
finalize()
The finalize method shuts down the timer.
|
protected long |
getAsyncCheckpointStartDelayNanos() |
ExecutorService |
getAsyncOperationsThreadPool() |
org.apache.flink.core.fs.CloseableRegistry |
getCancelables() |
protected Optional<CheckpointBarrierHandler> |
getCheckpointBarrierHandler()
Acquires the optional
CheckpointBarrierHandler associated with this stream task. |
org.apache.flink.runtime.state.CheckpointStorageWorkerView |
getCheckpointStorage() |
protected CompletableFuture<Void> |
getCompletionFuture() |
StreamConfig |
getConfiguration() |
org.apache.flink.runtime.execution.Environment |
getEnvironment() |
MailboxExecutorFactory |
getMailboxExecutorFactory() |
String |
getName()
Gets the name of the task, in the form "taskname (2/5)".
|
ProcessingTimeServiceFactory |
getProcessingTimeServiceFactory() |
void |
handleAsyncException(String message,
Throwable exception)
Handles an exception thrown by another thread (e.g. a TriggerTask), other than the one
executing the main task by failing the task entirely.
|
protected abstract void |
init() |
void |
invoke() |
boolean |
isCanceled() |
boolean |
isFailing() |
boolean |
isMailboxLoopRunning() |
boolean |
isRunning() |
boolean |
isUsingNonBlockingInput() |
void |
maybeInterruptOnCancel(Thread toInterrupt,
String taskName,
Long timeout) |
Future<Void> |
notifyCheckpointAbortAsync(long checkpointId,
long latestCompletedCheckpointId) |
Future<Void> |
notifyCheckpointCompleteAsync(long checkpointId) |
Future<Void> |
notifyCheckpointSubsumedAsync(long checkpointId) |
protected void |
processInput(MailboxDefaultAction.Controller controller)
This method implements the default action of the task (e.g. processing one event from the
input).
|
void |
restore() |
void |
runMailboxLoop() |
boolean |
runMailboxStep() |
protected void |
setSynchronousSavepoint(long checkpointId) |
protected org.apache.flink.metrics.Counter |
setupNumRecordsInCounter(StreamOperator streamOperator) |
String |
toString() |
CompletableFuture<Boolean> |
triggerCheckpointAsync(org.apache.flink.runtime.checkpoint.CheckpointMetaData checkpointMetaData,
org.apache.flink.runtime.checkpoint.CheckpointOptions checkpointOptions) |
void |
triggerCheckpointOnBarrier(org.apache.flink.runtime.checkpoint.CheckpointMetaData checkpointMetaData,
org.apache.flink.runtime.checkpoint.CheckpointOptions checkpointOptions,
org.apache.flink.runtime.checkpoint.CheckpointMetricsBuilder checkpointMetrics) |
getExecutionConfig, getIndexInSubtaskGroup, getJobConfiguration, getUserCodeClassLoaderpublic static final ThreadGroup TRIGGER_THREAD_GROUP
protected static final org.slf4j.Logger LOG
@Nullable protected StreamInputProcessor inputProcessor
init() method.protected OP extends StreamOperator<OUT> mainOperator
protected OperatorChain<OUT,OP extends StreamOperator<OUT>> operatorChain
protected final StreamConfig configuration
protected final org.apache.flink.runtime.state.StateBackend stateBackend
protected final org.apache.flink.runtime.state.CheckpointStorage checkpointStorage
protected final TimerService timerService
TimerService used to define the current processing time (default =
System.currentTimeMillis()) and register timers for tasks to be executed in the
future.protected final TimerService systemTimerService
timerService we should not register any user timers here. It should
be used only for system level timers.protected final org.apache.flink.runtime.io.network.api.writer.RecordWriterDelegate<org.apache.flink.runtime.plugable.SerializationDelegate<StreamRecord<OUT>>> recordWriter
protected final MailboxProcessor mailboxProcessor
protected StreamTask(org.apache.flink.runtime.execution.Environment env)
throws Exception
env - The task environment for this task.Exceptionprotected StreamTask(org.apache.flink.runtime.execution.Environment env,
@Nullable
TimerService timerService)
throws Exception
env - The task environment for this task.timerService - Optionally, a specific timer service to use.Exceptionprotected StreamTask(org.apache.flink.runtime.execution.Environment environment,
@Nullable
TimerService timerService,
Thread.UncaughtExceptionHandler uncaughtExceptionHandler)
throws Exception
Exceptionprotected StreamTask(org.apache.flink.runtime.execution.Environment environment,
@Nullable
TimerService timerService,
Thread.UncaughtExceptionHandler uncaughtExceptionHandler,
StreamTaskActionExecutor actionExecutor)
throws Exception
This constructor accepts a special TimerService. By default (and if null is passes
for the timer service) a DefaultTimerService will be
used.
environment - The task environment for this task.timerService - Optionally, a specific timer service to use.uncaughtExceptionHandler - to handle uncaught exceptions in the async operations thread
poolactionExecutor - a mean to wrap all actions performed by this task thread. Currently,
only SynchronizedActionExecutor can be used to preserve locking semantics.Exceptionprotected StreamTask(org.apache.flink.runtime.execution.Environment environment,
@Nullable
TimerService timerService,
Thread.UncaughtExceptionHandler uncaughtExceptionHandler,
StreamTaskActionExecutor actionExecutor,
TaskMailbox mailbox)
throws Exception
Exceptionprotected void processInput(MailboxDefaultAction.Controller controller) throws Exception
controller - controller object for collaborative interaction between the action and the
stream task.Exception - on any problems in the action.protected void endData(org.apache.flink.runtime.io.network.api.StopMode mode)
throws Exception
Exceptionprotected void setSynchronousSavepoint(long checkpointId)
protected void advanceToEndOfEventTime()
throws Exception
MAX_WATERMARK so that all registered timers are fired.
This is used by the source task when the job is TERMINATED. In the case, we want
all the timers registered throughout the pipeline to fire and the related state (e.g.
windows) to be flushed.
For tasks other than the source task, this method does nothing.
Exceptionpublic StreamTaskStateInitializer createStreamTaskStateInitializer()
protected org.apache.flink.metrics.Counter setupNumRecordsInCounter(StreamOperator streamOperator)
public final void restore()
throws Exception
restore 在接口中 org.apache.flink.runtime.jobgraph.tasks.TaskInvokableExceptionpublic final void invoke()
throws Exception
invoke 在接口中 org.apache.flink.runtime.jobgraph.tasks.TaskInvokableException@VisibleForTesting public boolean isMailboxLoopRunning()
public final void cleanUp(Throwable throwable) throws Exception
cleanUp 在接口中 org.apache.flink.runtime.jobgraph.tasks.TaskInvokableExceptionprotected CompletableFuture<Void> getCompletionFuture()
public final void cancel()
throws Exception
cancel 在接口中 org.apache.flink.runtime.jobgraph.tasks.TaskInvokableExceptionpublic MailboxExecutorFactory getMailboxExecutorFactory()
public final boolean isRunning()
public final boolean isCanceled()
public final boolean isFailing()
protected void finalize()
throws Throwable
This should not be relied upon! It will cause shutdown to happen much later than if manual shutdown is attempted, and cause threads to linger for longer than needed.
public final String getName()
public org.apache.flink.runtime.state.CheckpointStorageWorkerView getCheckpointStorage()
public StreamConfig getConfiguration()
public CompletableFuture<Boolean> triggerCheckpointAsync(org.apache.flink.runtime.checkpoint.CheckpointMetaData checkpointMetaData, org.apache.flink.runtime.checkpoint.CheckpointOptions checkpointOptions)
triggerCheckpointAsync 在接口中 org.apache.flink.runtime.jobgraph.tasks.CheckpointableTaskprotected Optional<CheckpointBarrierHandler> getCheckpointBarrierHandler()
CheckpointBarrierHandler associated with this stream task. The
CheckpointBarrierHandler should exist if the task has data inputs and requires to
align the barriers.public void triggerCheckpointOnBarrier(org.apache.flink.runtime.checkpoint.CheckpointMetaData checkpointMetaData,
org.apache.flink.runtime.checkpoint.CheckpointOptions checkpointOptions,
org.apache.flink.runtime.checkpoint.CheckpointMetricsBuilder checkpointMetrics)
throws IOException
triggerCheckpointOnBarrier 在接口中 org.apache.flink.runtime.jobgraph.tasks.CheckpointableTaskIOExceptionpublic void abortCheckpointOnBarrier(long checkpointId,
org.apache.flink.runtime.checkpoint.CheckpointException cause)
throws IOException
abortCheckpointOnBarrier 在接口中 org.apache.flink.runtime.jobgraph.tasks.CheckpointableTaskIOExceptionprotected void declineCheckpoint(long checkpointId)
public final ExecutorService getAsyncOperationsThreadPool()
public Future<Void> notifyCheckpointCompleteAsync(long checkpointId)
notifyCheckpointCompleteAsync 在接口中 org.apache.flink.runtime.jobgraph.tasks.CheckpointableTaskpublic Future<Void> notifyCheckpointAbortAsync(long checkpointId, long latestCompletedCheckpointId)
notifyCheckpointAbortAsync 在接口中 org.apache.flink.runtime.jobgraph.tasks.CheckpointableTaskpublic Future<Void> notifyCheckpointSubsumedAsync(long checkpointId)
notifyCheckpointSubsumedAsync 在接口中 org.apache.flink.runtime.jobgraph.tasks.CheckpointableTaskpublic void dispatchOperatorEvent(org.apache.flink.runtime.jobgraph.OperatorID operator,
org.apache.flink.util.SerializedValue<org.apache.flink.runtime.operators.coordination.OperatorEvent> event)
throws org.apache.flink.util.FlinkException
dispatchOperatorEvent 在接口中 org.apache.flink.runtime.jobgraph.tasks.CoordinatedTaskorg.apache.flink.util.FlinkExceptionpublic ProcessingTimeServiceFactory getProcessingTimeServiceFactory()
public void handleAsyncException(String message, Throwable exception)
In more detail, it marks task execution failed for an external reason (a reason other than the task code itself throwing an exception). If the task is already in a terminal state (such as FINISHED, CANCELED, FAILED), or if the task is already canceling this does nothing. Otherwise it sets the state to FAILED, and, if the invokable code is running, starts an asynchronous thread that aborts that code.
This method never blocks.
handleAsyncException 在接口中 org.apache.flink.runtime.taskmanager.AsyncExceptionHandlerpublic final org.apache.flink.core.fs.CloseableRegistry getCancelables()
@VisibleForTesting public static <OUT> org.apache.flink.runtime.io.network.api.writer.RecordWriterDelegate<org.apache.flink.runtime.plugable.SerializationDelegate<StreamRecord<OUT>>> createRecordWriterDelegate(StreamConfig configuration, org.apache.flink.runtime.execution.Environment environment)
protected long getAsyncCheckpointStartDelayNanos()
public boolean isUsingNonBlockingInput()
isUsingNonBlockingInput 在接口中 org.apache.flink.runtime.jobgraph.tasks.TaskInvokablepublic void maybeInterruptOnCancel(Thread toInterrupt, @Nullable String taskName, @Nullable Long timeout)
maybeInterruptOnCancel 在接口中 org.apache.flink.runtime.jobgraph.tasks.TaskInvokablepublic final org.apache.flink.runtime.execution.Environment getEnvironment()
Copyright © 2014–2022 The Apache Software Foundation. All rights reserved.