Interface DataflowPipelineDebugOptions
-
- All Superinterfaces:
org.apache.beam.sdk.options.ExperimentalOptions,org.apache.beam.sdk.transforms.display.HasDisplayData,org.apache.beam.sdk.options.MemoryMonitorOptions,org.apache.beam.sdk.options.PipelineOptions
- All Known Subinterfaces:
DataflowPipelineOptions,DataflowWorkerHarnessOptions,TestDataflowPipelineOptions
@Hidden public interface DataflowPipelineDebugOptions extends org.apache.beam.sdk.options.ExperimentalOptions, org.apache.beam.sdk.options.MemoryMonitorOptions, org.apache.beam.sdk.options.PipelineOptionsInternal. Options used to control execution of the Dataflow SDK for debugging and testing purposes.
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static classDataflowPipelineDebugOptions.DataflowClientFactoryReturns the default Dataflow client built from the passed in PipelineOptions.static classDataflowPipelineDebugOptions.StagerFactoryCreates aStagerobject using the class specified ingetStagerClass().static classDataflowPipelineDebugOptions.UnboundedReaderMaxReadTimeFactorySets Integer value based on old, deprecated field (getUnboundedReaderMaxReadTimeSec()).-
Nested classes/interfaces inherited from interface org.apache.beam.sdk.options.PipelineOptions
org.apache.beam.sdk.options.PipelineOptions.AtomicLongFactory, org.apache.beam.sdk.options.PipelineOptions.CheckEnabled, org.apache.beam.sdk.options.PipelineOptions.DirectRunner, org.apache.beam.sdk.options.PipelineOptions.JobNameFactory, org.apache.beam.sdk.options.PipelineOptions.UserAgentFactory
-
-
Method Summary
All Methods Instance Methods Abstract Methods Deprecated Methods Modifier and Type Method Description java.lang.StringgetApiRootUrl()The root URL for the Dataflow API.com.google.api.services.dataflow.DataflowgetDataflowClient()An instance of the Dataflow client.java.lang.StringgetDataflowEndpoint()Dataflow endpoint to use.java.lang.StringgetDataflowJobFile()The path to write the translated Dataflow job specification out to at job submission time.intgetDesiredNumUnboundedSourceSplits()The desired number of initial splits for UnboundedSources.booleangetDumpHeapOnOOM()If true, save a heap dump before killing a thread or process which is GC thrashing or out of memory.intgetJfrRecordingDurationSec()intgetNumberOfWorkerHarnessThreads()Number of threads to use on the Dataflow worker harness.java.lang.IntegergetReaderCacheTimeoutSec()The amount of time before UnboundedReaders are considered idle and closed during streaming execution.booleangetRecordJfrOnGcThrashing()If true, save a JFR profile when GC thrashing is first detected.java.lang.StringgetSaveHeapDumpsToGcsPath()CAUTION: This option implies dumpHeapOnOOM, and has similar caveats.java.lang.StringgetSdkHarnessContainerImageOverrides()Overrides for SDK harness container images.StagergetStager()The resource stager instance that should be used to stage resources.java.lang.Class<? extends Stager>getStagerClass()The class responsible for staging resources to be accessible by workers during job execution.java.util.Map<java.lang.String,java.lang.String>getTransformNameMapping()Mapping of old PTransform names to new ones, specified as JSON{"oldName":"newName",...}.java.lang.IntegergetUnboundedReaderMaxElements()The max elements read from an UnboundedReader before checkpointing.java.lang.IntegergetUnboundedReaderMaxReadTimeMs()The max amount of time an UnboundedReader is consumed before checkpointing.java.lang.IntegergetUnboundedReaderMaxReadTimeSec()Deprecated.usegetUnboundedReaderMaxReadTimeMs()insteadjava.lang.IntegergetUnboundedReaderMaxWaitForElementsMs()The max amount of time waiting for elements when reading from UnboundedReader.java.lang.IntegergetWorkerCacheMb()The size of the worker's in-memory cache, in megabytes.voidsetApiRootUrl(java.lang.String value)voidsetDataflowClient(com.google.api.services.dataflow.Dataflow value)voidsetDataflowEndpoint(java.lang.String value)voidsetDataflowJobFile(java.lang.String value)voidsetDesiredNumUnboundedSourceSplits(int value)voidsetDumpHeapOnOOM(boolean dumpHeapBeforeExit)voidsetJfrRecordingDurationSec(int value)voidsetNumberOfWorkerHarnessThreads(int value)voidsetReaderCacheTimeoutSec(java.lang.Integer value)voidsetRecordJfrOnGcThrashing(boolean value)voidsetSaveHeapDumpsToGcsPath(java.lang.String gcsPath)voidsetSdkHarnessContainerImageOverrides(java.lang.String value)voidsetStager(Stager stager)voidsetStagerClass(java.lang.Class<? extends Stager> stagerClass)voidsetTransformNameMapping(java.util.Map<java.lang.String,java.lang.String> value)voidsetUnboundedReaderMaxElements(java.lang.Integer value)voidsetUnboundedReaderMaxReadTimeMs(java.lang.Integer value)voidsetUnboundedReaderMaxReadTimeSec(java.lang.Integer value)voidsetUnboundedReaderMaxWaitForElementsMs(java.lang.Integer value)voidsetWorkerCacheMb(java.lang.Integer value)-
Methods inherited from interface org.apache.beam.sdk.options.ExperimentalOptions
getExperiments, setExperiments
-
Methods inherited from interface org.apache.beam.sdk.transforms.display.HasDisplayData
populateDisplayData
-
-
-
-
Method Detail
-
getApiRootUrl
@String("https://dataflow.googleapis.com/") java.lang.String getApiRootUrl()The root URL for the Dataflow API.dataflowEndpointcan override this value if it contains an absolute URL, otherwiseapiRootUrlwill be combined withdataflowEndpointto generate the full URL to communicate with the Dataflow API.
-
setApiRootUrl
void setApiRootUrl(java.lang.String value)
-
getDataflowEndpoint
@String("") java.lang.String getDataflowEndpoint()Dataflow endpoint to use.Defaults to the current version of the Google Cloud Dataflow API, at the time the current SDK version was released.
If the string contains "://", then this is treated as a URL, otherwise
getApiRootUrl()is used as the root URL.
-
setDataflowEndpoint
void setDataflowEndpoint(java.lang.String value)
-
getDataflowJobFile
java.lang.String getDataflowJobFile()
The path to write the translated Dataflow job specification out to at job submission time. The Dataflow job specification will be represented in JSON format.
-
setDataflowJobFile
void setDataflowJobFile(java.lang.String value)
-
getStagerClass
@Class(GcsStager.class) java.lang.Class<? extends Stager> getStagerClass()
The class responsible for staging resources to be accessible by workers during job execution. If stager has not been set explicitly, an instance of this class will be created and used as the resource stager.
-
setStagerClass
void setStagerClass(java.lang.Class<? extends Stager> stagerClass)
-
getStager
@InstanceFactory(StagerFactory.class) Stager getStager()
The resource stager instance that should be used to stage resources. If no stager has been set explicitly, the default is to use the instance factory that constructs a resource stager based upon the currently set stagerClass.
-
setStager
void setStager(Stager stager)
-
getDataflowClient
@InstanceFactory(DataflowClientFactory.class) com.google.api.services.dataflow.Dataflow getDataflowClient()
An instance of the Dataflow client. Defaults to creating a Dataflow client using the current set of options.
-
setDataflowClient
void setDataflowClient(com.google.api.services.dataflow.Dataflow value)
-
getTransformNameMapping
java.util.Map<java.lang.String,java.lang.String> getTransformNameMapping()
Mapping of old PTransform names to new ones, specified as JSON{"oldName":"newName",...}. To mark a transform as deleted, make newName the empty string.
-
setTransformNameMapping
void setTransformNameMapping(java.util.Map<java.lang.String,java.lang.String> value)
-
getNumberOfWorkerHarnessThreads
int getNumberOfWorkerHarnessThreads()
Number of threads to use on the Dataflow worker harness. If left unspecified, the Dataflow service will compute an appropriate number of threads to use.
-
setNumberOfWorkerHarnessThreads
void setNumberOfWorkerHarnessThreads(int value)
-
getDumpHeapOnOOM
boolean getDumpHeapOnOOM()
If true, save a heap dump before killing a thread or process which is GC thrashing or out of memory. The location of the heap file will either be echoed back to the user, or the user will be given the opportunity to download the heap file.CAUTION: Heap dumps can of comparable size to the default boot disk. Consider increasing the boot disk size before setting this flag to true.
-
setDumpHeapOnOOM
void setDumpHeapOnOOM(boolean dumpHeapBeforeExit)
-
getRecordJfrOnGcThrashing
boolean getRecordJfrOnGcThrashing()
If true, save a JFR profile when GC thrashing is first detected. The profile will run for the amount of time set by --jfrRecordingDurationSec, or 60 seconds by default.Note, JFR profiles are only supported on java 9 and up.
-
setRecordJfrOnGcThrashing
void setRecordJfrOnGcThrashing(boolean value)
-
getJfrRecordingDurationSec
@Integer(60) int getJfrRecordingDurationSec()
-
setJfrRecordingDurationSec
void setJfrRecordingDurationSec(int value)
-
getWorkerCacheMb
@Integer(100) java.lang.Integer getWorkerCacheMb()
The size of the worker's in-memory cache, in megabytes.Currently, this cache is used for storing read values of side inputs in batch as well as the user state for streaming jobs.
-
setWorkerCacheMb
void setWorkerCacheMb(java.lang.Integer value)
-
getReaderCacheTimeoutSec
@Integer(60) java.lang.Integer getReaderCacheTimeoutSec()
The amount of time before UnboundedReaders are considered idle and closed during streaming execution.
-
setReaderCacheTimeoutSec
void setReaderCacheTimeoutSec(java.lang.Integer value)
-
getUnboundedReaderMaxReadTimeSec
@Deprecated @Integer(10) java.lang.Integer getUnboundedReaderMaxReadTimeSec()
Deprecated.usegetUnboundedReaderMaxReadTimeMs()insteadThe max amount of time an UnboundedReader is consumed before checkpointing.
-
setUnboundedReaderMaxReadTimeSec
void setUnboundedReaderMaxReadTimeSec(java.lang.Integer value)
-
getUnboundedReaderMaxReadTimeMs
@InstanceFactory(UnboundedReaderMaxReadTimeFactory.class) java.lang.Integer getUnboundedReaderMaxReadTimeMs()
The max amount of time an UnboundedReader is consumed before checkpointing.
-
setUnboundedReaderMaxReadTimeMs
void setUnboundedReaderMaxReadTimeMs(java.lang.Integer value)
-
getUnboundedReaderMaxElements
@Integer(10000) java.lang.Integer getUnboundedReaderMaxElements()
The max elements read from an UnboundedReader before checkpointing.
-
setUnboundedReaderMaxElements
void setUnboundedReaderMaxElements(java.lang.Integer value)
-
getUnboundedReaderMaxWaitForElementsMs
@Integer(1000) java.lang.Integer getUnboundedReaderMaxWaitForElementsMs()
The max amount of time waiting for elements when reading from UnboundedReader.
-
setUnboundedReaderMaxWaitForElementsMs
void setUnboundedReaderMaxWaitForElementsMs(java.lang.Integer value)
-
getDesiredNumUnboundedSourceSplits
@Integer(0) int getDesiredNumUnboundedSourceSplits()
The desired number of initial splits for UnboundedSources. If this value is <=0, the splits will be computed based on the number of user workers.
-
setDesiredNumUnboundedSourceSplits
void setDesiredNumUnboundedSourceSplits(int value)
-
getSaveHeapDumpsToGcsPath
java.lang.String getSaveHeapDumpsToGcsPath()
CAUTION: This option implies dumpHeapOnOOM, and has similar caveats. Specifically, heap dumps can of comparable size to the default boot disk. Consider increasing the boot disk size before setting this flag to true.
-
setSaveHeapDumpsToGcsPath
void setSaveHeapDumpsToGcsPath(java.lang.String gcsPath)
-
getSdkHarnessContainerImageOverrides
java.lang.String getSdkHarnessContainerImageOverrides()
Overrides for SDK harness container images.
-
setSdkHarnessContainerImageOverrides
void setSdkHarnessContainerImageOverrides(java.lang.String value)
-
-