Interface DataflowPipelineDebugOptions

  • All Superinterfaces:
    org.apache.beam.sdk.options.ExperimentalOptions, org.apache.beam.sdk.transforms.display.HasDisplayData, org.apache.beam.sdk.options.MemoryMonitorOptions, org.apache.beam.sdk.options.PipelineOptions
    All Known Subinterfaces:
    DataflowPipelineOptions, DataflowWorkerHarnessOptions, TestDataflowPipelineOptions

    @Hidden
    public interface DataflowPipelineDebugOptions
    extends org.apache.beam.sdk.options.ExperimentalOptions, org.apache.beam.sdk.options.MemoryMonitorOptions, org.apache.beam.sdk.options.PipelineOptions
    Internal. Options used to control execution of the Dataflow SDK for debugging and testing purposes.
    • Method Detail

      • getApiRootUrl

        @String("https://dataflow.googleapis.com/")
        java.lang.String getApiRootUrl()
        The root URL for the Dataflow API. dataflowEndpoint can override this value if it contains an absolute URL, otherwise apiRootUrl will be combined with dataflowEndpoint to generate the full URL to communicate with the Dataflow API.
      • setApiRootUrl

        void setApiRootUrl​(java.lang.String value)
      • getDataflowEndpoint

        @String("")
        java.lang.String getDataflowEndpoint()
        Dataflow endpoint to use.

        Defaults to the current version of the Google Cloud Dataflow API, at the time the current SDK version was released.

        If the string contains "://", then this is treated as a URL, otherwise getApiRootUrl() is used as the root URL.

      • setDataflowEndpoint

        void setDataflowEndpoint​(java.lang.String value)
      • getDataflowJobFile

        java.lang.String getDataflowJobFile()
        The path to write the translated Dataflow job specification out to at job submission time. The Dataflow job specification will be represented in JSON format.
      • setDataflowJobFile

        void setDataflowJobFile​(java.lang.String value)
      • getStagerClass

        @Class(GcsStager.class)
        java.lang.Class<? extends Stager> getStagerClass()
        The class responsible for staging resources to be accessible by workers during job execution. If stager has not been set explicitly, an instance of this class will be created and used as the resource stager.
      • setStagerClass

        void setStagerClass​(java.lang.Class<? extends Stager> stagerClass)
      • getStager

        @InstanceFactory(StagerFactory.class)
        Stager getStager()
        The resource stager instance that should be used to stage resources. If no stager has been set explicitly, the default is to use the instance factory that constructs a resource stager based upon the currently set stagerClass.
      • setStager

        void setStager​(Stager stager)
      • getDataflowClient

        @InstanceFactory(DataflowClientFactory.class)
        com.google.api.services.dataflow.Dataflow getDataflowClient()
        An instance of the Dataflow client. Defaults to creating a Dataflow client using the current set of options.
      • setDataflowClient

        void setDataflowClient​(com.google.api.services.dataflow.Dataflow value)
      • getTransformNameMapping

        java.util.Map<java.lang.String,​java.lang.String> getTransformNameMapping()
        Mapping of old PTransform names to new ones, specified as JSON {"oldName":"newName",...} . To mark a transform as deleted, make newName the empty string.
      • setTransformNameMapping

        void setTransformNameMapping​(java.util.Map<java.lang.String,​java.lang.String> value)
      • getNumberOfWorkerHarnessThreads

        int getNumberOfWorkerHarnessThreads()
        Number of threads to use on the Dataflow worker harness. If left unspecified, the Dataflow service will compute an appropriate number of threads to use.
      • setNumberOfWorkerHarnessThreads

        void setNumberOfWorkerHarnessThreads​(int value)
      • getDumpHeapOnOOM

        boolean getDumpHeapOnOOM()
        If true, save a heap dump before killing a thread or process which is GC thrashing or out of memory. The location of the heap file will either be echoed back to the user, or the user will be given the opportunity to download the heap file.

        CAUTION: Heap dumps can of comparable size to the default boot disk. Consider increasing the boot disk size before setting this flag to true.

      • setDumpHeapOnOOM

        void setDumpHeapOnOOM​(boolean dumpHeapBeforeExit)
      • getRecordJfrOnGcThrashing

        boolean getRecordJfrOnGcThrashing()
        If true, save a JFR profile when GC thrashing is first detected. The profile will run for the amount of time set by --jfrRecordingDurationSec, or 60 seconds by default.

        Note, JFR profiles are only supported on java 9 and up.

      • setRecordJfrOnGcThrashing

        void setRecordJfrOnGcThrashing​(boolean value)
      • getJfrRecordingDurationSec

        @Integer(60)
        int getJfrRecordingDurationSec()
      • setJfrRecordingDurationSec

        void setJfrRecordingDurationSec​(int value)
      • getWorkerCacheMb

        @Integer(100)
        java.lang.Integer getWorkerCacheMb()
        The size of the worker's in-memory cache, in megabytes.

        Currently, this cache is used for storing read values of side inputs in batch as well as the user state for streaming jobs.

      • setWorkerCacheMb

        void setWorkerCacheMb​(java.lang.Integer value)
      • getReaderCacheTimeoutSec

        @Integer(60)
        java.lang.Integer getReaderCacheTimeoutSec()
        The amount of time before UnboundedReaders are considered idle and closed during streaming execution.
      • setReaderCacheTimeoutSec

        void setReaderCacheTimeoutSec​(java.lang.Integer value)
      • getUnboundedReaderMaxReadTimeSec

        @Deprecated
        @Integer(10)
        java.lang.Integer getUnboundedReaderMaxReadTimeSec()
        Deprecated.
        The max amount of time an UnboundedReader is consumed before checkpointing.
      • setUnboundedReaderMaxReadTimeSec

        void setUnboundedReaderMaxReadTimeSec​(java.lang.Integer value)
      • getUnboundedReaderMaxReadTimeMs

        @InstanceFactory(UnboundedReaderMaxReadTimeFactory.class)
        java.lang.Integer getUnboundedReaderMaxReadTimeMs()
        The max amount of time an UnboundedReader is consumed before checkpointing.
      • setUnboundedReaderMaxReadTimeMs

        void setUnboundedReaderMaxReadTimeMs​(java.lang.Integer value)
      • getUnboundedReaderMaxElements

        @Integer(10000)
        java.lang.Integer getUnboundedReaderMaxElements()
        The max elements read from an UnboundedReader before checkpointing.
      • setUnboundedReaderMaxElements

        void setUnboundedReaderMaxElements​(java.lang.Integer value)
      • getUnboundedReaderMaxWaitForElementsMs

        @Integer(1000)
        java.lang.Integer getUnboundedReaderMaxWaitForElementsMs()
        The max amount of time waiting for elements when reading from UnboundedReader.
      • setUnboundedReaderMaxWaitForElementsMs

        void setUnboundedReaderMaxWaitForElementsMs​(java.lang.Integer value)
      • getDesiredNumUnboundedSourceSplits

        @Integer(0)
        int getDesiredNumUnboundedSourceSplits()
        The desired number of initial splits for UnboundedSources. If this value is <=0, the splits will be computed based on the number of user workers.
      • setDesiredNumUnboundedSourceSplits

        void setDesiredNumUnboundedSourceSplits​(int value)
      • getSaveHeapDumpsToGcsPath

        java.lang.String getSaveHeapDumpsToGcsPath()
        CAUTION: This option implies dumpHeapOnOOM, and has similar caveats. Specifically, heap dumps can of comparable size to the default boot disk. Consider increasing the boot disk size before setting this flag to true.
      • setSaveHeapDumpsToGcsPath

        void setSaveHeapDumpsToGcsPath​(java.lang.String gcsPath)
      • getSdkHarnessContainerImageOverrides

        java.lang.String getSdkHarnessContainerImageOverrides()
        Overrides for SDK harness container images.
      • setSdkHarnessContainerImageOverrides

        void setSdkHarnessContainerImageOverrides​(java.lang.String value)