Interface DataflowPipelineWorkerPoolOptions
-
- All Superinterfaces:
org.apache.beam.sdk.options.FileStagingOptions,org.apache.beam.sdk.extensions.gcp.options.GcpOptions,org.apache.beam.sdk.extensions.gcp.options.GoogleApiDebugOptions,org.apache.beam.sdk.transforms.display.HasDisplayData,org.apache.beam.sdk.options.PipelineOptions
- All Known Subinterfaces:
DataflowPipelineOptions,DataflowWorkerHarnessOptions,TestDataflowPipelineOptions
public interface DataflowPipelineWorkerPoolOptions extends org.apache.beam.sdk.extensions.gcp.options.GcpOptions, org.apache.beam.sdk.options.FileStagingOptionsOptions that are used to configure the Dataflow pipeline worker pool.
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static classDataflowPipelineWorkerPoolOptions.AutoscalingAlgorithmTypeType of autoscaling algorithm to use.-
Nested classes/interfaces inherited from interface org.apache.beam.sdk.extensions.gcp.options.GcpOptions
org.apache.beam.sdk.extensions.gcp.options.GcpOptions.DefaultProjectFactory, org.apache.beam.sdk.extensions.gcp.options.GcpOptions.EnableStreamingEngineFactory, org.apache.beam.sdk.extensions.gcp.options.GcpOptions.GcpOAuthScopesFactory, org.apache.beam.sdk.extensions.gcp.options.GcpOptions.GcpTempLocationFactory, org.apache.beam.sdk.extensions.gcp.options.GcpOptions.GcpUserCredentialsFactory
-
Nested classes/interfaces inherited from interface org.apache.beam.sdk.extensions.gcp.options.GoogleApiDebugOptions
org.apache.beam.sdk.extensions.gcp.options.GoogleApiDebugOptions.GoogleApiTracer
-
Nested classes/interfaces inherited from interface org.apache.beam.sdk.options.PipelineOptions
org.apache.beam.sdk.options.PipelineOptions.AtomicLongFactory, org.apache.beam.sdk.options.PipelineOptions.CheckEnabled, org.apache.beam.sdk.options.PipelineOptions.DirectRunner, org.apache.beam.sdk.options.PipelineOptions.JobNameFactory, org.apache.beam.sdk.options.PipelineOptions.UserAgentFactory
-
-
Method Summary
All Methods Instance Methods Abstract Methods Deprecated Methods Modifier and Type Method Description DataflowPipelineWorkerPoolOptions.AutoscalingAlgorithmTypegetAutoscalingAlgorithm()The autoscaling algorithm to use for the workerpool.intgetDiskSizeGb()Remote worker disk size, in gigabytes, or 0 to use the default size.intgetMaxNumWorkers()The maximum number of workers to use for the workerpool.@Nullable java.lang.StringgetMinCpuPlatform()Specifies a Minimum CPU platform for VM instances.java.lang.StringgetNetwork()GCE network for launching workers.intgetNumWorkers()Number of workers to use when executing the Dataflow job.java.lang.StringgetSdkContainerImage()Container image used to configure SDK execution environment on worker.java.lang.StringgetSubnetwork()GCE subnetwork for launching workers.@Nullable java.lang.BooleangetUsePublicIps()Specifies whether worker pools should be started with public IP addresses.java.lang.StringgetWorkerDiskType()Specifies what type of persistent disk is used.java.lang.StringgetWorkerHarnessContainerImage()Deprecated.UsegetSdkContainerImage()instead.java.lang.StringgetWorkerMachineType()Machine type to create Dataflow worker VMs as.voidsetAutoscalingAlgorithm(DataflowPipelineWorkerPoolOptions.AutoscalingAlgorithmType value)voidsetDiskSizeGb(int value)voidsetMaxNumWorkers(int value)voidsetMinCpuPlatform(java.lang.String minCpuPlatform)voidsetNetwork(java.lang.String value)voidsetNumWorkers(int value)voidsetSdkContainerImage(java.lang.String value)voidsetSubnetwork(java.lang.String value)voidsetUsePublicIps(@Nullable java.lang.Boolean value)voidsetWorkerDiskType(java.lang.String value)voidsetWorkerHarnessContainerImage(java.lang.String value)Deprecated.UsesetSdkContainerImage(java.lang.String)instead.voidsetWorkerMachineType(java.lang.String value)-
Methods inherited from interface org.apache.beam.sdk.options.FileStagingOptions
getFilesToStage, setFilesToStage
-
Methods inherited from interface org.apache.beam.sdk.extensions.gcp.options.GcpOptions
getCredentialFactoryClass, getDataflowKmsKey, getGcpCredential, getGcpOauthScopes, getGcpTempLocation, getImpersonateServiceAccount, getProject, getWorkerRegion, getWorkerZone, getZone, isEnableStreamingEngine, setCredentialFactoryClass, setDataflowKmsKey, setEnableStreamingEngine, setGcpCredential, setGcpOauthScopes, setGcpTempLocation, setImpersonateServiceAccount, setProject, setWorkerRegion, setWorkerZone, setZone
-
Methods inherited from interface org.apache.beam.sdk.extensions.gcp.options.GoogleApiDebugOptions
getGoogleApiTrace, setGoogleApiTrace
-
-
-
-
Method Detail
-
getNumWorkers
int getNumWorkers()
Number of workers to use when executing the Dataflow job. Note that selection of an autoscaling algorithm other thenNONEwill affect the size of the worker pool. If left unspecified, the Dataflow service will determine the number of workers.
-
setNumWorkers
void setNumWorkers(int value)
-
getAutoscalingAlgorithm
DataflowPipelineWorkerPoolOptions.AutoscalingAlgorithmType getAutoscalingAlgorithm()
The autoscaling algorithm to use for the workerpool.- NONE: does not change the size of the worker pool.
- BASIC: autoscale the worker pool size up to maxNumWorkers until the job completes.
- THROUGHPUT_BASED: autoscale the workerpool based on throughput (up to maxNumWorkers).
-
setAutoscalingAlgorithm
void setAutoscalingAlgorithm(DataflowPipelineWorkerPoolOptions.AutoscalingAlgorithmType value)
-
getMaxNumWorkers
int getMaxNumWorkers()
The maximum number of workers to use for the workerpool. This options limits the size of the workerpool for the lifetime of the job, including pipeline updates. If left unspecified, the Dataflow service will compute a ceiling.
-
setMaxNumWorkers
void setMaxNumWorkers(int value)
-
getDiskSizeGb
int getDiskSizeGb()
Remote worker disk size, in gigabytes, or 0 to use the default size.
-
setDiskSizeGb
void setDiskSizeGb(int value)
-
getWorkerHarnessContainerImage
@Deprecated @Hidden java.lang.String getWorkerHarnessContainerImage()
Deprecated.UsegetSdkContainerImage()instead.
-
setWorkerHarnessContainerImage
@Deprecated @Hidden void setWorkerHarnessContainerImage(java.lang.String value)
Deprecated.UsesetSdkContainerImage(java.lang.String)instead.
-
getSdkContainerImage
java.lang.String getSdkContainerImage()
Container image used to configure SDK execution environment on worker. Used for custom containers on portable pipelines only.
-
setSdkContainerImage
void setSdkContainerImage(java.lang.String value)
-
getNetwork
java.lang.String getNetwork()
GCE network for launching workers.Default is up to the Dataflow service.
-
setNetwork
void setNetwork(java.lang.String value)
-
getSubnetwork
java.lang.String getSubnetwork()
GCE subnetwork for launching workers.Default is up to the Dataflow service. Expected format is regions/REGION/subnetworks/SUBNETWORK or the fully qualified subnetwork name, beginning with https://..., e.g. https://www.googleapis.com/compute/alpha/projects/PROJECT/ regions/REGION/subnetworks/SUBNETWORK
-
setSubnetwork
void setSubnetwork(java.lang.String value)
-
getWorkerMachineType
java.lang.String getWorkerMachineType()
Machine type to create Dataflow worker VMs as.See GCE machine types for a list of valid options.
If unset, the Dataflow service will choose a reasonable default.
-
setWorkerMachineType
void setWorkerMachineType(java.lang.String value)
-
getWorkerDiskType
java.lang.String getWorkerDiskType()
Specifies what type of persistent disk is used. The value is a full disk type resource, e.g., compute.googleapis.com/projects//zones//diskTypes/pd-ssd. For more information, see the API reference documentation for DiskTypes.
-
setWorkerDiskType
void setWorkerDiskType(java.lang.String value)
-
getUsePublicIps
@Nullable java.lang.Boolean getUsePublicIps()
Specifies whether worker pools should be started with public IP addresses.WARNING: This feature is available only through allowlist.
-
setUsePublicIps
void setUsePublicIps(@Nullable java.lang.Boolean value)
-
getMinCpuPlatform
@Nullable java.lang.String getMinCpuPlatform()
Specifies a Minimum CPU platform for VM instances.More details see Specifying Pipeline Execution Parameters.
-
setMinCpuPlatform
void setMinCpuPlatform(java.lang.String minCpuPlatform)
-
-