Class ReadChangeStreamPartitionDoFn
- java.lang.Object
-
- org.apache.beam.sdk.transforms.DoFn<PartitionMetadata,DataChangeRecord>
-
- org.apache.beam.sdk.io.gcp.spanner.changestreams.dofn.ReadChangeStreamPartitionDoFn
-
- All Implemented Interfaces:
java.io.Serializable,org.apache.beam.sdk.transforms.display.HasDisplayData
@UnboundedPerElement public class ReadChangeStreamPartitionDoFn extends org.apache.beam.sdk.transforms.DoFn<PartitionMetadata,DataChangeRecord> implements java.io.Serializable
A SDF (Splittable DoFn) class which is responsible for performing a change stream query for a given partition. A different action will be taken depending on the type of record received from the query. This component will also reflect the partition state in the partition metadata tables.The processing of a partition is delegated to the
QueryChangeStreamAction.- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.beam.sdk.transforms.DoFn
org.apache.beam.sdk.transforms.DoFn.AlwaysFetched, org.apache.beam.sdk.transforms.DoFn.BoundedPerElement, org.apache.beam.sdk.transforms.DoFn.BundleFinalizer, org.apache.beam.sdk.transforms.DoFn.Element, org.apache.beam.sdk.transforms.DoFn.FieldAccess, org.apache.beam.sdk.transforms.DoFn.FinishBundle, org.apache.beam.sdk.transforms.DoFn.FinishBundleContext, org.apache.beam.sdk.transforms.DoFn.GetInitialRestriction, org.apache.beam.sdk.transforms.DoFn.GetInitialWatermarkEstimatorState, org.apache.beam.sdk.transforms.DoFn.GetRestrictionCoder, org.apache.beam.sdk.transforms.DoFn.GetSize, org.apache.beam.sdk.transforms.DoFn.GetWatermarkEstimatorStateCoder, org.apache.beam.sdk.transforms.DoFn.Key, org.apache.beam.sdk.transforms.DoFn.MultiOutputReceiver, org.apache.beam.sdk.transforms.DoFn.NewTracker, org.apache.beam.sdk.transforms.DoFn.NewWatermarkEstimator, org.apache.beam.sdk.transforms.DoFn.OnTimer, org.apache.beam.sdk.transforms.DoFn.OnTimerContext, org.apache.beam.sdk.transforms.DoFn.OnTimerFamily, org.apache.beam.sdk.transforms.DoFn.OnWindowExpiration, org.apache.beam.sdk.transforms.DoFn.OnWindowExpirationContext, org.apache.beam.sdk.transforms.DoFn.OutputReceiver<T extends java.lang.Object>, org.apache.beam.sdk.transforms.DoFn.ProcessContext, org.apache.beam.sdk.transforms.DoFn.ProcessContinuation, org.apache.beam.sdk.transforms.DoFn.ProcessElement, org.apache.beam.sdk.transforms.DoFn.RequiresStableInput, org.apache.beam.sdk.transforms.DoFn.RequiresTimeSortedInput, org.apache.beam.sdk.transforms.DoFn.Restriction, org.apache.beam.sdk.transforms.DoFn.Setup, org.apache.beam.sdk.transforms.DoFn.SideInput, org.apache.beam.sdk.transforms.DoFn.SplitRestriction, org.apache.beam.sdk.transforms.DoFn.StartBundle, org.apache.beam.sdk.transforms.DoFn.StartBundleContext, org.apache.beam.sdk.transforms.DoFn.StateId, org.apache.beam.sdk.transforms.DoFn.Teardown, org.apache.beam.sdk.transforms.DoFn.TimerFamily, org.apache.beam.sdk.transforms.DoFn.TimerId, org.apache.beam.sdk.transforms.DoFn.Timestamp, org.apache.beam.sdk.transforms.DoFn.TruncateRestriction, org.apache.beam.sdk.transforms.DoFn.UnboundedPerElement, org.apache.beam.sdk.transforms.DoFn.WatermarkEstimatorState, org.apache.beam.sdk.transforms.DoFn.WindowedContext
-
-
Constructor Summary
Constructors Constructor Description ReadChangeStreamPartitionDoFn(DaoFactory daoFactory, MapperFactory mapperFactory, ActionFactory actionFactory, ChangeStreamMetrics metrics)This class needs aDaoFactoryto build DAOs to access the partition metadata tables and to perform the change streams query.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description org.joda.time.InstantgetInitialWatermarkEstimatorState(PartitionMetadata partition)doublegetSize(PartitionMetadata partition, TimestampRange range)TimestampRangeinitialRestriction(PartitionMetadata partition)The restriction for a partition will be defined from the start and end timestamp to query the partition for.ReadChangeStreamPartitionRangeTrackernewTracker(PartitionMetadata partition, TimestampRange range)org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant>newWatermarkEstimator(org.joda.time.Instant watermarkEstimatorState)org.apache.beam.sdk.transforms.DoFn.ProcessContinuationprocessElement(PartitionMetadata partition, org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,com.google.cloud.Timestamp> tracker, org.apache.beam.sdk.transforms.DoFn.OutputReceiver<DataChangeRecord> receiver, org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator, org.apache.beam.sdk.transforms.DoFn.BundleFinalizer bundleFinalizer)Performs a change stream query for a given partition.voidsetThroughputEstimator(BytesThroughputEstimator<DataChangeRecord> throughputEstimator)Sets the estimator to calculate the backlog of this function.voidsetup()Constructs instances for thePartitionMetadataDao,ChangeStreamDao,ChangeStreamRecordMapper,PartitionMetadataMapper,DataChangeRecordAction,HeartbeatRecordAction,ChildPartitionsRecordAction,PartitionStartRecordAction,PartitionEndRecordAction,PartitionEventRecordActionandQueryChangeStreamAction.
-
-
-
Constructor Detail
-
ReadChangeStreamPartitionDoFn
public ReadChangeStreamPartitionDoFn(DaoFactory daoFactory, MapperFactory mapperFactory, ActionFactory actionFactory, ChangeStreamMetrics metrics)
This class needs aDaoFactoryto build DAOs to access the partition metadata tables and to perform the change streams query. It uses mappers to transform database rows into theChangeStreamRecordmodel. It uses theActionFactoryto construct the action dispatchers, which will perform the change stream query and process each type of record received. It emits metrics for the partition using theChangeStreamMetrics.- Parameters:
daoFactory- theDaoFactoryto constructPartitionMetadataDaos andChangeStreamDaosmapperFactory- theMapperFactoryto constructChangeStreamRecordMappersactionFactory- theActionFactoryto construct actionsmetrics- theChangeStreamMetricsto emit partition related metrics
-
-
Method Detail
-
getInitialWatermarkEstimatorState
@GetInitialWatermarkEstimatorState public org.joda.time.Instant getInitialWatermarkEstimatorState(@Element PartitionMetadata partition)
-
newWatermarkEstimator
@NewWatermarkEstimator public org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> newWatermarkEstimator(@WatermarkEstimatorState org.joda.time.Instant watermarkEstimatorState)
-
initialRestriction
@GetInitialRestriction public TimestampRange initialRestriction(@Element PartitionMetadata partition)
The restriction for a partition will be defined from the start and end timestamp to query the partition for. TheTimestampRangerestriction represents a closed-open interval, while the start / end timestamps represent a closed-closed interval, so we add 1 nanosecond to the end timestamp to convert it to closed-open.In this function we also update the partition state to
PartitionMetadata.State.RUNNING.- Parameters:
partition- the partition to be queried- Returns:
- the timestamp range from the partition start timestamp to the partition end timestamp + 1 nanosecond
-
getSize
@GetSize public double getSize(@Element PartitionMetadata partition, @Restriction TimestampRange range) throws java.lang.Exception- Throws:
java.lang.Exception
-
newTracker
@NewTracker public ReadChangeStreamPartitionRangeTracker newTracker(@Element PartitionMetadata partition, @Restriction TimestampRange range)
-
setup
@Setup public void setup()
Constructs instances for thePartitionMetadataDao,ChangeStreamDao,ChangeStreamRecordMapper,PartitionMetadataMapper,DataChangeRecordAction,HeartbeatRecordAction,ChildPartitionsRecordAction,PartitionStartRecordAction,PartitionEndRecordAction,PartitionEventRecordActionandQueryChangeStreamAction.
-
processElement
@ProcessElement public org.apache.beam.sdk.transforms.DoFn.ProcessContinuation processElement(@Element PartitionMetadata partition, org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,com.google.cloud.Timestamp> tracker, org.apache.beam.sdk.transforms.DoFn.OutputReceiver<DataChangeRecord> receiver, org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator, org.apache.beam.sdk.transforms.DoFn.BundleFinalizer bundleFinalizer)Performs a change stream query for a given partition. A different action will be taken depending on the type of record received from the query. This component will also reflect the partition state in the partition metadata tables.The processing of a partition is delegated to the
QueryChangeStreamAction.- Parameters:
partition- the partition to be queriedtracker- an instance ofReadChangeStreamPartitionRangeTrackerreceiver- aDataChangeRecordDoFn.OutputReceiverwatermarkEstimator- aManualWatermarkEstimatorofInstantbundleFinalizer- the bundle finalizer- Returns:
- a
DoFn.ProcessContinuation.stop()if a record timestamp could not be claimed or if the partition processing has finished
-
setThroughputEstimator
public void setThroughputEstimator(BytesThroughputEstimator<DataChangeRecord> throughputEstimator)
Sets the estimator to calculate the backlog of this function. Must be called after the initialization of this DoFn.- Parameters:
throughputEstimator- an estimator to calculate local throughput.
-
-