Class ChildPartitionsRecordAction

    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.util.Optional<org.apache.beam.sdk.transforms.DoFn.ProcessContinuation> run​(PartitionMetadata partition, ChildPartitionsRecord record, org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,​com.google.cloud.Timestamp> tracker, RestrictionInterrupter<com.google.cloud.Timestamp> interrupter, org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator)
      This is the main processing function for a ChildPartitionsRecord.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • run

        public java.util.Optional<org.apache.beam.sdk.transforms.DoFn.ProcessContinuation> run​(PartitionMetadata partition,
                                                                                               ChildPartitionsRecord record,
                                                                                               org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<TimestampRange,​com.google.cloud.Timestamp> tracker,
                                                                                               RestrictionInterrupter<com.google.cloud.Timestamp> interrupter,
                                                                                               org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator)
        This is the main processing function for a ChildPartitionsRecord. It returns an Optional of DoFn.ProcessContinuation to indicate if the calling function should stop or not. If the Optional returned is empty, it means that the calling function can continue with the processing. If an Optional of DoFn.ProcessContinuation.stop() is returned, it means that this function was unable to claim the timestamp of the ChildPartitionsRecord, so the caller should stop.

        When processing the ChildPartitionsRecord the following procedure is applied:

        1. We try to claim the child partition record timestamp. If it is not possible, we stop here and return.
        2. We update the watermark to the child partition record timestamp.
        3. For each child partition, we try to insert them in the metadata tables if they do not exist.
        4. For each child partition, we check if they originate from a split or a merge and increment the corresponding metric.
        Dealing with partition splits and merge cases is detailed below:
        • Partition Splits: child partition tokens should not exist in the partition metadata table, so new rows are just added to such table. In case of a bundle retry, we silently ignore duplicate entries.
        • Partition Merges: the first parent partition that receives the child token should succeed in inserting it. The remaining parents will silently ignore and skip the insertion.
        Parameters:
        partition - the current partition being processed
        record - the change stream child partition record received
        tracker - the restriction tracker of the ReadChangeStreamPartitionDoFn SDF
        interrupter - the restriction interrupter suggesting early termination of the processing
        watermarkEstimator - the watermark estimator of the ReadChangeStreamPartitionDoFn SDF
        Returns:
        Optional.empty() if the caller can continue processing more records. A non empty Optional with DoFn.ProcessContinuation.stop() if this function was unable to claim the ChildPartitionsRecord timestamp. A non empty Optional with DoFn.ProcessContinuation.resume() if this function should commit what has already been processed and resume.