Class DetectNewPartitionsAction


  • @Internal
    public class DetectNewPartitionsAction
    extends java.lang.Object
    This class processes DetectNewPartitionsDoFn.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      org.apache.beam.sdk.transforms.DoFn.ProcessContinuation run​(org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<org.apache.beam.sdk.io.range.OffsetRange,​java.lang.Long> tracker, org.apache.beam.sdk.transforms.DoFn.OutputReceiver<PartitionRecord> receiver, org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator, InitialPipelineState initialPipelineState)
      Perform the necessary steps to manage initial set of partitions and new partitions.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • run

        public org.apache.beam.sdk.transforms.DoFn.ProcessContinuation run​(org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker<org.apache.beam.sdk.io.range.OffsetRange,​java.lang.Long> tracker,
                                                                           org.apache.beam.sdk.transforms.DoFn.OutputReceiver<PartitionRecord> receiver,
                                                                           org.apache.beam.sdk.transforms.splittabledofn.ManualWatermarkEstimator<org.joda.time.Instant> watermarkEstimator,
                                                                           InitialPipelineState initialPipelineState)
                                                                    throws java.lang.Exception
        Perform the necessary steps to manage initial set of partitions and new partitions. Currently, we set to process new partitions every second.
        1. Look up the initial list of partitions to stream if it's the very first run.
        2. On rest of the runs, try advancing watermark if needed.
        3. Update the metadata table with info about this DoFn.
        4. Check if this pipeline has reached the end time. Terminate if it has.
        5. Process new partitions and output them.
        6. Reconcile any Partitions that haven't been streaming for a long time
        7. Register callback to clean up processed partitions after bundle has been finalized.
        Parameters:
        tracker - offset tracker that simply increment by 1 every single run
        receiver - output new partitions
        watermarkEstimator - update watermark that is a representation of the low watermark of the entire beam pipeline
        Returns:
        DoFn.ProcessContinuation.resume() with 1-second delay if the stream continues, otherwise DoFn.ProcessContinuation.stop()
        Throws:
        com.google.protobuf.InvalidProtocolBufferException - if failing to process new partitions
        java.lang.Exception