Class PubsubIO.Write<T>

  • All Implemented Interfaces:
    java.io.Serializable, org.apache.beam.sdk.transforms.display.HasDisplayData
    Enclosing class:
    PubsubIO

    public abstract static class PubsubIO.Write<T>
    extends org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<T>,​org.apache.beam.sdk.values.PDone>
    Implementation of write methods.
    See Also:
    Serialized Form
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      class  PubsubIO.Write.PubsubBoundedWriter
      Writer to Pubsub which batches messages from bounded collections.
    • Field Summary

      • Fields inherited from class org.apache.beam.sdk.transforms.PTransform

        annotations, displayData, name, resourceHints
    • Constructor Summary

      Constructors 
      Constructor Description
      Write()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      org.apache.beam.sdk.values.PDone expand​(org.apache.beam.sdk.values.PCollection<T> input)  
      void populateDisplayData​(org.apache.beam.sdk.transforms.display.DisplayData.Builder builder)  
      PubsubIO.Write<T> to​(java.lang.String topic)
      Publishes to the specified topic.
      PubsubIO.Write<T> to​(org.apache.beam.sdk.options.ValueProvider<java.lang.String> topic)
      Like topic() but with a ValueProvider.
      PubsubIO.Write<T> to​(org.apache.beam.sdk.transforms.SerializableFunction<org.apache.beam.sdk.values.ValueInSingleWindow<T>,​java.lang.String> topicFunction)
      Provides a function to dynamically specify the target topic per message.
      void validate​(org.apache.beam.sdk.options.PipelineOptions options)  
      PubsubIO.Write<T> withClientFactory​(PubsubClient.PubsubClientFactory factory)
      The default client to write to Pub/Sub is the PubsubJsonClient, created by the PubsubJsonClient.PubsubJsonClientFactory.
      PubsubIO.Write<T> withErrorHandler​(org.apache.beam.sdk.transforms.errorhandling.ErrorHandler<org.apache.beam.sdk.transforms.errorhandling.BadRecord,​?> badRecordErrorHandler)
      Writes any serialization failures out to the Error Handler.
      PubsubIO.Write<T> withIdAttribute​(java.lang.String idAttribute)
      Writes to Pub/Sub, adding each record's unique identifier to the published messages in an attribute with the specified name.
      PubsubIO.Write<T> withMaxBatchBytesSize​(int maxBatchBytesSize)
      Writes to Pub/Sub are limited by 10mb in general.
      PubsubIO.Write<T> withMaxBatchSize​(int batchSize)
      Writes to Pub/Sub are batched to efficiently send data.
      PubsubIO.Write<T> withOrderingKey()
      Writes to Pub/Sub with each record's ordering key.
      PubsubIO.Write<T> withPubsubRootUrl​(java.lang.String pubsubRootUrl)  
      PubsubIO.Write<T> withTimestampAttribute​(java.lang.String timestampAttribute)
      Writes to Pub/Sub and adds each record's timestamp to the published messages in an attribute with the specified name.
      PubsubIO.Write<T> withValidation()
      Enable validation of the PubSub Write.
      • Methods inherited from class org.apache.beam.sdk.transforms.PTransform

        addAnnotation, compose, compose, getAdditionalInputs, getAnnotations, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, setDisplayData, setResourceHints, toString, validate
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Constructor Detail

      • Write

        public Write()
    • Method Detail

      • to

        public PubsubIO.Write<T> to​(org.apache.beam.sdk.options.ValueProvider<java.lang.String> topic)
        Like topic() but with a ValueProvider.
      • to

        public PubsubIO.Write<T> to​(org.apache.beam.sdk.transforms.SerializableFunction<org.apache.beam.sdk.values.ValueInSingleWindow<T>,​java.lang.String> topicFunction)
        Provides a function to dynamically specify the target topic per message. Not compatible with any of the other to methods. If to(java.lang.String) is called again specifying a topic, then this topicFunction will be ignored.
      • withMaxBatchSize

        public PubsubIO.Write<T> withMaxBatchSize​(int batchSize)
        Writes to Pub/Sub are batched to efficiently send data. The value of the attribute will be a number representing the number of Pub/Sub messages to queue before sending off the bulk request. For example, if given 1000 the write sink will wait until 1000 messages have been received, or the pipeline has finished, whichever is first.

        Pub/Sub has a limitation of 10mb per individual request/batch. This attribute was requested dynamic to allow larger Pub/Sub messages to be sent using this source. Thus allowing customizable batches and control of number of events before the 10mb size limit is hit.

      • withMaxBatchBytesSize

        public PubsubIO.Write<T> withMaxBatchBytesSize​(int maxBatchBytesSize)
        Writes to Pub/Sub are limited by 10mb in general. This attribute controls the maximum allowed bytes to be sent to Pub/Sub in a single batched message.
      • withOrderingKey

        public PubsubIO.Write<T> withOrderingKey()
        Writes to Pub/Sub with each record's ordering key. A subscription with message ordering enabled will receive messages published in the same region with the same ordering key in the order in which they were received by the service. Note that the order in which Beam publishes records to the service remains unspecified.
        See Also:
        Pub/Sub documentation on message ordering
      • withTimestampAttribute

        public PubsubIO.Write<T> withTimestampAttribute​(java.lang.String timestampAttribute)
        Writes to Pub/Sub and adds each record's timestamp to the published messages in an attribute with the specified name. The value of the attribute will be a number representing the number of milliseconds since the Unix epoch. For example, if using the Joda time classes, Instant(long) can be used to parse this value.

        If the output from this sink is being read by another Beam pipeline, then PubsubIO.Read.withTimestampAttribute(String) can be used to ensure the other source reads these timestamps from the appropriate attribute.

      • withIdAttribute

        public PubsubIO.Write<T> withIdAttribute​(java.lang.String idAttribute)
        Writes to Pub/Sub, adding each record's unique identifier to the published messages in an attribute with the specified name. The value of the attribute is an opaque string.

        If the output from this sink is being read by another Beam pipeline, then PubsubIO.Read.withIdAttribute(String) can be used to ensure that* the other source reads these unique identifiers from the appropriate attribute.

      • withPubsubRootUrl

        public PubsubIO.Write<T> withPubsubRootUrl​(java.lang.String pubsubRootUrl)
      • withErrorHandler

        public PubsubIO.Write<T> withErrorHandler​(org.apache.beam.sdk.transforms.errorhandling.ErrorHandler<org.apache.beam.sdk.transforms.errorhandling.BadRecord,​?> badRecordErrorHandler)
        Writes any serialization failures out to the Error Handler. See ErrorHandler for details on how to configure an Error Handler. Error Handlers are not well supported when writing to topics with schemas, and it is not recommended to configure an error handler if the target topic has a schema.
      • withValidation

        public PubsubIO.Write<T> withValidation()
        Enable validation of the PubSub Write.
      • expand

        public org.apache.beam.sdk.values.PDone expand​(org.apache.beam.sdk.values.PCollection<T> input)
        Specified by:
        expand in class org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<T>,​org.apache.beam.sdk.values.PDone>
      • validate

        public void validate​(org.apache.beam.sdk.options.PipelineOptions options)
        Overrides:
        validate in class org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<T>,​org.apache.beam.sdk.values.PDone>
      • populateDisplayData

        public void populateDisplayData​(org.apache.beam.sdk.transforms.display.DisplayData.Builder builder)
        Specified by:
        populateDisplayData in interface org.apache.beam.sdk.transforms.display.HasDisplayData
        Overrides:
        populateDisplayData in class org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<T>,​org.apache.beam.sdk.values.PDone>