Class AvroCoder<T>

  • Type Parameters:
    T - the type of elements handled by this coder
    All Implemented Interfaces:
    java.io.Serializable
    Direct Known Subclasses:
    AvroGenericCoder

    public class AvroCoder<T>
    extends org.apache.beam.sdk.coders.CustomCoder<T>
    A Coder using Avro binary format.

    Each instance of AvroCoder<T> encapsulates an Avro datum factory and schema for objects of type T.

    The Avro datum factory and schema may be provided explicitly via of(AvroDatumFactory, Schema) or omitted via specific(Class) or reflect(Class) in which case it will be inferred using Avro's SpecificData or ReflectData

    For complete details about schema generation and how it can be controlled please see the org.apache.avro.specific and org.apache.avro.reflect packages.

    To use, specify the Coder type on a PCollection:

    
     PCollection<MyCustomElement> records =
         input.apply(...)
              .setCoder(AvroCoder.of(MyCustomElement.class));
     

    or annotate the element class using @DefaultCoder.

    @DefaultCoder(AvroCoder.class)
     public class MyCustomElement {
         ...
     }
     

    The implementation attempts to determine if the Avro encoding of the given type will satisfy the criteria of Coder.verifyDeterministic() by inspecting both the type and the Schema provided or generated by Avro. Only coders that are deterministic can be used in GroupByKey operations.

    See Also:
    Serialized Form
    • Nested Class Summary

      • Nested classes/interfaces inherited from class org.apache.beam.sdk.coders.Coder

        org.apache.beam.sdk.coders.Coder.Context, org.apache.beam.sdk.coders.Coder.NonDeterministicException
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected AvroCoder​(java.lang.Class<T> type, org.apache.avro.Schema schema)  
      protected AvroCoder​(java.lang.Class<T> type, org.apache.avro.Schema schema, boolean useReflectApi)  
      protected AvroCoder​(AvroDatumFactory<T> datumFactory, org.apache.avro.Schema schema)  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods 
      Modifier and Type Method Description
      T decode​(java.io.InputStream inStream)  
      void encode​(T value, java.io.OutputStream outStream)  
      boolean equals​(@Nullable java.lang.Object other)  
      static AvroCoder<org.apache.avro.generic.GenericRecord> generic​(org.apache.avro.Schema schema)
      Returns an AvroCoder instance for the Avro schema.
      static org.apache.beam.sdk.coders.CoderProvider getCoderProvider()
      Returns a CoderProvider which uses the AvroCoder if possible for all types.
      AvroDatumFactory<T> getDatumFactory()
      Returns the datum factory used for encoding/decoding.
      org.apache.avro.io.DatumReader<T> getDatumReader()
      Returns the DatumReader used for decoding.
      org.apache.avro.io.DatumWriter<T> getDatumWriter()
      Returns the DatumWriter used for encoding.
      org.apache.beam.sdk.values.TypeDescriptor<T> getEncodedTypeDescriptor()  
      org.apache.avro.Schema getSchema()
      Returns the schema used by this coder.
      java.lang.Class<T> getType()
      Returns the type this coder encodes/decodes.
      int hashCode()  
      static <T> AvroCoder<T> of​(java.lang.Class<T> clazz)
      Returns an AvroCoder instance for the provided element class.
      static <T> AvroCoder<T> of​(java.lang.Class<T> type, boolean useReflectApi)
      Returns an AvroCoder instance for the given class, respecting whether to use Avro's Reflect* or Specific* suite for encoding and decoding.
      static <T> AvroCoder<T> of​(java.lang.Class<T> type, org.apache.avro.Schema schema)
      Returns an AvroCoder instance for the provided element type using the provided Avro schema
      static <T> AvroCoder<T> of​(java.lang.Class<T> type, org.apache.avro.Schema schema, boolean useReflectApi)
      Returns an AvroCoder instance for the provided element type using the provided Avro schema, respecting whether to use Avro's Reflect* or Specific* suite for encoding and decoding.
      static AvroGenericCoder of​(org.apache.avro.Schema schema)
      Returns an AvroGenericCoder instance for the Avro schema.
      static <T> AvroCoder<T> of​(AvroDatumFactory<T> datumFactory, org.apache.avro.Schema schema)
      Returns an AvroCoder instance for the provided AvroDatumFactory using the provided Avro schema.
      static <T> AvroCoder<T> of​(org.apache.beam.sdk.values.TypeDescriptor<T> type)
      Returns an AvroCoder instance for the provided element type.
      static <T> AvroCoder<T> of​(org.apache.beam.sdk.values.TypeDescriptor<T> type, boolean useReflectApi)
      Returns an AvroCoder instance for the provided element type, respecting whether to use Avro's Reflect* or Specific* suite for encoding and decoding.
      static <T> AvroCoder<T> reflect​(java.lang.Class<T> type)
      Returns an AvroCoder instance for the provided element type respecting Avro's Reflect* suite for encoding and decoding.
      static <T> AvroCoder<T> reflect​(java.lang.Class<T> type, org.apache.avro.Schema schema)
      Returns an AvroCoder instance for the provided element type respecting Avro's Reflect* suite for encoding and decoding.
      static <T> AvroCoder<T> reflect​(org.apache.beam.sdk.values.TypeDescriptor<T> type)
      Returns an AvroCoder instance for the provided element type respecting Avro's Reflect* suite for encoding and decoding.
      static <T> AvroCoder<T> specific​(java.lang.Class<T> type)
      Returns an AvroCoder instance for the provided element type respecting Avro's Specific* suite for encoding and decoding.
      static <T> AvroCoder<T> specific​(java.lang.Class<T> type, org.apache.avro.Schema schema)
      Returns an AvroCoder instance for the provided element type respecting Avro's Specific* suite for encoding and decoding.
      static <T> AvroCoder<T> specific​(org.apache.beam.sdk.values.TypeDescriptor<T> type)
      Returns an AvroCoder instance for the provided element type respecting Avro's Specific* suite for encoding and decoding.
      boolean useReflectApi()
      Deprecated.
      kept for backward API compatibility only.
      void verifyDeterministic()  
      • Methods inherited from class org.apache.beam.sdk.coders.CustomCoder

        getCoderArguments
      • Methods inherited from class org.apache.beam.sdk.coders.Coder

        consistentWithEquals, decode, encode, getEncodedElementByteSize, getEncodedElementByteSizeUsingCoder, isRegisterByteSizeObserverCheap, registerByteSizeObserver, structuralValue, verifyDeterministic, verifyDeterministic
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • AvroCoder

        protected AvroCoder​(java.lang.Class<T> type,
                            org.apache.avro.Schema schema)
      • AvroCoder

        protected AvroCoder​(java.lang.Class<T> type,
                            org.apache.avro.Schema schema,
                            boolean useReflectApi)
      • AvroCoder

        protected AvroCoder​(AvroDatumFactory<T> datumFactory,
                            org.apache.avro.Schema schema)
    • Method Detail

      • generic

        public static AvroCoder<org.apache.avro.generic.GenericRecord> generic​(org.apache.avro.Schema schema)
        Returns an AvroCoder instance for the Avro schema. The implicit type is GenericRecord.
      • specific

        public static <T> AvroCoder<T> specific​(org.apache.beam.sdk.values.TypeDescriptor<T> type)
        Returns an AvroCoder instance for the provided element type respecting Avro's Specific* suite for encoding and decoding.
      • specific

        public static <T> AvroCoder<T> specific​(java.lang.Class<T> type)
        Returns an AvroCoder instance for the provided element type respecting Avro's Specific* suite for encoding and decoding.
      • specific

        public static <T> AvroCoder<T> specific​(java.lang.Class<T> type,
                                                org.apache.avro.Schema schema)
        Returns an AvroCoder instance for the provided element type respecting Avro's Specific* suite for encoding and decoding.

        The schema must correspond to the type provided.

      • reflect

        public static <T> AvroCoder<T> reflect​(org.apache.beam.sdk.values.TypeDescriptor<T> type)
        Returns an AvroCoder instance for the provided element type respecting Avro's Reflect* suite for encoding and decoding.
      • reflect

        public static <T> AvroCoder<T> reflect​(java.lang.Class<T> type)
        Returns an AvroCoder instance for the provided element type respecting Avro's Reflect* suite for encoding and decoding.
      • reflect

        public static <T> AvroCoder<T> reflect​(java.lang.Class<T> type,
                                               org.apache.avro.Schema schema)
        Returns an AvroCoder instance for the provided element type respecting Avro's Reflect* suite for encoding and decoding.

        The schema must correspond to the type provided.

      • of

        public static AvroGenericCoder of​(org.apache.avro.Schema schema)
        Returns an AvroGenericCoder instance for the Avro schema. The implicit type is GenericRecord.
      • of

        public static <T> AvroCoder<T> of​(org.apache.beam.sdk.values.TypeDescriptor<T> type)
        Returns an AvroCoder instance for the provided element type.
        Type Parameters:
        T - the element type
      • of

        public static <T> AvroCoder<T> of​(org.apache.beam.sdk.values.TypeDescriptor<T> type,
                                          boolean useReflectApi)
        Returns an AvroCoder instance for the provided element type, respecting whether to use Avro's Reflect* or Specific* suite for encoding and decoding.
        Type Parameters:
        T - the element type
      • of

        public static <T> AvroCoder<T> of​(java.lang.Class<T> clazz)
        Returns an AvroCoder instance for the provided element class.
        Type Parameters:
        T - the element type
      • of

        public static <T> AvroCoder<T> of​(java.lang.Class<T> type,
                                          boolean useReflectApi)
        Returns an AvroCoder instance for the given class, respecting whether to use Avro's Reflect* or Specific* suite for encoding and decoding.
        Type Parameters:
        T - the element type
      • of

        public static <T> AvroCoder<T> of​(java.lang.Class<T> type,
                                          org.apache.avro.Schema schema)
        Returns an AvroCoder instance for the provided element type using the provided Avro schema

        The schema must correspond to the type provided.

        Type Parameters:
        T - the element type
      • of

        public static <T> AvroCoder<T> of​(AvroDatumFactory<T> datumFactory,
                                          org.apache.avro.Schema schema)
        Returns an AvroCoder instance for the provided AvroDatumFactory using the provided Avro schema.

        The schema must correspond to the provided datumFactory's type.

        Type Parameters:
        T - the element type
      • of

        public static <T> AvroCoder<T> of​(java.lang.Class<T> type,
                                          org.apache.avro.Schema schema,
                                          boolean useReflectApi)
        Returns an AvroCoder instance for the provided element type using the provided Avro schema, respecting whether to use Avro's Reflect* or Specific* suite for encoding and decoding.

        The schema must correspond to the type provided.

        Type Parameters:
        T - the element type
      • getCoderProvider

        public static org.apache.beam.sdk.coders.CoderProvider getCoderProvider()
        Returns a CoderProvider which uses the AvroCoder if possible for all types.

        It is unsafe to register this as a CoderProvider because Avro will reflectively accept dangerous types such as Object.

        This method is invoked reflectively from DefaultCoder.

      • getType

        public java.lang.Class<T> getType()
        Returns the type this coder encodes/decodes.
      • getDatumFactory

        public AvroDatumFactory<T> getDatumFactory()
        Returns the datum factory used for encoding/decoding.
      • getDatumWriter

        public org.apache.avro.io.DatumWriter<T> getDatumWriter()
        Returns the DatumWriter used for encoding.
      • getDatumReader

        public org.apache.avro.io.DatumReader<T> getDatumReader()
        Returns the DatumReader used for decoding.
      • encode

        public void encode​(T value,
                           java.io.OutputStream outStream)
                    throws java.io.IOException
        Specified by:
        encode in class org.apache.beam.sdk.coders.Coder<T>
        Throws:
        java.io.IOException
      • decode

        public T decode​(java.io.InputStream inStream)
                 throws java.io.IOException
        Specified by:
        decode in class org.apache.beam.sdk.coders.Coder<T>
        Throws:
        java.io.IOException
      • verifyDeterministic

        public void verifyDeterministic()
                                 throws org.apache.beam.sdk.coders.Coder.NonDeterministicException
        Overrides:
        verifyDeterministic in class org.apache.beam.sdk.coders.CustomCoder<T>
        Throws:
        org.apache.beam.sdk.coders.Coder.NonDeterministicException - when the type may not be deterministically encoded using the given Schema, the directBinaryEncoder, and the ReflectDatumWriter or GenericDatumWriter.
      • getSchema

        public org.apache.avro.Schema getSchema()
        Returns the schema used by this coder.
      • getEncodedTypeDescriptor

        public org.apache.beam.sdk.values.TypeDescriptor<T> getEncodedTypeDescriptor()
        Overrides:
        getEncodedTypeDescriptor in class org.apache.beam.sdk.coders.Coder<T>
      • equals

        public boolean equals​(@Nullable java.lang.Object other)
        Overrides:
        equals in class java.lang.Object
        Returns:
        true if the two AvroCoder instances have the same class, type and schema.
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object