Class DynamicDestinations<T,DestinationT>
- java.lang.Object
-
- org.apache.beam.sdk.io.gcp.bigquery.DynamicDestinations<T,DestinationT>
-
- All Implemented Interfaces:
java.io.Serializable
- Direct Known Subclasses:
PortableBigQueryDestinations,StorageApiDynamicDestinationsTableRow
public abstract class DynamicDestinations<T,DestinationT> extends java.lang.Object implements java.io.SerializableThis class provides the most general way of specifying dynamic BigQuery table destinations. Destinations can be extracted from the input element, and stored as a custom type. Mappings are provided to convert the destination into a BigQuery table reference and a BigQuery schema. The class can read side inputs while performing these mappings.For example, consider a PCollection of events, each containing a user-id field. You want to write each user's events to a separate table with a separate schema per user. Since the user-id field is a string, you will represent the destination as a string.
events.apply(BigQueryIO.<UserEvent>write() .to(new DynamicDestinations<UserEvent, String>() { public String getDestination(ValueInSingleWindow<UserEvent> element) { return element.getValue().getUserId(); } public TableDestination getTable(String user) { return new TableDestination(tableForUser(user), "Table for user " + user); } public TableSchema getSchema(String user) { return tableSchemaForUser(user); } }) .withFormatFunction(new SerializableFunction<UserEvent, TableRow>() { public TableRow apply(UserEvent event) { return convertUserEventToTableRow(event); } }));An instance of
DynamicDestinationscan also use side inputs usingsideInput(PCollectionView). The side inputs must be present ingetSideInputs(). Side inputs are accessed in the global window, so they must be globally windowed.DestinationTis expected to provide proper hash and equality members. Ideally it will be a compact type with an efficient coder, as these objects may be used as a key in aGroupByKey.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description DynamicDestinations()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract DestinationTgetDestination(@Nullable org.apache.beam.sdk.values.ValueInSingleWindow<T> element)Returns an object that represents at a high level which table is being written to.@Nullable org.apache.beam.sdk.coders.Coder<DestinationT>getDestinationCoder()Returns the coder forDynamicDestinations.abstract @Nullable com.google.api.services.bigquery.model.TableSchemagetSchema(DestinationT destination)Returns the table schema for the destination.java.util.List<org.apache.beam.sdk.values.PCollectionView<?>>getSideInputs()Specifies that this object needs access to one or more side inputs.abstract TableDestinationgetTable(DestinationT destination)Returns aTableDestinationobject for the destination.@Nullable com.google.api.services.bigquery.model.TableConstraintsgetTableConstraints(DestinationT destination)Returns TableConstraints (including primary and foreign key) to be used when creating the table.protected <SideInputT>
SideInputTsideInput(org.apache.beam.sdk.values.PCollectionView<SideInputT> view)Returns the value of a given side input.
-
-
-
Method Detail
-
getSideInputs
public java.util.List<org.apache.beam.sdk.values.PCollectionView<?>> getSideInputs()
Specifies that this object needs access to one or more side inputs. This side inputs must be globally windowed, as they will be accessed from the global window.
-
sideInput
protected final <SideInputT> SideInputT sideInput(org.apache.beam.sdk.values.PCollectionView<SideInputT> view)
Returns the value of a given side input. The view must be present ingetSideInputs().
-
getDestination
public abstract DestinationT getDestination(@Nullable org.apache.beam.sdk.values.ValueInSingleWindow<T> element)
Returns an object that represents at a high level which table is being written to. May not return null.The method must return a unique object for different destination tables involved over all BigQueryIO write transforms in the same pipeline. See https://github.com/apache/beam/issues/32335 for details.
-
getDestinationCoder
public @Nullable org.apache.beam.sdk.coders.Coder<DestinationT> getDestinationCoder()
Returns the coder forDynamicDestinations. If this is not overridden, thenBigQueryIOwill look in the coder registry for a suitable coder. This must be a deterministic coder, asDynamicDestinationswill be used as a key type in aGroupByKey.
-
getTable
public abstract TableDestination getTable(DestinationT destination)
Returns aTableDestinationobject for the destination. May not return null. Return value needs to be unique to each destination: may not return the sameTableDestinationfor different destinations.
-
getSchema
public abstract @Nullable com.google.api.services.bigquery.model.TableSchema getSchema(DestinationT destination)
Returns the table schema for the destination.
-
getTableConstraints
public @Nullable com.google.api.services.bigquery.model.TableConstraints getTableConstraints(DestinationT destination)
Returns TableConstraints (including primary and foreign key) to be used when creating the table. Note: this is not currently supported when using FILE_LOADS!.
-
-