@Internal public abstract class LazyBinaryFormat<T> extends Object implements BinaryFormat
BinaryFormat which is lazily serialized into binary or
lazily deserialized into Java object.
The reason why we introduce this data structure is in order to save (de)serialization in nested function calls. Consider the following function call chain:
UDF0(input) -> UDF1(result0) -> UDF2(result1) -> UDF3(result2)
Such nested calls, if the return values of UDFs are Java object format, it will result in multiple conversions between Java object and binary format:
converterToBinary(UDF0(converterToJavaObject(input))) ->
converterToBinary(UDF1(converterToJavaObject(result0))) ->
converterToBinary(UDF2(converterToJavaObject(result1))) ->
...
So we introduced LazyBinaryFormat to avoid the redundant cost, it has three forms:
It can lazy the conversions as much as possible. It will be converted into required form only when it is needed.
HIGHEST_FIRST_BIT, HIGHEST_SECOND_TO_EIGHTH_BIT, MAX_FIX_PART_DATA_SIZE| Constructor and Description |
|---|
LazyBinaryFormat() |
LazyBinaryFormat(org.apache.flink.core.memory.MemorySegment[] segments,
int offset,
int sizeInBytes) |
LazyBinaryFormat(org.apache.flink.core.memory.MemorySegment[] segments,
int offset,
int sizeInBytes,
T javaObject) |
LazyBinaryFormat(T javaObject) |
LazyBinaryFormat(T javaObject,
BinarySection binarySection) |
| Modifier and Type | Method and Description |
|---|---|
void |
ensureMaterialized(org.apache.flink.api.common.typeutils.TypeSerializer<T> serializer)
Ensure we have materialized binary format.
|
BinarySection |
getBinarySection() |
T |
getJavaObject() |
int |
getOffset()
Gets the start offset of this binary data in the
MemorySegments. |
org.apache.flink.core.memory.MemorySegment[] |
getSegments()
Gets the underlying
MemorySegments this binary format spans. |
int |
getSizeInBytes()
Gets the size in bytes of this binary data.
|
protected abstract BinarySection |
materialize(org.apache.flink.api.common.typeutils.TypeSerializer<T> serializer)
Materialize java object to binary format.
|
void |
setJavaObject(T javaObject)
Must be public as it is used during code generation.
|
public LazyBinaryFormat()
public LazyBinaryFormat(org.apache.flink.core.memory.MemorySegment[] segments,
int offset,
int sizeInBytes,
T javaObject)
public LazyBinaryFormat(org.apache.flink.core.memory.MemorySegment[] segments,
int offset,
int sizeInBytes)
public LazyBinaryFormat(T javaObject)
public LazyBinaryFormat(T javaObject, BinarySection binarySection)
public T getJavaObject()
public BinarySection getBinarySection()
public void setJavaObject(T javaObject)
public org.apache.flink.core.memory.MemorySegment[] getSegments()
BinaryFormatMemorySegments this binary format spans.getSegments in interface BinaryFormatpublic int getOffset()
BinaryFormatMemorySegments.getOffset in interface BinaryFormatpublic int getSizeInBytes()
BinaryFormatgetSizeInBytes in interface BinaryFormatpublic final void ensureMaterialized(org.apache.flink.api.common.typeutils.TypeSerializer<T> serializer)
protected abstract BinarySection materialize(org.apache.flink.api.common.typeutils.TypeSerializer<T> serializer) throws IOException
RawValueData needs javaObjectSerializer).IOExceptionCopyright © 2014–2024 The Apache Software Foundation. All rights reserved.