Class IsmFormat
- java.lang.Object
-
- org.apache.beam.runners.dataflow.internal.IsmFormat
-
public class IsmFormat extends java.lang.ObjectAn Ism file is a prefix encoded composite key value file broken into shards. Each composite key is composed of a fixed number of component keys. A fixed number of those sub keys represent the shard key portion; seeIsmFormat.IsmRecordandIsmFormat.IsmRecordCoderfor further details around the data format. In addition to the data, there is a bloom filter, and multiple indices to allow for efficient retrieval.An Ism file is composed of these high level sections (in order):
- shard block
- bloom filter (See
ScalableBloomFilterfor details on encoding format) - shard index
- footer (See
IsmFormat.Footerfor details on encoding format)
The shard block is composed of multiple copies of the following:
- data block
- data index
The data block is composed of multiple copies of the following:
- key prefix (See
IsmFormat.KeyPrefixfor details on encoding format) - unshared key bytes
- value bytes
- optional 0x00 0x00 bytes followed by metadata bytes (if the following 0x00 0x00 bytes are not present, then there are no metadata bytes)
1225801234as the seed value.The data index is composed of
Ncopies of the following:- key prefix (See
IsmFormat.KeyPrefixfor details on encoding format) - unshared key bytes
- byte offset to key prefix in data block (variable length long coding)
The shard index is composed of a
variable length integerencoding representing the number of shard index records followed by that many shard index records. SeeIsmFormat.IsmShardCoderfor further details as to its encoding scheme.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classIsmFormat.FooterThe footer stores the relevant information required to locate the index and bloom filter.static classIsmFormat.FooterCoderACoderforIsmFormat.Footer.static classIsmFormat.IsmRecord<V>A record containing a composite key and either a value or metadata.static classIsmFormat.IsmRecordCoder<V>ACoderforIsmFormat.IsmRecords.static classIsmFormat.IsmShardA shard descriptor containing shard id, the data block offset, and the index offset for the given shard.static classIsmFormat.IsmShardCoderA coder forIsmFormat.IsmShards.static classIsmFormat.KeyPrefixThe prefix used before each key which contains the number of shared and unshared bytes from the previous key that was read.static classIsmFormat.KeyPrefixCoderACoderforIsmFormat.KeyPrefix.static classIsmFormat.MetadataKeyCoder<K>A coder for metadata key component.
-
Field Summary
Fields Modifier and Type Field Description static org.apache.beam.sdk.coders.Coder<java.util.List<IsmFormat.IsmShard>>ISM_SHARD_INDEX_CODERAListCoderwrapping aIsmFormat.IsmShardCoderused to encode the shard index.static intSHARD_BITS
-
Constructor Summary
Constructors Constructor Description IsmFormat()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static java.lang.ObjectgetMetadataKey()An object representing a wild card for a key component.static booleanisMetadataKey(java.util.List<?> keyComponents)Returns true if and only if any of the passed in key components represent a metadata key.static voidvalidateCoderIsCompatible(IsmFormat.IsmRecordCoder<?> coder)Validates that the key portion of the given coder is deterministic.
-
-
-
Field Detail
-
SHARD_BITS
public static final int SHARD_BITS
- See Also:
- Constant Field Values
-
ISM_SHARD_INDEX_CODER
public static final org.apache.beam.sdk.coders.Coder<java.util.List<IsmFormat.IsmShard>> ISM_SHARD_INDEX_CODER
AListCoderwrapping aIsmFormat.IsmShardCoderused to encode the shard index. SeeListCoderfor its encoding specification andIsmFormat.IsmShardCoderfor its encoding specification.
-
-
Method Detail
-
validateCoderIsCompatible
public static void validateCoderIsCompatible(IsmFormat.IsmRecordCoder<?> coder)
Validates that the key portion of the given coder is deterministic.
-
isMetadataKey
public static boolean isMetadataKey(java.util.List<?> keyComponents)
Returns true if and only if any of the passed in key components represent a metadata key.
-
getMetadataKey
public static java.lang.Object getMetadataKey()
An object representing a wild card for a key component. Encoded usingIsmFormat.MetadataKeyCoder.
-
-