Package org.apache.arrow.vector.complex
Class BaseRepeatedValueVector
java.lang.Object
org.apache.arrow.vector.BaseValueVector
org.apache.arrow.vector.complex.BaseRepeatedValueVector
- All Implemented Interfaces:
Closeable,AutoCloseable,Iterable<ValueVector>,BaseListVector,RepeatedValueVector,DensityAwareVector,FieldVector,ValueVector
- Direct Known Subclasses:
ListVector
public abstract class BaseRepeatedValueVector
extends BaseValueVector
implements RepeatedValueVector, BaseListVector
Base class for Vectors that contain repeated values.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Stringstatic final FieldVectorprotected Stringstatic final byteprotected longprotected ArrowBufprotected final CallBackprotected intprotected FieldVectorFields inherited from class org.apache.arrow.vector.BaseValueVector
allocator, fieldReader, INITIAL_VALUE_ALLOCATION, MAX_ALLOCATION_SIZE, MAX_ALLOCATION_SIZE_PROPERTYFields inherited from interface org.apache.arrow.vector.complex.RepeatedValueVector
DEFAULT_REPEAT_PER_RECORD -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedBaseRepeatedValueVector(String name, BufferAllocator allocator, FieldVector vector, CallBack callBack) protectedBaseRepeatedValueVector(String name, BufferAllocator allocator, CallBack callBack) -
Method Summary
Modifier and TypeMethodDescription<T extends ValueVector>
AddOrGetResult<T> addOrGetVector(FieldType fieldType) Initialize the data vector (and execute callback) if it hasn't already been done, returns the data vector.booleanAllocates new buffers.protected ArrowBufallocateOffsetBuffer(long size) voidclear()Release any owned ArrowBuf and reset the ValueVector to the initial state.ArrowBuf[]getBuffers(boolean clear) Return the underlying buffers associated with this vector.intGet the number of bytes used by this vector.intgetBufferSizeFor(int valueCount) Returns the number of bytes that is used by this vector if it holds the given number of values.Get the data vector.intintgetInnerValueCountAt(int index) Returns the value count for inner data vector at a particular index.getName()Gets the name of the vector.protected intDeprecated.This API will be removed, as the current implementations no longer hold inner offset vectors.intReturns the maximum number of values that can be stored in this vector instance.intGets the number of values.abstract booleanisEmpty(int index) Return if value at index is empty.iterator()voidreAlloc()Allocate new buffer with double capacity, and copy data into the new buffer.protected voidprotected voidvoidreset()Reset the ValueVector to the initial state without releasing any owned ArrowBuf.voidsetInitialCapacity(int numRecords) Set the initial record capacity.voidsetInitialCapacity(int numRecords, double density) Specialized version of setInitialCapacity() for ListVector.voidsetInitialTotalCapacity(int numRecords, int totalNumberOfElements) Specialized version of setInitialTotalCapacity() for ListVector.voidsetValueCount(int valueCount) Preallocates the number of repeated values.intsize()Get value indicating if inner vector is set.intstartNewValue(int index) Starts a new repeated value.Methods inherited from class org.apache.arrow.vector.BaseValueVector
checkBufRefs, close, copyFrom, copyFromSafe, getAllocator, getReader, getReaderImpl, getTransferPair, getValidityBufferSizeFromCount, releaseBuffer, toString, transferBufferMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.arrow.vector.complex.BaseListVector
getElementEndIndex, getElementStartIndexMethods inherited from interface org.apache.arrow.vector.FieldVector
exportBuffer, exportCDataBuffers, getChildrenFromFields, getDataBufferAddress, getExportedCDataBufferCount, getFieldBuffers, getFieldInnerVectors, getOffsetBufferAddress, getValidityBufferAddress, initializeChildrenFromFields, loadFieldBuffers, setNullMethods inherited from interface java.lang.Iterable
forEach, spliteratorMethods inherited from interface org.apache.arrow.vector.ValueVector
accept, allocateNew, close, copyFrom, copyFromSafe, getAllocator, getDataBuffer, getField, getMinorType, getNullCount, getObject, getOffsetBuffer, getReader, getTransferPair, getTransferPair, getTransferPair, getTransferPair, getTransferPair, getValidityBuffer, hashCode, hashCode, isNull, makeTransferPair, validate, validateFull
-
Field Details
-
DEFAULT_DATA_VECTOR
-
DATA_VECTOR_NAME
- See Also:
-
OFFSET_WIDTH
public static final byte OFFSET_WIDTH- See Also:
-
offsetBuffer
-
vector
-
repeatedCallBack
-
valueCount
protected int valueCount -
offsetAllocationSizeInBytes
protected long offsetAllocationSizeInBytes -
defaultDataVectorName
-
-
Constructor Details
-
BaseRepeatedValueVector
-
BaseRepeatedValueVector
protected BaseRepeatedValueVector(String name, BufferAllocator allocator, FieldVector vector, CallBack callBack)
-
-
Method Details
-
getName
Description copied from interface:ValueVectorGets the name of the vector.- Specified by:
getNamein interfaceValueVector- Specified by:
getNamein classBaseValueVector- Returns:
- the name of the vector.
-
allocateNewSafe
public boolean allocateNewSafe()Description copied from interface:ValueVectorAllocates new buffers. ValueVector implements logic to determine how much to allocate.- Specified by:
allocateNewSafein interfaceValueVector- Returns:
- Returns true if allocation was successful.
-
allocateOffsetBuffer
-
reAlloc
public void reAlloc()Description copied from interface:ValueVectorAllocate new buffer with double capacity, and copy data into the new buffer. Replace vector's buffer with new buffer, and release old one- Specified by:
reAllocin interfaceValueVector
-
reallocOffsetBuffer
protected void reallocOffsetBuffer() -
getOffsetVector
Deprecated.This API will be removed, as the current implementations no longer hold inner offset vectors.Get the offset vector.- Specified by:
getOffsetVectorin interfaceRepeatedValueVector- Returns:
- the underlying offset vector or null if none exists.
-
getDataVector
Description copied from interface:RepeatedValueVectorGet the data vector.- Specified by:
getDataVectorin interfaceRepeatedValueVector- Returns:
- the underlying data vector or null if none exists.
-
setInitialCapacity
public void setInitialCapacity(int numRecords) Description copied from interface:ValueVectorSet the initial record capacity.- Specified by:
setInitialCapacityin interfaceValueVector- Parameters:
numRecords- the initial record capacity.
-
setInitialCapacity
public void setInitialCapacity(int numRecords, double density) Specialized version of setInitialCapacity() for ListVector. This is used by some callers when they want to explicitly control and be conservative about memory allocated for inner data vector. This is very useful when we are working with memory constraints for a query and have a fixed amount of memory reserved for the record batch. In such cases, we are likely to face OOM or related problems when we reserve memory for a record batch with value count x and do setInitialCapacity(x) such that each vector allocates only what is necessary and not the default amount but the multiplier forces the memory requirement to go beyond what was needed.- Specified by:
setInitialCapacityin interfaceDensityAwareVector- Parameters:
numRecords- value countdensity- density of ListVector. Density is the average size of list per position in the List vector. For example, a density value of 10 implies each position in the list vector has a list of 10 values. A density value of 0.1 implies out of 10 positions in the list vector, 1 position has a list of size 1 and remaining positions are null (no lists) or empty lists. This helps in tightly controlling the memory we provision for inner data vector.
-
setInitialTotalCapacity
public void setInitialTotalCapacity(int numRecords, int totalNumberOfElements) Specialized version of setInitialTotalCapacity() for ListVector. This is used by some callers when they want to explicitly control and be conservative about memory allocated for inner data vector. This is very useful when we are working with memory constraints for a query and have a fixed amount of memory reserved for the record batch. In such cases, we are likely to face OOM or related problems when we reserve memory for a record batch with value count x and do setInitialCapacity(x) such that each vector allocates only what is necessary and not the default amount but the multiplier forces the memory requirement to go beyond what was needed.- Parameters:
numRecords- value counttotalNumberOfElements- the total number of elements to to allow for in this vector across all records.
-
getValueCapacity
public int getValueCapacity()Description copied from interface:ValueVectorReturns the maximum number of values that can be stored in this vector instance.- Specified by:
getValueCapacityin interfaceValueVector- Returns:
- the maximum number of values that can be stored in this vector instance.
-
getOffsetBufferValueCapacity
protected int getOffsetBufferValueCapacity() -
getBufferSize
public int getBufferSize()Description copied from interface:ValueVectorGet the number of bytes used by this vector.- Specified by:
getBufferSizein interfaceValueVector- Returns:
- the number of bytes that is used by this vector instance.
-
getBufferSizeFor
public int getBufferSizeFor(int valueCount) Description copied from interface:ValueVectorReturns the number of bytes that is used by this vector if it holds the given number of values. The result will be the same as if setValueCount() were called, followed by calling getBufferSize(), but without any of the closing side-effects that setValueCount() implies wrt finishing off the population of a vector. Some operations might wish to use this to determine how much memory has been used by a vector so far, even though it is not finished being populated.- Specified by:
getBufferSizeForin interfaceValueVector- Parameters:
valueCount- the number of values to assume this vector contains- Returns:
- the buffer size if this vector is holding valueCount values
-
iterator
- Specified by:
iteratorin interfaceIterable<ValueVector>- Overrides:
iteratorin classBaseValueVector
-
clear
public void clear()Description copied from interface:ValueVectorRelease any owned ArrowBuf and reset the ValueVector to the initial state. If the vector has any child vectors, they will also be cleared.- Specified by:
clearin interfaceValueVector- Overrides:
clearin classBaseValueVector
-
reset
public void reset()Description copied from interface:ValueVectorReset the ValueVector to the initial state without releasing any owned ArrowBuf. Buffer capacities will remain unchanged and any previous data will be zeroed out. This includes buffers for data, validity, offset, etc. If the vector has any child vectors, they will also be reset.- Specified by:
resetin interfaceValueVector
-
getBuffers
Description copied from interface:ValueVectorReturn the underlying buffers associated with this vector. Note that this doesn't impact the reference counts for this buffer so it only should be used for in-context access. Also note that this buffer changes regularly thus external classes shouldn't hold a reference to it (unless they change it).- Specified by:
getBuffersin interfaceValueVector- Parameters:
clear- Whether to clear vector before returning; the buffers will still be refcounted; but the returned array will be the only reference to them- Returns:
- The underlying
buffersthat is used by this vector instance.
-
size
public int size()Get value indicating if inner vector is set.- Returns:
- 1 if inner vector is explicitly set via #addOrGetVector else 0
-
addOrGetVector
Initialize the data vector (and execute callback) if it hasn't already been done, returns the data vector. -
replaceDataVector
-
getValueCount
public int getValueCount()Description copied from interface:ValueVectorGets the number of values.- Specified by:
getValueCountin interfaceValueVector- Returns:
- number of values in the vector
-
getInnerValueCount
public int getInnerValueCount() -
getInnerValueCountAt
public int getInnerValueCountAt(int index) Returns the value count for inner data vector at a particular index. -
isEmpty
public abstract boolean isEmpty(int index) Return if value at index is empty. -
startNewValue
public int startNewValue(int index) Starts a new repeated value. -
setValueCount
public void setValueCount(int valueCount) Preallocates the number of repeated values.- Specified by:
setValueCountin interfaceValueVector
-