Class ParquetSource
java.lang.Object
org.apache.wayang.core.plan.wayangplan.OperatorBase
org.apache.wayang.core.plan.wayangplan.UnarySource<Record>
org.apache.wayang.basic.operators.ParquetSource
- All Implemented Interfaces:
Serializable,ActualOperator,ElementaryOperator,Operator
- Direct Known Subclasses:
JavaParquetSource,SparkParquetSource
This source reads a parquet file and outputs the lines as
Record units.- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected classCustomCardinalityEstimatorforFlatMapOperators.Nested classes/interfaces inherited from class org.apache.wayang.core.plan.wayangplan.OperatorBase
OperatorBase.GsonSerializer -
Field Summary
Fields inherited from class org.apache.wayang.core.plan.wayangplan.OperatorBase
inputSlots, outputSlots, STANDARD_OPERATOR_ARGSFields inherited from interface org.apache.wayang.core.plan.wayangplan.Operator
FIRST_EPOCH -
Constructor Summary
ConstructorsConstructorDescriptionParquetSource(String inputUrl, String[] projection, String... fieldNames) ParquetSource(String inputUrl, String[] projection, DataSetType<Record> type) ParquetSource(ParquetSource that) Copies an instance (exclusive of broadcasts). -
Method Summary
Modifier and TypeMethodDescriptionstatic ParquetSourceCreates a new instance.createCardinalityEstimator(int outputIndex, Configuration configuration) org.apache.parquet.hadoop.metadata.ParquetMetadataString[]org.apache.parquet.schema.MessageTypeMethods inherited from class org.apache.wayang.core.plan.wayangplan.UnarySource
getOutput, getTypeMethods inherited from class org.apache.wayang.core.plan.wayangplan.OperatorBase
accept, addBroadcastInput, addTargetPlatform, at, collectMappedInputSlots, collectMappedOutputSlots, copy, createCopy, getAllInputs, getAllOutputs, getCardinalityEstimator, getContainer, getEpoch, getName, getOriginal, getSimpleClassName, getTargetPlatforms, isAuxiliary, isSupportingBroadcastInputs, propagateInputCardinality, propagateOutputCardinality, setAuxiliary, setCardinalityEstimator, setContainer, setEpoch, setName, toStringMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.wayang.core.plan.wayangplan.ActualOperator
acceptMethods inherited from interface org.apache.wayang.core.plan.wayangplan.ElementaryOperator
getCardinalityEstimator, isAuxiliary, setAuxiliary, setCardinalityEstimatorMethods inherited from interface org.apache.wayang.core.plan.wayangplan.Operator
addBroadcastInput, addTargetPlatform, broadcastTo, broadcastTo, collectMappedInputSlots, collectMappedOutputSlots, connectTo, connectTo, getAllInputs, getAllOutputs, getCardinalityPusher, getContainer, getEffectiveOccupant, getEffectiveOccupant, getEpoch, getEstimationContextProperties, getForwards, getInnermostLoop, getInput, getInput, getLoopStack, getName, getNumBroadcastInputs, getNumInputs, getNumOutputs, getNumRegularInputs, getOuterInputSlot, getOutermostInputSlot, getOutermostOutputSlots, getOutput, getOutput, getParent, getTargetPlatforms, isAlternative, isConversion, isElementary, isExecutionOperator, isFeedbackInput, isFeedforwardOutput, isLoopHead, isLoopSubplan, isOwnerOf, isReading, isSink, isSource, isSubplan, isSupportingBroadcastInputs, isUnconnected, propagateInputCardinality, propagateOutputCardinality, propagateOutputCardinality, replaceWith, setContainer, setEpoch, setInput, setName, setOutput
-
Constructor Details
-
ParquetSource
-
ParquetSource
-
ParquetSource
Copies an instance (exclusive of broadcasts).- Parameters:
that- that should be copied
-
-
Method Details
-
create
Creates a new instance.- Parameters:
inputUrl- name of the file to be readprojection- names of the columns to filter; can be omitted but allows for an early projection
-
getInputUrl
-
getProjection
-
getMetadata
public org.apache.parquet.hadoop.metadata.ParquetMetadata getMetadata() -
getSchema
public org.apache.parquet.schema.MessageType getSchema() -
createCardinalityEstimator
public Optional<CardinalityEstimator> createCardinalityEstimator(int outputIndex, Configuration configuration) Description copied from interface:ElementaryOperator- Parameters:
outputIndex- index of theOutputSlotfor that theCardinalityEstimatoris requestedconfiguration- if theCardinalityEstimatordepends on further ones, use this to obtain the latter- Returns:
- an
Optionalthat might provide the requested instance
-