Class ParquetSource
java.lang.Object
org.apache.wayang.core.plan.wayangplan.OperatorBase
org.apache.wayang.core.plan.wayangplan.UnarySource<Record>
org.apache.wayang.basic.operators.ParquetSource
- All Implemented Interfaces:
Serializable
,ActualOperator
,ElementaryOperator
,Operator
- Direct Known Subclasses:
JavaParquetSource
,SparkParquetSource
This source reads a parquet file and outputs the lines as
Record
units.- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected class
CustomCardinalityEstimator
forFlatMapOperator
s.Nested classes/interfaces inherited from class org.apache.wayang.core.plan.wayangplan.OperatorBase
OperatorBase.GsonSerializer
-
Field Summary
Fields inherited from class org.apache.wayang.core.plan.wayangplan.OperatorBase
inputSlots, outputSlots, STANDARD_OPERATOR_ARGS
Fields inherited from interface org.apache.wayang.core.plan.wayangplan.Operator
FIRST_EPOCH
-
Constructor Summary
ConstructorsConstructorDescriptionParquetSource
(String inputUrl, String[] projection, String... fieldNames) ParquetSource
(String inputUrl, String[] projection, DataSetType<Record> type) ParquetSource
(ParquetSource that) Copies an instance (exclusive of broadcasts). -
Method Summary
Modifier and TypeMethodDescriptionstatic ParquetSource
Creates a new instance.createCardinalityEstimator
(int outputIndex, Configuration configuration) org.apache.parquet.hadoop.metadata.ParquetMetadata
String[]
org.apache.parquet.schema.MessageType
Methods inherited from class org.apache.wayang.core.plan.wayangplan.UnarySource
getOutput, getType
Methods inherited from class org.apache.wayang.core.plan.wayangplan.OperatorBase
accept, addBroadcastInput, addTargetPlatform, at, collectMappedInputSlots, collectMappedOutputSlots, copy, createCopy, getAllInputs, getAllOutputs, getCardinalityEstimator, getContainer, getEpoch, getName, getOriginal, getSimpleClassName, getTargetPlatforms, isAuxiliary, isSupportingBroadcastInputs, propagateInputCardinality, propagateOutputCardinality, setAuxiliary, setCardinalityEstimator, setContainer, setEpoch, setName, toString
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.apache.wayang.core.plan.wayangplan.ActualOperator
accept
Methods inherited from interface org.apache.wayang.core.plan.wayangplan.ElementaryOperator
getCardinalityEstimator, isAuxiliary, setAuxiliary, setCardinalityEstimator
Methods inherited from interface org.apache.wayang.core.plan.wayangplan.Operator
addBroadcastInput, addTargetPlatform, broadcastTo, broadcastTo, collectMappedInputSlots, collectMappedOutputSlots, connectTo, connectTo, getAllInputs, getAllOutputs, getCardinalityPusher, getContainer, getEffectiveOccupant, getEffectiveOccupant, getEpoch, getEstimationContextProperties, getForwards, getInnermostLoop, getInput, getInput, getLoopStack, getName, getNumBroadcastInputs, getNumInputs, getNumOutputs, getNumRegularInputs, getOuterInputSlot, getOutermostInputSlot, getOutermostOutputSlots, getOutput, getOutput, getParent, getTargetPlatforms, isAlternative, isConversion, isElementary, isExecutionOperator, isFeedbackInput, isFeedforwardOutput, isLoopHead, isLoopSubplan, isOwnerOf, isReading, isSink, isSource, isSubplan, isSupportingBroadcastInputs, isUnconnected, propagateInputCardinality, propagateOutputCardinality, propagateOutputCardinality, replaceWith, setContainer, setEpoch, setInput, setName, setOutput
-
Constructor Details
-
ParquetSource
-
ParquetSource
-
ParquetSource
Copies an instance (exclusive of broadcasts).- Parameters:
that
- that should be copied
-
-
Method Details
-
create
Creates a new instance.- Parameters:
inputUrl
- name of the file to be readprojection
- names of the columns to filter; can be omitted but allows for an early projection
-
getInputUrl
-
getProjection
-
getMetadata
public org.apache.parquet.hadoop.metadata.ParquetMetadata getMetadata() -
getSchema
public org.apache.parquet.schema.MessageType getSchema() -
createCardinalityEstimator
public Optional<CardinalityEstimator> createCardinalityEstimator(int outputIndex, Configuration configuration) Description copied from interface:ElementaryOperator
- Parameters:
outputIndex
- index of theOutputSlot
for that theCardinalityEstimator
is requestedconfiguration
- if theCardinalityEstimator
depends on further ones, use this to obtain the latter- Returns:
- an
Optional
that might provide the requested instance
-