ark::serialization::ParquetReader

Defined in header “ark/serialization/parquet/parquet_reader.hh”.


This class can read rows from a Parquet file (produced by ParquetWriter or an equivalent schema) and convert each row into an rbuf message. It uses reflection to set values on an rbuf Object, so performance is not as fast as reading directly into generated structures.

Methods

  • ParquetReader(const std::filesystem::path & input_path, const std::string & message_schema, const std::string & message_type)
    Constructor. The given path is used as the path to read the Parquet file from. The message schema/type come from rbuf.

  • ParquetReader(const core::ByteBuffer & input_buffer, const std::string & message_schema, const std::string & message_type)
    Constructor. Wraps the given byte buffer, reading parquet data from it. You maintain ownership of the buffer. The message schema/type come from rbuf.

  • ParquetReader(ParquetReader && other)
    Move constructor.

  • ~ParquetReader()
    Destructor.

  • void close()
    Closes the underlying file/reader. No further reads can take place.

  • bool read(RbufType & output)
    Reads the next row and materializes it into a concrete rbuf type. Returns false on EOF.

  • std::optional< std::string > read()
    Reads the next row and returns the result as a JSON blob.