Image Concepts

All images in Ark are represents from the ark::image::Image class. This contains a few bits of metadata and the core payload:

  • capture_time_ns - The time the image was captured on the sensor (or best estimate)
  • width - The width of the image, in pixels.
  • height - The height of the image, in pixels.
  • data_format - The data format of the payload.
  • sequence_number - Typically increments as images are read from their source
  • data - The actual payload (raw bytes).

Ark supports numerous data formats out of the box, including:

  • RGB and BGR (one byte per channel, interleaved, like RGBRGB …)
  • RGBA and BGRA (one byte per channel, interleaved, like RGBARGBA …)
  • YUYV 4:2:2 (interleaved, like YUYVYUYV …)
  • YUV 4:2:0 (planar, like YYYY, YYYY, UU, VV, YYYY, …)
  • NM12 (planar/interleaved, like YYYY, YYYY, UV, UV, …)
  • Bayer (8bit, 10bit, and 16bit, like RGGB, GRGB, …)
  • Greyscale 8-bit (single channel, 8-bits)
  • Depth 16-bit (single channel, 16-bits)

Additionally, it supports the following compression formats:

  • H264 (in some situations, depending on how you’ve built the software)

All supported formats are enumerated in the image_data_format.rbuf file.

Data Format Conversions

An API exists to get your image into whatever format you desire. For example, if you want your image in RGB, you can run:

#include "ark/image/convert.hh"

auto converted = ark::image::format_convert(original_image, ark::image::ImageDataFormat::Rgb);

If this image is already in RGB, this has no cost. If the image is compressed, it will be decompressed automatically. If the image is not in the RGB format, it will be converted to RGB in the most efficient way possible.

When reading images, it is almost always advisable to convert the source image into your assumed format before operating on them.

Scaling Images

You can scale images to different dimensions if your algorithms need it (or if you just want to make thumbnails of things).

While scaling an image, you can also convert its data format – this can be more efficient then converting first and scaling later (or even scaling first and converting later).

The API is also relatively simple:

#include "ark/image/scale.hh"

auto scaled = ark::image::scale_image(original_image, new_width, new_height);

If you want to change data format at the same time:

#include "ark/image/scale.hh"

auto scaled = ark::image::scale_image(original_image, new_width, new_height, ark::image::ImageDataFormat::Yuv420);

That will scale the original image to the new width, height, and to YUV420 planar.

Note that if you scale an image and provide the same width/height as the existing image, it is equivalent to calling format_convert, and if the width/height/format all match your desired values, this just returns the same image you passed in.


You can apply basic image transformations with the ark::image::transform_image API. This allows you to do simple things like rotate or transpose images, using CPU-efficient APIs.

The supported operations are:

  • Rotate (0, 90, 180, or 270 degrees) - Rotates the image (counter-clockwise)
  • Transpose (90 for horizontal mirror, 270 for a vertical mirror, 180 for both)


#include "ark/image/transform.hh"

auto flipped = ark::image::transform_image(original, ark::image::TransformType::TransposeRotate270);


There are two methods for compressing/decompressing – if your data is self-contained within a single frame, consider using the single-shot APIs:

#include "ark/image/convert.hh"

auto decompressed = ark::image::decompress(compressed);

The is_image_compressed API can help tell you if the image is compressed or not; but again, you may just want to use the format_convert API before working on your image to ensure it’s in the format you expect.

For more complex compression/decompression schemes, including formats that might not be pure keyframes, you can use the AbstractVideoCompressor and AbstractVideoDecompressor.

These classes will maintain state, and can emit a stream of images that form a video.

This example will be for H264.

#include "ark/image/abstract_video_compressor.hh"

using namespace ark::image;

// Configure the compressor. When using the abstract compressor, all
// configuration is done via key/value pairs. You can use structured
// types if you use the non-abstract compressors.

CompressorCodecConfig config;

config["codec_name"] = "libx264";
config["bit_rate"] = "2000000";
config["gop_size"] = "1";

// Creates the video compressor -- this is using libavcodec, so it assumes
// you are linked against libavcodec with x264 support.

auto compressor = create_video_compressor("avcodec", config);

while (/* use read a source image from some place */)

    // For many codecs, there is not a 1:1 relationship between frames
    // written and frames read. Some times you may get one frame back,
    // sometimes zero, and sometimes more then once.
    // To handle this, loop until read() returns false.

    Image compressed_image;

    while (compressor->read(compressed_image))
        /* handle your compressed image here */

There are a few different compressors available out of the box, including:

  • jpeg - This provides basic JPEG image compression/decompression
  • avcodec - This wraps libavcodec (when available) to support many codec types.
  • v4l - This uses Video4Linux to do hardware-accelerated compression

Video4Linux is commonly used to do hardware accelerated compression on many platforms, including the Nvidia Jetson series. Our compressor will make use of hardware acceleration for compression if possible, which can dramatically reduce the CPU cost of compressing images/videos.

A sample configuration for Nvidia might look like:

config["device_path"] = "/dev/nvhost-msenc";
config["plugin_path"] = "/usr/lib/aarch64-linux-gnu/libv4l/plugins/nv/";
config["mpeg_video_bitrate"] = "2000000";
config["mpeg_video_h264_profile"] = "1";
config["mpeg_video_gop_size"] = "10";

Take note of the plugin path – this is a V4L2 compatible plugin. Many systems require this to make use of hardware acceleration. You will likely get ioctl errors if you forget to configure this properly.

This is all typically provided by the VideoCompressor and VideoDecompressor stages.

Converting to OpenCV

It’s possible to convert an ark::image::Image type into an OpenCV cv::Mat without much overhead. This is supported for BGR, BGRA, Greyscale, and DepthZ16 image types.

auto opencv_image = to_opencv(image);

At this point opencv_image contains the same contents as image, just wrapped in an cv::Mat. You must maintain a reference to the original image, as we just point at the data contained in that image, we do not copy it.

Capturing Video

The ark::image::VideoDevice class exists to capture images using the Video4Linux APIs. It is a thin-wrapper over the V4L APIs to abstract them and make them easier to use and manage.

An example of using the device to capture images:

#include "ark/image/video_device.hh"

using namespace ark::image;

// Configure the device first -- note that a single device may have multiple
// streams (such as a capture stream and an output stream). For many cameras,
// a single stream is sufficient.

VideoStreamConfiguration stream_config;

stream_config.width = 1280;
stream_config.height = 720;
stream_config.format = ImageDataFormat::Jpeg;
stream_config.frame_rate = 30;

VideoDeviceConfiguration device_config;

device_config.device_path = "/dev/video0";
device_config.stream_configs[VideoStreamType::VideoCapture] = stream_config;

// Instantiate the device and enable streaming.
VideoDevice device(device_config);


// Now, just loop and read buffers. This will block until a buffer is available.
// Note that you can specify the non_blocking flag in configuration, and this will
// simply return a nullptr if the device isn't ready.
auto buffer = device.read_buffer(VideoStreamType::VideoCapture);

// The buffer contains pointers to the raw data, including width/height information.
// Each buffer contains 1 or more planes (typically 1 plane for most image formats,
// but YUV420 will contain multiple planes).

// When you are done with the buffer, release it back to the device.

Note that a read_scoped_buffer API exists which will return a buffer which automatically releases back to the device when you are done with it – this can be helpful to avoid image leaks.

If you are outputting video, use the VideoStreamType::VideoOutput type. In that case, you will want to use the write_buffer APIs:

// Get an image suitable for writing.
auto buffer = device.write_buffer(VideoStreamType::VideoOutput);

// Copy your image to the buffer -- this is the same structure/format as used
// by the capturing system.

// Release the image -- this will output the image to the device, potentially
// displaying it (or compressing it).

Finally, many devices provide customized controls. These can be queried through the available_controls() API, and you can retrieve their current state through the get_controls_state() API. Finally, you can adjust them by using set_controls.

All of this is provided through the VideoCaptureStage – that is the more traditional way to use these APIs in your pipelines.