Binary Data

Introduction

The C-Craft library includes functions to help deal with binary data blocks.

Binary Data

Binary data is managed using BIN. A BIN can be created using:

#include <c-craft/bin.h>

size_t len = 64;
unsigned char *data = sm_alloc(mem, len):

BIN bin = bin_create(mem, data, len);

Data can be retrieved from a BIN using:

unsigned char *data = bin_data(bin);
size_t len = bin_len(bin);

Abstracting binary data using BIN provides a consistent method for functions to return binary data blocks.

Data Compression

Overview

Binary data may be compressed like so.

#include <c-craft/bin.h>

unsigned char data[512];        /* Data to compress. */

MEM_SCOPE mem = sm_create(0);

BIN data = bin_compress(mem, data, sizeof(data), NULL);

And expanded like so.

#include <c-craft/bin.h>

unsigned char data[512];        /* Compressed data. */

MEM_SCOPE mem = sm_create(0);

BIN data = bin_expand(mem, data, sizeof(data), NULL);

Data compression in C-Craft is implemented as a sequence of encodings. You can access the individual compression encoding stages if you need more detailed control over the compression process.

Run Length Encoding

Run length encoding swaps a sequence of identical bytes for a byte and count pair. This is useful for data that often has long runs of the same byte value. However, for some data it may not be able to compress at all.

#include <c-craft/bin.h>

unsigned char data[512];        /* Data to compress. */

MEM_SCOPE mem = sm_create(0);

BIN bin = rle_encode(mem, data, sizeof(data));

unsigned char *data = bin_data(bin);
size_t len = bin_len(bin);

bin = rle_decode(mem, data, len);

Back Reference

Back reference encoding (LZ77) maintains a sliding window of previously seen data. It swaps repeating patterns for back reference pointers. The window size is a configurable parameter. Larger windows take more memory and are slower to process, but have greater potential for better compression. For in memory compression such as this, the window can be as much as the whole data block. The only reason to choose a smaller window would be for processing speed.

#include <c-craft/bin.h>

unsigned char data[512];        /* Data to compress. */

MEM_SCOPE mem = sm_create(0);

/* Use 0 to set maximum window size. */
BIN bin = lz77_encode(mem, data, sizeof(data), 0);

unsigned char *data = bin_data(bin);
size_t len = bin_len(bin);

bin = lz77_decode(mem, data, len);

Huffman Encoding

Huffman encoding is normally the final compression stage. It swaps each possible byte value with shorter bit patterns. This requires building a table of byte values in order to assign bit patterns to each byte. Consequently, it requires more processing and memory than other algorithms.

#include <c-craft/bin.h>

unsigned char data[512];        /* Data to compress. */

MEM_SCOPE mem = sm_create(0);

/* Use 0 to set maximum window size. */
BIN bin = huffman_encode(mem, data, sizeof(data), 0);

unsigned char *data = bin_data(bin);
size_t len = bin_len(bin);

bin = huffman_decode(mem, data, len);