version 3.0

This commit is contained in:
Bramfeld Team 2015-08-31 14:01:44 +02:00
commit d837490606
209 changed files with 19662 additions and 0 deletions

79
xcodec/FUTURE Normal file
View file

@ -0,0 +1,79 @@
o) We should really be looking up hashes both in our cache and in our cache
of hashes from the peer. If it's only in the peer's cache, we can add it
to ours and send it as if it were a new piece of data. This could either
be a big speedup or a big slowdown for the encoding process. It depends
on how often we see data from the peer then be data we need to send back
to the peer.
XCodec 0.9.0 goals:
o) Stop using hashes like names and use actual names. This abstraction will
allow us to minimize the cost of collisions, speed lookup, etc. It also
means that different systems will be able to use different encode/hash
algorithms for lookup based on their requirements.
o) Expand the protocol to have different ops for referening hashes in our
namespace vs. those of our peer...
o) Exchange not just our UUIDs but a list of all of the UUIDs of other systems
we're talking to, allowing us to also reference hashes in other namespaces
that we share access to.
o) Also exchange other parameters, like size of the backref window, using the
minimum between the two peers.
Past ideas:
o) Create a new XCodecTag that incorporates a hash and a counter and
perhaps other things...
o) The counter will increment for each collision (or perhaps just be a
random number after the first collision and bail out if there's a
collision on the random number) and add new variants of extract, etc.,
that give a counter to append to the hash to get the Tag.
o) Allow either powers of two above 128 or multiples of 128 to be usable
chunk sizes instead of just 128. Include any time we define a tag a
bitmap of 128-byte blocks (or blocks of each size down to 128) within
the chunk that are to be learned, too, so that we still deal well with
changes. Eventually allow defining new e.g. 4K blocks based on old 4K
blocks with a single 128-byte block difference?
o) Add a pass number to the Tag so we can do recursive encoding. Use a
limited number of bits and put this above the opcode so that we can have
separate back-reference windows, etc., for each pass and so that we can
avoid escaping for subsequent passes, perhaps?
o) Deflate after recursive encoding.
o) A new encoder that can exploit all of those features, possibly keeping
the old encoder around for applications that need low latency and high
throughput.
To-do:
o) Add a 'count' field to the hash and allow incrementing it to do collision
overflow. For this it'd be nice to have an interface that would return a
range of matches in the dictionary. Put the count at the end to make this
possible. Would need to change the encoding logic to use a different OP
for these that took, say, a count or even just the full hash/identifier.
o) Only have N bytes outstanding at any given time (say 128k?) and add some
type of ACK, perhaps? This is necessary to:
o) Write a garbage-collector for the dictionary. LRU?
Possibly-bad future ideas:
o) Incorporate run-length encoding.
o) Incorporate occasional (figure out frequency) CRCs or such of the next N
bytes of decoded data to make it possible to detect any hash mismatches,
using a different hash function to any that go into the hash.
o) If the encoded version of a stream is larger than the source would be
escaped, it'd be nice to just transmit it escaped and to have some way to
tell the remote side how to pick out chunks to be taken as known to both
parties in the future. One approach would be to send a list of offsets
at which hashes were declared.
%%%
Hash-set deduplication:
For a given number of hashes (say 64), put an unordered list (hash?) of each
64 hashes that are encountered into a database.
When data is encoded, check whether its 64 hashes have appeared previously. If
they have, then use a compact encoding to list the order in which they appear
and the offsets within the list at which escaped or new data is to be inserted.
Eventually extend with one of the Computational Biology algorithms for finding
sequences missing an element or with one element changed so that we can do work
with deltas and offset sequences/sets more reliably.

5
xcodec/Makefile Normal file
View file

@ -0,0 +1,5 @@
SUBDIR+=example
SUBDIR+=test
SUBDIR+=cache/coss
include ../common/subdir.mk

19
xcodec/TODO Normal file
View file

@ -0,0 +1,19 @@
o) Go back to a lookahead decoder so we don't have to do an ASK/LEARN at a time,
that will be really painful on long, slow links.
o) Add a PAUSE/RESUME mechanism so that we don't have, say, more than 1MB of data
queued up during an ASK/LEARN session? PAUSE when we send an ASK with more
than 1MB or data or get more than 1MB of data with an ASK outstanding, and then
send a RESUME once we get <1MB of data outstanding?
o) Use a 16-bit window counter rather than an 8-bit one so we have an 8MB window
rather than a 32KB one.
XXX Preliminary tests show this to be a big throughput hit. Need to check
whether the gains are worth it.
o) Don't let a peer claim to have our UUID?
o) Permanent storage.
o) Decide whether to keep a std::set (or something fancier) of hashes associated
with each UUID (i.e. ones we have sent to them). We could even make it a
set of <UUID,UUID,hash> so that we can distribute updates like routing
tables.
o) Do lookups in the peer's dictionary and ours at the same time.
o) Put a generation number in the hashes so that if the remote side recycles a
hash, we can do something about it.

4
xcodec/cache/coss/Makefile vendored Normal file
View file

@ -0,0 +1,4 @@
SUBDIR+=test
include ../../../common/subdir.mk

5
xcodec/cache/coss/lib.mk vendored Normal file
View file

@ -0,0 +1,5 @@
VPATH+= ${TOPDIR}/xcodec/cache/coss
SRCS+= xcodec_cache_coss.cc

3
xcodec/cache/coss/test/Makefile vendored Normal file
View file

@ -0,0 +1,3 @@
SUBDIR+=xcodec-coss1
include ../../../../common/subdir.mk

View file

@ -0,0 +1,7 @@
TEST=xcodec-coss1
TOPDIR=../../../../..
USE_LIBS=common common/uuid xcodec xcodec/cache/coss
include ${TOPDIR}/common/program.mk
LDADD+=-lboost_filesystem -lboost_system

View file

@ -0,0 +1,96 @@
#include <common/buffer.h>
#include <common/test.h>
#include <common/uuid/uuid.h>
#include <xcodec/xcodec.h>
#include <xcodec/xcodec_cache.h>
#include <xcodec/xcodec_decoder.h>
#include <xcodec/xcodec_encoder.h>
#include <xcodec/xcodec_hash.h>
#include <xcodec/cache/coss/xcodec_cache_coss.h>
#include <boost/filesystem.hpp>
#include <fstream>
#include <stdlib.h>
using namespace boost::filesystem;
int
main(void)
{
char tmp_template[] = "/tmp/cache-coss-XXXXXX";
path cache_path = mkdtemp(tmp_template);
create_directory(cache_path);
typedef pair<uint64_t, const uint8_t*> segment_list_element_t;
typedef deque<segment_list_element_t> segment_list_t;
segment_list_t segment_list;
{
TestGroup g("/test/xcodec/encode-decode-coss/2/char_kat",
"XCodecEncoder::encode / XCodecDecoder::decode #2");
UUID uuid;
std::string cache_path_str = cache_path.string();
unsigned i, j;
uuid.generate();
for (j = 0; j < 4; j++) {
XCodecCache *cache = new XCodecCacheCOSS(uuid, cache_path_str,
10, 10, 10);
for (i = 0; i < 10000; i++) {
uint8_t random[XCODEC_SEGMENT_LENGTH];
ifstream rand_fd("/dev/urandom");
rand_fd.read(random, sizeof(random));
ASSERT("xcodec-coss1", rand_fd.good());
uint64_t hash = XCodecHash::hash(random);
const uint8_t* data = cache->lookup(hash);
if (data)
continue;
segment_list.push_front(make_pair(hash, data));
Buffer buf (data, XCODEC_SEGMENT_LENGTH);
cache->enter(hash, buf, 0);
}
delete cache;
cache = new XCodecCacheCOSS(uuid, cache_path_str,
10, 10, 10);
segment_list_element_t el;
const uint8_t *seg1, *seg2;
uint64_t hash;
while (!segment_list.empty()){
el = segment_list.back();
segment_list.pop_back();
seg1 = el.second;
seg2 = cache->lookup(el.first);
hash = el.first;
if (!seg2)
cout << "Segment not found: " << hash << endl;;
if (seg2) {
if (memcmp (seg1, seg2, XCODEC_SEGMENT_LENGTH))
cout << "Segments are not equal: " << hash <<
endl;
Test _(g, "Segment are not equal.",
seg1->equal(seg2));
}
}
delete cache;
}
}
remove_all(cache_path);
return (0);
}

371
xcodec/cache/coss/xcodec_cache_coss.cc vendored Normal file
View file

@ -0,0 +1,371 @@
/*
*
* XCodec COSS Cache
*
* COSS = Cyclic Object storage system
*
* Idea taken from Squid COSS cache.
*
* Diego Woitasen <diegows@xtech.com.ar>
* XTECH
*
*/
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <xcodec/cache/coss/xcodec_cache_coss.h>
////////////////////////////////////////////////////////////////////////////////
// //
// File: xcodec_cache_coss.cc //
// Description: persistent cache on disk for xcodec protocol streams //
// Project: WANProxy XTech //
// Adapted by: Andreu Vidal Bramfeld-Software //
// Last modified: 2015-08-31 //
// //
////////////////////////////////////////////////////////////////////////////////
XCodecCacheCOSS::XCodecCacheCOSS (const UUID& uuid, const std::string& cache_dir, size_t cache_size)
: XCodecCache(uuid, cache_size),
log_("xcodec/cache/coss")
{
uint8_t str[UUID_STRING_SIZE + 1];
uuid.to_string (str);
file_path_ = cache_dir;
if (file_path_.size() > 0 && file_path_[file_path_.size() - 1] != '/')
file_path_.append ("/");
file_path_.append ((const char*) str, UUID_STRING_SIZE);
file_path_.append (".wpc");
struct stat st;
if (::stat (file_path_.c_str(), &st) == 0 && (st.st_mode & S_IFREG))
file_size_ = st.st_size;
else
{
ofstream tmp (file_path_.c_str());
file_size_ = 0;
}
serial_number_ = 0;
stripe_range_ = 0;
if (! cache_size)
cache_size = CACHE_BASIC_SIZE;
uint64_t size = ROUND_UP((uint64_t) cache_size * 1048576, sizeof (COSSStripe));
stripe_limit_ = size / sizeof (COSSStripe);
freshness_level_ = 0;
active_ = 0;
directory_ = new COSSMetadata[stripe_limit_];
memset (directory_, 0, sizeof (COSSMetadata) * stripe_limit_);
stream_.open (file_path_.c_str(), fstream::in | fstream::out | fstream::binary);
if (! read_file ())
{
stream_.close ();
stream_.open (file_path_.c_str(), fstream::in | fstream::out | fstream::trunc | fstream::binary);
file_size_ = 0;
initialize_stripe (stripe_range_, active_);
}
DEBUG(log_) << "Cache file: " << file_path_;
DEBUG(log_) << "Max size: " << size;
DEBUG(log_) << "Stripe size: " << sizeof (COSSStripe);
DEBUG(log_) << "Stripe header size: " << sizeof (COSSStripeHeader);
DEBUG(log_) << "Serial: " << serial_number_;
DEBUG(log_) << "Stripe number: " << stripe_range_;
}
XCodecCacheCOSS::~XCodecCacheCOSS()
{
for (int i = 0; i < LOADED_STRIPE_COUNT; ++i)
if (stripe_[i].header.metadata.state == 1)
store_stripe (i, (i == active_ ? sizeof (COSSStripe) : sizeof (COSSStripeHeader)));
stream_.close();
delete[] directory_;
/*
INFO(log_) << "Stats: ";
INFO(log_) << "\tLookups=" << stats_.lookups;
INFO(log_) << "\tHits=" << (stats_.found_1 + stats_.found_2) << " (" << stats_.found_1 << " + " << stats_.found_2 << ")";
if (stats_.lookups > 0)
INFO(log_) << "\tHit ratio=" << ((stats_.found_1 + stats_.found_2) * 100) / stats_.lookups << "%";
*/
DEBUG(log_) << "Closing coss file: " << file_path_;
DEBUG(log_) << "Serial: " << serial_number_;
DEBUG(log_) << "Stripe number: " << stripe_range_;
DEBUG(log_) << "Index size: " << cache_index_.size();
}
bool XCodecCacheCOSS::read_file ()
{
COSSStripeHeader header;
COSSIndexEntry entry;
uint64_t serial, range, limit, level;
uint64_t hash;
serial = range = limit = level = 0;
limit = file_size_ / sizeof (COSSStripe);
if (limit * sizeof (COSSStripe) != file_size_)
return false;
if (limit > stripe_limit_)
limit = stripe_limit_;
stream_.seekg (0);
for (uint64_t n = 0; n < limit; ++n)
{
stream_.read ((char*) &header, sizeof header);
if (! stream_.good() || stream_.gcount () != sizeof header)
return false;
if (header.metadata.signature != CACHE_SIGNATURE)
return false;
if (header.metadata.segment_count > STRIPE_SEGMENT_COUNT)
return false;
stream_.seekg (sizeof (COSSStripe) - sizeof header, ios::cur);
if (header.metadata.serial_number > serial)
serial = header.metadata.serial_number, range = n;
if (header.metadata.freshness > level)
level = header.metadata.freshness;
directory_[n] = header.metadata;
directory_[n].state = 0;
for (int i = 0; i < STRIPE_SEGMENT_COUNT; ++i)
{
if ((hash = header.hash_array[i]))
{
entry.stripe_range = n;
entry.position = i;
cache_index_.insert (hash, entry);
}
}
}
if (serial > 0)
{
serial_number_ = serial;
stripe_range_ = range;
freshness_level_ = level;
load_stripe (stripe_range_, active_);
}
else
{
initialize_stripe (stripe_range_, active_);
}
return true;
}
void XCodecCacheCOSS::enter (const uint64_t& hash, const Buffer& buf, unsigned off)
{
COSSIndexEntry entry;
while (stripe_[active_].header.metadata.segment_index >= STRIPE_SEGMENT_COUNT)
new_active ();
COSSStripe& act = stripe_[active_];
act.header.hash_array[act.header.metadata.segment_index] = hash;
buf.copyout (act.segment_array[act.header.metadata.segment_index].bytes, off, XCODEC_SEGMENT_LENGTH);
entry.stripe_range = act.header.metadata.stripe_range;
entry.position = act.header.metadata.segment_index;
act.header.metadata.segment_index++;
while (act.header.metadata.segment_index < STRIPE_SEGMENT_COUNT &&
act.header.hash_array[act.header.metadata.segment_index])
act.header.metadata.segment_index++;
act.header.metadata.segment_count++;
act.header.metadata.freshness = ++freshness_level_;
cache_index_.insert (hash, entry);
}
bool XCodecCacheCOSS::lookup (const uint64_t& hash, Buffer& buf)
{
const COSSIndexEntry* entry;
const uint8_t* data;
int slot;
stats_.lookups++;
if ((data = find_recent (hash)))
{
buf.append (data, XCODEC_SEGMENT_LENGTH);
stats_.found_1++;
return true;
}
if (! (entry = cache_index_.lookup (hash)))
return false;
for (slot = 0; slot < LOADED_STRIPE_COUNT; ++slot)
if (stripe_[slot].header.metadata.stripe_range == entry->stripe_range)
break;
if (slot >= LOADED_STRIPE_COUNT)
{
slot = best_unloadable_slot ();
detach_stripe (slot);
load_stripe (entry->stripe_range, slot);
}
if (stripe_[slot].header.hash_array[entry->position] != hash)
return false;
stripe_[slot].header.metadata.freshness = ++freshness_level_;
stripe_[slot].header.metadata.uses++;
stripe_[slot].header.metadata.credits++;
stripe_[slot].header.metadata.load_uses++;
stripe_[slot].header.flags[entry->position] |= 3;
data = stripe_[slot].segment_array[entry->position].bytes;
remember (hash, data);
buf.append (data, XCODEC_SEGMENT_LENGTH);
stats_.found_2++;
return true;
}
void XCodecCacheCOSS::initialize_stripe (uint64_t range, int slot)
{
memset (&stripe_[slot].header, 0, sizeof (COSSStripeHeader));
stripe_[slot].header.metadata.signature = CACHE_SIGNATURE;
stripe_[slot].header.metadata.version = CACHE_VERSION;
stripe_[slot].header.metadata.serial_number = ++serial_number_;
stripe_[slot].header.metadata.stripe_range = range;
stripe_[slot].header.metadata.state = 1;
directory_[range] = stripe_[slot].header.metadata;
}
bool XCodecCacheCOSS::load_stripe (uint64_t range, int slot)
{
uint64_t pos = range * sizeof (COSSStripe);
if (pos < file_size_)
{
stream_.seekg (pos);
stream_.read ((char*) &stripe_[slot], sizeof (COSSStripe));
if (stream_.gcount () == sizeof (COSSStripe))
{
stripe_[slot].header.metadata.stripe_range = range;
stripe_[slot].header.metadata.load_uses = 0;
stripe_[slot].header.metadata.state = 1;
directory_[range].state = 1;
return true;
}
}
stream_.clear ();
return false;
}
void XCodecCacheCOSS::store_stripe (int slot, size_t size)
{
uint64_t pos = stripe_[slot].header.metadata.stripe_range * sizeof (COSSStripe);
if (pos != (uint64_t) stream_.tellp ())
stream_.seekp (pos);
stream_.write ((char*) &stripe_[slot], size);
if (stream_.good () && pos + sizeof (COSSStripe) > file_size_)
file_size_ = pos + sizeof (COSSStripe);
stream_.clear ();
}
void XCodecCacheCOSS::new_active ()
{
store_stripe (active_, sizeof (COSSStripe));
active_ = best_unloadable_slot ();
detach_stripe (active_);
stripe_range_ = best_erasable_stripe ();
if (load_stripe (stripe_range_, active_))
purge_stripe (active_);
else
initialize_stripe (stripe_range_, active_);
}
int XCodecCacheCOSS::best_unloadable_slot ()
{
uint64_t v, n = 0xFFFFFFFFFFFFFFFFull;
int j = 0;
for (int i = 0; i < LOADED_STRIPE_COUNT; ++i)
{
if (i == active_)
continue;
if (stripe_[i].header.metadata.signature == 0)
return i;
if ((v = stripe_[i].header.metadata.freshness + stripe_[i].header.metadata.load_uses) < n)
j = i, n = v;
}
return j;
}
uint64_t XCodecCacheCOSS::best_erasable_stripe ()
{
COSSMetadata* m;
uint64_t v, n = 0xFFFFFFFFFFFFFFFFull;
uint64_t i, j = 0;
for (m = directory_, i = 0; i < stripe_limit_; ++i, ++m)
{
if (m->state == 1)
continue;
if (m->signature == 0)
return i;
if ((v = m->freshness + m->uses) < n)
j = i, n = v;
}
return j;
}
void XCodecCacheCOSS::detach_stripe (int slot)
{
if (stripe_[slot].header.metadata.state == 1)
{
uint64_t range = stripe_[slot].header.metadata.stripe_range;
directory_[range] = stripe_[slot].header.metadata;
directory_[range].state = 2;
for (int i = 0; i < STRIPE_SEGMENT_COUNT; ++i)
{
if (stripe_[slot].header.flags[i] & 1)
{
forget (stripe_[slot].header.hash_array[i]);
stripe_[slot].header.flags[i] &= ~1;
}
}
stripe_[slot].header.metadata.state = 0;
store_stripe (slot, sizeof (COSSStripeHeader));
}
}
void XCodecCacheCOSS::purge_stripe (int slot)
{
for (int i = STRIPE_SEGMENT_COUNT - 1; i >= 0; --i)
{
uint64_t hash = stripe_[slot].header.hash_array[i];
if (hash && ! (stripe_[slot].header.flags[i] & 2))
{
cache_index_.erase (hash);
stripe_[slot].header.hash_array[i] = 0;
stripe_[slot].header.flags[i] = 0;
stripe_[slot].header.metadata.segment_count--;
}
stripe_[slot].header.flags[i] &= ~2;
if (! stripe_[slot].header.hash_array[i])
stripe_[slot].header.metadata.segment_index = i;
}
stripe_[slot].header.metadata.serial_number = ++serial_number_;
stripe_[slot].header.metadata.uses = stripe_[slot].header.metadata.credits;
stripe_[slot].header.metadata.credits = 0;
if (stripe_[slot].header.metadata.segment_count >= STRIPE_SEGMENT_COUNT)
INFO(log_) << "No more space available in cache";
}

223
xcodec/cache/coss/xcodec_cache_coss.h vendored Normal file
View file

@ -0,0 +1,223 @@
/*
*
* XCodec COSS Cache
*
* COSS = Cyclic Object Storage System
*
* Idea taken from Squid COSS.
*
* Diego Woitasen <diegows@xtech.com.ar>
* XTECH
*
*/
#ifndef XCODEC_XCODEC_CACHE_COSS_H
#define XCODEC_XCODEC_CACHE_COSS_H
#include <string>
#include <map>
#include <fstream>
#include <common/buffer.h>
#include <xcodec/xcodec.h>
#include <xcodec/xcodec_cache.h>
////////////////////////////////////////////////////////////////////////////////
// //
// File: xcodec_cache_coss.h //
// Description: persistent cache on disk for xcodec protocol streams //
// Project: WANProxy XTech //
// Adapted by: Andreu Vidal Bramfeld-Software //
// Last modified: 2015-08-31 //
// //
////////////////////////////////////////////////////////////////////////////////
using namespace std;
/*
* - In COSS, we have one file per cache (UUID). The file is divided in
* stripes.
*
* - Each stripe is composed by:
* metadata + hash array + segment size array + segment array.
*
* - The arrays elements are the same order:
* hash1, hash2, ..., hashN - size1, size2, ..., sizeN - seg1, seg2, ..., segN.
*
* - The segments are indexed in memory. This index is loaded when the cache is
* openned, reading the hash array of each stripe. Takes a few millisecons in
* a 10 GB cache.
*
* - We have one active stripe in memory at a time. New segments are written
* in the current stripe in order of appearance.
*
* - When a cached segment is requested and it's out of the active stripe,
* is copied to it.
*
* - When the current stripe is full, we move to the next one.
*
* - When we reach the EOF, first stripe is zeroed and becomes active.
*
*/
// Changes introduced in version 2:
//
// - segment size is made independent of BUFFER_SEGMENT_SIZE and not stored explicitly
// since it is always XCODEC_SEGMENT_LENGTH
// - several stripes can be help simultaneously in memory, and when a segment is requested
// which lies outside the active stripe it is read into an alternate slot together
// with the whole stripe to be ready to satisfy requests for neighbour segments
// - an array of bits keeps track of the state of each segment within a stripe signaling
// if it has been recently used
// - when no more place is available, the LRU stripe is purged and any segments
// no used during the last period are erased
/*
* This values should be page aligned.
*/
#define CACHE_SIGNATURE 0xF150E964
#define CACHE_VERSION 2
#define STRIPE_SEGMENT_COUNT 512 // segments of XCODEC_SEGMENT_LENGTH per stripe (must fit into 16 bits)
#define LOADED_STRIPE_COUNT 4 // number of stripes held in memory (must be greater than 1)
#define CACHE_BASIC_SIZE 1024 // MB
#define CACHE_ALIGNEMENT 4096
#define HEADER_ARRAY_SIZE (STRIPE_SEGMENT_COUNT * (sizeof (uint64_t) + sizeof (uint32_t)))
#define METADATA_SIZE (sizeof (COSSMetadata))
#define ROUND_UP(N, S) ((((N) + (S) - 1) / (S)) * (S))
#define HEADER_ALIGNED_SIZE ROUND_UP(HEADER_ARRAY_SIZE + METADATA_SIZE, CACHE_ALIGNEMENT)
#define METADATA_PADDING (HEADER_ALIGNED_SIZE - HEADER_ARRAY_SIZE - METADATA_SIZE)
struct COSSIndexEntry
{
uint64_t stripe_range : 48;
uint64_t position : 16;
};
class COSSIndex
{
typedef __gnu_cxx::hash_map<Hash64, COSSIndexEntry> index_t;
index_t index;
public:
void insert (const uint64_t& hash, const COSSIndexEntry& entry)
{
index[hash] = entry;
}
const COSSIndexEntry* lookup (const uint64_t& hash)
{
index_t::iterator it = index.find (hash);
return (it != index.end () ? &it->second : 0);
}
void erase (const uint64_t& hash)
{
index.erase (hash);
}
size_t size()
{
return index.size();
}
};
struct COSSOnDiskSegment
{
uint8_t bytes[XCODEC_SEGMENT_LENGTH];
string hexdump() {
string dump;
char buf[8];
int i;
for (i = 0; i < 70; i++) {
snprintf(buf, 8, "%02x", bytes[i]);
dump += buf;
}
return dump;
}
};
struct COSSMetadata
{
uint32_t signature;
uint32_t version;
uint64_t serial_number;
uint64_t stripe_range;
uint32_t segment_index;
uint32_t segment_count;
uint64_t freshness;
uint64_t uses;
uint64_t credits;
uint32_t load_uses;
uint32_t state;
};
struct COSSStripeHeader
{
COSSMetadata metadata;
char padding[METADATA_PADDING];
uint32_t flags[STRIPE_SEGMENT_COUNT];
uint64_t hash_array[STRIPE_SEGMENT_COUNT];
};
struct COSSStripe
{
COSSStripeHeader header;
COSSOnDiskSegment segment_array[STRIPE_SEGMENT_COUNT];
public:
COSSStripe() { memset (&header, 0, sizeof header); }
};
struct COSSStats
{
uint64_t lookups;
uint64_t found_1;
uint64_t found_2;
public:
COSSStats() { lookups = found_1 = found_2 = 0; }
};
class XCodecCacheCOSS : public XCodecCache
{
std::string file_path_;
uint64_t file_size_;
fstream stream_;
uint64_t serial_number_;
uint64_t stripe_range_;
uint64_t stripe_limit_;
uint64_t freshness_level_;
COSSStripe stripe_[LOADED_STRIPE_COUNT];
int active_;
COSSMetadata* directory_;
COSSIndex cache_index_;
COSSStats stats_;
LogHandle log_;
public:
XCodecCacheCOSS (const UUID& uuid, const std::string& cache_dir, size_t cache_size);
~XCodecCacheCOSS();
virtual void enter (const uint64_t& hash, const Buffer& buf, unsigned off);
virtual bool lookup (const uint64_t& hash, Buffer& buf);
private:
bool read_file ();
void initialize_stripe (uint64_t range, int slot);
bool load_stripe (uint64_t range, int slot);
void store_stripe (int slot, size_t size);
void new_active ();
int best_unloadable_slot ();
uint64_t best_erasable_stripe ();
void detach_stripe (int slot);
void purge_stripe (int slot);
};
#endif /* !XCODEC_XCODEC_CACHE_COSS_H */

3
xcodec/example/Makefile Normal file
View file

@ -0,0 +1,3 @@
SUBDIR+=xcodec-hash-roll1
include ../../common/subdir.mk

View file

@ -0,0 +1,7 @@
PROGRAM=xcodec-hash-roll1
SRCS+= xcodec-hash-roll1.cc
TOPDIR=../../..
USE_LIBS=common common/thread common/time common/uuid event io xcodec
include ${TOPDIR}/common/program.mk

View file

@ -0,0 +1,112 @@
/*
* Copyright (c) 2012 Juli Mallett. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include <unistd.h>
#include <event/event_callback.h>
#include <event/event_system.h>
#include <io/stream_handle.h>
#include <xcodec/xcodec.h>
#include <xcodec/xcodec_hash.h>
class Sink {
LogHandle log_;
StreamHandle fd_;
XCodecHash hash_;
unsigned length_;
Action *action_;
public:
Sink(int fd)
: log_("/sink"),
fd_(fd),
hash_(),
length_(0),
action_(NULL)
{
EventCallback *cb = callback(this, &Sink::read_complete);
action_ = fd_.read(0, cb);
}
~Sink()
{
ASSERT(log_, action_ == NULL);
}
void read_complete(Event e)
{
action_->cancel();
action_ = NULL;
switch (e.type_) {
case Event::Done:
case Event::EOS:
break;
default:
HALT(log_) << "Unexpected event: " << e;
return;
}
while (!e.buffer_.empty()) {
BufferSegment *seg;
e.buffer_.moveout(&seg);
const uint8_t *p, *q = seg->end();
if (length_ == XCODEC_SEGMENT_LENGTH) {
for (p = seg->data(); p < q; p++) {
hash_.roll(*p);
}
} else {
for (p = seg->data(); p < q; p++) {
if (length_ == XCODEC_SEGMENT_LENGTH) {
hash_.roll(*p);
} else {
hash_.add(*p);
length_++;
}
}
}
seg->unref();
}
if (e.type_ == Event::EOS) {
fd_.close();
return;
}
EventCallback *cb = callback(this, &Sink::read_complete);
action_ = fd_.read(0, cb);
}
};
int
main(void)
{
Sink sink(STDIN_FILENO);
event_system.run();
}

5
xcodec/lib.mk Normal file
View file

@ -0,0 +1,5 @@
VPATH+= ${TOPDIR}/xcodec
SRCS+= xcodec_encoder.cc
SRCS+= xcodec_decoder.cc
SRCS+= xcodec_filter.cc

4
xcodec/test/Makefile Normal file
View file

@ -0,0 +1,4 @@
SUBDIR+=xcodec-encode-decode1
SUBDIR+=xcodec-hash1
include ../../common/subdir.mk

View file

@ -0,0 +1,5 @@
TEST=xcodec-encode-decode1
TOPDIR=../../..
USE_LIBS=common common/uuid xcodec
include ${TOPDIR}/common/program.mk

View file

@ -0,0 +1,108 @@
/*
* Copyright (c) 2011 Juli Mallett. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include <common/buffer.h>
#include <common/test.h>
#include <common/uuid/uuid.h>
#include <xcodec/xcodec.h>
#include <xcodec/xcodec_cache.h>
#include <xcodec/xcodec_decoder.h>
#include <xcodec/xcodec_encoder.h>
int
main(void)
{
{
TestGroup g("/test/xcodec/encode-decode/1/char_kat", "XCodecEncoder::encode / XCodecDecoder::decode #1");
unsigned i;
for (i = 0; i < 256; i++) {
Buffer in;
unsigned j;
for (j = 0; j < XCODEC_SEGMENT_LENGTH; j++)
in.append((uint8_t)i);
for (j = 0; j < 8; j++) {
Buffer tmp(in);
tmp.append(in);
in = tmp;
}
Buffer original(in);
UUID uuid;
uuid.generate();
XCodecCache *cache = new XCodecMemoryCache(uuid);
XCodecEncoder encoder(cache);
Buffer out;
encoder.encode(&out, &in);
{
Test _(g, "Empty input buffer after encode.", in.empty());
}
{
Test _(g, "Non-empty output buffer after encode.", !out.empty());
}
{
Test _(g, "Reduction in size.", out.length() < original.length());
}
out.moveout(&in);
XCodecDecoder decoder(cache);
std::set<uint64_t> unknown_hashes;
bool ok = decoder.decode(&out, &in, unknown_hashes);
{
Test _(g, "Decoder success.", ok);
}
{
Test _(g, "No unknown hashes.", unknown_hashes.empty());
}
{
Test _(g, "Empty input buffer after decode.", in.empty());
}
{
Test _(g, "Non-empty output buffer after decode.", !out.empty());
}
{
Test _(g, "Expected data.", out.equal(&original));
}
delete cache;
}
}
return (0);
}

View file

@ -0,0 +1,5 @@
TEST=xcodec-hash1
TOPDIR=../../..
USE_LIBS=common xcodec
include ${TOPDIR}/common/program.mk

View file

@ -0,0 +1,314 @@
/*
* Copyright (c) 2011 Juli Mallett. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include <common/test.h>
#include <xcodec/xcodec.h>
#include <xcodec/xcodec_hash.h>
/*
* Single-character run known-answer tests.
*/
static uint64_t char_kats[] = {
0x0000000080200400ull,
0x8200400000400800ull,
0x0400800080600c00ull,
0x8200400000801000ull,
0x8600c00080a01400ull,
0x8200400000c01800ull,
0x0400800080e01c00ull,
0x8200400001002000ull,
0x0801000081202400ull,
0x8200400001402800ull,
0x0400800081602c00ull,
0x8200400001803000ull,
0x8600c00081a03400ull,
0x8200400001c03800ull,
0x0400800081e03c00ull,
0x8200400002004000ull,
0x8a01400082204400ull,
0x8200400002404800ull,
0x0400800082604c00ull,
0x8200400002805000ull,
0x8600c00082a05400ull,
0x8200400002c05800ull,
0x0400800082e05c00ull,
0x8200400003006000ull,
0x0801000083206400ull,
0x8200400003406800ull,
0x0400800083606c00ull,
0x8200400003807000ull,
0x8600c00083a07400ull,
0x8200400003c07800ull,
0x0400800083e07c00ull,
0x8200400004008000ull,
0x0c01800084208400ull,
0x8200400004408800ull,
0x0400800084608c00ull,
0x8200400004809000ull,
0x8600c00084a09400ull,
0x8200400004c09800ull,
0x0400800084e09c00ull,
0x820040000500a000ull,
0x080100008520a400ull,
0x820040000540a800ull,
0x040080008560ac00ull,
0x820040000580b000ull,
0x8600c00085a0b400ull,
0x8200400005c0b800ull,
0x0400800085e0bc00ull,
0x820040000600c000ull,
0x8a0140008620c400ull,
0x820040000640c800ull,
0x040080008660cc00ull,
0x820040000680d000ull,
0x8600c00086a0d400ull,
0x8200400006c0d800ull,
0x0400800086e0dc00ull,
0x820040000700e000ull,
0x080100008720e400ull,
0x820040000740e800ull,
0x040080008760ec00ull,
0x820040000780f000ull,
0x8600c00087a0f400ull,
0x8200400007c0f800ull,
0x0400800087e0fc00ull,
0x8200400008010000ull,
0x8e01c00088210400ull,
0x8200400008410800ull,
0x0400800088610c00ull,
0x8200400008811000ull,
0x8600c00088a11400ull,
0x8200400008c11800ull,
0x0400800088e11c00ull,
0x8200400009012000ull,
0x0801000089212400ull,
0x8200400009412800ull,
0x0400800089612c00ull,
0x8200400009813000ull,
0x8600c00089a13400ull,
0x8200400009c13800ull,
0x0400800089e13c00ull,
0x820040000a014000ull,
0x8a0140008a214400ull,
0x820040000a414800ull,
0x040080008a614c00ull,
0x820040000a815000ull,
0x8600c0008aa15400ull,
0x820040000ac15800ull,
0x040080008ae15c00ull,
0x820040000b016000ull,
0x080100008b216400ull,
0x820040000b416800ull,
0x040080008b616c00ull,
0x820040000b817000ull,
0x8600c0008ba17400ull,
0x820040000bc17800ull,
0x040080008be17c00ull,
0x820040000c018000ull,
0x0c0180008c218400ull,
0x820040000c418800ull,
0x040080008c618c00ull,
0x820040000c819000ull,
0x8600c0008ca19400ull,
0x820040000cc19800ull,
0x040080008ce19c00ull,
0x820040000d01a000ull,
0x080100008d21a400ull,
0x820040000d41a800ull,
0x040080008d61ac00ull,
0x820040000d81b000ull,
0x8600c0008da1b400ull,
0x820040000dc1b800ull,
0x040080008de1bc00ull,
0x820040000e01c000ull,
0x8a0140008e21c400ull,
0x820040000e41c800ull,
0x040080008e61cc00ull,
0x820040000e81d000ull,
0x8600c0008ea1d400ull,
0x820040000ec1d800ull,
0x040080008ee1dc00ull,
0x820040000f01e000ull,
0x080100008f21e400ull,
0x820040000f41e800ull,
0x040080008f61ec00ull,
0x820040000f81f000ull,
0x8600c0008fa1f400ull,
0x820040000fc1f800ull,
0x040080008fe1fc00ull,
0x8200400010020000ull,
0x1002000090220400ull,
0x8200400010420800ull,
0x0400800090620c00ull,
0x8200400010821000ull,
0x8600c00090a21400ull,
0x8200400010c21800ull,
0x0400800090e21c00ull,
0x8200400011022000ull,
0x0801000091222400ull,
0x8200400011422800ull,
0x0400800091622c00ull,
0x8200400011823000ull,
0x8600c00091a23400ull,
0x8200400011c23800ull,
0x0400800091e23c00ull,
0x8200400012024000ull,
0x8a01400092224400ull,
0x8200400012424800ull,
0x0400800092624c00ull,
0x8200400012825000ull,
0x8600c00092a25400ull,
0x8200400012c25800ull,
0x0400800092e25c00ull,
0x8200400013026000ull,
0x0801000093226400ull,
0x8200400013426800ull,
0x0400800093626c00ull,
0x8200400013827000ull,
0x8600c00093a27400ull,
0x8200400013c27800ull,
0x0400800093e27c00ull,
0x8200400014028000ull,
0x0c01800094228400ull,
0x8200400014428800ull,
0x0400800094628c00ull,
0x8200400014829000ull,
0x8600c00094a29400ull,
0x8200400014c29800ull,
0x0400800094e29c00ull,
0x820040001502a000ull,
0x080100009522a400ull,
0x820040001542a800ull,
0x040080009562ac00ull,
0x820040001582b000ull,
0x8600c00095a2b400ull,
0x8200400015c2b800ull,
0x0400800095e2bc00ull,
0x820040001602c000ull,
0x8a0140009622c400ull,
0x820040001642c800ull,
0x040080009662cc00ull,
0x820040001682d000ull,
0x8600c00096a2d400ull,
0x8200400016c2d800ull,
0x0400800096e2dc00ull,
0x820040001702e000ull,
0x080100009722e400ull,
0x820040001742e800ull,
0x040080009762ec00ull,
0x820040001782f000ull,
0x8600c00097a2f400ull,
0x8200400017c2f800ull,
0x0400800097e2fc00ull,
0x8200400018030000ull,
0x8e01c00098230400ull,
0x8200400018430800ull,
0x0400800098630c00ull,
0x8200400018831000ull,
0x8600c00098a31400ull,
0x8200400018c31800ull,
0x0400800098e31c00ull,
0x8200400019032000ull,
0x0801000099232400ull,
0x8200400019432800ull,
0x0400800099632c00ull,
0x8200400019833000ull,
0x8600c00099a33400ull,
0x8200400019c33800ull,
0x0400800099e33c00ull,
0x820040001a034000ull,
0x8a0140009a234400ull,
0x820040001a434800ull,
0x040080009a634c00ull,
0x820040001a835000ull,
0x8600c0009aa35400ull,
0x820040001ac35800ull,
0x040080009ae35c00ull,
0x820040001b036000ull,
0x080100009b236400ull,
0x820040001b436800ull,
0x040080009b636c00ull,
0x820040001b837000ull,
0x8600c0009ba37400ull,
0x820040001bc37800ull,
0x040080009be37c00ull,
0x820040001c038000ull,
0x0c0180009c238400ull,
0x820040001c438800ull,
0x040080009c638c00ull,
0x820040001c839000ull,
0x8600c0009ca39400ull,
0x820040001cc39800ull,
0x040080009ce39c00ull,
0x820040001d03a000ull,
0x080100009d23a400ull,
0x820040001d43a800ull,
0x040080009d63ac00ull,
0x820040001d83b000ull,
0x8600c0009da3b400ull,
0x820040001dc3b800ull,
0x040080009de3bc00ull,
0x820040001e03c000ull,
0x8a0140009e23c400ull,
0x820040001e43c800ull,
0x040080009e63cc00ull,
0x820040001e83d000ull,
0x8600c0009ea3d400ull,
0x820040001ec3d800ull,
0x040080009ee3dc00ull,
0x820040001f03e000ull,
0x080100009f23e400ull,
0x820040001f43e800ull,
0x040080009f63ec00ull,
0x820040001f83f000ull,
0x8600c0009fa3f400ull,
0x820040001fc3f800ull,
0x040080009fe3fc00ull,
0x8200400020040000ull
};
int
main(void)
{
{
TestGroup g("/test/xcodec/hash1/char_kat", "XCodecHash #1 / Single-character KATs");
unsigned i;
for (i = 0; i < sizeof char_kats / sizeof char_kats[0]; i++) {
XCodecHash hash;
unsigned j;
for (j = 0; j < XCODEC_SEGMENT_LENGTH; j++)
hash.add((uint8_t)i);
std::ostringstream os;
os << "KAT #" << i;
Test _(g, os.str(), char_kats[i] == hash.mix());
}
}
return (0);
}

80
xcodec/xcodec.h Normal file
View file

@ -0,0 +1,80 @@
/*
* Copyright (c) 2008-2011 Juli Mallett. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef XCODEC_XCODEC_H
#define XCODEC_XCODEC_H
////////////////////////////////////////////////////////////////////////////////
// //
// File: xcodec.h //
// Description: symbolic constants for the basic xcodec protocol //
// Project: WANProxy XTech //
// Adapted by: Andreu Vidal Bramfeld-Software //
// Last modified: 2015-08-31 //
// //
////////////////////////////////////////////////////////////////////////////////
#define XCODEC_MAGIC ((uint8_t)0xf1) /* Magic! */
/*
* Usage:
* <MAGIC> <OP_ESCAPE>
*
* Effects:
* A literal XCODEC_MAGIC appears in the output stream.
*
*/
#define XCODEC_OP_ESCAPE ((uint8_t)0x00)
/*
* Usage:
* <MAGIC> <OP_EXTRACT> data[uint8_t x XCODEC_SEGMENT_LENGTH]
*
* Effects:
* The `data' is hashed, the hash is associated with the data if possible
* and the data is inserted into the output stream.
*
* If other data is already known by the hash of `data', error will be
* indicated from the decoder.
*
*/
#define XCODEC_OP_EXTRACT ((uint8_t)0x01)
/*
* Usage:
* <MAGIC> <OP_REF> hash[uint64_t]
*
* Effects:
* The data associated with the hash `hash' is looked up and inserted into
* the output stream if possible.
*
* If the `hash' is not known, an OP_ASK will be sent in response.
*
*/
#define XCODEC_OP_REF ((uint8_t)0x02)
#define XCODEC_SEGMENT_LENGTH (2048)
#endif /* !XCODEC_XCODEC_H */

201
xcodec/xcodec_cache.h Normal file
View file

@ -0,0 +1,201 @@
/*
* Copyright (c) 2008-2011 Juli Mallett. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef XCODEC_XCODEC_CACHE_H
#define XCODEC_XCODEC_CACHE_H
#include <ext/hash_map>
#include <map>
#include <common/buffer.h>
#include <common/uuid/uuid.h>
#include <xcodec/xcodec.h>
////////////////////////////////////////////////////////////////////////////////
// //
// File: xcodec_cache.h //
// Description: base cache class and in-memory cache implementation //
// Project: WANProxy XTech //
// Adapted by: Andreu Vidal Bramfeld-Software //
// Last modified: 2015-08-31 //
// //
////////////////////////////////////////////////////////////////////////////////
#define XCODEC_WINDOW_COUNT 64 // must be binary
/*
* XXX
* GCC supports hash<unsigned long> but not hash<unsigned long long>. On some
* of our platforms, the former is 64-bits, on some the latter. As a result,
* we need this wrapper structure to throw our hashes into so that GCC's hash
* function can be reliably defined to use them.
*/
struct Hash64
{
uint64_t hash_;
Hash64(const uint64_t& hash)
: hash_(hash)
{ }
bool operator== (const Hash64& hash) const
{
return (hash_ == hash.hash_);
}
bool operator< (const Hash64& hash) const
{
return (hash_ < hash.hash_);
}
};
namespace __gnu_cxx
{
template<>
struct hash<Hash64>
{
size_t operator() (const Hash64& x) const
{
return (x.hash_);
}
};
}
class XCodecCache
{
private:
UUID uuid_;
size_t size_;
struct WindowItem {uint64_t hash; const uint8_t* data;};
WindowItem window_[XCODEC_WINDOW_COUNT];
unsigned cursor_;
protected:
XCodecCache (const UUID& uuid, size_t size)
: uuid_(uuid),
size_(size)
{
memset (window_, 0, sizeof window_);
cursor_ = 0;
}
public:
virtual ~XCodecCache()
{ }
const UUID& identifier ()
{
return uuid_;
}
size_t nominal_size ()
{
return size_;
}
virtual void enter (const uint64_t& hash, const Buffer& buf, unsigned off) = 0;
virtual bool lookup (const uint64_t& hash, Buffer& buf) = 0;
protected:
void remember (const uint64_t& hash, const uint8_t* data)
{
window_[cursor_].hash = hash;
window_[cursor_].data = data;
cursor_ = (cursor_ + 1) & (XCODEC_WINDOW_COUNT - 1);
}
const uint8_t* find_recent (const uint64_t& hash)
{
WindowItem* w;
int n;
for (w = window_, n = XCODEC_WINDOW_COUNT; n > 0; --n, ++w)
if (w->hash == hash)
return w->data;
return 0;
}
void forget (const uint64_t& hash)
{
WindowItem* w;
int n;
for (w = window_, n = XCODEC_WINDOW_COUNT; n > 0; --n, ++w)
if (w->hash == hash)
w->hash = 0;
}
};
class XCodecMemoryCache : public XCodecCache
{
typedef __gnu_cxx::hash_map<Hash64, const uint8_t*> segment_hash_map_t;
segment_hash_map_t segment_hash_map_;
LogHandle log_;
public:
XCodecMemoryCache (const UUID& uuid, size_t size)
: XCodecCache(uuid, size),
log_("/xcodec/cache/memory")
{ }
~XCodecMemoryCache()
{
segment_hash_map_t::const_iterator it;
for (it = segment_hash_map_.begin(); it != segment_hash_map_.end(); ++it)
delete[] it->second;
segment_hash_map_.clear();
}
void enter (const uint64_t& hash, const Buffer& buf, unsigned off)
{
ASSERT(log_, segment_hash_map_.find(hash) == segment_hash_map_.end());
uint8_t* data = new uint8_t[XCODEC_SEGMENT_LENGTH];
buf.copyout (data, off, XCODEC_SEGMENT_LENGTH);
segment_hash_map_[hash] = data;
}
bool lookup (const uint64_t& hash, Buffer& buf)
{
const uint8_t* data;
if ((data = find_recent (hash)))
{
buf.append (data, XCODEC_SEGMENT_LENGTH);
return true;
}
segment_hash_map_t::const_iterator it = segment_hash_map_.find (hash);
if (it != segment_hash_map_.end ())
{
buf.append (it->second, XCODEC_SEGMENT_LENGTH);
remember (hash, it->second);
return true;
}
return false;
}
};
#endif /* !XCODEC_XCODEC_CACHE_H */

176
xcodec/xcodec_decoder.cc Normal file
View file

@ -0,0 +1,176 @@
/*
* Copyright (c) 2008-2011 Juli Mallett. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include <common/buffer.h>
#include <common/endian.h>
#include <xcodec/xcodec.h>
#include <xcodec/xcodec_cache.h>
#include <xcodec/xcodec_decoder.h>
#include <xcodec/xcodec_encoder.h>
#include <xcodec/xcodec_hash.h>
////////////////////////////////////////////////////////////////////////////////
// //
// File: xcodec_decoder.cc //
// Description: decoding routines for the xcodex protocol //
// Project: WANProxy XTech //
// Adapted by: Andreu Vidal Bramfeld-Software //
// Last modified: 2015-08-31 //
// //
////////////////////////////////////////////////////////////////////////////////
XCodecDecoder::XCodecDecoder(XCodecCache* cache)
: log_("/xcodec/decoder"),
cache_(cache)
{ }
XCodecDecoder::~XCodecDecoder()
{ }
/*
* XXX These comments are out-of-date.
*
* Decode an XCodec-encoded stream. Returns false if there was an
* inconsistency, error or unrecoverable condition in the stream.
* Returns true if we were able to process the stream entirely or
* expect to be able to finish processing it once more data arrives.
* The input buffer is cleared of anything we can parse right now.
*
* Since some events later in the stream (i.e. ASK or LEARN) may need
* to be processed before some earlier in the stream (i.e. REF), we
* parse the stream into a list of actions to take, performing them
* as we go if possible, otherwise queueing them to occur until the
* action that is blocking the stream has been satisfied or the stream
* has been closed.
*
* XXX For now we will ASK in every stream where an unknown hash has
* occurred and expect a LEARN in all of them. In the future, it is
* desirable to optimize this. Especially once we start putting an
* instance UUID in the HELLO message and can tell which streams
* share an originator.
*/
bool
XCodecDecoder::decode (Buffer& output, Buffer& input, std::set<uint64_t>& unknown_hashes)
{
uint8_t data[XCODEC_SEGMENT_LENGTH];
Buffer old;
uint64_t behash;
uint64_t hash;
unsigned off;
uint8_t op;
while (! input.empty())
{
if (! input.find (XCODEC_MAGIC, &off))
{
input.moveout (&output);
break;
}
if (off > 0)
{
output.append (input, off);
input.skip (off);
}
ASSERT(log_, !input.empty());
/*
* Need the following byte at least.
*/
if (input.length() == 1)
break;
input.extract (&op, sizeof XCODEC_MAGIC);
switch (op)
{
case XCODEC_OP_ESCAPE:
output.append (XCODEC_MAGIC);
input.skip (sizeof XCODEC_MAGIC + sizeof op);
break;
case XCODEC_OP_EXTRACT:
if (input.length() < sizeof XCODEC_MAGIC + sizeof op + XCODEC_SEGMENT_LENGTH)
return (true);
input.skip (sizeof XCODEC_MAGIC + sizeof op);
input.copyout (data, XCODEC_SEGMENT_LENGTH);
hash = XCodecHash::hash (data);
if (cache_->lookup (hash, old))
{
if (old.equal (data, sizeof data))
{
DEBUG(log_) << "Declaring segment already in cache.";
}
else
{
ERROR(log_) << "Collision in <EXTRACT>.";
return (false);
}
old.clear ();
}
else
cache_->enter (hash, input, 0);
output.append (input, XCODEC_SEGMENT_LENGTH);
input.skip (XCODEC_SEGMENT_LENGTH);
break;
case XCODEC_OP_REF:
if (input.length() < sizeof XCODEC_MAGIC + sizeof op + sizeof behash)
return (true);
input.extract (&behash, sizeof XCODEC_MAGIC + sizeof op);
hash = BigEndian::decode (behash);
if (cache_->lookup (hash, output))
{
input.skip (sizeof XCODEC_MAGIC + sizeof op + sizeof behash);
}
else
{
if (unknown_hashes.find (hash) == unknown_hashes.end())
{
DEBUG(log_) << "Sending <ASK>, waiting for <LEARN>.";
unknown_hashes.insert (hash);
}
else
{
DEBUG(log_) << "Already sent <ASK>, waiting for <LEARN>.";
}
return (true);
}
break;
default:
ERROR(log_) << "Unsupported XCodec opcode " << (unsigned)op << ".";
return (false);
}
}
return (true);
}

54
xcodec/xcodec_decoder.h Normal file
View file

@ -0,0 +1,54 @@
/*
* Copyright (c) 2008-2011 Juli Mallett. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef XCODEC_XCODEC_DECODER_H
#define XCODEC_XCODEC_DECODER_H
#include <set>
////////////////////////////////////////////////////////////////////////////////
// //
// File: xcodec_decoder.h //
// Description: decoding routines for the xcodex protocol //
// Project: WANProxy XTech //
// Adapted by: Andreu Vidal Bramfeld-Software //
// Last modified: 2015-08-31 //
// //
////////////////////////////////////////////////////////////////////////////////
class XCodecCache;
class XCodecDecoder {
LogHandle log_;
XCodecCache* cache_;
public:
XCodecDecoder(XCodecCache*);
~XCodecDecoder();
bool decode (Buffer&, Buffer&, std::set<uint64_t>&);
};
#endif /* !XCODEC_XCODEC_DECODER_H */

252
xcodec/xcodec_encoder.cc Normal file
View file

@ -0,0 +1,252 @@
/*
* Copyright (c) 2009-2011 Juli Mallett. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include <common/buffer.h>
#include <common/endian.h>
#include <xcodec/xcodec.h>
#include <xcodec/xcodec_cache.h>
#include <xcodec/xcodec_encoder.h>
#include <xcodec/xcodec_hash.h>
////////////////////////////////////////////////////////////////////////////////
// //
// File: xcodec_encoder.cc //
// Description: encoding routines for the xcodex protocol //
// Project: WANProxy XTech //
// Adapted by: Andreu Vidal Bramfeld-Software //
// Last modified: 2015-08-31 //
// //
////////////////////////////////////////////////////////////////////////////////
struct candidate_symbol
{
bool set_;
unsigned offset_;
uint64_t symbol_;
};
XCodecEncoder::XCodecEncoder(XCodecCache *cache)
: log_("/xcodec/encoder"),
cache_(cache)
{ }
XCodecEncoder::~XCodecEncoder()
{ }
/*
* This takes a view of a data stream and turns it into a series of references
* to other data, declarations of data to be referenced, and data that needs
* escaped.
*/
void
XCodecEncoder::encode (Buffer& output, Buffer& input)
{
XCodecHash xcodec_hash;
candidate_symbol candidate = {0, 0, 0};
unsigned offset = 0;
unsigned o = 0;
Buffer old;
for (Buffer::SegmentIterator it = input.segments (); ! it.end (); it.next ())
{
const BufferSegment* seg = *it;
const uint8_t *p, *q = seg->end ();
for (p = seg->data (); p < q; ++p)
{
/*
* Add bytes to the hash until we have a complete hash.
*/
if (++o < XCODEC_SEGMENT_LENGTH)
xcodec_hash.add (*p);
else
{
if (o == XCODEC_SEGMENT_LENGTH)
xcodec_hash.add (*p);
else
xcodec_hash.roll (*p);
/*
* And then mix the hash's internal state into a
* uint64_t that we can use to refer to that data
* and to look up possible past occurances of that
* data in the XCodecCache.
*/
uint64_t hash = xcodec_hash.mix ();
/*
* If there is a pending candidate hash that wouldn't
* overlap with the data that the rolling hash presently
* covers, declare it now.
*/
if (candidate.set_ && candidate.offset_ + (XCODEC_SEGMENT_LENGTH * 2) <= offset + o)
{
encode_declaration (output, input, offset, candidate.offset_, candidate.symbol_);
o -= (candidate.offset_ + XCODEC_SEGMENT_LENGTH - offset);
offset = (candidate.offset_ + XCODEC_SEGMENT_LENGTH);
candidate.set_ = false;
}
/*
* Now attempt to encode this hash as a reference if it
* has been defined before.
*/
if (cache_->lookup (hash, old))
{
/*
* This segment already exists. If it's
* identical to this chunk of data, then that's
* positively fantastic.
*/
if (encode_reference (output, input, offset, offset + o - XCODEC_SEGMENT_LENGTH, hash, old))
{
/*
* We have output any data before this hash
* in escaped form, so any candidate hash
* before it is invalid now.
*/
offset += o;
o = 0;
xcodec_hash.reset();
candidate.set_ = false;
}
else
{
/*
* This hash isn't usable because it collides
* with another, so keep looking for something
* viable.
*/
DEBUG(log_) << "Collision in first pass.";
}
old.clear ();
}
else
{
/*
* Not defined before, it's a candidate for declaration
* if we don't already have one.
*/
if (candidate.set_)
{
/*
* We already have a hash that occurs earlier,
* isn't a collision and includes data that's
* covered by this hash, so don't remember it
* and keep going.
*/
ASSERT(log_, candidate.offset_ + (XCODEC_SEGMENT_LENGTH * 2) > offset + o);
}
else
{
/*
* The hash at this offset doesn't collide with any
* other and is the first viable hash we've seen so far
* in the stream, so remember it so that if we don't
* find something to reference we can declare this one
* for future use.
*/
candidate.offset_ = offset + o - XCODEC_SEGMENT_LENGTH;
candidate.symbol_ = hash;
candidate.set_ = true;
}
}
}
}
}
/*
* There's a hash we can declare, do it.
*/
if (candidate.set_)
{
encode_declaration (output, input, offset, candidate.offset_, candidate.symbol_);
o -= (candidate.offset_ + XCODEC_SEGMENT_LENGTH - offset);
offset = (candidate.offset_ + XCODEC_SEGMENT_LENGTH);
candidate.set_ = false;
}
/*
* There's data after that hash or no candidate hash, so
* just escape it.
*/
if (offset < input.length ())
encode_escape (output, input, offset, input.length ());
}
void
XCodecEncoder::encode_declaration (Buffer& output, Buffer& input, unsigned offset, unsigned start, uint64_t hash)
{
if (offset < start)
encode_escape (output, input, offset, start);
cache_->enter (hash, input, start);
output.append (XCODEC_MAGIC);
output.append (XCODEC_OP_EXTRACT);
output.append (input, start, XCODEC_SEGMENT_LENGTH);
}
void
XCodecEncoder::encode_escape (Buffer& output, Buffer& input, unsigned offset, unsigned limit)
{
unsigned pos;
while (offset < limit && input.find (XCODEC_MAGIC, offset, limit - offset, &pos))
{
if (offset < pos)
output.append (input, offset, pos - offset);
output.append (XCODEC_MAGIC);
output.append (XCODEC_OP_ESCAPE);
offset = pos + 1;
}
if (offset < limit)
output.append (input, offset, limit - offset);
}
bool
XCodecEncoder::encode_reference (Buffer& output, Buffer& input, unsigned offset, unsigned start, uint64_t hash, Buffer& old)
{
uint8_t data[XCODEC_SEGMENT_LENGTH];
input.copyout (data, start, XCODEC_SEGMENT_LENGTH);
if (old.equal (data, sizeof data))
{
if (offset < start)
encode_escape (output, input, offset, start);
output.append (XCODEC_MAGIC);
output.append (XCODEC_OP_REF);
uint64_t behash = BigEndian::encode (hash);
output.append (&behash);
return true;
}
return false;
}

57
xcodec/xcodec_encoder.h Normal file
View file

@ -0,0 +1,57 @@
/*
* Copyright (c) 2008-2011 Juli Mallett. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef XCODEC_XCODEC_ENCODER_H
#define XCODEC_XCODEC_ENCODER_H
////////////////////////////////////////////////////////////////////////////////
// //
// File: xcodec_encoder.h //
// Description: encoding routines for the xcodex protocol //
// Project: WANProxy XTech //
// Adapted by: Andreu Vidal Bramfeld-Software //
// Last modified: 2015-08-31 //
// //
////////////////////////////////////////////////////////////////////////////////
class XCodecCache;
class XCodecEncoder
{
LogHandle log_;
XCodecCache* cache_;
public:
XCodecEncoder(XCodecCache*);
~XCodecEncoder();
void encode (Buffer&, Buffer&);
private:
void encode_declaration (Buffer&, Buffer&, unsigned, unsigned, uint64_t);
void encode_escape (Buffer&, Buffer&, unsigned, unsigned);
bool encode_reference (Buffer&, Buffer&, unsigned, unsigned, uint64_t, Buffer&);
};
#endif /* !XCODEC_XCODEC_ENCODER_H */

495
xcodec/xcodec_filter.cc Normal file
View file

@ -0,0 +1,495 @@
/*
* Copyright (c) 2011-2012 Juli Mallett. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include <common/buffer.h>
#include <common/endian.h>
#include <programs/wanproxy/wanproxy.h>
#include "xcodec_filter.h"
////////////////////////////////////////////////////////////////////////////////
// //
// File: xcodec_filter.cc //
// Description: instantiation of encoder/decoder in a data filter pair //
// Project: WANProxy XTech //
// Adapted by: Andreu Vidal Bramfeld-Software //
// Last modified: 2015-08-31 //
// //
////////////////////////////////////////////////////////////////////////////////
/*
* Usage:
* <OP_HELLO> length[uint8_t] data[uint8_t x length]
*
* Effects:
* Must appear at the start of and only at the start of an encoded stream.
*
* Sife-effects:
* Possibly many.
*/
#define XCODEC_PIPE_OP_HELLO ((uint8_t)0xff)
/*
* Usage:
* <OP_LEARN> data[uint8_t x XCODEC_PIPE_SEGMENT_LENGTH]
*
* Effects:
* The `data' is hashed, the hash is associated with the data if possible.
*
* Side-effects:
* None.
*/
#define XCODEC_PIPE_OP_LEARN ((uint8_t)0xfe)
/*
* Usage:
* <OP_ASK> hash[uint64_t]
*
* Effects:
* An OP_LEARN will be sent in response with the data corresponding to the
* hash.
*
* If the hash is unknown, error will be indicated.
*
* Side-effects:
* None.
*/
#define XCODEC_PIPE_OP_ASK ((uint8_t)0xfd)
/*
* Usage:
* <OP_EOS>
*
* Effects:
* Alert the other party that we have no intention of sending more data.
*
* Side-effects:
* The other party will send <OP_EOS_ACK> when it has processed all of
* the data we have sent.
*/
#define XCODEC_PIPE_OP_EOS ((uint8_t)0xfc)
/*
* Usage:
* <OP_EOS_ACK>
*
* Effects:
* Alert the other party that we have no intention of reading more data.
*
* Side-effects:
* The connection will be torn down.
*/
#define XCODEC_PIPE_OP_EOS_ACK ((uint8_t)0xfb)
/*
* Usage:
* <FRAME> length[uint16_t] data[uint8_t x length]
*
* Effects:
* Frames an encoded chunk.
*
* Side-effects:
* None.
*/
#define XCODEC_PIPE_OP_FRAME ((uint8_t)0x00)
#define XCODEC_PIPE_MAX_FRAME (32768)
// Encoding
bool EncodeFilter::consume (Buffer& buf)
{
Buffer output;
Buffer enc;
ASSERT(log_, ! flushing_);
if (! encoder_)
{
if (! cache_ || ! cache_->identifier().is_valid ())
{
ERROR(log_) << "Could not encode UUID for <HELLO>.";
return false;
}
output.append (XCODEC_PIPE_OP_HELLO);
uint64_t mb = cache_->nominal_size ();
output.append ((uint8_t) (UUID_STRING_SIZE + sizeof mb));
cache_->identifier().encode (output);
output.append (&mb);
if (! (encoder_ = new XCodecEncoder (cache_)))
return false;
}
encoder_->encode (enc, buf);
while (! enc.empty ())
{
int n = enc.length ();
if (n > XCODEC_PIPE_MAX_FRAME)
n = XCODEC_PIPE_MAX_FRAME;
Buffer frame;
enc.moveout (&frame, n);
uint16_t len = n;
len = BigEndian::encode (len);
output.append (XCODEC_PIPE_OP_FRAME);
output.append (&len);
output.append (frame);
}
return produce (output);
}
void EncodeFilter::flush (int flg)
{
if (flg == XCODEC_PIPE_OP_EOS_ACK)
eos_ack_ = true;
else
{
flushing_ = true;
flush_flags_ |= flg;
if (! sent_eos_)
{
Buffer output;
output.append (XCODEC_PIPE_OP_EOS);
sent_eos_ = produce (output);
}
}
if (flushing_ && eos_ack_)
Filter::flush (flush_flags_);
}
// Decoding
bool DecodeFilter::consume (Buffer& buf)
{
if (! upstream_)
{
ERROR(log_) << "Decoder not configured";
return false;
}
pending_.append (buf);
while (! pending_.empty ())
{
uint8_t op = pending_.peek ();
switch (op)
{
case XCODEC_PIPE_OP_HELLO:
if (decoder_cache_)
{
ERROR(log_) << "Got <HELLO> twice.";
return false;
}
else
{
uint8_t len;
if (pending_.length() < sizeof op + sizeof len)
return true;
pending_.extract (&len, sizeof op);
if (pending_.length() < sizeof op + sizeof len + len)
return true;
uint64_t mb;
if (len != UUID_STRING_SIZE + sizeof mb)
{
ERROR(log_) << "Unsupported <HELLO> length: " << (unsigned)len;
return false;
}
UUID uuid;
pending_.skip (sizeof op + sizeof len);
if (! uuid.decode (pending_))
{
ERROR(log_) << "Invalid UUID in <HELLO>.";
return false;
}
pending_.extract (&mb);
pending_.skip (sizeof mb);
if (! (decoder_cache_ = wanproxy.find_cache (uuid)))
decoder_cache_ = wanproxy.add_cache (uuid, mb);
ASSERT(log_, decoder_ == NULL);
if (decoder_cache_)
decoder_ = new XCodecDecoder (decoder_cache_);
DEBUG(log_) << "Peer connected with UUID: " << uuid;
}
break;
case XCODEC_PIPE_OP_ASK:
if (! encoder_cache_)
{
ERROR(log_) << "Decoder not configured";
return false;
}
else
{
uint64_t hash;
if (pending_.length() < sizeof op + sizeof hash)
return true;
pending_.skip (sizeof op);
pending_.moveout (&hash);
hash = BigEndian::decode (hash);
Buffer learn;
learn.append (XCODEC_PIPE_OP_LEARN);
if (encoder_cache_->lookup (hash, learn))
{
DEBUG(log_) << "Responding to <ASK> with <LEARN>.";
if (! upstream_->produce (learn))
return false;
}
else
{
ERROR(log_) << "Unknown hash in <ASK>: " << hash;
return false;
}
}
break;
case XCODEC_PIPE_OP_LEARN:
if (! decoder_cache_)
{
ERROR(log_) << "Got <LEARN> before <HELLO>.";
return false;
}
else
{
if (pending_.length() < sizeof op + XCODEC_SEGMENT_LENGTH)
return true;
pending_.skip (sizeof op);
uint8_t data[XCODEC_SEGMENT_LENGTH];
pending_.copyout (data, XCODEC_SEGMENT_LENGTH);
uint64_t hash = XCodecHash::hash (data);
if (unknown_hashes_.find (hash) == unknown_hashes_.end ())
INFO(log_) << "Gratuitous <LEARN> without <ASK>.";
else
unknown_hashes_.erase (hash);
Buffer old;
if (decoder_cache_->lookup (hash, old))
{
if (old.equal (data, sizeof data))
{
DEBUG(log_) << "Redundant <LEARN>.";
}
else
{
ERROR(log_) << "Collision in <LEARN>.";
return false;
}
old.clear ();
}
else
{
DEBUG(log_) << "Successful <LEARN>.";
decoder_cache_->enter (hash, pending_, 0);
}
pending_.skip (XCODEC_SEGMENT_LENGTH);
}
break;
case XCODEC_PIPE_OP_EOS:
if (received_eos_)
{
ERROR(log_) << "Duplicate <EOS>.";
return false;
}
pending_.skip (sizeof op);
received_eos_ = true;
break;
case XCODEC_PIPE_OP_EOS_ACK:
if (received_eos_ack_)
{
ERROR(log_) << "Duplicate <EOS_ACK>.";
return false;
}
pending_.skip (sizeof op);
received_eos_ack_ = true;
break;
case XCODEC_PIPE_OP_FRAME:
if (! decoder_)
{
ERROR(log_) << "Got frame data before decoder initialized.";
return false;
}
else
{
uint16_t len;
if (pending_.length() < sizeof op + sizeof len)
return true;
pending_.extract (&len, sizeof op);
len = BigEndian::decode (len);
if (len == 0 || len > XCODEC_PIPE_MAX_FRAME)
{
ERROR(log_) << "Invalid framed data length.";
return false;
}
if (pending_.length() < sizeof op + sizeof len + len)
return true;
pending_.moveout (&frame_buffer_, sizeof op + sizeof len, len);
}
break;
default:
ERROR(log_) << "Unsupported operation in pipe stream.";
return false;
}
if (frame_buffer_.empty ())
continue;
if (! unknown_hashes_.empty ())
{
DEBUG(log_) << "Waiting for unknown hashes to continue processing data.";
continue;
}
Buffer output;
if (! decoder_->decode (output, frame_buffer_, unknown_hashes_))
{
ERROR(log_) << "Decoder exiting with error.";
return false;
}
if (! output.empty ())
{
ASSERT(log_, ! flushing_);
if (! produce (output))
return false;
}
else
{
/*
* We should only get no output from the decoder if
* we're waiting on the next frame or we need an
* unknown hash. It would be nice to make the
* encoder framing aware so that it would not end
* up with encoded data that straddles a frame
* boundary. (Fixing that would also allow us to
* simplify length checking within the decoder
* considerably.)
*/
ASSERT(log_, !frame_buffer_.empty() || !unknown_hashes_.empty());
}
Buffer ask;
std::set<uint64_t>::const_iterator it;
for (it = unknown_hashes_.begin(); it != unknown_hashes_.end(); ++it)
{
uint64_t hash = *it;
hash = BigEndian::encode (hash);
ask.append (XCODEC_PIPE_OP_ASK);
ask.append (&hash);
}
if (! ask.empty ())
{
DEBUG(log_) << "Sending <ASK>s.";
if (! upstream_->produce (ask))
return false;
}
}
if (received_eos_ && ! sent_eos_ack_ && frame_buffer_.empty ())
{
DEBUG(log_) << "Decoder received <EOS>, sending <EOS_ACK>.";
Buffer eos_ack;
eos_ack.append (XCODEC_PIPE_OP_EOS_ACK);
sent_eos_ack_ = true;
if (! upstream_->produce (eos_ack))
return false;
}
/*
* If we have received EOS and not yet sent it, we can send it now.
* The only caveat is that if we have outstanding <ASK>s, i.e. we have
* not yet emptied decoder_unknown_hashes_, then we can't send EOS yet.
*/
if (received_eos_ && ! flushing_)
{
if (unknown_hashes_.empty ())
{
if (! frame_buffer_.empty ())
return false;
DEBUG(log_) << "Decoder received <EOS>, shutting down decoder output channel.";
flushing_ = true;
Filter::flush (0);
}
else
{
if (frame_buffer_.empty ())
return false;
DEBUG(log_) << "Decoder waiting to send <EOS> until <ASK>s are answered.";
}
}
/*
* NB:
* Along with the comment above, there is some relevance here. If we
* use some kind of hierarchical decoding, then we need to be able to
* handle the case where an <ASK>'s response necessitates us to send
* another <ASK> or something of that sort. There are other conditions
* where we may still need to send something out of the encoder, but
* thankfully none seem to arise yet.
*/
if (sent_eos_ack_ && received_eos_ack_ && ! upflushed_)
{
ASSERT(log_, pending_.empty());
ASSERT(log_, frame_buffer_.empty());
DEBUG(log_) << "Decoder finished, got <EOS_ACK>, shutting down encoder output channel.";
upflushed_ = true;
upstream_->flush (XCODEC_PIPE_OP_EOS_ACK);
}
return true;
}
void DecodeFilter::flush (int flg)
{
flushing_ = true;
flush_flags_ |= flg;
if (! pending_.empty ())
DEBUG(log_) << "Flushing decoder with data outstanding.";
if (! frame_buffer_.empty ())
DEBUG(log_) << "Flushing decoder with frame data outstanding.";
if (! upflushed_ && upstream_)
upstream_->flush (XCODEC_PIPE_OP_EOS_ACK);
Filter::flush (flush_flags_);
}

74
xcodec/xcodec_filter.h Normal file
View file

@ -0,0 +1,74 @@
////////////////////////////////////////////////////////////////////////////////
// //
// File: xcodec_filter.h //
// Description: instantiation of encoder/decoder in a data filter pair //
// Project: WANProxy XTech //
// Author: Andreu Vidal Bramfeld-Software //
// Last modified: 2015-08-31 //
// //
////////////////////////////////////////////////////////////////////////////////
#ifndef XCODEC_FILTER_H
#define XCODEC_FILTER_H
#include <set>
#include <common/filter.h>
#include <xcodec/xcodec.h>
#include <xcodec/xcodec_cache.h>
#include <xcodec/xcodec_hash.h>
#include <xcodec/xcodec_encoder.h>
#include <xcodec/xcodec_decoder.h>
class EncodeFilter : public BufferedFilter
{
private:
XCodecCache* cache_;
XCodecEncoder* encoder_;
bool sent_eos_;
bool eos_ack_;
public:
EncodeFilter (const LogHandle& log, XCodecCache* cc) : BufferedFilter (log)
{
cache_ = cc; encoder_ = 0; sent_eos_ = eos_ack_ = false;
}
virtual ~EncodeFilter ()
{
delete encoder_;
}
virtual bool consume (Buffer& buf);
virtual void flush (int flg);
};
class DecodeFilter : public LogisticFilter
{
private:
XCodecCache* encoder_cache_;
XCodecDecoder* decoder_;
XCodecCache* decoder_cache_;
std::set<uint64_t> unknown_hashes_;
Buffer frame_buffer_;
bool received_eos_;
bool sent_eos_ack_;
bool received_eos_ack_;
bool upflushed_;
public:
DecodeFilter (const LogHandle& log, XCodecCache* cc) : LogisticFilter (log)
{
encoder_cache_ = cc; decoder_ = 0; decoder_cache_ = 0;
received_eos_ = sent_eos_ack_ = received_eos_ack_ = upflushed_ = false;
}
~DecodeFilter ()
{
delete decoder_;
}
virtual bool consume (Buffer& buf);
virtual void flush (int flg);
};
#endif /* !XCODEC_FILTER_H */

177
xcodec/xcodec_hash.h Normal file
View file

@ -0,0 +1,177 @@
/*
* Copyright (c) 2008-2012 Juli Mallett. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#ifndef XCODEC_XCODEC_HASH_H
#define XCODEC_XCODEC_HASH_H
#include <strings.h>
class XCodecHash {
struct RollingHash {
uint32_t sum1_; /* Really <16-bit. */
uint32_t sum2_; /* Really <32-bit. */
uint32_t buffer_[XCODEC_SEGMENT_LENGTH]; /* Really >8-bit. */
RollingHash(void)
: sum1_(0),
sum2_(0),
buffer_()
{ }
void add(uint32_t ch, unsigned start)
{
buffer_[start] = ch;
sum1_ += ch;
sum2_ += sum1_;
}
void reset(void)
{
sum1_ = 0;
sum2_ = 0;
}
void roll(uint32_t ch, unsigned start)
{
uint32_t dead;
dead = buffer_[start];
sum1_ -= dead;
sum2_ -= dead * XCODEC_SEGMENT_LENGTH;
buffer_[start] = ch;
sum1_ += ch;
sum2_ += sum1_;
}
};
RollingHash bytes_;
RollingHash bits_;
unsigned start_;
#ifndef NDEBUG
unsigned length_;
#endif
public:
XCodecHash(void)
: bytes_(),
bits_(),
start_(0)
#ifndef NDEBUG
, length_(0)
#endif
{ }
~XCodecHash()
{ }
void add(uint8_t ch)
{
unsigned bit = ffs(ch);
unsigned word = (unsigned)ch + 1;
#ifndef NDEBUG
ASSERT("/xcodec/hash", length_ < XCODEC_SEGMENT_LENGTH);
#endif
bytes_.add(word, start_);
bits_.add(bit, start_);
#ifndef NDEBUG
length_++;
#endif
start_ = (start_ + 1) % XCODEC_SEGMENT_LENGTH;
}
void reset(void)
{
bytes_.reset();
bits_.reset();
#ifndef NDEBUG
length_ = 0;
#endif
start_ = 0;
}
void roll(uint8_t ch)
{
unsigned bit = ffs(ch);
unsigned word = (unsigned)ch + 1;
#ifndef NDEBUG
ASSERT("/xcodec/hash", length_ == XCODEC_SEGMENT_LENGTH);
#endif
bytes_.roll(word, start_);
bits_.roll(bit, start_);
start_ = (start_ + 1) % XCODEC_SEGMENT_LENGTH;
}
/*
* XXX
* Need to write a compression function for this; get rid of the
* completely non-entropic bits, anyway, and try to mix the others.
*
* Need to look at what bits can even possibly be set in sum2_,
* looking at all possible ranges resulting from:
* 128*data[0] + 127*data[1] + ... 1*data[127]
*
* It seems like since each data[] must have at least 1 bit set,
* there are a great many impossible values, and there is a large
* minimal value. Should be easy to compress, and would rather
* have the extra bits changing from the normalized (i.e. non-zero)
* input and have to compress than have no bits changing...but
* perhaps those are extensionally-equivalent and that's just
* hocus pocus computer science. Need to think more clearly and
* fully about it.
*/
uint64_t mix(void) const
{
#ifndef NDEBUG
ASSERT("/xcodec/hash", length_ == XCODEC_SEGMENT_LENGTH);
#endif
uint64_t bits_hash = (bits_.sum1_ << 16) + bits_.sum2_;
uint64_t bytes_hash = (bytes_.sum1_ << 20) + bytes_.sum2_;
return ((bits_hash << 36) + bytes_hash);
}
static uint64_t hash(const uint8_t *data)
{
XCodecHash xchash;
unsigned i;
for (i = 0; i < XCODEC_SEGMENT_LENGTH; i++)
xchash.add(*data++);
return (xchash.mix());
}
};
#endif /* !XCODEC_XCODEC_HASH_H */