Designed a Backend System
Created a product prototype
Last week I finally found some "downtime" in between projects and wanted to play with rust 🦀 a bit. I also needed to solve a problem that it could work well with too, so I was in luck.

There's not time to start a new project like in the downtime between a few meetings right. So the gist of the goal was a simple client process to collect basic server info and push it to another service I'll build at another time.

So far it's just a prototype - but eventually the goal is to provide it as an internal product/solution for our support teams. The part I wanted to highlight though was related to data collection and reporting efficiencies.

Given that this tool needs to be on customer systems the requirements for low-bandwidth usage is a must. So that in mind, I've implemented a naive but effective data deduplication method.

So when the agent is installed two important config values are set. The interval at which to collect the data (in seconds) and the interval in which to push the data (measured by data collection cycles).

Meaning you could set the collection interval to 15sec and the push interval to 4 and the result would be data collected every 15 seconds but pushed only ever 60 seconds. During the push event it would send data for all 4 collection events (e.g 15s, 30s 45s, 60s).

This type of system keeps things flexible so that this tool isn't constantly piping data to the collection server. However it leaves something to be desired - not every data collection event will be unique info after all.

In other words capturing changes is important but they don't happen often. So the solution was to deduplicate events like:
{
	"initial_data" => (the first event payload),
	"collect_events" => [
		0 => (a subsquent event payload),
		1 => (a subsquent event payload),
		2 => (a subsquent event payload),
	]
}
However, the collect events payloads are presented as diffs of the data in the `initial_data`. So only the fields from them that are unique will be included!

Meaning we can still act on the stream of events as if we collected them in real time. But we're really only collecting and processing them in near-real time.