I’ve got a process that analyzes data as it comes into a network connection. These frames are about 128 bytes, and the processing takes about 1.8us. While that might seem plenty fast, the sheer number of frames to be evaluated is enough to evaluate how to multi-thread the work.
The code below is my best attempt to simulate what’s going through a few simulations (random bytes/sleep methods).
- The linear function shows how I’ve processed the data without any threading.
- The tokio function shows my really poor attempt to multi-thread it. I’ve added elements as I get various compiler errors, but it really seems so much more complex than any of the examples I see in the rust or tokio documentation.
The cargo.toml file is:
[package]
name = "codereview"
version = "0.1.0"
edition = "2021"
[dependencies]
tokio = {version = "1.38.0", features = ["full"] }
rand = "0.8.5"
The main.rs file is:
use std::sync::Arc;
use std::sync::atomic::{AtomicU64, Ordering};
use std::thread;
use std::time::Duration;
use tokio::{task, time::{Instant}};
use Vec;
use tokio::sync::mpsc;
use rand::Rng;
fn fill_vec(count: u8) -> Vec<u8> {
let mut rng = rand::thread_rng();
(0..count).map(|_| rng.gen()).collect()
}
// simulate the actual process, which takes about 2us.
fn process_data(data: &Vec<u8>) -> bool {
thread::sleep(Duration::from_micros(2));
return true;
}
fn do_linear_process(data: &Vec<Vec<u8>>, max_iterations:u32) {
let mut iteration = 0;
let iteration_start = Instant::now();
let mut elements = 0;
loop {
iteration += 1;
if iteration > max_iterations {
break;
}
// simply classify the frame
let data = data[iteration as usize % 100].clone();
let val = process_data(&data);
if val {
elements += 1;
}
}
let iteration_elapsed = iteration_start.elapsed();
let time_elements = iteration_elapsed.as_micros() / elements;
println!("Iterated {} elements in {:?}us at {} us/element", elements, iteration_elapsed.as_micros(), time_elements);
}
fn do_tokio_mpsc(data: &Vec<Vec<u8>>, max_iterations:u32) {
println!("---- Repeating with tokio mpsc threading ----");
// now we'll repeat the last part with some threading action...
let mut iteration = 0;
let iteration_start = Instant::now();
let elements = Arc::new(AtomicU64::new(0));
let (data_tx, mut data_rx) = mpsc::channel::<bool>(2000);;
let (frame_tx, mut frame_rx) = mpsc::channel::<Vec<u8>>(2000);
// fully decode the frame
tokio::spawn(async move {
while let Some(frame) = frame_rx.recv().await {
if let was_successful = process_data(&frame) {
data_tx.send(was_successful).await;
}
}
});
let elements_clone = elements.clone();
let mut process = || async move {
while let Some(_) = data_rx.recv().await {
elements_clone.fetch_add(1, Ordering::SeqCst);
}
println!("Finished receiving {} elements.", elements_clone.load(Ordering::SeqCst));
};
let handle = task::spawn(process());
loop {
iteration += 1;
if iteration > max_iterations {
drop(frame_tx);
break;
}
let data = data[iteration as usize% 100].clone();
frame_tx.send(data);
}
tokio::spawn(async move {
match handle.await {
Ok(_) => println!("Task completed successfully."),
Err(e) => println!("Task failed: {:?}", e),
}
});
let iteration_elapsed = iteration_start.elapsed();
if elements.load(Ordering::SeqCst) != 0 {
let time_elements = iteration_elapsed.as_micros() / elements.load(Ordering::SeqCst) as u128;
println!("Iterated {} elements in {:?}us at {} us/element", elements.load(Ordering::SeqCst), iteration_elapsed.as_micros(), time_elements);
} else {
println!("Elements is 0. Time to complete was: {}us", iteration_elapsed.as_micros());
}
}
#[tokio::main]
async fn main() {
// create random data to simulate actual data
let mut frame_data:Vec<Vec<u8>> = Vec::new();
for _ in 0..100 {
frame_data.insert(0,fill_vec(128));
}
// maximum iterations for the demo...
let max_iterations = 1_000_000;
// process the data in a single, linear thread
do_linear_process(&frame_data, max_iterations);
do_tokio_mpsc(&frame_data, max_iterations);
}
The more I work on it, the worse it gets. Here are the issues that I’m aware of:
- I don’t think the data is actually being sent to the thread because I’m not calling await on it. (compiler warning)
- I can’t call await because it says it’s not async (although it’s being called from async main.
- Is it really that hard to have a variable that shares a simple state like “count” where I’m just trying to make sure I counted the number of data elements processed?
Any help or suggestions to get me back on track would be greatly appreciated.