Analytical Model

The repository contains Python scripts for analytical model for distributed memory \(k\)-mer counting.

The default scripts will reproduce plots from our conference paper, based on hardware parameters and experiments performed using the Phoenix supercomputer at Georgia Tech.

DAKC GitHub Repository

Description

kcount.py: Funtion definitions for the analytical model of distributed-memory k-mer counting.
memory.py: Seperate model to show memory overhead of multi-layered message aggregation.
params.py: Machine parametes used for the analytical model. Edit this file with parameters of your target machine.
experiments.py: Input and experimental values of k-mer counting for different synthetic datasets, as observed on the Phoenix machine. Edit this file with updated input and experimental results on your target machine
defaultplot.py: Default plotting options.
cachepred.py: Analytically predicts the L3 cache misses for inputs mentioned in experiments.py and compares them against experimental results.
hwresource.py: Analytically predicts what percentage of k-mer counting is spent doing "memory access", "communication", and "computation".

How to execute

Edit the params.py and experiments.py file as required. Then simply run python <scriptname>.py. The script will produce a figure inside the figures directory.