baijia - papers and notes

Full Version: Agrawal...Generating realistic impressions for file-system benchmarking. FAST'09
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Generating Realistic Impressions for File-System Benchmarking
Nitin Agrawal, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau
FAST'09

* * *

Impressions is a framework for generating a namespace (directory hierarchy) and files. Much of the model is shaped by a 5-year study on file system activities at Microsoft, but the framework gives a number of parameters for users to customize the generated file system image.

The procedure of creating the file system image is as follows.
1. Generate the directory hierarchy with a generative model
2. Determine the file sizes through a lognormal distribution and a Pareto tail distribution
3. Assign file names and extensions based on known statistics
4. Assign a file depth to each file based on the distribution of files with depths and the distribution of bytes with depth.
5. Assign files to directories of appropriate depths according to the distribution of directories with file count modeled in inverse-polynomial of degree 2.

Statistical and algorithmic methods are provided to sample, generate, and test the generated file system. Impressions handles interpolation and extrapolation, tests goodness-of-fit, and emulates fragmentation.

Impressions also provides a method for resolving constraints. We assume such resolution is best effort for some cases.
Reference URL's