Mark Gritter (markgritter) wrote,
Mark Gritter

Visualizing histograms vs. time

Dear Lazyweb,

I have time-series data which consists of response time histograms. What is a good way to visualize them?

Each histogram consists of usually 5-13 buckets (max 20) on a logarithmic scale, from 1 microsecond through 1 second. (2^0 microseconds through 2^20 microseconds.) What I'm particularly interested in is highlighting cases where there are more entries in the long-latency buckets than usual, even though they are usually a small components of the overall distribution. But I am also interested in seeing what the modal response time is and any changes to the overall shape of the distribution, over time.

                "7":    14,
                "8":    10834,
                "9":    6344,
                "10":   1997,
                "11":   5016,
                "12":   6665,
                "13":   13858,
                "14":   80563,
                "15":   202353,
                "16":   19600,
                "17":   341,
                "18":   1118,
                "19":   1320,
                "20":   726

Ideas so far:
* Stacked bar per sample. Color-code each bucket to emphasize the "bad" buckets. But, for example, 726 is a very small fraction of 202353 so the height of each bar would need to be logarithmically scaled. Alignment is also difficult to get right--- might want to allow the user to select which bucket "lines up" along the time series.
* Cumulative distribution per sample in a 3-D graph
* Show 'low/medium/high' as three separate graphs or charts stacked together on a common timeline so that they can be scaled independently.
* Show deviations from the overall average, somehow--- maybe a delta per bucket.

While you're at it, you could also pop over and answer David Eppstein's question on 3-D illustration tools. I would probably end up using POV-ray too.
Tags: visualization
  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded