In contrast, today’s computers and software permit straightforward manipulation of large amounts of data, allowing the analyst to explore data without the drudgery. Interactively “playing around” –creating graphs and plots etc. –is perhaps the best way to understand your data and extract the real meaning without the distributional assumptions of statistical inference. For many if not most purposes, descriptive techniques are all that is needed to effectively characterize sites and make decisions such as where and how to treat and monitor. Simple tools used effectively reveal the essence of the data in ways that determine the follow-up actions.
This paper illustrates common exploratory data analysis techniques and visual methods: box plots, quantile plots, quantile-quantile plots, ratio and double ratio plots, stem-and-leaf plots, and scatter plots are used to identify patterns or trends in various types of environmental data. Also shown are a simple but effective means to assess the distribution of data (the 5-point data characterization method) and an example of a user-friendly statistical and graphical software package (STATA).
In one particularly striking example, a plot prepared using exploratory and graphical data analysis showed contaminant from a dense non-aqueous phase liquid (DNAPL) was in soils above the confining stratum, contrary to what was thought; the implications for treatment were dramatic. While this paper focuses mainly on analytical data, the methods can be applied to field parameter data as well.