GroundWater Spatiotemporal Data Analysis Tool (GWSDAT) V2.12
GWSDAT is an open source, user-friendly, software application for the visualisation and interpretation of groundwater monitoring data.
Key features include:
- Visualisation of site wide trends in solute concentrations, NAPL thickness and groundwater flow velocity for conceptual site model development.
- Spatiotemporal analysis: variation in groundwater solute concentration is modelled as a function of X,Y and time.
- Automatic generation of concentration contour plots at user specified time intervals, with the option to overlay groundwater elevation contours and NAPL thickness/ footprint data.
- Automatic report generation tools.
- Improved data transparency helps design and optimise groundwater monitoring or remediation programmes (i.e. avoid the collection of redundant data).
- Early identification of new releases, migration pathways, need for corrective action and stable/ declining trends that may aid in closure determinations.
- Rapid interpretation of complex data sets from large monitoring networks (e.g. refineries, terminals).
- More efficient evaluation and reporting of groundwater monitoring trends via simple, standardised plots and tables created at the 'click of a mouse'.
The GroundWater Spatiotemporal Data Analysis Tool (GWSDAT) has been developed by Shell Global Solutions for the analysis of groundwater monitoring data. It is designed to work with simple time-series data for solute concentration and ground water elevation, but can also plot non-aqueous phase liquid (NAPL) thickness if required.
Spatial data is input in the form of well coordinates, and wells can be grouped to separate data from different aquifer units. The software also allows the import of a site basemap in GIS shapefile format. Concentration trend and 2D contour plots generated using GWSDAT can be exported directly to Microsoft PowerPoint and Word to expedite reporting.
The application is supported for Windows XP, Vista, Windows 7, 8 and 10 and the corresponding version of Microsoft Office (including 64 bit operating systems). Data input to GWSDAT is via a standardized Excel spreadsheet and the data analysis and plot functions are accessed through an Excel Add-in application.
The statistical engine used to perform geo-statistical modelling and display graphical output is the open- source statistical programming language R (www.r-project.org). A user manual and two example datasets are provided with the software for training and demonstration purposes.
Spatiotemporal Data Analysis
The modelling of solute distribution in groundwater is typically restricted to either the analysis of trends in individual wells or independent fitting of spatial concentration distributions (e.g. by Kriging) to data from monitoring events. Neither of these techniques satisfactorily elucidate the interaction between spatial and temporal components of the data.
GWSDAT applies a spatiotemporal model smoother for a more coherent and smooth interpretation of the interaction in spatial and time-series components of groundwater solute concentrations. A spatiotemporal concentration smoother is fitted for each analyte using a non-parametric regression technique known as Penalised Splines (Eilers and Marx, 1992, 1996).
A Bayesian methodology is used to select the appropriate degree of model smoothness (Evers et al, 2013) The fit of the spatiotemporal algorithm to the monitoring data can be evaluated in either graphical or numerical format (export to MS Excel).
The GWSDAT graphical user interface (GUI) allows the user to navigate through a groundwater dataset and explore concentration/ groundwater elevation trends in individual wells and across the site as a whole.
Left-clicking on any of the user interface plots generates an identical but expanded plot in a separate window that can be saved to a variety of different formats including “jpeg”, “postscript”, “pdf”, “metafile”. Plots can also be automatically exported to a Microsoft PowerPoint or Word.
GWSDAT includes the following tools for trend visualization and detection:
Spatial plot: For the analysis of spatial trends in solute concentrations, groundwater flow and, if present, NAPL thickness. Overlaid on this plot are the predictions of the spatiotemporal solute concentration smoother which is a function that simultaneously estimates both the spatial and time series trend in site solute concentrations. GIS shapefiles can also be overlaid on this plot.
Well Trend plot: For the investigation of historical time-series trends in solute concentrations, groundwater elevation and, if present, NAPL thickness for individual wells. Users can overlay a nonparametric smoother which estimates the time-series trend in solute concentration. The advantage of this nonparametric method is that the trend estimate is not constrained to be monotonic, i.e. the trend can change direction.
Trend and Threshold Indicator Matrix: This feature provides a summary of the level and time series trend in solute concentrations at a particular model output interval.
Plume Diagnostics: New since version 2.1, GWSDAT calculates plume metrics quantifying and reporting temporal plume evolution. The quantities of plume mass, average concentration, area, and location of the centre of mass are calculated by spatial integration of the plume concentrations above a predefined concentration threshold. Further details can be found in the GWSDAT user manual and references therein.
Input, output and reporting
Data entry is via a Microsoft Excel input template (Figure S1), comprising three data input tables.
Groundwater concentration data is entered for different constituents in different wells as a function of time. Input in this table can also comprise ‘non–detects’ (which can be specified at either the detection limit or half the detection limit); groundwater elevations (which can be used for drawing additional groundwater contours), and light non–aqueous phase liquid (LNAPL) thickness (which can be automatically replaced by groundwater concentrations for interpolation purposes: either the maximum groundwater concentration observed in the dataset, or concentration calculated effective aqueous solubility).
The second table contains the wells coordinates and the third table can be used to specify the location of a basemap (in ArcGIS shapefile format).
Results and reporting
The main output of GWSDAT consists of the screen shown in Figure S2 which has 4 panels (highlighted with A to D in Figure S2).
In panel A, the user can select which constituent to plot, and whether to plot concentration trends, groundwater levels and/or LNAPL thickness. The user can also scroll through the time series and select a time slice (which is related to the concentration map and trend box, panels C and D, respectively). Panel B contains a graph with the observation data for the selected constituent and well (selected in panel A) and the selected trends (including 95% confidence percentiles). The grey line in the plot represents the time for which panels C and D are drawn.
The concentration map (panel C) shows a spatiotemporal model for a given time–slice (that was selected in panel A) including locations of wells and observed concentrations and with optional groundwater elevations and LNAPL thicknesses. The plume boundary contour is illustrated relative to a specified background or regulatory compliance concentration value. If the boundary contour is closed (i.e. the entire plume is captured by the spatiotemporal model), plume mass per meter aquifer thickness and plume area will be calculated. The trend and indicator threshold matrix (panel D) provides a fast way to determine concentration trends in the monitoring network for a given time slice (green = declining concentrations, white = stable, red = increasing). When using the matrix in ‘threshold mode’, the user can enter water quality threshold values and screen the data at the given time period against those thresholds (if the ‘statistical threshold’ option is used, the 95%–percentile of the data is screened against the threshold criteria).
GWSDAT can generate a number of different reports in different formats. The plots from the output screen can directly be exported as ‘jpeg’, ‘postscript’, ‘pdf’ files. It is also possible to export a sequence of plots capturing different time slices (both graphs from a single well, or spatiotemporal maps) and directly import these into Microsoft Word or Powerpoint.
A complete series of graphs can be exported using the well reporting feature. This generates a matrix of graphs (one for each well) in which a selection of constituents can be plotted.
Lastly, a number of summary graphs can be exported plotting the plume metrics based on the method by Ricker (2008) for any given constituent over time (Figure S3). This feature in particular, often in combination with a movie–like presentation of the contour plots in Powerpoint, rapidly creates a comprehensive view of plume behavior.
Downloads and Installation
System Requirements: Windows XP, Vista, 7, 8 or 10. Microsoft Office versions: 2003, XP, 2007, 2010.
- Download and install the latest version of the open source statistical application "R" available from: http://cran.r-project.org/bin/windows/base/ . Please accept all default settings during installation. (Users must have administrator access rights to install R).
- Download the
- Open up Excel and install the GWSDAT add-in by choosing: "Office button"-> "Excel Options"->"Add-Ins"->"Go"->"Browse" and then select "GWSDAT V2.12.xla" located in the "C:\Apps\GWSDAT_v2.12" folder. (If prompted to copy the GWSDAT add-in to another location please select 'No')
- A menu called "GWSDAT v2.12" will appear "Add-Ins" tab of Excel.
- To get started with a basic example select "GWSDAT v2.12->Insert Data File->Basic Example" and then "GWSDAT v2.12->GWSDAT Analysis".
- To get started with a more complex example select "GWSDAT v2.12->Insert Data File->Comprehensive Example" and then "GWSDAT v2.12->GWSDAT Analysis".
- You can view the user manual by selecting "GWSDAT v2.12->User Manual".
FAQs and Support
Additional guidance is provided in the GWSDAT v2.0 User Manual, which can be accessed from the GWSDAT add-in menu:
Please click on the questions below to view the answers.
Time series groundwater solute concentration (in ng/L, ug/L or mg/L units)
Time series groundwater elevation (relative to a common ordnance datum)
Time series NAPL thickness
Well coordinates (in Cartesian coordinates, not latitude, longitude)
Yes, scalar X,Y well coordinates can be measured direct from a site plan. For example, by aligning a transparent grid of numbered squares with the N-S arrow on the site plan (north upwards) and reading off the relative X,Y locations of each well.
Yes, site plans in GIS Shapefile format can be imported as background images. The filepath to the Shapefile folder is entered in the third table of the Excel data input worksheet (entitled “GIS ShapeFiles”). The user can either enter the shapefile location manually or use the `Browse for Shapefile' function in the GWSDAT Excel menu for interactive file selection. Only the location of the main shapefile (file ending with a `.shp' extension) needs to be specifed in this table - the associated data files (e.g. .dbf, .sbn, .sbx, .shx) will be picked up automatically, provided they are in the same folder. It is possible to overlay multiple shapefiles up to a maximum of seven. Please refer to the GWSDAT user manual for additional information, including the conversion of CAD drawing layers to Shapefile format using ARC-GIS.
No, it is not possible to model vertical concentration distribution using GWSDAT. However, it is possible to group monitoring wells (e.g. by aquifer) and then plot each group separately. Multiple concentration values for a given solute at the same X,Y location, well group and sampling time are detected by GWSDAT and averaged prior to fitting of the spatiotemporal model.
In the event that a site- wide 3D interpretation of groundwater flow and solute transport is required we would recommended the use of numerical modelling software such as FEFLOW or MODFLOW. The considerable time and effort required to populate and run such complex models may be justified for high profile sites when working with high- cost 3 dimensional aquifer data.
The minimum input data requirements for GWSDAT to run correctly are as follows: For plotting of groundwater flow direction arrows:
- No solute concentration data required
- Minimum 3 well locations in coordinate table
- Minimum 1 measurement of groundwater elevation at each of these 3 wells within the user- selected model output interval
For plotting of groundwater elevation contours:
- No solute concentration data required
- Minimum 4 well locations in coordinate table
- Minimum 1 measurement of groundwater elevation at each of these 4 well locations within the user- selected model output interval
For plotting of solute concentration trends at individual wells:
- Minimum one solute: No groundwater elevation data required
- Minimum 1 well location in coordinate table
- Minimum 1 measurement of groundwater solute concentration at this well location
For fitting of valid spatiotemporal model and plotting of solute concentration contours:
- Minimum one solute: No groundwater elevation data required
- Minimum 3 well locations in coordinate table
- Minimum 2 concentration, time data points for each of these 3 well coordinates
In order to generate representative concentration contour plots, the spacing of monitoring wells needs to reflect the characteristic distance over which solute concentrations vary in the groundwater. This will vary from site to site: if groundwater flow rates are low or solute transport retarded then concentration hotspots are likely to occur and a closer well spacing will be required to map the concentration distribution. Conversely, if groundwater flow rates are high and solute transport is not significantly retarded then a larger well spacing may be adequate to map the concentration distribution.
Because the minimum well spacing required for effective concentration contouring varies from site to site, the user’s judgement is required in deciding whether the available data merits contouring. The presence of “redundant” data points that can be removed without significantly changing the concentration distribution is an indication that the monitoring well spacing is more than sufficient.
In the event that only a small number of wells (i.e. <4) are present, then GWSDAT v2.0 includes a circle plot option, which represents the data as circles coloured and sized to solute concentration, thereby avoiding the need to use potentially misleading concentration contours.
Similar arguments apply to the contouring of groundwater elevation data, although in the absence of significant topographic variation/ geological heterogeneity or groundwater abstraction/ water injection groundwater piezometric surfaces should be locally planar. The adaptive kriging algorithm used by GWSDAT to derive the piezometric surface requires a minimum of 4 well locations; flow direction arrows can, however, be generated for only 3 well locations.
GWSDAT handles non-detect data by a method of substitution. In accordance with general convention, the default option is to substitute the non-detect data with half its detection limit, e.g. ND<50ug/l is substituted with 25ug/l. Alternatively, non-detect data can be substituted with its full detection limit, e.g. ND<50ug/l is substituted with 50ug/l. Note that the entry of zero concentration values is not permitted.
During data analysis the user has the option to ignore the presence of NAPL when fitting the spatiotemporal model, or substitute detections of NAPL with site maximum solute concentrations. NAPL substitution should only be used if it is known that the solutes entered into GWSDAT are derived from dissolution of the NAPL. This functionality was introduced to avoid the situation whereby an area of wells containing NAPL appears as a minimum on concentration contour plots because groundwater solute concentration data is not available.
Note: Any solutes that are not derived from the NAPL can be excluded from the NAPL substitution process by flagging them as “NotInNAPL” or “E-acc” in the historical monitoring data table of the input worksheet. Note also that only one solute data point needs to be flagged to remove that solute from the substitution algorithm.
Data Analysis/Plotting functions: Spatial plot window
The “GWSDAT options” dialogue box, which appears when “GWSDAT analysis” is selected, allows the user to select the time interval between spatiotemporal model output plots. The pre-defined user options are “None”, “Monthly” or “Quarterly”: the model sets the start and end dates for the intervals by working backwards from the most recent sampling date. Concentration contour plots are generated by exporting data from the spatiotemporal model at the end of each specified time interval.
The “GWSDAT options” dialogue box also controls the handling of groundwater elevation data. If no aggregation (i.e. “None”) is selected then the software will attempt to generate a groundwater contour plan for every date in the input dataset. In practice, however, groundwater elevation surveys are often spread over a number of days and so this approach is likely to generate incomplete contour plots. If “monthly” or “quarterly” aggregation selected the software collates daily groundwater elevation data into monthly or quarterly blocks, thereby increasing the size of the dataset available for piezometric contouring.
Solute concentration contours are generated using a spatiotemporal smoother algorithm, which fits a model to the solute concentration distribution through space (XY well coordinates) and time. This does not involve any temporal collation of the input concentration dataset. For further details refer to GWSDAT software manual.
Piezometric contours are generated using an adaptive kriging algorithm. The degree of flexibility allowed by the kriging algorithm is a function of the number of groundwater elevation data points in the selected model output interval, which improves the contour quality for smaller datasets.
Solute concentration contour plots have default logarithmic scales of 0 to >50,000 ng/L, 0 to >5000 ug/L or 0 to > 500 mg/L, dependent on the units selected. The concentration scale is fixed so that contour plots for successive time slices are directly comparable. The user can, however, select to “scale colours to data” to produce a colour key scaled to the concentration range for each model output interval.
Data Analysis/Plotting functions: Well trend and indicator matrix plot windows
The use of non- parametric statistics allows the analysis of cyclical trends in groundwater solute concentrations (e.g. concentrations that increase and decrease through time). For example the Mann Kendal function, which is commonly used to evaluate trends in groundwater solute concentrations, is a monotonic function that cannot fit cyclical variation/ short term fluctuations.
Cells in the trend/ threshold indictor matrix will switch to grey in the event that the concentration trend cannot be calculated because there is insufficient data, or if confidence in the trend smoother estimate is poor. Regions of poor confidence in the trend fit are indicated where the trend smoother and 95% confidence limit curves are coloured grey, rather than blue, on the well trend plot. Regions of poor confidence are defined as where the upper 95% confidence limit (dashed blue/ grey line) exceeds 10x the trend fit value (solid blue line).
Cells in the threshold indictor matrix switch colour dependent on the value of the upper 95% confidence limit relative to the user specified threshold concentrations. So, for example, a threshold matrix cell will only switch from red to green when the upper 95% confidence limit of the trend fit is below the user specified threshold concentration.
The user can compare measured concentration data directly against the user specified threshold values by choosing the “Threshold absolute” option in the trend matrix display table. This option is useful for highlighting any one-off measurements that exceed the concentration threshold but are not statistically significant. The user can then determine if it is worthwhile collecting additional data to validate the result.
This issue sometimes arises with low screen resolution. To modify the size of the GWSDAT Graphical User Interface you need to go in to the Excel VBA code module “ConfigParams” within “GWSDAT V2.0.xla” and change the following line of code:
(To access the VBA code –press Alt+F11 from Excel).
Change the code:
Public Const PanelScaling = 1 'Sizing of GWSDAT interface
To something like this:
Public Const PanelScaling = “0.75" 'Sizing of GWSDAT interface
When you are happy with the size of the GWSDAT user interface save the add-in by choosing File->Save GWSDAT V2.0.xla from the VBA editor.
Useful Links & Presentations
Supporting information for the Groundwater article:
The authors gratefully acknowledge those people who have contributed their knowledge and time to the development of GWSDAT.
The authors wish to express their gratitude to Adrian Bowman, Ludger Evers and Daniel Molinari from the department of Statistics, University of Glasgow, for their invaluable contributions to the development of the spatiotemporal algorithm.
Thanks also to Ewan Crawford from the University of Glasgow for his assistance in the development of the GWSDAT user interface.
We acknowledge and thank the R project for Statistical Computing and all its contributors without which this project would not have been possible.
A big thank you to Shell's worldwide environmental consultants for assistance in evaluating and testing the earlier versions of GWSDAT. Thanks also to the Shell Year in Industry students who spent a great deal of time testing GWSDAT and making suggestions for improvements.
We thank both current and former colleagues including Matthew Lahvis, Jonathan Smith, George Devaull, Dan Walsh, Curtis Stanley, Marco Giannitrapani and Philip Jonathan for their support, vision and advocacy of GWSDAT.
- W. R. Jones, M. J. Spence; A. W. Bowman, L. Evers, D. A. Molinari, 2014. A software tool for the spatiotemporal analysis and reporting of groundwater monitoring data. Environmental Modelling & Software, 55, 242-249.
- Evers, L., Molinari, D. A., Bowman, A. W., Jones, W. R., Spence, M. J., (in Press). Efficient and automatic methods for flexible regression on spatiotemporal data, with applications to groundwater monitoring, Environmetrics.
- Adrian W. Bowman and Adelchi Azzalini. sm: Smoothing methods for nonparametric regression and density estimation. R package, www.stats.gla.ac.uk/~adrian/sm
- Adrian W. Bowman and A. Azzalini. Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford, 1997.
- Eilers, P. H. C., Marx, B. D., 1992. Generalized Linear Models with P-Splines in Advances in GLIM and Statistical Modelling (L.Fahrmeir et al.eds.). Springer, New York.
- Eilers, P. H. C., Marx, B. D., 1996. Flexible smoothing with b-splines and penalties. Statistical Science 11, 89–121.
- R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2008. ISBN 3-900051-07-0, http://www.r-project.org
- W. R. Jones, M. J. Spence, Matthijs Bonte. Analyzing Groundwater Quality Data and Contamination Plumes with GWSDAT. Groundwater. doi:10.1111/gwat.12340.
- Ricker, J.A. 2008. A Practical Method to Evaluate Ground Water Contaminant Plume Stability. Ground Water Monitoring & Remediation 28, no. 4: 85–94.