Made changes pertaining to changes from sptatstat on which this package depends greatly.
Mostly some minor modifications but some limited image warping capability has been added via iwarper and warper. Per request by Gregor Skok, the previously added (v. 0.8) dNSS functions have been removed.
A new function for calculating G from Gilleland (2021) has been added in addition to the ones previously added in 0.6-6. Gregor Skok's contributed functions for calculating dFSS have been fixed so that they do not quit from R under certain circumstances (now the function will merely stop and/or produce a warning in those situations). Additional new contributions from Gregor for calculating the dNSS have also been added. Finally, some changes to spatstat required some additonal modifications that the user need not be concerned about.
Fixed a bug in the new Gbeta (and similar) functions whereby the y1 and y2 terms were large enough that when multiplied together before dividing by β caused computational problems. The end result was that Gbeta often produced a value of zero instead of the correct value. Now each term is divided by the square root of β before multiplying them together.
A circle histogram (aka windrose diagram) has been added, called CircleHistogram as originally ported into R by Matt Pocernich.
New spatial-alignment summary measures proposed in Gilleland (2021) have been added. Namely, Gβ, Gβ,IL, and G2,β,IL.
craer was removed because it was not clear if it was working properly.
Most of the ICP test cases needed to be removed because of new restrictions on the size of R packages. A new package has been made, and are now available from these web pages (ICPtestcases_1.0.tar.gz).
Axes have been added to all plots using the map function from package maps using map.axes() from the self-same package.
An error was found in the Cindex function wherreby it was adding up the number of zero-valued grid points instead of the one-valued ones. This error has been fixed beginning with this release.
Minor changes to remain current with future R versions.
Primarily, this version fixes a would-be bug in future releases because of a change to spatstat on which this package greatly depends.
The centered and squared version of Baddeley's Delta metric now takes the p and constant arguments.
The absolute value operator was missing from the structurogram function, which is only a problem for odd values of q. The bug is fixed in this version.
Functions to calculate the new dFSS measure are included from Gregor Skok.
Per a user's suggestion, bias will now return NaN when hits + misses = 0 in vxstats. Before it returned a large number because it re-set the denominator to 1e-8 instead of zero (i.e., approaching the limit). Also, for hoods2dPlot and fss2dPlot, a new matplotcol argument has been added so that the color scheme can be changed instead of set at 1:6 (default). In some cases, the incorrect colors might have been plotted.
With the new thresholder function, a comment given when verbose is TRUE in the hoods2d function still tried to grab the threshold from the old procedure, which resulted in an error. The bug has been fixed, but a quick fix before this version is released would be to simply set verbose to FALSE when running the function.
A bug was found where hoods2d was not returning the correct information for the fuzzy logic neighborhood method. It is now fixed.
A new function called censqdelta calculates the Baddeley delta metric after first centering the binary sets, A and B, onto a square grid to facilitate comparisons across cases. The Baddeley delta metric is useful, but it is also sensitive to the domain size and the position of the sets within the domain, and this function alleviates the issue. See the examples in the help file to see how it solves the problem.
The plot method function for "matched" objects was not plotting the correct colors for feature numbers. The bug has been fixed for this version.
The variography measure introduced in Ekström (2016) has been added (see ?variographier).
The partial Hausdorff method was taking the k-th lowest (instead of the k-th highest) value. This bug has been fixed in this version. For earlier versions, simply use the N - k + 1 value, where N is the total number of grid points (e.g., if you want the 4-th highest, use k = N - 4 + 1 instead of 4).
Re-wrote the plot method function for both "SpatialVx" and "matched" objects to be considerably more streamlined, which also fixed a bug or two. Removed the set.pw argument from all plotting routines, replacing most with an mfrow argument, and not giving the option for others (use par to change mfrow or mfcol yourself instead, as one normally would do).
Added a generic function called thresholder with a default method for matrices and a method for "SpatialVx" objects. In particular, many functions now allow the user to change the rule (e.g., so that you can keep values below a threshold instead of above, or use non-inclusive greater/less than statements). Most, but not all, functions now use the new functionality; in particular, hoods2d.
Updated the functions for performing wavelet-based approaches to be more consistent with the newer style (e.g., generic function that calls a function to handle "SpatialVx" objects versus matrices. Streamlined their plotting routines as well.
Several bugs were found in deltamm that either caused errors, or just gave bad results. These bugs have been fixed, but note that I still consider this function to be under testing.
The 2-d interpolation function, Fint2d, was returning all zeros for the gradients (if derivs = TRUE) because of a scoping issue, which has now been fixed.
The saller function estimated the location component, L1, by using the centroid of the field with only identified features rather than the raw field as was originally proposed. Now, the L and L1 components should better match those found using other software. Alternative L1 and L components are returned with the object that use the centroid originally used in this package as such a definition may be reasonable. Originally, it was intended to have a slightly different SAL, but it was decided to have the originally prposed one as the default, with this alternative returned but not emphasized.
The craer function was calculating the MSE breakdowns on the binary fields defining the features rather than on the original fields. This problem has been fixed.
The partial Hausdorff distance was not the partial Hausdorff distance, but something else (something still potentially useful). The partial Hausdorff now gives the correct value and a new measure (qdmapdiff) gives the k-th largest (or a quantile) of the difference in distance maps. Thanks to Barbara Casati for finding this error.
The mean error (and mean square error) distances now give both directions.
A bug in locmeasures2d (or, really, locperf) was discovered by Barbara Casati whereby the modified Hausdorff distance returned some kind of gibberish. An additional bug was also discovered whereby if one did not call one of a few of the other measures, the function would error out. Attempting to fix this bug proved to be more trouble than it is worth. Therefore, this measure has been removed from consideration in this package. It can, however, easily be computed by applying locmeasures2d to compute mean error distance in both directions. That is,
A <- locmeasures2d( object = obsField, Y = fcstField, which.stats = "med" )
B <- locmeasures2d( object = fcstField, Y = obsField, which.stats = "med")
and then taking the maximum of the two values A and B (or the minimum, as is defined in some more recent literature).
Per a request, a function to do the spatial prediction comparison test (SPCT) for irregular spatial locations has been added. Namely, spct, which works with expvgram; not to be confused with expvg, which works with the old spatMLD (now lossdiff) to do SPCT on regular grids.
spatMLD has been renamed to lossdiff, fit.spatMLD to flossdiff and empiricalVG.lossdiff has been added as an additional step. The trend estimation used by spatMLD did not test for significance of the trend. It is an important step that perhaps should not have been automated. The procedure now has a new step conducted by the new function empiricalVG.lossdiff that fits the empirical variogram to the loss differential field created by lossdiff. It is now up to the user to determine if a trend exists, and if so, estimate that trend before calling empiricalVG.lossdiff. The trend can then either be removed before calling empiricalVG.lossdiff or within the call (the trend removed is returned in the output object). The function lossdiff still estimates a trend using lm, and the output object is printed in the call to print so that one can quickly see the results.
The three feature identification functions (convthresh, threshfac and threshsizer) have been combined into a single function called FeatureFinder, which also allows for more flexibility including: (i) combining the different methods (e.g., convolution smoothing with a further requirement that all features be at least min.size grid squares large), (ii) allows one to identify only smaller features (using the max.size argument), (iii) allow for a different smoothing parameter for each of the two fields (if smoothing is performed), (iv) allow for a different min/max size across fields, and (v) allow for different threshold factors for the two fields.
The fit.spatMLD function has been modified to use a more reliable fitting routine so that, with luck, it will not fail to find a good fitting variogram as often. It is possible to write your own variogram function to use with fit.spatMLD.
A bug fix from a dependent package caused an error on a developmental version of R. This version fixes that problem.
This version includes a bug fix where centmatch would sometimes fail erroneously if only one feature existed in one or both fields.
Additionally, new functions now exist that allow one to carry out the contiguous rain area (CRA) method of Ebert and McBride (2000). In particular, the new functions rigider (finds an optimal rigid transformation), rigidTransform (performs a rigid transformation for given translation and rotation parameters), Fint2d (2-d interpolation), and craer (perform the CRA method for any object of class "matched". Also, a new feature matching function based on minimum boundary separation has been added (minboundmatch).
This version fixes a user-found bug in the print function for deltamm objects whereby it could not figure out that matches existed even when they did. The function imomenter has also been added for calculating image moments.
This update addresses some new changes in the spatstat package related to some now depracated functions. Thanks to Adrian Baddeley for informing me about them.
A bug in the vxstats function has been fixed whereby it used to give erroneous results for the contingency-table based smoothing filter methods.
Additionally, functionality for compositing features (compositer) and for identifying boundary points along user-specified angles (hiw) has been added. In the former case, similar methods as proposed by Nachamkin (2004) and Micheas et al. (2007) can be performed. Note, however, that the exact methodology differs in each case (see the associated help files). The function hiw allows for an object to be set up that can then be analyzed using the shapes package in R (its summary function gives some of the SS values, namely: location/translation, and average, minimum and maximum intensity SS).
This version fixes the bugs introduced with the previous version (0.1-8), but also includes some new functionality. Specifically, a new function called MergeForce takes objects returned by functions like centmatch and deltamm and forces the mergings into a new "matched" object. Further, new MODE functions have been addd that more-or-less round out the MODE method (one could do infinitely many other things with the MODE approach). In particular, FeatureTable conducts the feature-based contingency table approach and interester carries out the fuzzy logic approach of MODE.
No new methods added to this version, but numerous bugs have been fixed, and the feature-based code has been greatly simplified. In particular, FeatureSuite, FeatureFunPrep and their method functions have been removed in favor of the users' performing each step in the feature-based paradigm themselves. This change should not only make the coding side of things easier, but it should also prove easier and more transparent for the user. One bug since this release that has been found (and will be fixed for the next version) is that saller, threshfac and threshsizer did not get the necessary updates, and so are temporarily unusable (this package is still a work in progress).
Notable for this release is that the centroid distance is no longer calculated using the spatstat function centroid.owin and can be given in either lon/lat or indexed coordinates. Centroid distance can now be computed using any user-provided distance function with the correct formatting. The default (which generally should be preferred) is to use rdist from package fields, but if great-circle distance is appropriate, then rdist.earth (also from fields) can be used (in which case the centroid will be automatically calculated from the user-provided lon/lat coordinates given in the call to make.SpatialVx). Additionally, both centmatch and deltamm have been fixed and simplified (they were not working correctly in all cases). Further, deltamm was slow because it was having to re-calculate the distance maps for every call to deltametric. Now, deltametric is not called, the distance maps are first calculated and re-used as appropriate, and most (not all) metrics that were calculated multiple times are now only calculated once. The result is a much faster function.
Only minor bug fixes (more to come) and some new method functions.
This version contains some improvements to the features-based verification functions; e.g., some new method functions for some of the main functions. There is also now a function for calculating the bearing from one feature to another (see the function help file for bearing), which was adapted from code written by Randy G. Bullock.
A function for doing the cluster analysis procedure described in Marzban and Sandgathe (2006), called clusterer has been added. Additionally, Caren Marzban provided some functions written by Hillary Lyons for doing the variation of the cluster analysis approach proposed in Marzban and Sandgathe (2008), which have been modified and are included as CSIsamples. The package fastcluster is now required, which enables use of a fast cluster procedure through hclust (Daniel Müllner, fastcluster: Fast hierarchical clustering routines for R and Python, Version 1.1.6, 2012. Available at CRAN http://cran.r-project.org and http://math.stanford.edu/~muellner).
In addition to the cluster analysis, Caren Marzban also provided code for performing the optical flow verification procedure introduced in Marzban and Sandgathe (2010), which too has been modified and made available primarily through the new funciton OF.
Functionality to estimate the empirical structure function described in Harris et al (2001) has been added via the functions structurogram and structurogram.matrix.
For each of the objects returned by the cluster analysis, optical flow and structure functions, there are method functions (e.g., plot, summary, hist) available. See the help files for each function to see which specific method functions are available.
In addition to the new required package fastcluster, two other packages are now required: CircStats (S-plus original by Ulric Lund and R port by Claudio Agostinelli
(2009). CircStats: Circular Statistics, from "Topics in circular Statistics" (2001). R package version 0.2-4. http://CRAN.R-project
.org/package=CircStats), which allows for computing statistics for cicular data (e.g., feature axis angles, bearings, etc.) as well as plotting features such as rose.diag. The maps package is also now required (Original S code by Richard A. Becker and Allan R. Wilks. R version by Ray Brownrigg. Enhancements by Thomas P. Minka
New for this version of SpatialVx is the 2-d Gaussian Mixture Model (GMM) method introduced in Lakshmanan and Kain (2010). In particular, the function gmm2d fits a GMM to both fields of a verification set and its summary function calculates the various errors. For speed, the turboem function from package turboEM is employed (Bobb, J. F., H. Zhao and R. Varadhan, 2012. turboEM: A Suite of Convergence Acceleration Schemes for EM and MM algorithms. R package version 2012.2-1. http://CRAN.R-project.org/package=turboEM).
Also new in this version are the functions S1, which calculates the S1 score, and ACC, which calculates the anomoly correlation between two fields (provided climatologies are supplied).
This version adds a few new methods to the mix. In particular, the spatial prediction comparison test (SPCT) proposed by Hering and Genton (2011) has been added via the functions spatMLD, fit.spatMLD and their summary and plot method functions (summary actually does the test once
In association with the above method, a variogram function is included that is a modification of vgram.matrix from the fields package called variogram.matrix, which works exaclty the same as vgram.matrix, but allows for missing values. The function vgram.matrix is a fast function for calculating the empirical variogram for matrices (i.e., appropriate when the data are gridded so that grid information can be used to calculate distances). In particular, the plot method function for vgram.matrix also works for variogram.matrix. Therefore, the SPCT can be applied to precipitation fields when grid points with zeros in the verification field, and two forecast fields are all zero. This is accomplished (automatically by spatMLD if zero.out=TRUE) by setting zero values to NA so that they are subsequently not involved in any calculations. Thus, this can be used to make variogram plots such as those proposed in Marzban and Sandgathe (2009).
Some geometric characterization measures introduced in AghaKouchak et al. (2011) are included here. These include the connectivity index (Cindex), shape index (Sindex), and area index (Aindex).
Finally, the bias-corrected versions of the threat score and Gilbert Skill Score introduced in Mesinger (2008) are included in the vxstats function.
While still a work-in-progress, this version expands considerably on the intial release. Future releases will be aimed at including all of the methods applied in the ICP, as well as some others. This version contains most of the location measures/metrics including (see Baddeley, 1992) : (i) Baddeley's binary image metric, (ii) Hausdorff metric, (iii) partial Hausdorff measure, (iv) mean error distance, (v) mean square error distance, (vi) Pratt's Figure of Merit (FOM), (vii) minimum separation between boundaries, (viii) modified Hausdorff distance, (ix) metrV (Zhu et al, 2011), and (x) the Forecast Quality Index (FQI) of Venugopal et al. (2005).
In addition to the above location metrics/measures, the wavelet denoising technique of Briggs and Levine (1997) (a neighborhood type method) is included, along with the scale separation wavelet methods of Briggs and Levine (1997), Casati et al. (2004), and Casati (2010). Functions from the waveslim package are used, and in particular it is allowed to have non-dyadic fields using the maximal overlap discrete wavelet transform (see help file for SpatialVx, and the associated functions for more information and references).
Some features-based functionality has also been added, though these functions are still in their infancy, and major changes could happen in future releases. However, there is functionality to do object identification in the manner of convolution thresholding (Davis et al., 2006) and just thresholding (both of which use the connected function from the spatstat package, which uses a very computationally efficient algorithm for labelling connected objects in a field), some merging and matching (e.g., by centroid distance ala Davis et al., 2006, and using Baddeley's delta metric ala Gilleland et al., 2008), and some analysis techniques including the SAL method (Wernli et al., 2008), as well as functions for simply computing various properties of either single objects or object pairs (e.g., axis angle, angle difference, aspect ratio, centroid, centroid distance, the various location metrics and measures mentioned above, area, intersection area, area ratio).
In addition to the methods outlined in Gilleland et al. (2009, 2010), there is also the field significance method of Elmore et al. (2006). The functions used here rely on functions from package boot.
This new version of SpatialVx relies heavily on existing R packages, and therefore has new dependencies. In particular, the location metrics/measures that rely on quickly computing distances between every point in the field, and some binary event utilize functions such as deltametric and distmap from package spatstat (A. Baddeley and R. Turner, 2005. Spatstat: an R package for analyzing spatial point patterns. Journal of Statistical Software 12 (6), 1-42. ISSN: 1548-7660. URL: www.jstatsoft.org). Many functions from this same package are also used for the features-based methods, as this package already contains many functions for doing image analysis work. Additionally, the package smatr is used for finding the major axis of features/objects (David Warton, Remko Duursma, Daniel Falster and Sara Taskinen, 2011. smatr: (Standardised) Major Axis Estimation and Testing Routines. R package version 3.2.4. http://CRAN.R-project.org/package=smatr). Package waveslim is used for all of the wavelet analysis functions (Brandon Whitcher, 2010. waveslim: Basic wavelet routines for one-, two- and three-dimensional signal processing. R package version 1.6.4. http://CRAN.R-project.org/package=waveslim). Finally, the boot package is used for the field significance approach (Angelo Canty and Brian Ripley, 2012. boot: Bootstrap R (S-Plus) Functions. R package version 1.3-4.). Thanks to Brandon Whitcher for agreeing to re-build/submit waveslim in the new R version so that it can be used here.
At present, the maps package (Original S code by Richard A. Becker and Allan R. Wilks. R version by Ray Brownrigg. Enhancements by Thomas P Minka
The initial release (version 0.1-0) included functions for doing most of the neighborhood forecast verification methods including (see Ebert, 2008): (i) Fractions Skill Score (FSS), (ii) upscaling, (iii) minimum coverage, (iv) fuzzy logic, (v) multi-event contingency table, (vi) pragmatic, and (vii) practically perfect hindcast. At this point, no functionality for handling more than one forecast time is provided, but such ability is planned for a future release.
Package dependcies for this initial release are limited to fields (Reinhard Furrer, Douglas Nychka and Stephen Sain (2012). fields: Tools for spatial data. R package version 6.6.3. http://CRAN.R-project.org/package=fields), and it is expected that many more of these functions will utilize tools from this package. Thanks to Doug Nychka for re-building/submitting this package under the newer R versions so that it could be installed with this package!