---
title: "Comparison of CoordinateCleaner to other tools"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Comparison of CoordinateCleaner to other tools}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

##Background
Erroneous database entries and problematic geographic coordinates are a central issue in biogeography and there is a set of tools available to address different dimensions of the problem. CoordinateCleaner focuses on the fast and reproducible flagging of large amounts of records, and additional functions to detect dataset-level and fossil-specific biases. In the R-environment the [scrubr](https://github.com/ropensci-archive/scrubr) and [biogeo](https://CRAN.R-project.org/package=biogeo) offer cleaning approaches complementary to CoordinateCleaner. The scrubr package combines basic geographic cleaning (comparable to cc_dupl, cc_zero and cc_count in CoordinateCleaner) but adds options to clean taxonomic names (See also [taxize](https://github.com/ropensci/taxize)) and date information. biogeo includes some basic automated geographic cleaning (similar to `cc_val`, `cc_count` and `cc_outl`) but rather focusses on correcting suspicious coordinates on a manual basis using environmental information.


Table 1. Function by function comparison of CoordinateCleaner, scrubr and biogeo.

| Functionality                                                 | CoordinateCleaner 2.0-2                                   | scrubr 0.1.1     |biogeo 1.0| Percent overlap                      |
|--------------------------------|--------------|------------------|------------------|--------------------|
| Missing coordinates                                           | cc_val                                                    | coord_incomplete |missingvalsexclude| 100%                                 |
| Coordinates outside CRS                                       | cc_val                                                    | coord_impossible |-| 100%                                 |
| Duplicated records                                            | cc_dupl                                                   | dedup            |duplicatesexclude| The aim is identical, methods differ |
| 0/0 coordinates                                               | cc_zero                                                   | coord_unlikely   |-| 100%                                 |
| Identical lon/lat                                             | cc_equ                                                    | -                |-| 0%                                   |
| Country capitals                                              | cc_cap                                                    | -                |-| 0%                                   |
| Political unit centroids                                      | cc_cen                                                    | -  |-| 0%                                   |
| Coordinates  in-congruent with additional location information | cc_count                                                  | coord_within    |errorcheck, quickclean| 100%                                 |
| Coordinates assigned to GBIF headquarters                      | cc_gbif                                                   | -                |-| 0%                                   |
| Coordinates assigned to the location of biodiversity institutions  |    cc_inst                                     | -                |-| 0%                                   |
| Coordinates outside natural range                             | cc_iucn                                                  |-                 |-| 0%                                   |
| Spatial outliers                                              | cc_outl                                                   | -                |outliers| 50%, biogeo uses environmental distance                                   |
| Coordinates within the ocean                                  | cc_sea                                                    | -                |-| 0%                                   |
| Coordinates in urban area                                     | cc_urb                                                    | -                |-| 0%                                   |
| Coordinate conversion error                                   | dc_ddmm                                                   | -                |-| 0%                                   |
| Rounded coordinates/rasterized collection                    | dc_round                                                  | -                |precisioncheck| 20%, biogeo test for predefined rasters                                   |
| Fossils: invalid age range                                    | tc_equal                                                  | -                |-| 0%                                   |
| Fossils: excessive age range                                  | tc_range                                                  | -                |-| 0%                                   |
| Fossils: temporal outlier                                     | tc_outl                                                   | -                |-| 0%                                   |
| Fossils: PyRate interface                                     | WritePyrate                                               | -                |-| 0%                                   |
| Wrapper functions to run all test                             | CleanCoordinates, CleanCoordinatesDS, CleanCoordinatesFOS | -                |-| 0%                                   |
| Database of biodiversity institutions                          | institutions                                              | -                |-| 0%                                   |
| Taxonomic cleaning                                            | -                                                         | tax_no_epithet   |-| 0%                                   |
| Missing date                                                  | -                                                         | date_missing     |-| 0%                                   |
| Add date                                                      | -                                                         | date_create      |-| 0%                                   |
| Date format                                                   | -                                                         | date_standardize |-| 0%
| Reformatting coordinate annotation |-|-|a large set of functions|0 %|
| Correcting coordinates using guessing and environmental distance |-|-|a large set of functions|0 %|