Data Quality: Detection and Management of Outliers

Abstract

Tracking data can potentially be affected by a large set of errors in different steps of data acquisition and processing. Erroneous data can heavily affect analysis, leading to biased inference and misleading wildlife management/conservation suggestions. Data quality assessment is therefore a key step in data management. In this chapter, we especially deal with biased locations, or ‘outliers’. While in some cases incorrect data are evident, in many situations, it is not possible to clearly identify locations as outliers because although they are suspicious (e.g. long distances covered by animals in a short time or repeated extreme values), they might still be correct, leaving a margin of uncertainty. In this chapter, different potential errors are identified and a general approach to managing outliers is proposed that tags records rather than deleting them. According to this approach, practical methods to find and mark errors are illustrated on the database created in Chaps. 2, 3, 4, 5, 6 and 7.

Publication
Spatial Database for GPS Wildlife Tracking Data (eds Urbano F. & Cagnacci F.)
Date

Reference: Urbano F., Basille M. & Cagnacci F. (2014) Data Quality: Detection and Management of Outliers. In Spatial Database for GPS Wildlife Tracking Data (eds Urbano F. & Cagnacci F.), Springer International Publishing, Switzerland, pp. 115–137. DOI: 10.1007/978-3-319-03743-1_8