COPOD: Copula-Based Outlier Detection

Li, Zheng; Zhao, Yue; Botta, Nicola; Ionescu, Cezar; Hu, Xiyang

DetailsSummary

COPOD: Copula-Based Outlier Detection

Li, Z., Zhao, Y., Botta, N., Ionescu, C., Hu, X. (in press): COPOD: Copula-Based Outlier Detection. - In: IEEE International Conference on Data Mining (ICDM), IEEE.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://publications.pik-potsdam.de/pubman/item/item_24536 Version Permalink: https://publications.pik-potsdam.de/pubman/item/item_24536_1

Genre: Book Chapter

Files

show Files

hide Files

:

Zheng Li, Botta_etal 2020 COPOD_ Copula-Based Outlier Detection-1.pdf (Preprint), 333KB

File Permalink:
-

Name:
Zheng Li, Botta_etal 2020 COPOD_ Copula-Based Outlier Detection-1.pdf

Description:
-

OA-Status:

Visibility:
Private

MIME-Type / Checksum:
application/pdf

Technical Metadata:

Copyright Date:
-

Copyright Info:
-

License:
-

Locators

show

Creators

show

hide

Creators:
Li, Zheng¹, Author
Zhao, Yue¹, Author
Botta, Nicola², Author
Ionescu, Cezar², Author
Hu, Xiyang¹, Author

Affiliations:
1External Organizations, ou_persistent22
2Potsdam Institute for Climate Impact Research, ou_persistent13

Content

show

hide

Free keywords: outlier detection, anomaly detection, copula

Abstract: Outlier detection refers to the identification of rare
items that are deviant from the general data distribution. Existing
approaches suffer from high computational complexity, low
predictive capability, and limited interpretability. As a remedy, we
present a novel outlier detection algorithm called COPOD, which
is inspired by copulas for modeling multivariate data distribution.
COPOD first constructs a empirical copula, and then uses it to
predict tail probabilities of each given data point to determine its
level of “extremeness”. Intuitively, we think of this as calculating
an anomalous p-value. This makes COPOD both parameter-free,
highly interpretable, and computationally efficient. In this work,
we make three key contributions, 1) propose a novel, parameterfree
outlier detection algorithm with both great performance
and interpretability, 2) perform extensive experiments on 30
benchmark datasets to show that COPOD outperforms in most
cases and is also one of the fastest algorithms, and 3) release an
easy-to-use Python implementation for reproducibility.

Details

show

hide

Language(s):

Dates: Accepted: 2020-09-01

Publication Status: Accepted / In Press

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: MDB-ID: No data to archive
PIKDOMAIN: RD4 - Complexity Science
Organisational keyword: RD4 - Complexity Science
Model / method: Machine Learning
Model / method: Nonlinear Data Analysis

Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show

hide

Title: IEEE International Conference on Data Mining (ICDM)

Source Genre: Book

Creator(s):

Affiliations:

Publ. Info: IEEE

Pages: - Volume / Issue: - Sequence Number: - Start / End Page: - Identifier: -