Currently implemented data quality checks¶
A list of available checks is shown below:
Check |
Description |
Flag |
Arguments |
Check function |
Target |
---|---|---|---|---|---|
Missing values |
Checks for missing values on the data. |
missing |
completeness* |
missing_values missing_values_data* |
Constant, series** and dataseries* |
Outlier values |
Checks for outlier values on the data. |
outliers |
outliers_method, outliers_nstd, outliers_niqr |
outlier_values |
Constant and data |
Series range |
Checks if series is inside a range. |
series_range |
series_range_values |
series_range |
Series |
Series monotony |
Checks if series is monotonically increasing. |
series_monotony |
series_monotony |
Series |
|
Series increment type |
Checks if series series increment type |
series_increment |
series_increment_type |
series_increment_type |
Series |
* completness argument is only used for dataseries calling missing_values_data
** the check for missing values is always passed over series values as the missing values in the series dimesion have to be removed before passing other tests.
Information about each check argument is shown in the table below:
Argument |
Check |
Description |
Possible values |
Default |
---|---|---|---|---|
completeness |
Missing values |
If set to ‘any’ the check will fail if there is any missing value for any series value. If set to ‘all’ the check will fail if all the data values are missing for a given series value (column). It only has an effect when data is a matrix (2 or more dimensions). |
‘any’ or ‘all |
‘any’ |
outliers_method |
Outlier values |
The method to be used. Can be ‘std’ for standard deviation method or ‘iqr’ for interquartile range method. |
‘std’ or ‘iqr’ |
‘std’ |
outliers_nstd |
Outlier values |
For ‘std’ method, the number of standard deviations to define outliers. |
float > 0 |
2 |
outliers_niqr |
Outlier values |
For ‘iqr’ method, the number of interquartile ranges to define outliers. |
float > 0 |
1.5 |
series_range_values |
Series range |
The minimum and maximum value of the series. |
[float, float] |
[-inf, inf] |
series_increment_type |
Series increment type |
The series distribution. If ‘linear’ will check if the series increment linearly. |
‘linear’ |
‘linear’ |