Google News
logo
Data Analyst - Interview Questions
What are the ways to detect outliers? Explain different ways to deal with it.
Outliers can be detected in several ways, including visual methods and statistical techniques :

* Box Plots : A box plot (or box-and-whisker plot) can help you visually identify outliers. Points that are located outside the whiskers of the box plot are often considered outliers.

* Scatter Plots : These can be useful for spotting outliers in multivariate data.

* Z-Scores : Z-scores measure how many standard deviations a data point is from the mean. A common rule of thumb is that a data point is considered an outlier if its z-score is greater than 3 or less than -3.

* IQR Method : The interquartile range (IQR) method identifies as outliers any points that fall below the first quartile minus 1.5 times the IQR or above the third quartile plus 1.5 times the IQR.

* DBSCAN Clustering : Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a density-based clustering algorithm, which can be used to detect outliers in the data.

Explain a hash table :

A hash table, also known as a hash map, is a data structure that implements an associative array abstract data type, a structure that can map keys to values. Hash tables use a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.

Hash tables are widely used because they are efficient. In a well-dimensioned hash table, the average cost (in terms of time complexity) for each lookup is independent of the number of elements stored in the table. Many programming languages have built-in support for hash tables, including Python (dictionaries), JavaScript (objects), and Java (HashMap).
Advertisement