The Violin Plot is used to indicate the probability density of data at different values and it is quite similar to the Matplotlib Box Plot.
- These plots are mainly a combination of Box Plots and Histograms.
- The violin plot usually portrays the distribution, median, interquartile range of data.
- In this, the interquartile and median are statistical information that is provided by the box plot whereas the distribution is being provided by the histogram.
- The violin plots are also used to represent the comparison of a variable distribution across different “categories”; like the Box plots.
- The Violin plots are more informative as they show the full distribution of the data.
Here is a figure showing common components of the Box Plot and Violin Plot:
Creation of the Violin Plot
The violinplot()
method is used for the creation of the violin plot.
The syntax required for the method is as follows:
violinplot(dataset, positions, vert, widths, showmeans, showextrema,showmedians,quantiles,points=1, bw_method, *, data)
Parameters
The description of the Parameters of this function is as follows:
- datasetThis parameter denotes the array or sequence of vectors. It is the input data.
- positionsThis parameter is used to set the positions of the violins. In this, the ticks and limits are set automatically in order to match the positions. It is an array-like structured data with the default as = [1, 2, …, n].
- vertThis parameter contains the boolean value. If the value of this parameter is set to true then it will create a vertical plot, otherwise, it will create a horizontal plot.
- showmeansThis parameter contains a
boolean
value with false as its default value. If the value of this parameter is True, then it will toggle the rendering of the means. - showextremaThis parameter contains the boolean values with false as its default value. If the value of this parameter is True, then it will toggle the rendering of the extrema.
- showmediansThis parameter contains the boolean values with false as its default value.If the value of this parameter is True, then it will toggle the rendering of the medians.
- quantilesThis is an array-like data structure having None as its default value.If value of this parameter is not None then,it set a list of floats in interval [0, 1] for each violin,which then stands for the quantiles that will be rendered for that violin.
- pointsIt is scalar in nature and is used to define the number of points to evaluate each of the Gaussian kernel density estimations.
- bw_methodThis method is used to calculate the estimator bandwidth, for which there are many different ways of calculation. The default rule used is Scott’s Rule, but you can choose ‘silverman’, a scalar constant, or a callable.
Now its time to dive into some examples in order to clear the concepts:
Violin Plot Basic Example:
Below we have a simple example where we will create violin plots for a different collection of data.
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(10)
collectn_1 = np.random.normal(120, 10, 200)
collectn_2 = np.random.normal(150, 30, 200)
collectn_3 = np.random.normal(50, 20, 200)
collectn_4 = np.random.normal(100, 25, 200)
data_to_plot = [collectn_1, collectn_2, collectn_3, collectn_4]
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
bp = ax.violinplot(data_to_plot)
plt.show()
The output will be as follows:
Time For Live Example!
Let us take a look at the Live example of the Violin Plot: