Introduction to List of R Packages
A package in R programming language is a unit that provides required functionalities that can be utilized by loading it into the R environment. A list of R Packages is similar to a library in C, C++ or Java. So, essentially, a package can have numerous functionalities like functions, constants, etc. that we will allow the user to utilize them in the context of a particular problem. In R, a requisite package can be loaded using library() function. In case, a package is not present, then it can be installed using the install.packages() function. Packages make seemingly difficult tasks easy through its ready-made functionalities.
What are R Packages?
There are many packages in R, and the selection of a package depends on its application. Though there are certain packages that are widely used due to the functionalities they provide, it isn’t the case that other packages are less important. Different packages have different purposes; some are related to statistical techniques, some pertain to visualizations, etc.
In the following section, we will look at some of the important packages in R:
This package is Companion to Applied Regression. It is a big package that provides various functionalities for statistical analysis. Importing this package into the R environment imports other related packages such as MASS, stats, graphics, etc. Some of the functions in the package include Anova, avPlots, Boxplot, carPalette, density plots, infIndexPlot, linear hypothesis, logit, outlier test, qqPlot, residual plots, scatterplot, scatterplot matrix, etc. The extensive capabilities of the package can be gauged from the number of functions it provides.
The package provides a graphical display of a correlation matrix and a confidence interval. The package also provides algorithms to perform matrix reordering. Numerous options include choosing requisite colors, text labels, color labels, layout, etc. Various visualization methods or parameter methods in corrplot package are “circle”, “square”, “ellipse”, “number”, “shade”, “color”, and “pie”. The corrplot function incorporating various options gives a visually appealing representation of correlation amongst different variables, which, otherwise, in normal circumstances, like numbers, are difficult to interpret. Positive correlations are displayed in blue and negative correlations in red. The intensity of color and the size of the circle are proportional to the correlation coefficients.
This package deals with automated data exploration and treatment. It provides an automated data exploration process meant for analytic tasks and predictive modeling. This is crucial as it enables the user to understand data and extract insights. Each variable in the analysis is scanned and analyzed by the package. Further, the package provides functionalities for visualization of these variables using typical graphical techniques. It also provides common data processing methods for treat and format data.
The gmodels package provides various tools in R for plotting data. It contains various functions such as glh.test which is used to test, print, or summarize a general linear hypothesis for a regression model. The function makes. contrasts convert human-readable contrasts into the form that R requires for computation. The matrix returned by make.contrasts can be used as the argument to the contrasts argument of model functions. The coefFrame function fits a model to each subgroup defined by, then returns a data frame with one row for each fit and one column for each parameter. The estimable function computes and tests contrasts and other estimable linear functions of model coefficients for lm, glm, etc. The function fit.contrast computes and tests arbitrary contrasts for regression objects.
This package provides visualizations functionalities through multifarious programming tools. The functions in the package work on the concept of calculation and plotting. The graphical capabilities of the package are demonstrated by various functions such as band plot, boxplot2, col2hex, ci2d, hist2d, text plot, sink plot, balloon pilot, plotCI, plot means, etc. These functions enable working with settings related to color, text, and other intricate graphical aspects of the visualization. They also deal with complex elements involved in statistics-based visualization, e.g. lmplot2, residplot functions that enable the user to drive detailed regression diagnosis through diagnostic plots. If multiple data needs to be plotted in the same region, but with separate axes, then this is possible using over plot function in the package.
It is one of the very famous packages in R that provides extensive visual capabilities and presents the results even of complex statistical and mathematical techniques. The numerous functionalities provided by the package enables the analyst to derive insights from data in the most interactive fashion. The R description for the function is “a system for declaratively creating graphics which is based on the Grammar of Graphics”. This grammar of graphics means that the user has to tell ‘ggplot2’ about the way variables have to be mapped to aesthetics, so this essentially means that specifying what graphical aspects to using, and ggplot2 will work accordingly based on the details.
This R package makes it easier to work with dates and times. The lubridate package enables easy manipulation of date and time data. It parses a number and gives suitable data arrangement, in fact, the parse functions in the package handle a wide variety of formats and separators that simplifies the parsing process. One of the notable features is that the package provides functionalities to handle dates with different time zones.
Named Harrell Miscellaneous, the Hmisc package contains many functions that can be leveraged for data analysis, high-level graphics and utility operations. It also includes functions for computing sample size and power, importing and annotating datasets, imputing missing values, providing advanced table functionalities, clustering of variables, manipulation of the character string, conversion of R objects to HTML code, etc.
The package offers a high-level data visualization system that was inspired by Trellis graphics. It emphasizes on multivariate data. The powerful visualization capabilities of the package provide the needed graphical solution. Some of the notable functions in the package are B_07_cloud which helps produce 3d scatter plot and wireframe surface plot; D_level. colors, a function to compute false colors representing numeric or categorical variable; B_06_levelplot, a function that generates level plots and contour plots; A_01_Lattice, a function that provides Lattice Graphical capabilities. B_09_tmd is a function that generates Tukey Mean – Difference Plot; B_11_oneway, a function that fits the One-way Model. The package, thus, provides extensive functionalities for visualizations through various functions.
This package allows modeling with sparse and dense ‘Matrix’ matrices. To accomplish this it uses modular prediction and response, module classes. All the functions provided by the package are equally important, some of which are lm.fit.sparse which is a fitter function for sparse linear models, solveCoef which solves for the coefficients and coefficient increment, model. A matrix that constructs possibly sparse design or model matrices, glm4 which fits generalized linear models.
The package allows for multiple comparisons of k groups in generalized linear models. A list of nine standard procedures viz. Dunnet, Tukey, Sequen, AVE, Changepoint, Williams, Marcus, McDermott, and Tetrade, is provided to the user, and the user selects the comparisons based on the requirement. In addition to this, a free input interface is also provided for the contrast matrix which allows for special comparisons. The noteworthy feature is that the comparisons itself are not restricted to any particular design such as balanced or simple, rather the programs are designed in such a manner that they suit multiple comparisons within the general linear model which allows for covariates, correlated means, missing values, etc.
This package basically deals with extended structural equation modeling. It provides functionalities to create structural equation models. These models can be manipulated using programming. The models may be specified with matrices or paths such as LISREL or RAM. Some of the types of models include multiple groups, confirmatory factor, mixture distribution, categorical threshold, differential Fit functions, etc.
It is a very important package that provides functionalities for data manipulation. It provides tools for splitting, applying and combining data. It comes with a set of tools that helps solve a common set of problems. E.g. sometimes we may need to break a big task into smaller tasks that are manageable, then we operate on each of the pieces and then finally, we put all the pieces back together.
The package acquires significance owing to various quality analysis functionalities that it provides. It provides Shewhart quality control charts for continuous, attributes and counts data. Among other important charts are Cusum and EWMA charts and Operating characteristics curves. It also offers process capability analysis functionality. Pareto chart and cause-and-effect chart and multivariate control charts are useful tools that are provided by the package.
As the name suggests, this package is used to build a random forest algorithm. The package implements Breiman’s random forest algorithm, which is based on Beiman and Cutler’s original FORTRAN code. The algorithm is used for classification and regression. The package can also be used in unsupervised mode to assess proximities among data points.
It is a package meant for a special purpose. The package provides a procedure for psychological, psychometric, and personality research. Functions are primarily for multivariate analysis using various multivariate statistical techniques.