Correct Answer : Both (A) and (B)
Explanation : R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinformaticians and statisticians for data analysis and developing statistical software. Users have created packages to augment the functions of the R language.
Correct Answer : August 1993
Explanation : R was initially written by Ross Ihaka and Robert Gentleman at the Department of Statistics of the University of Auckland in Auckland, New Zealand. R made its first appearance in August 1993.
Correct Answer : All of the other answers are correct
Correct Answer : # This is a comment
Correct Answer : x <- 5
Correct Answer : paste()
Correct Answer : break
Correct Answer : dim()
Correct Answer : while (x < y)
Correct Answer : if (x > y)
Correct Answer : cbind()
Correct Answer : plot()
Correct Answer : var1 <- var2 <- var3 <- "Orange"
Correct Answer : The + sign
Correct Answer : numeric
Correct Answer : C
Explanation : R allows integration with the procedures written in the C, C++, .Net, Python or FORTRAN languages for efficiency.
Correct Answer : GNU S
Explanation : R is free software distributed under a GNU-style copy left, and an official part of the GNU project called GNU S.
Correct Answer : All of the above
Correct Answer : 4095
Explanation : Elementary commands can be grouped together into one compound expression by braces (‘{’ and ‘}’).
Correct Answer : S
Explanation : The R language is a dialect of S which was designed in the 1980s. Since the early 90’s the life of the S language has gone down a rather winding path. The scoping rules for R are the main feature that makes it different from the original S language.
Correct Answer : 6
Explanation : R language has 6 atomic data types. They are logical, integer, real, complex, string (or character) and raw. There is also a class for “raw” objects, but they are not commonly used directly in data analysis.
Correct Answer : ==
Correct Answer : sqrt()
Correct Answer : Find the number of characters in the str string
Correct Answer : my_function <- function()
Correct Answer : my_function()
Correct Answer : data.frame()
Correct Answer : fruits <- list("banana", "apple", "orange")
Correct Answer : fruits <- c("banana", "apple", "orange")
Correct Answer : in
Correct Answer : Scalar variable
Explanation : A scalar variable was a variable which holds one value at a time. It is a single component which assumes a range of number or string values. A scalar value is associated with every point in space.
Correct Answer : mapply()
Explanation : The mapply() function can be used to automatically “vectorize” a function. What this means is that it can be used to take a function that typically only takes single arguments and create a new function that can take vector arguments.
Correct Answer : Statistics, Probability, Distributions
Explanation : R has many functions for all types of mathematical objects. For example, Statistics, Probability, Distributions like Multivariate, Continuous, Simple, Discrete etc.
> x <- 1 > print(x)
Correct Answer : 1
Explaination : When a complete expression is entered at the prompt, it is evaluated and the result of the evaluated expression is returned.
> x <- 5 > x
> x <- 5 > print(x)
> x <- "auto" > x
> x <- "auto" > x <- "auto"
Correct Answer : > x <- 5 > print(x)
Explaination : Print command is used for outputting the value.
Correct Answer : source(“commands.Râ€)
Explanation : For Windows, Source is also available on the File menu.
Correct Answer : sink
Explanation : sink() restores it to the console once again.
Correct Answer : workspace
Explanation : All objects created during an R session can be stored permanently in a file for use in future R sessions.
x <- c(4, 5, 1, 2, 3, 3, 4, 4, 5, 6) x <- as.factor(x)
Correct Answer : x becomes a factor
Explaination : Factors are used to represent categorical data and can unordered and ordered. One can think of a factor as an integer vector where each integer has a label. Factors are important in statistical modelling and are treated specially by modelling functions like lm() and glm().
paste("Everybody", "is", “a” , "warrior")​
Correct Answer : “Everybody is a warrior”
Explaination : Both paste() and cat() print out text to the console by combining multiple character vectors together, it is impossible for those functions to know in advance how many character vectors will be passed to the function by the user.
cat("Everybody", "is", "a", “warrior”,sep="*")
Correct Answer : Everybody*is*a*warrior
Explaination : Both paste() and cat() print out text to the console by combining multiple character vectors together, it is impossible to those functions to know in advance how many character vectors will be passed to the function by the user.
paste()
cat()
Sys.Date()
Correct Answer : Present date
Explaination : Sys.time and also Sys.Date returns the system’s idea of the current date with and without time. Sys.time returns an absolute date-time value which can be converted to various time zones and may return different days. Sys.Date returns the current day in the current time zone.
Sys.time
Sys.Date
Sys.time()
Correct Answer : Present date and time
Explaination : Sys.time returns a present date-time value which can be converted to various time zones and may return different days. Sys.time and also Sys.Date returns the system’s idea of the current date with and without time.
Correct Answer : 5
Explanation : The most basic type of R object is a vector.
Correct Answer : double
Explanation : This means that even if you see a number like “1” or “2” in R, which you might think of as integers, they are likely represented behind the scenes as numeric objects something like “1.00” or “2.00”.
Correct Answer : attributes()
Explanation : Not all R objects contain attributes, in which case the attributes() function returns NULL.
attributes()
Correct Answer : c()
Explanation : The simplest such structure is the numeric vector, which is a single entity consisting of an ordered collection of numbers.
> x <- vector("numeric", length = 10) > x
Correct Answer : 0 0 0 0 0 0 0 0 0 0
Explaination : You can also use the vector() function to initialize vectors.
> x <- 6 > class(x)
Correct Answer : “numeric”
Explaination : class is used to determine data type of the variable.
> sqrt(-17)
Correct Answer : NaN
Explaination : These metadata can be very useful in that they help to describe the object.
Correct Answer : > v <- 2*x + y + 1
Explanation : The elementary arithmetic operators are the usual +, -, *, / and ^ for raising to a power.
Correct Answer : sort()
Explanation : There are other more flexible sorting facilities available like order() or sort.list() which produce a permutation to do the sorting.
order()
sort.list()
> x <- 0:6
Correct Answer : as.character(x)
Explaination : as.character would print number from 0 to 6.
Correct Answer : > m <- matrix(nrow = 2, ncol = 3)
Explanation : Matrix Entries can be thought of starting in the “upper left” corner and running down the columns.
> m <- matrix(nrow = 2, ncol = 3) > dim(m)
Correct Answer : 2 3
Explaination : Matrices are constructed column-wise.
> m <- matrix(1:6, nrow = 2, ncol = 3) > m
[,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6
[,0] [,1] [,2] [1,] 1 3 5 [2,] 2 4 6
[,1] [,2] [,3] [1,] 1 3 6 [2,] 2 4 5
[,5] [,6] [,7] [1,] 1 3 6 [2,] 2 4 5
Correct Answer : [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6
Explaination : Matrices can also be created directly from vectors by adding a dimension attribute.
> m <- 1:10 > m
Correct Answer : 1 2 3 4 5 6 7 8 9 10
Explaination : Matrices can be created by column-binding.
a <- c(1,2,5.3,6,-2,4)
Correct Answer : Numeric
Explaination : The labels are always character irrespective of whether it is numeric or character and also Boolean etc. on the input vector. The nlevels functions will give the count of levels. Factors are created using the factor() function.
Correct Answer : Same
Explanation : All columns in a matrix must have the same mode(numeric, character, etc) and also the same length. byrow=TRUE indicates that the matrix should be filled by rows. byrow=FALSE indicates that the matrix should be filled by columns (the default).
Correct Answer : Arrays
Explanation : Arrays are similar to matrices which can have more than two dimensions. See help(array) for details. Factors are created using the factor() function. We can Identify elements of a list using the [[]] convention.
Correct Answer : Lists
Explanation : An ordered collection of objects are called lists. A list allows you to gather a variety of (possibly unrelated) objects under one name. We can Identify elements of a list using the [[]] convention.
Correct Answer : Factor
Explanation : The factor stores the nominal values as a vector of integers in the range [1… k] (where k is the no. of unique values of the nominal variable), and an internal vector of character strings (the original values) mapped to these integers.
Correct Answer : Data types
Explanation : Based on the data type of a variable, the OS allocates the memory and decides what can be stored on the reserved memory. This means that when you create a variable you reserve some space in memory.
Correct Answer : Array
Explanation : The array function takes a dim attribute which creates the required number of dimensions. While matrices are confined to two dimensions, arrays could be of any number of dimensions.
Correct Answer : Function()
Explanation : Factors are created using the factor() function. The labels are always character irrespective of whether it is numeric or character or also Boolean etc. in the vector. The nlevels functions will give the count of levels.
Correct Answer : Rownames()
Explanation : Data frames can have additional attributes such as rownames(), which can be useful for annotating data, like subject_id or sample_id. But most of the time they are not used. A data frame is an important data type in R.
Explanation : The four most frequently used types of data objects in R are vectors, matrices, data frames and lists. A list is a generalisation of a vector and represents a collection of data objects.
Correct Answer : Seq
Explanation : The rep function replicates elements of vectors. The seq function creates a regular sequence of values to form a vector. The four most frequently used types of data objects in R are vectors, matrices, data frames and lists.
Correct Answer : Scan
Explanation : The scan function is used to enter data at the terminal. This is useful for small datasets but tiresome for entering in large datasets.
Correct Answer : Rbind
Explanation : To bind a row onto an already existing matrix, the rbind function can be used. The scan function is used to enter data at the terminal.
Correct Answer : Iris
Explanation : The iris dataset is a three dimensional dataset. One dimension is represented for each species: Setosa, Versicolor and Virginica.
Correct Answer : Indexing
Explanation : Accessing elements is achieved through a process called indexing. Indexing may be done by a vector of positive integers and negative integers to indicate inclusion.
Correct Answer : Data frames
Explanation : Data frames can be indexed by either row or column using a specific name (that corresponds to either the row or column) or a number.
Correct Answer : Subsetting Commands
Explanation : To access elements with a value greater than five we can use some subsetting commands and logical operators to produce the desired result.
Correct Answer : List
Explanation : Lists can be created using the list function. Like data frames, they can incorporate a mixture of modes into the one list and each component can be of a different length or size.
Correct Answer : Number of the Position
Explanation : There are a number of ways of accessing the first component of a list. We can either access it through the name of that component (if names are assigned) or by using a number corresponding to the position the component.
Correct Answer : Spline
Explanation : The spline function returns a list of densities (y) corresponding to bin values (x). These can be passed to the plot routine to produce a line graph of the density.
Correct Answer : Concat
Explanation : Joining two lists can be achieved either using the concatenation function or the append function.
Correct Answer : Equal
Explanation : The length of a list is equal to the number of components in that list. Lists can be created using the list function. Like data frames, they can incorporate a mixture of modes into the one list and each component can be of a different length or size.
Correct Answer : Extraction
Explanation : There are a number of ways of accessing the first component of a list. We can either access it through the name of that component (if names are assigned) or by using a number corresponding to the position the component corresponds to. The former approach can be performed using subsetting ([[]]) or alternatively, by the extraction operator ($).
Correct Answer : copy()
Explanation : Any numbers given among the arguments are coerced into character strings in the evident way, that is, in the same way they would be if they were printed.
Correct Answer : > x <- factor(c("yes", "yes", "no", "yes", "no"))
Explanation : Factor objects can be created with the factor() function.
> x <- vector("list", length = 5) > x
Correct Answer : NULL
Explaination : We can also create an empty list of a pre-specified length with the vector() function.
> x <- factor(c("yes", "yes", "no", "yes", "no")) > table(x)
yes no 2 3
no yes 2 3
no yes 2 2
yes yes 6 2
Correct Answer : no yes 2 3
Explaination : The order of the levels of a factor can be set using the levels argument to factor().
> x <- c(1, 2, NaN, NA, 4) > is.na(x)
Correct Answer : FALSE FALSE TRUE TRUE FALSE
Explaination : Missing values are denoted by NA or NaN for q undefined mathematical operations.
Correct Answer : as.matrix()
Explanation : as.matrix() function should be used to coerce a data frame to a matrix.
as.matrix()
> x <- data.frame(foo = 1:4, bar = c(T, T, F, F)) > ncol(x)
Correct Answer : 2
Explaination : Data frames are represented as a special type of list where every element of the list has to have the same length.
> m <- matrix(1:4, nrow = 2, ncol = 2) > dimnames(m) <- list(c("a", "b"), c("c", "d")) > m
c d a 1 3 b 2 4
c d a 1 2 b 2 3
c d a 1 3 b 4 2
Correct Answer : c d a 1 3 b 2 4
Explaination : Matrices can have both column and row names.
Correct Answer : colnames(m) <- c(“hâ€, “fâ€)
Explanation : Column names and row names can be set separately using the colnames() and rownames() functions.
Correct Answer : data <- read.table(“foo.txtâ€)
Explanation : R will automatically skip lines that begin with a #.
> y <- data.frame(a = 1, b = "a") > dput(y, file = "y.R") > new.y <- dget("y.R") > new.y
b a 1 a a
a b 1 1 a
a b 2 1 a
a b 1 2 b
Correct Answer : a b 1 1 a
Explaination : Multiple objects can be deparsed at once using the dump function and read back in using source.
Correct Answer : Save
Explanation : The key functions for converting R objects into a binary format are save(), save.image(), and serialize().
> a <- data.frame(x = rnorm(100), y = runif(100)) > b <- c(3, 4.4, 1 / 3)
keep(a, b, file = “mydata.rda”)
keep_image(a, b, file = “mydata.rda”)
save(a, b, file = “mydata.rda”)
save_image(a, b, file = “mydata.rda”)
Correct Answer : save(a, b, file = “mydata.rda”)
Explaination : You can save all objects in your workspace using the save.image() function.
save.image()
Correct Answer : load(“mydata.RDataâ€)
Explanation : .rda and .RData are fairly common extensions and you may want to use them because they are recognized by other software
.rda
.RData
Correct Answer : con <- file(“foo.txtâ€)
Explanation : Open is used for opening connection to ‘foo.txt’ in read-only mode.
Correct Answer : con <- gzfile(“words.gzâ€)
Explanation : For more structured text data like CSV files or tab-delimited files, there are other functions like read.csv() or read.table().
Correct Answer : data <- read.csv(“foo.txtâ€)
Explanation : Connections must be opened, then the are read from or written to, and then they are closed.
> x <- c("a", "b", "c", "c", "d", "a")
Correct Answer : x[1]
Explaination : The element which we want to extract will be in the format of variable[index value of the element] in R script.
Correct Answer : x[1:4]
Explaination : The multiple successive elements which we want to extract will be in the format of variable[index value of the start element:index value of the last element] in R script.
> x <- matrix(1:6, 2, 3) > x[1, , drop = FALSE]
[,1] [,2] [,3] [1,] 2 3 5
[,1] [,2] [,3] [1,] 1 2 5
[,1] [,2] [,3] [1,] 1 3 5
Correct Answer : [,1] [,2] [,3] [1,] 1 3 5
Explaination : By default, when a single element of a matrix is retrieved, it is returned as a vector of length 1 rather than a $1\times 1$ matrix.
$1\times
1$ matrix
Correct Answer : Select
Explanation : One important contribution of the dplyr package is that it provides a “grammar” for data manipulation and for operating on data frames.
Correct Answer : rename
Explanation : rename is used to rename variables in a dataframe.
Correct Answer : mutate
Correct Answer : install.packages(“dplyrâ€)
Explanation : After installing the package it is important that you load it into your R session with the library() function.
Correct Answer : devtools
Explanation : The GitHub repository will usually contain the latest updates to the package and the development version.
Correct Answer : tapply()
Explanation : Functions can be passed as arguments to other functions.
Correct Answer : function()
Explanation : In particular, they are R objects of class “function”.
> f <- function() { + ## This is an empty function + } > class(f)
Correct Answer : “function”
Explaination : Functions have their own class.
> f <- function(num) { + hello <- "Hello, world!\n" + for(i in seq_len(num)) { + cat(hello) + } + chars <- nchar(hello) * num + chars + } > meaningoflife <- f(3) > print(meaningoflife)
Correct Answer : 32
Explaination : This function returns the total number of characters printed to the console.
Correct Answer : Import data
Explanation : R Commander is used to import data in R language. To start the R commander GUI, the user should type in the command Rcmdr into the console. There are 3 different types in which data can be imported in R language.
M <- c(3, 2, 4)
N <- c(1, 2
Z <- M*N
Correct Answer : Z <- (3, 4, 4)
Explaination : In R language when the vectors have different lengths, the multiplication begins with the smaller vector and continues till all the elements in the larger vector have been multiplied.
Correct Answer : 6000
Explanation : R language has several packages for solving a particular problem. CRAN package ecosystem has more than 6000 packages.
Correct Answer : t.tests ()
Explanation : t.tests () function in R language is used to find out whether the means of 2 groups are equal to each other. It is not used most commonly in R. It is used in some specific conditions.
Correct Answer : Transpose()
Explanation : Transpose t () is the easiest method for reshaping the data before analysis. The transpose (reversing rows and columns) is always the simplest method of reshaping a dataset. Use the t() function to transpose a matrix or a data frame.
Correct Answer : With()
Explanation : With a () function is used to apply an expression for a given dataset. R language has a large number of in-built functions and the user can create their own functions. In R, a function is an object to the R interpreter is able to pass control to the function.
Correct Answer : rbind()
Explanation : rbind () function can be used add datasets in R language provided the columns in the datasets should be the same. R has a large number of in-built functions and also the user can create their own functions.
Correct Answer : 8 TB
Explanation : 8TB is the memory limit for 64-bit system memory and 3GB is the limit for 32-bit system memory. A solid understanding of R’s memory management will help you predict how much memory you’ll need for a given task.
Correct Answer : 3GB
Correct Answer : By()
Explanation : BY () function is used for applying a function each level of factors. R has a large number of in-built functions and also user can create their own functions. In R, a function is an object. So the R interpreter is able to pass control to the function.
Correct Answer : loglm()
Explanation : loglm is fit log-linear models by iterative proportional scaling. This function provides a front-end of the standard function, loglin, to allow log-linear models to be specified and fitted in a manner similar to that of other fitting functions.
Correct Answer : Matrix Object
Explanation : The function call is.matrix(X) returns TRUE then X can be termed as a matrix data object. R has a large number of in-built functions and also the user can create their own functions.
Correct Answer : Logical Regression
Explanation : Logistic regression can be used for this and the function glm () in R language provides this functionality. Logistic regression is a statistical method for analysing a dataset in which there are one or more independent variables which determine an outcome.
Correct Answer : Glm()
Explanation : Glm() in R language provides this functionality. Logistic regression is a statistical method for analysing a dataset in which there are one or more independent variables that determine an outcome.
Correct Answer : Sample()
Explanation : Sample() function can be used to select a random sample of size ‘n’ from the huge dataset. R has a large number of in-built functions and also the user can create their own functions.
Correct Answer : Subset()
Explanation : Subset () function is used to select variables and observations from a given dataset. R has a large number of in-built functions and also the user can create their own functions.
Correct Answer : Match()
Explanation : It can be done using the match () function- match () function returns the first appearance of a particular element. The other way is to use %in% which returns a Boolean value either true or false.
Correct Answer : warning()
Explanation : warning is an indication that something is wrong but not necessarily fatal; execution of the function continues.
Correct Answer : Sys.Date
Explanation : The POSIXlt class stores date/time values as a list of components (hour, min, sec, mon, etc.) making it easy to extract these parts.
> printmessage <- function(x) { + if(x > 0) + print("x is greater than zero") + else + print("x is less than or equal to zero") + invisible(x) + } > printmessage(NA)
Correct Answer : Error
Explaination : You can’t do that test if x is a NA or NaN value.
> lm <- function(x) { x * x } > lm
Correct Answer : function(x) { x * x }
Explaination : When R tries to bind a value to a symbol, it searches through a series of environments to find the appropriate value.
Correct Answer : function
Explanation : The function closure model can be used to create functions that “carry around” data with them.
Correct Answer : scoping rules
Explanation : This function never actually uses the argument b, so calling f(2) will not produce an error because the 2 gets positionally matched to a.
> g <- function(x) { + a <- 3 + x+a+y + ## 'y' is a free variable + } > g(2)
Explaination : Object ‘y’ not found error is displayed.
Correct Answer : make.power()
Explanation : Typically, a function is defined in the global environment, so that the values of free variables are just found in the user’s workspace.
Correct Answer : apply()
Explanation : An auxiliary function split is also useful, particularly in conjunction with lapply.
Correct Answer : sapply()
Explanation : lapply tries to simplify the result.
Explanation : t in tapply stands for table.
Correct Answer : four
Explanation : This function takes three arguments: (1) a list X; (2) a function (or the name of a function) FUN; (3) other arguments via its____argument.
Correct Answer : Error in log(c(-1, 2)): NaNs produced
Explanation : Warning is produced due to negative values.
Explanation : When you choose a frame number, you will be put in the browse and will have the ability to poke around.
Correct Answer : rnorm
Explanation : The “r” function is the one that actually simulates random numbers from that distribution.
Correct Answer : pnorm
Explanation : p stands for cumulative distribution.
Correct Answer : dnorm
Explanation : That point can be a vector of points.
Correct Answer : Gaussian
Explanation : Working with the Normal distributions requires using four functions.
> pnorm(2)
Correct Answer : 0.9772499
Explaination : If you wanted to know what was the probability of a random Normal variable of being less than 2, you could use the pnorm() function to do that calculation.
Correct Answer : Profiling
Explanation : Sometimes profiling becomes necessary as a project grows and layers of code are placed on top of each other.
Correct Answer : debugger
Explanation : In general, it’s usually a bad idea to focus on optimizing your code at the very beginning of development.
Correct Answer : system.time()
Explanation : They system.time() function takes an arbitrary R expression as input and returns the amount of time taken to evaluate the expression.
system.time()
Correct Answer : proc_time
Explanation : if there’s an error, gives the time until the error occurred.
Correct Answer : parallel
Explanation : When you have multiple processors/- cores/machines working in parallel, the amount of time that the collection of CPUs spends working on a problem is the same as with a single CPU, but because they are operating in parallel, there is a savings in elapsed time.
Correct Answer : longer
Explanation : If your expression is getting pretty long (more than 2 or 3 lines), it might be better to either break it into smaller pieces or to use the profiler.
Correct Answer : ggplot2
Explanation : The emphasis in ggplot2 is reducing the amount of thinking time by making it easier to go from the plot in your brain to the plot on the page.
Correct Answer : cut_interval
Explanation : cut_number cuts numeric vector into intervals containing equal number of points.
cut_number
Correct Answer : ggorder
Explanation : ggsave save a ggplot with sensible defaults.
Correct Answer : translate_qplot_base
Explanation : translate_qplot_gpl is used for translating between qplot and Graphics Production Library (GPL).
Correct Answer : translate_qplot_defaults
Correct Answer : ggmissing
Explanation : The missing values plot is a useful tool to get a rapid overview of the number and pattern of missing values in a dataset.
Correct Answer : geom_contour
Explanation : A layer specific dataset – only needed if you want to override the plot defaults.
Correct Answer : geom_density
Explanation : geom_density display a smooth density estimate. A smooth density estimate calculated by stat_density.
Correct Answer : geom_pointrange
Explanation : autoplot uses ggplot2 to draw a particular plot for an object of a particular class in a single command.
Explanation : Character vector is given for the creation of identity.
Correct Answer : tidyneat
Explanation : Tidy data is data that’s easy to work in R.
Correct Answer : dplyr
Explanation : Its easy to munge with dplyr.
Correct Answer : gather()
Explanation : gather() gathers column into key-value pairs.
Correct Answer : print.ggplot
Explanation : spread() makes “long” data wide.
Correct Answer : Show causality, mechanism, explanation
Explanation : Only do what your tools allow you to do.
Correct Answer : Plots are created and annotated with separate functions
Explanation :
Correct Answer : SVG
Explanation : SVG stands for scalable vector graphics.
Correct Answer : Scatterplots with many many points
Explanation : Scatterplots would be used frequently for particular dimension.
Correct Answer : boxplot()
Explanation : text() also can be used to add elements to a plot. boxplot() used to add elements to a plot in the base graphics system.
Correct Answer : quartz()
Explanation : quartz starts a graphics device driver for the Mac.
Correct Answer : the plotting symbol/character in the base graphics system
Explanation : R makes it easy to combine multiple plots into one overall graph, using either the par ( ) or layout( ) function.
Correct Answer : x1 <- c(rnorm(n))
Explanation : rnorm generates random deviates.
Correct Answer : read.table(filename,header=TRUE,sep=’,’)
Explanation : Each row of the table appears as one line of the file.
Correct Answer : read.table(filename,header=TRUE)
Explanation : read.csv and read.csv2 are identical to read.table except for the defaults.
Correct Answer : data.df[data.df=logical]
Explanation : subset(data.df,select=variables,logical) get those objects from a data frame that meet a criterion.
Correct Answer : x[rev(order(x$B)),]
Explanation : x[rev(order(x$B)),] sort the dataframe in reverse order.
Correct Answer : rm(list=ls())
Explanation : attach(mat) make the names of the variables in the matrix or data frame available in the workspace.
Correct Answer : browse.workspace
Explanation : It is a Mac menu command that creates a window with information about all variables in the workspace.
Correct Answer : rev(x)
Explanation : rev provides a reversed version of its argument.
Correct Answer : rowMeans(x, na.rm = FALSE, dims = 1)
Explanation : False value leads to unexpected result.
Correct Answer : apply(X, MARGIN, FUN, …)
Explanation : apply(x,2,max) finds the maximum for each column.
Correct Answer : apply(x,2,max)
Explanation : col.max(x) is another way to find which column has the maximum value for each row.
Correct Answer : Solve(A,B)
Explanation : Solve(A,B) implies inverse of A * B.
Correct Answer : par(omi=c(0,0,1,0) )
Explanation : par can be used to set or query graphical parameters.
Correct Answer : title( “R languageâ€)
Explanation : This function can be used to add labels to a plot.
Correct Answer : plyr
Explanation : sqldf uses SQLite syntax. Plyr will split-apply-combine paradigm for R. Forecast is a generic function for forecasting from time series or time series models. The function invokes particular methods which depend on the class of the first argument.
Correct Answer : forecast
Explanation : Most important feature is the resulting forecast plot.
Correct Answer : install.packages(“forecastâ€)
Explanation : forecast is used for time series analysis.
Correct Answer : lubridate
Explanation : lubridate is one of those magical libraries that just seems to do exactly what you expect it to.
Correct Answer : reshape2
Explanation : reshape2 is used in conjunction with ggplot2 and plyr.
Correct Answer : melt
Explanation : dcast is used to go from long to wide.
Correct Answer : kBestShortestPaths
Explanation : This package provides some routines to conduct the K-adaptive partitioning (kaps) and recursive partitioning (lrtree) for survival data.
Correct Answer : editrules
Explanation : editrules is a package for parsing, applying, and manipulating data cleaning rules. edrGraphicalTools provides tools for dimension reduction methods.
Correct Answer : namespace
Explanation : The package namespace is one of the most confusing parts of building a package. nbpMatching contains functions for non-bipartite optimal matching.
Correct Answer : paleoTS
Explanation : This package contains parfossil parallelized functions for palaeoecological and palaeogeographical analysis.
Correct Answer : uniPlot
Explanation : uniPlot() allows to change parameters of the packages graphics, lattice and ggplot2 and to make these changes persistent over one R session.
uniPlot()
install.packages(c("devtools", "roxygen2"))
Correct Answer : Installs the given packages
Explaination : Make sure you have the latest version of R and then run the above code to get the packages you’ll need. It installs the given packages. Confirm that you have a recent version of RStudio.
Correct Answer : Create()
Explanation : To get started with your new package in RStudio, double-click the pkgname.Rproj file that create() just made. This will open a new RStudio project for your package. Projects are the way to develop packages.
Correct Answer : Path
Explanation : If you have an existing package that doesn’t have an .Rproj file, you can use devtools for the use_rstudio(“path/to/package”) to add it.
Correct Answer : Single
Explanation : A bundled package is a package that’s been compressed into a single file. A source package is just a directory with components like R/, DESCRIPTION, and so on.
Correct Answer : Vignettes
Explanation : Vignettes are built for getting the HTML and PDF output instead of Markdown or LaTeX input. A bundled package is a package that’s been compressed into a single file.
Correct Answer : bundle
Explanation : Files listed in the Rbuildignore were not included in the bundle. .Rbuildignore prevents files from the src package and appearing in the bundled package. It allows you to have some extra directories in your source package that will not be included in the package bundle.
Correct Answer : source
Explanation : .Rbuildignore prevents files from the source package and appearing in the bundled package. It allows to have some extra directories in your source package that will not be included in the package bundle.
Correct Answer : quantile()
Explanation : barplot() produces a bar graph.
Correct Answer : read
Explanation : table() list all values of a variable with frequencies.
table()
Correct Answer : factor.model
Explanation : factor.congruence is used to find the factor congruence coefficients.
Correct Answer : prop.table()
Explanation : par() is used to query and edit graphical settings.
Correct Answer : phi2poly
Explanation : In statistics, polychoric correlation is a technique for estimating the correlation between two theorized normally distributed continuous latent variables, from two observed ordinal variables.
Correct Answer : count.pairwise
Explanation : Pairwise comparison generally is any process of comparing entities in pairs to judge which of each entity is preferred.
Correct Answer : lm
Explanation : Describe give means, sd, skew, n, and se.
Correct Answer : Scatterplot, Boxplot, Density plot
Explanation : Each plot has its own importance of highlighting a specific feature. Scatter plot is used to visualise the relationship between the variables, Box plot is used to spot the outliers which effect line of best fit.
Correct Answer : first+second+first:second
Explanation : A terms specification of the form “first + second” indicates all the terms in first together with all the terms in second with duplicates removed.
Correct Answer : Extrapolation
Explanation : Predicting y for a value of x that is within the interval of points that we saw in the original data is called interpolation. Predicting y for a value of x that’s outside the range of values we actually saw for x in the original data is called extrapolation.
Correct Answer : Intra polation
Correct Answer : ANOVA
Explanation : If the ANOVA test determines that the model explains a significant portion of the variability in the data, then we can consider testing each of the hypotheses and correcting for multiple comparisons.
Correct Answer : Linear regression
Explanation : Linear regression is a simple approach to supervised learning. It assumes that the dependence of Y on X1, X2, . . . Xp is linear. linear regression is an incredibly powerful tool for analysing data.
Correct Answer : Reverse Regression Method
Explanation : The sum of squares of the difference between the observations and the line in the horizontal direction in the scatter diagram can be minimized to obtain the estimates of 0 1 β and β. This is generally called a reverse or inverse regression method.
Correct Answer : Variance
Explanation : In order to calculate confidence intervals and hypothesis tests, it is assumed that the errors are independent and normally distributed with mean zero and variance.
Correct Answer : Normal
Explanation : When hypothesis tests and confidence limits are to be used, the residuals are assumed to follow the normal distribution.
Correct Answer : Watson
Explanation : IBM Watson is a system based on cognitive computing. With the addition of Revolution R Enterprise for IBM Netezza, you can use the power of the R language to build predictive models on Big Data.
Correct Answer : Hadoop
Explanation : However, it has a reputation for not being a suitable environment for high performance complex iterative algorithms such as logistic regression, generalized linear models, and decision trees.
Correct Answer : ZingChart
Explanation : ZingChart lets you create HTML5 Canvas charts and more.
Correct Answer : SAS
Explanation : SAS (Statistical Analysis System) is a software suite developed by SAS Institute for advanced analytics.
Correct Answer : Classification
Explanation : Classification techniques are widely used in data mining to classify data.
Correct Answer : Descriptive
Explanation : Descriptive is the simplest class of analytics. Predictive analytics can only forecast what might happen in the future because all predictive analytics are probabilistic in nature.