Wednesday 25 April 2018

R Language Interview Questions

1) Explain about data import in R language

R Commander is used to import data in R language. To start the R commander GUI, the user must type in the command Rcmdr into the console. There are 3 different ways in which data can be imported in R language-
•        Users can select the data set in the dialog box or enter the name of the data set (if they know).
•        Data can also be entered directly using the editor of R Commander via Data->New Data Set. However, this works well when the data set is not too large.
•        Data can also be imported from a URL or from a plain text file (ASCII), from any other statistical package or from the clipboard.

2) Two vectors X and Y are defined as follows – X <- c(3, 2, 4) and Y <- c(1, 2). What will be output of vector Z that is defined as Z <- X*Y.

In R language when the vectors have different lengths, the multiplication begins with the smaller vector and continues till all the elements in the larger vector have been multiplied.
The output of the above code will be –
Z <- (3, 4, 4)

3)  Compare R and Python programming languages for Predictive Modelling.


Feature

Python is Better

R Language is Better

Model Building
Both are Similar
Both are Similar
Model Interpretability
Not better than R.
R is better
Production
Python is Better
Not better than Python
Community Support
Not better than R.
R  has good community support over Python.
Data Science Libraries
Both are similar.
Both are similar
Data Visualizations
Not better than R
R has good data visualizations libraries and tools.
Learning Curve
Learning Python is easier than learning R.
R has a steep learning curve.

4) How missing values and impossible values are represented in R language?

NaN (Not a Number) is used to represent impossible values whereas NA (Not Available) is used to represent missing values. The best way to answer this question would be to mention that deleting missing values is not a good idea because the probable cause for missing value could be some problem with data collection or programming or the query. It is good to find the root cause of the missing values and then take necessary steps handle them.

5) R language has several packages for solving a particular problem. How do you make a decision on which one is the best to use?

CRAN package ecosystem has more than 6000 packages. The best way for beginners to answer this question is to mention that they would look for a package that follows good software development principles. The next thing would be to look for user reviews and find out if other data scientists or analysts have been able to solve a similar problem.

6) Which function in R language is used to find out whether the means of 2 groups are equal to each other or not?

t.tests ()

7) What is the best way to communicate the results of data analysis using R language?

The best possible way to do this is combine the data, code and analysis results in a single document using knitr for reproducible research. This helps others to verify the findings, add to them and engage in discussions. Reproducible research makes it easy to redo the experiments by inserting new data and applying it to a different problem.

8) How many data structures does R language have?

R language has Homogeneous and Heterogeneous data structures. Homogeneous data structures have same type of objects – Vector, Matrix ad Array. Heterogeneous data structures have different type of objects – Data frames and lists.

9) What is the value of f (2) for the following R code?






b <- 4

f <- function (a)

{
b <- 3
b^3 + g (a)
}
g <- function (a)
{
a*b
}
The answer to the above code snippet is 35. The value of “a” passed to the function is 2 and the value for “b” defined in the function f (a) is 3. So the output would be 3^3 + g (2). The function g is defined in the global environment and it takes the value of b as 4(due to lexical scoping in R) not 3 returning a value 2*4= 8 to the function f. The result will be 3^3+8= 35.

10) What is the process to create a table in R language without using external files?

MyTable= data.frame ()
edit (MyTable)
The above code will open an Excel Spreadsheet for entering data into MyTable.

11) Explain about the significance of transpose in R language

Transpose t () is the easiest method for reshaping the data before analysis.

12) What are with () and BY () functions used for?

With () function is used to apply an expression for a given dataset and BY () function is used for applying a function each level of factors.

13) dplyr package is used to speed up data frame management code. Which package can be integrated with dplyr for large fast tables?

data.table

14) In base graphics system, which function is used to add elements to a plot?

boxplot () or text ()

15) What are the different type of sorting algorithms available in R language?

Bucket Sort
Selection Sort
Quick Sort
Bubble Sort
Merge Sort

15) What is the command used to store R objects in a file?

No comments:

Post a Comment