data analysis with r programming weekly challenge 3
1. A data analyst is working with a dataset in R that has more than 50,000 observations. Why might they choose to use a tibble instead of the standard data frame? Select all that apply.
- Tibbles can create row names
- Tibbles automatically only preview the first 10 rows of data
- Tibbles can automatically change the names of variables
- Tibbles automatically only preview as many columns as fit on screen
2.A data analyst is exploring their data to get more familiar with it. They want a preview of just the first six rows to get a better idea of how the data frame is laid out. What function should they use?
- print()
- preview()
- head()
- colnames()
3. You are working with the ToothGrowth dataset. You want to use the head() function to get a preview of the dataset. Write the code chunk that will give you this preview.
What are the names of the columns in the ToothGrowth dataset?
- VC, supp, dose
- len, supp, dose
- len, supp, VC
- len, VC, dose
4. A data analyst is working with a data frame named sales. They write the following code:
sales %>%
The data frame contains a column named q1_sales. What code chunk does the analyst add to change the name of the column from q1_sales to quarter1_sales ?
- rename(quarter1_sales = q1_sales)
- rename(q1_sales <- “quarter1_sales”)
- rename(quarter1_sales <- “q1_sales”)
- rename(q1_sales = quarter1_sales)
5. A data analyst is working with the penguins data. They write the following code:
penguins %>%
The variable species includes three penguin species: Adelie, Chinstrap, and Gentoo. What code chunk does the analyst add to create a data frame that only includes the Gentoo species?
- filter(species == “Gentoo”)
- filter(species <- “Gentoo”)
- filter(Gentoo == species)
- filter(species == “Adelie”)
6. You are working with the penguins dataset. You want to use the summarize() and max() functions to find the maximum value for the variable flipper_length_mm. You write the following code:
penguins %>%
drop_na() %>%
group_by(species) %>%
Add the code chunk that lets you find the maximum value for the variable flipper_length_mm.
drop_na() %>%
group_by(species) %>%
Add the code chunk that lets you find the minimum value for the variable bill_depth_mm.
What is the minimum bill depth in mm for the Chinstrap species?
What is the maximum flipper length in mm for the Gentoo species?
- 200
- 212
- 210
- 231
7. A data analyst is working with a data frame called salary_data. They want to create a new column named total_wages that adds together data in the standard_wages and overtime_wages columns. What code chunk lets the analyst create the total_wages column?
- mutate(salary_data, standard_wages = total_wages + overtime_wages)
- mutate(salary_data, total_wages = standard_wages + overtime_wages)
- mutate(salary_data, total_wages = standard_wages * overtime_wages)
- mutate(total_wages = standard_wages + overtime_wages)
8. A data analyst is working with a data frame named stores. It has separate columns for city (city) and state (state). The analyst wants to combine the two columns into a single column named location, with the city and state separated by a comma. What code chunk lets the analyst create the location column?
- unite(stores, “location”, city, state, sep=”,”)
- unite(stores, “location”, city, sep=”,”)
- unite(stores, city, state, sep=”,”)
- unite(stores, “location”, city, state)
9. A data analyst writes the following code chunk to return a statistical summary of their dataset:
quartet %>% group_by(set) %>% summarize(mean(x), sd(x), mean(y), sd(y), cor(x, y))
Which function will return the average value of the y column?
- mean(y)
- mean(x)
- cor(x, y)
- sd(x)
10. A data analyst uses the bias() function to compare the actual outcome with the predicted outcome to determine if the model is biased. They get a score of 0.8. What does this mean?
- Bias cannot be determined
- The model is biased
- Bias can be determined
- The model is not biased
Shuffle Q/A 1
11. What is an advantage of using data frames instead of tibbles?
- Data frames allow you to create row names
- Data frames make printing easier
- Data frames allow you to use column names
- Data frames store never change variable names
12. A data analyst is examining a new dataset for the first time. They load the dataset into a data frame to learn more about it. What function(s) will allow them to review the names of all of the columns in the data frame? Select all that apply.
- colnames()
- head()
- str()
- library()
Good and thanks for your help
No worries!