process data from dirty to clean weekly challenge 4 answers
1. Verification and reporting come directly before the data-cleaning process.
- True
- False
2. What is the first step in the verification process?
- Compare cleaned data with the original, uncleaned dataset and compare it to what is there now
- Create a chronological list of modifications made to the data
- Determine the quality of the data
- Inform others of your data-cleaning effort
3. Which of the following functions automatically remove extra spaces when cleaning data?
- SNIP
- REMOVE
- TRIM
- CLEAR
4. What tool can a data analyst use to figure out how many identical errors occur in a dataset?
- CASE
- COUNTA
- CONFIRM
- COUNT
5. Fill in the blank: A data analyst uses the CASE statement to consider one or more _____, then returns a value.
- additions
- conditions
- identifications
- changes
6. What is the process of tracking changes, additions, deletions, and errors during data cleaning?
- Recording
- Observation
- Cataloging
- Documentation
7. Fill in the blank: While cleaning data, a data analyst can use a changelog to keep a chronological list of changes they make. They can refer to it during the _____ period if there are errors or questions.
- presenting
- verification
- documentation
- visualization
8. Reviewing version history is an effective way to view a changelog in SQL.
- True
- False
9. In what step of the data-cleaning process do you find mistakes before you begin analyzing the data?
- Confirming
- Publishing
- Verifying
- Processing
10. During the data cleaning process you find a significant amount of data that contains irrelevant spaces. Which function do you use to remove leading, trailing, or repeated spaces?
- CUT
- DELETE
- TRIM
- TIDY
11. A data analyst is checking for errors in a dataset. They want to determine how many times the name of a country is in the dataset using a pivot table. What function can they use to find this count?
- COUNTA
- CHECK
- COUNT
- CASE
12. You’re writing the below SQL query and need to change “World Wide Web” to “www”. What function would you use to accomplish this task?
SELECT
_____
WHEN ‘World Wide Web’ THEN ‘www’
END AS some_column
FROM
some_table
- THEN
- CASE
- ELSE
- WHEN
13. What should a data analyst actively track throughout the data cleaning process?
- Additions, changes, and queries
- Errors, deletions, and notes
- Changes, resolutions, and deletions
- Errors, additions, and deletions
14. A data analyst is in the verification process and needs to verify the modifications that they have made to the data. What could the analyst reference to find the changes they made throughout data cleaning?
- Changelog
- Notepad
- Spreadsheet
- Metadata
15. A data analyst commits a query to the repository as a new and improved query. Then, they specify the changes they made and why they made them. This scenario is part of what process?
- Reporting data
- Visualizing data
- Communicating with stakeholders
- Creating a changelog
16. The data collected for an analysis project has just been cleaned. What are the next steps for a data analyst? Select all that apply.
- Reporting
- Certification
- Validation
- Verification
17. As a data analyst, you will need to keep the big picture in mind throughout any project when verifying data cleaning. What must the analyst do to take a big picture view of the project? Select all that apply.
- Consider the data
- Consider the goal
- Consider the business problem
- Consider the reporting
18. During the verification process, you find that you missed a few leading spaces during data cleaning. What function can you use to eliminate these spaces?
- TRIM
- TIDY
- CUT
- CROP
19. Which SQL tool considers one or more conditions, then returns a value as soon as a condition is met?
- THEN
- WHEN
- CASE
- ELSE
20. Fill in the blank: Documentation is the process of tracking _____ during data cleaning. Select all that apply.
- additions
- deletions
- changes
- inactivity
21. Fill in the blank: A changelog contains a _____ list of modifications made to a project.
- random
- approximate
- chronological
- synchronized
22. You start a complex project that will take more than a year to complete. You need to document modifications made to your queries throughout the project. What is the correct way to store these modifications?
- Creating a changelog
- Creating a notepad
- Visualizing data
- Creating a spreadsheet
23. Fill in the blank: A process to confirm that a data-cleaning effort was well-executed and the resulting data is accurate and reliable is known as _____.
- verification
- publishing
- manipulation
- processing
24. A data analyst is in the verification step. They consider the business problem, the goal, and the data involved in their analytics project. What scenario does this describe?
- Reporting on the data
- Considering the stakeholders
- Seeing the big picture
- Visualizing the data
25. During data cleaning, you find an error in a username where the ID number was accidentally joined to the user’s last name. You need to figure out if this username has been entered incorrectly more than once in your datasett. If you use a pivot table, what function can you use to determine the number of times this error occurs in your dataset?
- CASE
- COUNT
- COUNTA
- CHECK
26. You’re working with a dataset that contains categorical variables. You notice that some of the strings are misspelled or are not capitalized. What function can you use to fix these errors when a condition is met?
- ELSE
- CASE
- WHEN
- THEN
27. A data analyst uses a changelog while cleaning data. What process does a changelog support?
- Illumination
- Examination
- Disclosure
- Documentation
28. A changelog is essential for storing chronological modifications made during the data cleaning process. When will an analyst refer to the information in the changelog to certify data integrity?
- Documentation
- Verification
- Presenting
- Visualization
29. Fill in the blank: As a data analyst, you should always create a _____ to track your additions, deletions, errors, and changes to a query.
- notepad
- database
- changelog
- spreadsheet
30. Fill in the blank: TRIM is a function that removes _____ spaces in data. Select all that apply.
- repeated
- trailing
- leading
- inner
31. While verifying cleaned data, a data analyst encounters a misspelled name. Which function can they use to determine the number of misspelled occurrences in the dataset?
- CASE
- CHECK
- CHECK
- COUNTA
32. At what point during the analysis process does a data analyst use a changelog?
- While cleaning the data
- While visualizing the data
- While gathering the data
- While reporting the data
33. Your manager points out an error in a product ID number in your dataset. The Product IDs can be numbers like 42 or text like "CAD-425". Using a pivot table, what function can you use to find how many times this error occurs in the dataset?
- COUNT
- CHECK
- COUNTA
- CASE
34. While reviewing your coworker’s data cleaning process, you find a few cases of trailing spaces in the data. What function can you use to remove these spaces?
- REMOVE TRAILING
- DELETE
- CUT
- TRIM
35. Which of the following queries considers one or more conditions and returns a value as soon as that condition is met?
- SELECT * WHEN CASE COLUMN = VARIABLE
- SELECT * CASE IF COLUMN = VARIABLE
- SELECT * CASE WHEN COLUMN = VARIABLE
- SELECT * IF CASE COLUMN = VARIABLE
36. Fill in the blank: Once data is clean, a data analyst moves on to _____ and verification.
- processing
- confirming
- publishing
- reporting
37. A data analyst is starting a large scale project. The project will be crucial to business success and the data analyst needs to keep the big picture at the forefront when verifying their data cleaning. What is the first step in the verification process?
- Create a chronological list of modifications made to the data
- Compare cleaned data with the original, uncleaned dataset and compare it to what is there now
- Inform others of the data-cleaning effort
- Determine the quality of the data
38. You use SQL to clean your data. You make comments whenever you modify your queries to keep track of any changes. What documentation will this practice help you create when you’re done cleaning the data?
- A changelog
- A query repository
- A new dataset
- A database
39. A data analyst is starting a large scale project that is crucial to business success. The data analyst needs to remember the big picture when verifying their data cleaning. What is involved when focusing on the big picture-view of the project? Select all that apply.
- Consider the reporting
- Consider the business problem
- Consider the stakeholders
- Consider the goal