Module 3: Data Cleaning with SQL Answers (Part 4: Q46–53)
This is Part 3 of the Module 1 quiz answers for “Data Cleaning with SQL ” from the Google Data Analytics Professional Certificate on Coursera.
Here, we’ll walk through questions 46 to 53 with detailed explanations to support your learning.
To find answers to the remaining questions, check out the full module breakdown below:
46. A data analyst notices their Boolean column is incorrectly storing True/False values as strings. What function can the analyst use to convert the data type from a string to Boolean?
- CAST ✅
- TRIM
- LENGTH
- SUBSTR
Explanation: CAST('True' AS BOOLEAN)
converts string values into Boolean.
47. A data analyst is cleaning survey data. The results for an optional question contain many nulls. What function can the analyst use to eliminate the null values from the results?
- COALESCE ✅
- CONCAT
- CAST
- LENGTH
Explanation:
Use COALESCE()
to replace or filter out nulls effectively.
48. A data analyst is analyzing medical data for a health insurance company. The dataset contains billions of rows of data. Which of the following tools will handle the data most efficiently?
- A word processor
- CSV
- A spreadsheet
- SQL ✅
Explanation:
SQL is optimized for high-performance queries on big data.
49. You are working with a database table that contains invoice data. The table includes columns for invoice_id and billing_city. You want to remove duplicate entries for billing_city and sort the results by invoice_id.
You write the SQL query below. Add a DISTINCT clause that will remove duplicate entries from the billing_city column.
NOTE: The three dots (...) indicate where to add the clause.

What billing city appears in row 15 of your query result?
NOTE: The query index starts at 1 not 0.
- London
- Reno ✅
- Santiago
- Oslo
Explanation:
Use SELECT DISTINCT billing_city ORDER BY invoice_id
. The 15th city is “Reno”.
50. You are working with a database table that contains invoice data. The table includes columns about billing location such as billing_city, billing_state, and billing_postal_code. You use the SUBSTR function to retrieve the first 4 numbers of each billing_postal_code, and use the AS command to store the result in a new column called new_postal_code.
You write the SQL query below. Add a statement to your SQL query that will retrieve the first 4 numbers of each billing postal code and store the result in a new column as new_postal_code.
NOTE: The three dots (...) indicate where to add the statement.
NOTE: SUBSTR takes in three arguments being column, starting_index, ending_index

What invoice id is located in row 4?
NOTE: The query index starts at 1 not 0.
- 32
- 104
- 52
- 206 ✅
Explanation:
SUBSTR(billing_postal_code, 1, 4) AS new_postal_code
grabs the start of the code. Row 4 has invoice ID 206.
51. In which of the following situations would a data analyst use SQL instead of a spreadsheet? Select all that apply.
- When quickly pulling information from many different sources in a database ✅
- When recording queries and changes throughout a project ✅
- When working with a huge amount of data ✅
- When using the COUNTIF function to find a specific piece of information
Explanation:
SQL is used in situations where data is stored in databases, particularly for handling large datasets, complex queries, and when tracking changes over time. The COUNTIF function is specific to spreadsheets, not SQL.
52. You are working with a database table named invoice that contains invoice data. The table includes columns for customer_id and total. You want to remove duplicate customers and identify which unique customers have a total greater than 5.
You write the SQL query below. Add a DISTINCT clause that will remove duplicate entries from the customer_id column.
NOTE: The three dots (...) indicate where to add the clause.

What customer_id number is located in row 5?
NOTE: The query index starts at 1 not 0.
- 57
- 14
- 52 ✅
- 4
Explanation:
The DISTINCT
clause removes duplicate entries from the customer_id
column, and the query returns the unique customer IDs where the total is greater than 5.
53. You are working with a database table named customer that contains customer data. The table includes columns about customer location such as city, state, country, and postal_code. You want to check what country names are greater than 7 characters.
You write the SQL query below. Add a LENGTH function that will return any country names that are greater than 7 characters long.
NOTE: The three dots (...) indicate where to add the clause.

What is the value of postal_code in row 1 of your query result? (Hint: you will have to scroll to the right with your mouse or track pad to locate the indicated column.)
NOTE: The query index starts at 1 not 0.
- 14700 ✅
- None
- 2010
- N1 5LH
Explanation:
The LENGTH(country) > 7
condition filters countries with names longer than 7 characters. The postal code for Ireland (in this case) is N1 5LH
in row 1.
Congratulations! You’ve completed all questions. Share this post if it helped you, and check out other Coursera quiz answers below.
Related contents:
Module 1: The importance of integrity
Module 2: Clean data for more accurate insights
Module 4: Verify and report on cleaning results
Module 6: Course wrap-up
You might also like:
Course 1: Foundations: Data, Data, Everywhere
Course 2: Ask Questions to Make Data-Driven Decisions
Course 3: Prepare Data for Exploration
Course 5: Analyze Data to Answer Questions
Course 6: Share Data Through the Art of Visualization
Course 7: Data Analysis with R Programming
Course 8: Google Data Analytics Capstone: Complete a Case Study