Module 3: Data Cleaning with SQL Answers (Part 2: Q16–30)

This is Part 2 of the Module 1 quiz answers for “Data Cleaning with SQL ” from the Google Data Analytics Professional Certificate on Coursera.

Here, we’ll walk through questions 16 to 30 with detailed explanations to support your learning.

To find answers to the remaining questions, check out the full module breakdown below:

16. You are working with a database table that contains employee data. The table includes columns about employee location such as city, state, country, and postal_code. You want to retrieve the first 3 characters of each postal code. You decide to use the SUBSTR function to retrieve the first 3 characters of each postal code, and use the AS command to store the result in a new column called new_postal_code.

You write the SQL query below. Add a statement to your SQL query that will retrieve the first 3 characters of each postal code and store the result in a new column as new_postal_code.

NOTE: The three dots (...) indicate where to add the statement.

What employee ID number appears in row 5 of your query result?

NOTE: The query index starts at 1 not 0.

  • 3 ✅
  • 1
  • 8
  • 7

Explanation:
Using SUBSTR(postal_code, 1, 3) extracts part of the postal code, and naming it AS new_postal_code stores the result in a new column. Sorting places employee ID 3 in the fifth row.

17. Why do data analysts choose to work with SQL? Select all that apply.

  • SQL can handle huge amounts of data. ✅
  • SQL is a powerful software program.
  • SQL is a well-known standard in the professional community. ✅
  • SQL is a programming language that can also create web apps.

Explanation:
SQL is a powerful tool widely used in data analysis and is well-known for its ability to handle large datasets and serve as a standard across different databases.

18. A team of data analysts is working on a large project that will take months to complete and contains a huge amount of data. They need to document their process and communicate with multiple databases. The team decides to use a SQL server as the main analysis tool for this project and SQL for the queries. What makes this the most efficient tool? Select all that apply.

  • SQL efficiently handles large amounts of data. ✅
  • SQL records queries and changes throughout a project. ✅
  • SQL contains commands that build visualizations.
  • SQL allows you to connect to multiple databases. ✅

Explanation:
SQL is designed for scalability and logging, which is critical for long-term, collaborative projects. It can query multiple sources and maintain query history for documentation and analysis.

19. Fill in the blank: _____ refers to the process of converting data from one type to another.

  • Formatting
  • Cleaning
  • Typecasting ✅
  • Querying

Explanation:
Typecasting involves changing the data type of a value, such as converting strings to integers or dates to timestamps, which is essential in cleaning and preparing data.

20. A data analyst is working with product sales data. They import new data into a database. The database recognizes the data for product price as text strings. What SQL function can the analyst use to convert text strings to floats?

  • LENGTH
  • TRIM
  • SUBSTR
  • CAST ✅

Explanation: The CAST function is used to convert data from one type to another, such as converting a text string to a float.

21. Fill in the blank: The _____ function can be used to join strings to create a new column.

  • CAST
  • COALESCE
  • TRIM
  • CONCAT ✅

Explanation:
CONCAT() is used in SQL to combine multiple strings into one. This is helpful when creating new columns or labels by merging existing text fields.

22. As a data analyst, you are working on a quick project containing a small amount of data. As the data was emailed to you, there is no need to query the data. What tool should you use to perform your analysis?

  • Spreadsheet ✅
  • SQL
  • word process
  • CSV

Explanation: Since the data is small and already sent (not stored in a database), there’s no need for SQL. Spreadsheets (like Excel or Google Sheets) are perfect for quick data analysis and visualization of small datasets.

23. A data analyst has added a massive table to their database on accident and needs to remove the table. What command can the analyst use to correct their mistake?

  • DROP TABLE IF NOT EXISTS
  • INSERT INTO
  • REMOVE TABLE IF EXISTS
  • DROP TABLE IF EXISTS ✅

Explanation: This SQL command deletes a table if it exists, avoiding errors if the table name is incorrect or already deleted. Perfect for fixing accidental additions.

24. You are working with a database table that contains invoice data. The table includes a column for customer_id. You want to remove duplicate entries for customer_id and get a count of total customers in the database.

You write the SQL query below. Add a DISTINCT clause that will remove duplicate entries from the customer_id column.

NOTE: The three dots (...) indicate where to add the clause.

What is the total number of customers in the database?

  • 84
  • 105
  • 43
  • 59 ✅

Explanation: Use SELECT COUNT(DISTINCT customer_id) to eliminate duplicates and count unique customers.

25. In SQL databases, what data type refers to a number that does not contain a decimal?

  • String
  • Integer ✅
  • Boolean
  • Float

Explanation:
An integer is a whole number. It doesn’t have decimal points, unlike a float.

26. After joining multiple tables you find your data contains a significant amount of null values. What function can you use to return only the non-null values in a list ?

  • CAST
  • COALESCE ✅
  • TRIM
  • CONCAT

Explanation:
The COALESCE() function returns the first non-null value from a list of arguments, helping clean up data with nulls.

27. You are working with a database table that contains customer data. The table includes columns about customer location such as city, state, and country. The state names are abbreviated. You want to retrieve the first 2 letters of each state name. You decide to use the SUBSTR function to retrieve the first 2 letters of each state name, and use the AS command to store the result in a new column called new_state.

You write the SQL query below. Add a statement to your SQL query that will retrieve the first 2 letters of each state name and store the result in a new column as new_state.

NOTE: The three dots (...) indicate where to add the statement.

NOTE: SUBSTR takes in three arguments being column, starting_index, ending_index

What customer ID number is in row 9 of your query result?

NOTE: The query index starts at 1 not 0.

  • 3
  • 55
  • 47 ✅
  • 10

Explanation:
The SUBSTR(state, 1, 2) function extracts the first 2 characters of the state column and stores them in the new_state column. Customer ID 47 appears in row 9 after sorting by state.

28. A junior data analyst joins a new company. The analyst learns that SQL is heavily utilized within the organization. Why would the organization choose to invest in SQL? Select all that apply.

  • SQL is a programming language that can also create web apps.
  • SQL can handle huge amounts of data. ✅
  • SQL is a powerful software program.
  • SQL is a well-known standard in the professional community. ✅

Explanation:
SQL is scalable and widely supported, making it ideal for large, collaborative environments.

29. You are working with a database table that contains invoice data. The table includes columns for invoice_id and billing_state. You want to remove duplicate entries for billing state and sort the results by invoice ID.

You write the SQL query below. Add a DISTINCT clause that will remove duplicate entries from the billing_state column.

NOTE: The three dots (...) indicate where to add the clause.

What billing state appears in row 17 of your query result?

NOTE: The query index starts at 1 not 0.

  • AZ ✅
  • NV
  • CA
  • WI

Explanation:
Use SELECT DISTINCT billing_state to remove duplicates, then ORDER BY invoice_id. The 17th row is AZ.

30. You are working with a database table that contains customer data. The table includes columns about customer location such as city, state, country, and postal_code. You want to check for city names that are greater than 9 characters long.

You write the SQL query below. Add a LENGTH function that will return any city names that are greater than 9 characters long.

What is the first name of the customer that is in row 7 of your query result?

NOTE: The query index starts at 1 not 0.

  • Diego
  • Kara
  • Julia ✅
  • Roberto

Explanation:
Use LENGTH(city) > 9 in the WHERE clause. The 7th row in the result has Julia.

Hope this helped! Use the buttons below to move to the previous or next part.

Leave a Reply