Nashville Housing Data Cleaning
Tools & Skills
-
SQL
-
CASE statements
-
JOIN
-
SPLIT_PART
-
CTE
-
ALTER_TABLE
-
COALESCE
-
In this project I uploaded a CSV file containing Nashville Housing data into PostgreSQL to clean. The purpose of this project was to standardize the data into a more readable and usable format.
Steps taken to clean data:
1. When I examined the data, I found null values for Propertyaddress, however, there were corresponding Propertyaddress found elsewhere in the data. I was able to use a JOIN and COALESCE to combine the data and fill in null address values.
​
2. I wanted to break apart the address into separate columns containing Address, City and State. I used SPLIT_PART and ALTER_TABLE to accomplish this task.
​
3. Next I wanted to change 'Y and 'N' found in the SoldAsVacant column to 'Yes' and 'No' for easier readability. I accomplished this by using a CASE statement.
​
4. I checked for duplicates and removed them using a CTE and PARTITION BY.
​
5. Lastly I removed columns that were unnecessary to my analysis using ALTER_TABLE and DROP.
