top of page

Profiling and Analyzing the Yelp Dataset

Tools & Skills Used
  • SQL

    • IS NULL​

    • AVG, MIN, MAX

    • SUM

    • LIKE

    • JOIN

    • Aliasing

This project was conducted as part of the final assessment for the 'SQL for Data Science' course on Coursera.com, were I learned how to interpret the structure, meaning, and relationships in source data and use SQL as a professional to shape your data for targeted analysis purposes

 

This first section of the project focused on profiling the Yelp dataset to help understand the relationship between the many tables it contains. I used queries to:

  • How search for how many unique values there are for each table

  • Determine if there were any null values

  • Calculate basic statistical values of for various given fields

  • Determine the cities with the most reviews

  • Find the users with the most reviews and fans

  • Search reviews for key phrases 

 

The second section of the project entailed choosing one city and one business within the Yelp dataset. I then grouped them by their overall star ratings to analyze how their ratings could be affected by hours of operation, number or reviews and location. To perform this task, I used JOIN to combine tables to analyze.

 

The final section of the project asked us to pick our own type of analysis to conduct on the Yelp dataset. I had to describe what I intended to analyze, what tables and functions I would be using for my analysis and the conclusions I was able to determine.

bottom of page