Data Science in San Francisco: Predicting At-Risk Restaurants
Abstract
San Francisco is one of the most popular cities in the US and is recorded to have the highest number of restaurants per capita than any other major US city. However, this creates a problem for San Francisco’s health inspectors because it makes it harder to track and maintain the quality of these restaurants. In our research, we will address this issue and propose a way to facilitate finding the restaurants that have a higher need of inspection through social media.
Since social media has become prevalent in our society, it can be used to provide valuable information previously not available. Social Media is not only used to connect to friends and post personal information, but also write reviews of businesses, movies, or restaurants. Customers who take the time to write a review on a social media platform generally provide a more genuine opinion rather than customers who were asked to write a review. Taking this in to consideration, customer reviews in social media like Yelp would provide a more accurate dataset rather than customers who were asked to fill out a survey. Our work involves taking a dataset of San Francisco restaurant reviews provided on Yelp to predict which restaurants have a higher need of a health inspection. We will present various graphs, implemented through R Studio to visualize and explain our results.