Airbnb Price Evaluator

Scroll down to learn more

Inspiration

During a visit to Brooklyn, New York this summer one of our members began a discussion with their host. Almost immediately listing price emerged, and the host’s questions were many. Was it right for the area? What should we start at for our other apartment? How important was the air conditioning? It became apparent there was a need for pricing clarity and assistance. The idea for our project was born from a direct need.

Project Team

Our project team consists of four graduate students from Northwestern University who all are part of the Master of Science in Artificial Intelligence Program. We bring a diverse set of expertise ranging from mechanical engineering and physics to traditional computer science. At Northwestern University, we all are diving deep into the field of artificial intelligence and would like to help shape the future world.

Dataset, Preprocessing, and Evaluation

Our dataset was provided from three different Kaggle repositories - detailing the same features for Boston, Seattle, and New York City (all 5 boroughs included). In order to evaluate our models, we performed 5-fold cross validation r squared testing, where our task was to predict the price value for an unseen Airbnb. Below is a visualization of each dataset.

New York

Boston

Seattle

High Level Approach

Our approach was to use multiple different learners on the same dataset, and progressively increase the complexity and representation of the features fed into each of our model’s. For our learners we chose two separate sets – traditional regressors and ensemble regressors. Our traditional regressors were Lasso, Ridge, Elastinet, and Linear Regressors. We chose each of these as they represent a good mix of complexity and each can deal with different underlying probability distributions well. Our ensemble methods were a random forest, and a gradient boosted regressor – each representing two of the most popular ways to optimize based on fundamentally different approaches to bias and variance in any given stage. Below is a visualization of the major stages we executed – and the key results section highlights our takeaways from the process.

Key Results

Throughout the stages of our project, we gained several key insights around the importance of particular features and the efficacy of different learners. First, we learned that the critical feature in this case was an accurate representation of location – namely using the neighborhood of a listing as opposed to latitude and longitude. Additionally, we learned that making sure a given dataset was accurately represented is essential (this came from shuffling our NYC dataset more carefully). These two actions produced the biggest effect on performance across all learners (represented in stage 1 -> 2). Removing outliers in the dataset also was helpful in our understanding and modeling (stage 2 -> 3). Additionally, our initial hypothesis that reviews and host features would be excellent model attributes was misguided – and we believe this is due to human rating tendencies (bifurcation of scores) and potentially manufactured feedback (stage 3 -> 4). Also – we were surprised that sentiment of listing description was not as relevant as well, but there is likely additional work that can be done to extract more meaning (stage 4 -> 5). In terms of model performance, we learned that no matter the complexity of the simple regressor (Ridge and Lasso) they eventually have a difficult time reaching acceptable accuracy with such a complex task. We had originally hypothesized that one of the traditional regressors would eventually beat out the others – but our work very much shows the importance of using ensemble methods with a complex regression task. Finally, throughout the entirety of the project we learned just how important the complexity of the underlying dataset is. New York was by far the most difficult market to model, and we attempted many different mutations of both our models and the dataset to alleviate this bottleneck. This was an excellent learning with a real-world dataset – sometimes significantly more work is required to reach acceptable performance, and it is not always clear as to why. Below we have represented our outputs for all models and stages both with and without NYC to demonstrate this complexity.