Airbnb Research

Table of Contents

Open Findings Here
Key Points
Files Description

Key Points

CRISP-DM Process (Cross Industry Process for Data Mining)
Business Understanding
Data Understanding
Prepare Data
Data Modeling
Evaluate Results
Working with Categorical Values


Anaconda Distribution of Python it is all is needed to run this project.

This project follows the CRISP-DM Process (Cross Industry Process for Data Mining)

  1. Business Understanding
  2. Data Understanding
  3. Preparing Data
  4. Data Modeling
  5. Results


In this project we are interested in answering questions regarding the airbnb business. A few of the questions we want to answer are: What we can do to have more customers, How are the trends for people visiting our city?, What influences people to travel more during an specific time?. Are prices a factor on losing businesses with people?, among many others.

Possible questions: -Is there a relationship between ratings and price? -Are good reviews associated directly to higher price?


In order to answer these questions we must have a better understanding of the data we have access to. For this we use open source data that is available to us on the internet in Kagle website. In this site we found data that is related to Airbnb businesses located in Boston City. Looking at a listing.csv file, we can see data columns such as price, bathrooms, bedrooms, minimum_nights, 'maximum_nights', among many others. So far this would be a good starting point to try to answer a few questions.

We will prepare the data next to be able to start to visualizing with the help of Data Modeling techniques.


We are working with to sets of data, Boston and Seattle.


We do this to see how well our model fits.


The main findings of the code can be found at the post available here.



We have our data from internet. We are going to use Seattle AirBnB csv files that we found in Kaggle.