The research goal of this project is to inspect rent growth prices for various Victorian suburbs and predict areas that will have the greatest growth within the coming years. External factors such as crime, population, income and public transport have been taken into account.
The timeline for the property data is from the year 2022, taken from the month of September. The rest of the external data is taken from the years of 2021, 2016 and 2011.
To run the pipeline, please download the dependencies in requirements.txt. Please note that the chrome browser is required for the program to run.
Then, please visit the scripts
directory and run the files in order:
(Please note that only step 1. shapefiles.py
needs to be run since the shapefiles were too large to upload. The raw data for steps 2-7 have already been uploaded but may be run again if required.)
shapefiles.py
: This script downloads the Australian SA2 shapefiles from the Australian Bureau of Statistics(ABS), and saves them to thedata\raw\shapefiles
directory.suburb.py
: This scrapes all the 307 SA2 level suburbs and saves it in thedata\raw
directorysearch_suburbs.py
: This script uses domain.com.au's autocomplete feature to get the URLs for Victoria's suburbs and saves it todata\raw
generate_urls.py
: This script generates all the property URLs by suburbscrape.py
: This script scrapes properties, saving them to thedata\raw
directorydownload_census.py
: This script downloads census data from the Australian Bureau of Statistics(ABS) abs.gov.au for 2011,2016,2021, saving them to thedata\raw
directory.
Please visit the notebook directory at this time.download_crime.ipynb
: This notebook downloads crime data, saving them to thedata\raw
directory
Then, please visit the notebooks
directory and run these files in order:
-
external data preprocessing:
preprocessing_crime.ipynb
income_&_population_2021.ipynb
income_&_population_2011_2016.ipynb
routing_assignments.ipynb
shapefiles_visualisation.ipynb
-
property data preprocessing:
preprocess_property.ipynb
assign_suburbs.ipynb
visualisation_housing.ipynb
-
joining datasets:
join_datasets.ipynb
join_isochrones_and_crime_data.ipynb
-
analysis and modelling:
feature_analysis.ipynb
linear_model.ipynb
liveability_ranking.ipynb
neural_network_model.ipynb
xgboost_model.ipynb
-
summary:
summary.ipynb
Andrew Dharmaputra, 1213935
Arshia Azarhoush, 1175924
Ayesha Tabassum, 1166531
Sophie Sarwesvaran, 1063490
Sureen Tiwana, 912147