Skills demonstarted:

(pandas, numpy, tensorflow/keras, scikit-learn & matplotlib)

Classifying-Quasars-Galaxies-Stars-using-Photometry

Note: There are 2 notebooks photometry.ipynb (models were initially created and ensembled here) and production.ipynb (already saved models were loaded and used in production)

I have tried to classify Quasars, Galaxies and Stars using photometric data.

Motivations:

Large catalogs of unlabelled stellar objects are available. Labelling these stellar objects is important for a number of reasons. E.g: for statistical poulation analyses and for testing cosmological models to name a few. Although these stellar objects can be classified by analyzing their optical spectrums but that process is time consuming. Next generation of telescopes will increase the quantity of available unlabelled data even more! That's why I have tried to use the photometric data and a combination of machine learning approaches to label the stellar objects.

Based on Kaggle Dataset

Table of Contents photometry.ipynb

1. Data Cleaning & Data Preparation

Loading data as a Pandas Data Frame
Selecting relevant columns/features
Removing unphysical records/rows
Undersampling
Creating training, validation and test splits
Converting data into numpy arrays
Normalizing The Data

2. Data Visualisation

Principal Component Analysis
- Intercative 3d scatter plot

3. Data Classification

Unsupervised Learning
- Gaussian Mixture Model - Clustering
  - Choosing the best Permutation
Supervised Learning
- Neural Network
- XGBoost
- Random Forest

4. Model Ensembling

Simple-Random-Search for weight optimization

5. Conclusions

6. What's Next?

How can we achieve this?
Possible Limitations

Results

The results were:

For objects correctly_classified as Quasars, 90.3% of them had a probability greater than 0.9 of being a Quasar.
For objects correctly_classified as Galaxies, 86.7% of them had a probability greater than 0.9 of being a Galaxy.
For objects correctly_classified as Stars, 99.6% of them had a probability greater than 0.9 of being a Star.

Screenshots from the notebook production.ipynb:

Features Used

Feature Name	Description
u	Ultraviolet filter in the photometric system
g	Green filter in the photometric system
r	Red filter in the photometric system
i	Near Infrared filter in the photometric system
z	Infrared filter in the photometric system
redshift	Redshift value based on the increase in wavelength
class	Object class (galaxy, star, or quasar object)

ansimran / classifying-quasars-galaxies-stars-using-photometry Goto Github PK