Giter Club home page Giter Club logo

htru1's Introduction

HTRU1

The HTRU1 Batched Dataset is a subset of the HTRU Medlat Training Data, a collection of labeled pulsar candidates from the intermediate galactic latitude part of the HTRU survey. HTRU1 was originally assembled to train the SPINN pulsar classifier. If you use this dataset please cite:

SPINN: a straightforward machine learning solution to the pulsar candidate selection problem V. Morello, E.D. Barr, M. Bailes, C.M. Flynn, E.F. Keane and W. van Straten, 2014, Monthly Notices of the Royal Astronomical Society, vol. 443, pp. 1651-1662 arXiv:1406:3627

The High Time Resolution Universe Pulsar Survey - I. System Configuration and Initial Discoveries M. J. Keith et al., 2010, Monthly Notices of the Royal Astronomical Society, vol. 409, pp. 619-627 arXiv:1006.5744

The full HTRU dataset is available here.

The HTRU1 Batched Dataset

The HTRU1 Batched Dataset consists of 60000 32x32 images in 2 classes: pulsar & non-pulsar. Each image has 3 channels (equivalent to RGB), but the channels contain different information:

  • Channel 0: Period Correction - Dispersion Measure surface
  • Channel 1: Phase - Sub-band surface
  • Channel 2: Phase - Sub-integration surface

There are 50000 training images and 10000 test images. The HTRU1 Batched Dataset is inspired by the CIFAR-10 Dataset.

The dataset is divided into five training batches and one test batch. Each batch contains 10000 images. These are in random order, but each batch contains the same balance of pulsar and non-pulsar images. Between them, the six batches contain 1194 true pulsars and 58806 non-pulsars.

This is an imbalanced dataset.

Pulsar: pulsar1 pulsar2 pulsar3 pulsar4 pulsar5 pulsar6 pulsar7 pulsar8 pulsar9 pulsar10

Non-pulsar: cand1 cand2 cand3 cand4 cand5 cand6 cand7 cand8 cand9 cand10

Using the Dataset in PyTorch

The htru1.py file contains an instance of the torchvision Dataset() for the HTRU1 Batched Dataset.

To use it with PyTorch in Python, first import the torchvision datasets and transforms libraries:

from torchvision import datasets
import torchvision.transforms as transforms

Then import the HTRU1 class:

from htru1 import HTRU1

Define the transform:

# convert data to a normalized torch.FloatTensor
transform = transforms.Compose([
    transforms.RandomHorizontalFlip(), # randomly flip and rotate
    transforms.RandomRotation(10),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
    ])

Read the HTRU1 dataset:

# choose the training and test datasets
train_data = HTRU1('data', train=True, download=True, transform=transform)
test_data = HTRU1('data', train=False, download=True, transform=transform)

Using Individual Channels in PyTorch

If you want to use only one of the "channels" in the HTRU1 Batched Dataset, you can extract it using the torchvision generic transform transforms.Lambda.

This function extracts a specific channel ("c") and writes the image of that channel out as a greyscale PIL Image:

def select_channel(x,c):
    
    from PIL import Image
    
    np_img = np.array(x, dtype=np.uint8)
    ch_img = np_img[:,:,c]
    img = Image.fromarray(ch_img, 'L')
    
    return img

You can add it to your pytorch transforms like this:

transform = transforms.Compose(
   [transforms.Lambda(lambda x: select_channel(x,0)),
    transforms.ToTensor(),
    transforms.Normalize([0.5],[0.5])])

Jupyter Notebooks

An example of classification using the HTRU1 class in PyTorch is provided as a Jupyter notebook treating the dataset as an RGB image and also extracting an individual channel as a greyscale image.

These are examples for demonstration only - please don't use them for science!

HitCount

htru1's People

Contributors

as595 avatar aeneas-wp3 avatar ai4astro avatar

Stargazers

Wynand van Staden avatar CJ avatar Nimalan avatar Eslam Hussein avatar KASSI SONG avatar  avatar Shreyas Bapat avatar Natasha Scannell avatar Swapnil Sharma avatar

Watchers

James Cloos avatar  avatar  avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.