Giter Club home page Giter Club logo

cms_image_classification's Introduction

Application of Transfer Learning Tuning to a Convolutional Neural Network for Image classification to the analysis of collisions in High Energy Physics

Table of contents

General Info

This project is based of this paper: https://arxiv.org/pdf/1708.07034.pdf

The program of LHC and CMS records the collision of high energy physics and shares the data in the worldwide LHC Computing Grid, which gives a platform for physicists in 42 countries.With CMS open data, this research applied convolutional neural network to classify tt + jets, W + jets and Drell-Yan processes. We compare the performance of five well-known CNN models and test transfer learning in particle classification.

Project process:

JSON files -> Image Dataset -> Augmented Dataset (optional) -> Transfer Learning Model -> classification results

Technologies

  • Google colab
  • Jupyter notebooks

Original Repos

Requirements

All required module can be found in the requirements.txt file

How to Download Project

git clone https://github.com/jzyee/cms_image_classification

How to get json files

json-collisons README.md

How to Create Images

  1. After creating the json files transfer the json files
  • from folder: json-cms/AnalysisFW/python/outputjsons
  • to folder: json_files
  1. Run create_images.ipynb

This will create the images in the CreatedImages folder

How to Create Augmented Images(optional)

  • not advised if user is planning to use the free gpu from provided by google colab. User will require a stronger GPU for training to have decent training times
  1. Run create_augments.ipynb

This will create the images in the CreatedImages/output folder

How to Tune a Transfer Learning Model

In this part, we can use Google Colaboratory to run the notebook as it provides GPU accelerators which help accelerate the training process. You will need a google drive account to upload your dataset into.

  1. Upload the CreatedImages folder to your google drive
  2. Rename the folders DYjets Images,TTjets Images,Wjets Images to 0,1,2 respectively.
  3. Download a model that you would like to use or all the .ipynb files in this folder, and their filenames represent the different models and activation function (e.g. “vgg19_quark_classification_RMSprop.ipynb” means using VGG19 model with RMSprop as its activation function)

An example of how to download an .ipynb file:
wget https://github.com/jzyee/cms_image_classification/blob/master/Training_Model/incepV3_quark_classification.ipynb
  1. Open the link: https://colab.research.google.com, create a new python3 notebook, and upload the .ipynb files via “File->Upload notebook” in the menu at the top left of the page. Then you can see the codes are successfully loaded in google colaboratory.

  2. To use GPU accelerator, you can change the setting via “Runtime->Change runtime type->Hardware accelerator->GPU->SAVE”. If loaded successfully, in the third cell it would print out ‘Found GPU at: /device:GPU:0’.

  3. To rerun the codes, images are needed. According to our codes, images should be uploaded to your google drive.In the fifth cell, it will grant you access to your google drive by mounting the drive. You will need to change the image paths accordingly to where you have stored them on your drive

  4. Each notebook is essestially is made up of 2 main parts.

    • notebook setup
    • model

    For notebook setup:
    Run all the cells under this header, the cells under this header will help you install the packages necessary and create your dataset. You will need to alter your working directory and your image path to where you have stored your images in your google drive account, you will find these particular cells under the heading data prep

    An example of setting the current working directory in colab to 'drive/My Drive/lhc_durham':

     %cd drive/'My Drive/lhc_durham'
    

    An example of setting the image path in relation to the working directory in colab (drive/My Drive/lhc_durham/filetered_images):

     img_folder = '/filtered_images'
    

    Model section :
    Each model section is broken up into 4 parts:

    1. entirely frozen
    2. few layers unfrozen
    3. many layers unfrozen
    4. entirely unfrozen

    alt text alt text

    Figure 1.0 Transfer learning tuning graph

    You can choose to run one section to achieve a model tuned via a particular frozen-unfrozen proportion or run all the cells in the notebook to see the performance of the model with different frozen-unfrozen proportions. This technique of tuning the model is inspired by the above graph

  5. If you run all the cells in the notebook, the trained models will be saved along with fitting history. This is so that you can load the model at a later time to carry the prediction again without the training.

  6. Congratulations you have carried out transfer learning model tuning!

References and Useful Links

  1. the European Organization for Nuclear Research (2011). CMS detector design. Available at: http://cms.web.cern.ch/news/cms-detector-design. (Accessed: November 2011)
  2. the European Organization for Nuclear Research (2011). How CMS detects particles. Available at: http://cms.web.cern.ch/news/how-cms-detects-particles. (Accessed: November 2011)
  3. CREN Openlab (2017). White Paper: Future Challenges in Scientific Research. Available at:http://cds.cern.ch/record/2301895/files/Whitepaper_brochure_ONLINE.pdf. (Accessed: September 2017)
  4. towards data science (2018) Transfer learning from pre-trained models. Available at: https://towardsdatascience.com/transfer-learning-from-pre-trained-models-f2393f124751

cms_image_classification's People

Contributors

jzyee avatar chuckwong13 avatar ladderseeker avatar jiawanghao1997 avatar mammmmmm avatar

Stargazers

 avatar

Watchers

 avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.