Giter Club home page Giter Club logo

soil-prediction's Introduction

Soil-Prediction

Summary

This document describe the 2nd prize solution approach to Africa Soil Property Prediction Challenge. Soil-Prediction challenge. In this competition it is required to predict 5 target soil functional properties from diffuse reflectance infrared spectroscopy measurements. The solution consists of two steps: Data preprocessing and model prediction. For the preprocessing stage, we used 2 methods, one applied for target 1-4 (PIDN/Ca/P/pH/SOC ) and the other for target 5 (Sand). The second step was to feed the processed features to a neural network. In order to ensure that the CV error is stabilized, we had to average enough models. We ended up with 100 models to get reasonably stable error.

Feature selection and extraction

In order to optimized the prediction results, the main effort concentrated on feature dilution and processing. We first decimated the features by 8, in a standard manner, we low pass with a 16-tap hamming window and then decimated by 8. This decimation is coarse and ignores data with different type (such as Topsoil/Subsoil etc.) We also skipped features 41-99 as they do not contains much information. We ended up with 391 features. For target 5 (Sand) this used as the input (after linear normalization) to a 2-layers neural networks with 1-4-4-1 architecture. For target 1-4 we did more processing. First, we calculate the derivative (difference sequence) resulting 390 features, then centering the results and the last stage is enhancing strong variance features. This is done by (point-wise) multilying the features by the data standard deviation vector (normalized to range 0-1).

Modeling Techniques and Training

At the start of the competition we tried using few commonly used ML models such as SVM, KNN, neural networks etc. We quickly noted that in, all models, the cross validation creats a strong noise. Adding the fact that the training set and test set are relativly small in size (1158 data elements for the training and 727 elements for the test set) it was clear that overfitting is a big issue here. This make the LB results very problemtic.

Our model used matlab implementation of neural networks, trainlm. This uses the robust levenberg-marquet algorithm. The layer architecture was 1-4-4-1 (2 hidden layers). We trained the model with 5-fold cross validation and averaged 20 times. (overall 100 model average).

Dependencies

  • The scripts requires MATLAB 2014 with Neural Network Toolbox and Statistics Toolbox.

Code

The code is matlab script, provided in repository [2]. The main script is soil.m, includes model building, training and submission.

References

  1. http://www.kaggle.com/c/afsis-soil-properties "Africa Soil Property Prediction Challenge"
  2. https://github.com/CharlyBi/Soil-Prediction.

soil-prediction's People

Contributors

charlybi avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.