Giter Club home page Giter Club logo

paper-hdrda's People

Contributors

philyoung007 avatar ramhiser avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

paper-hdrda's Issues

Revamp discussion

Currently, the discussion is weak. Rewrite it. Items that should be discussed:

  • Shrinkage includes several methods with the appropriate choice of alpha and gamma
  • We allow the differences in covariance matrices to relax the common linear assumptions employed for high-dimensional data (i.e., LDA)
  • We do not require the restrictive assumption that features are uncorrelated/independent

Finetune intro

The CSDA reviewers made a few comments that lead me to believe the intro should be more clear. With that in mind, we need to finetune the intro and ensure we are communicating the following:

  • We modify RDA to gain interpretability
  • HDRDA inherently yields dimension reduction
  • HDRDA is much faster than RDA
    • Mention savings in terms of Big Oh
    • Timing comparison (from #26)
  • HDRDA should be used instead of RDA (for p > N?)
    • Add RDA to classification study?
  • HDRDA is competitive in terms of classification performance

Add timing comparison

Previously, I coded up a timing comparison, comparing simdiag::hdrda vs klaR::rda. We'll bring that back with a more exhaustive comparison and also include the PenalizedLDA::PenalizedLDA from Witten and Tibshirani (2011).

I'll also add the diagonal classifiers but will likely forego including them in the paper. We will need to remark that loosely justifies this. Frankly, the diagonal classifiers will be much faster because they are simpler, but at the cost of classification accuracy.

In the Witten and Tibshirani (2011) paper, they also perform a timing comparison with 4 populations with varying feature dimensions:

  • p=20
  • p=200
  • p=2000
  • p=20000

They perform the timing comparison over 25 repetitions and report the mean and standard deviation of the runtimes. We'll do something similar but with a lot more repetitions.

Revamp two computational complexity paragraphs

These two paragraphs immediately follow Proposition 1 and are found on pages 9-10 of AoAS submission.

  • Make more concise
  • Incorporate computational complexity
  • (Optional) Write algorithm for training/model selection (JCGS papers do this)

Consider classification study with simulated data

Although I find classification studies with simulated data largely pointless, I'd rather add roughly two simulation configurations to ensure the paper is published. If a referee desires more than two, so be it. But two should be good enough.

Change title and name of classifier

Our proposed classifier clearly does not generalize the RDA classifier but instead improves and modernizes it for high-dimensional data. With this in mind, we need a slick name. The title of the paper should reflect the name somehow.

Add coauthor organizations to paper

  • JAR

    uStudio, Inc.
    1806 Rio Grande St
    Austin, TX 78701

  • CKS

    Myeloma Institute
    University of Arkansas for Medical Sciences
    4301 West Markham # 816
    Little Rock, Arkansas 72205

  • PDY

    Department of Management and Information Systems
    Baylor University
    One Bear Place #98005
    Waco, Texas 76798-7140

  • DMY

    Department of Statistical Science
    Baylor University
    One Bear Place #97140
    Waco, Texas 76798-7140

Update \oplus notation to 2x2 block-diagonal

The \oplus notation used in the paper is less conventional and may be a bit confusing. Instead, we'll switch to 2x2 block-diagonal matrices.

For example, in equation 10, we use the notation W_k \oplus \gamma I_{p-q}. Instead, we should replace this with:

\begin{bmatrix}
W_k & 0 \\
0 & \gamma I_{p-q}
\end{bmatrix}

The results will be more intelligible. It will take a bit of effort though to ensure that no orphan notation is introduced.

Rewrite salespitch of HDRDA classifier in introduction

After the regularization literature review in the introduction, we begin with "Here, we propose the high-dimensional RDA classifier..." I am now of the opinion that this is not the route we want to go. It does not reference to Friedman's (1989) RDA classifier, and at least one reviewer was critical of this.

The wording then needs to change but still maintain a strong presence. Here are working blurbs to add to paragraph or to replace original sentences.

(After brief discussion of Friedman's classifier) We reparameterize Friedman's (1989) RDA classifier so that the resulting covariance-matrix estimator is a convex combination of ... Our parameterization improves the interpretation of the contribution of each observation weighted by the pooling parameter. We show that our parameterization results yields an equivalent, dual decision function that can be efficiently calculated for p >> N.

Create figure showing contours as a function of lambda

The idea here is to demonstrate the effect of the tuning parameter lambda. One emphasis in the paper that we are leaning towards is stressing the benefits of relaxing the linearity assumptions of the LDA classifier.

The figure should display the following:

  • Contours for approximately 4-5 populations
  • The covariance matrices should be obviously different when lamdba = 0
  • The contours should be identical for lambda = 1
  • Display 5 subfigures each for lamdba = 0, 0.25, 0.5, 0.75, 1.

When lambda is introduced in the paper, add one sentence that says we demonstrate the effect of lambda in the figure.

Rewrite abstract

I am not happy with the current abstract. It does not sell the paper well enough.

Upload final draft to arXiv

After Mrs. Young proofreads the paper and I've applied the edits, upload the paper to arXiv before submitting paper to CSDA.

Discuss computational complexity

We have stressed that our proposed classifier is much faster. We need to add more details to back up our claim.

  • Provide computational complexity for proposed classifier
  • Contrast with computational complexity of the original RDA classifier

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.