The physmap_manuscript's discuss from erickenjilee

erickenjilee / physmap_manuscript Goto Github PK

View Code? Open in Web Editor NEW

This repo is provided for the reproduction of figures and analyses found in the manuscript Lee et al. 2024, bioRxiv. It should be useful for both reviewers and interested researchers.

R 95.99% MATLAB 4.01%

physmap_manuscript's Issues

Question about CellExplorerV2 classification paper vs. code

Hello! First of all, congrats on the amazing manuscript. I had a question about differences between the paper and the codebase. In the paper, you write:

We conducted an 15-85% test-train split and trained a gradient boosted tree model (GBM) with five-fold cross-validation on this dataset to identify one of five cell types that had over 25 examples each and this process was repeated 10 times with different random seeds

However, in the codebase, these are the following lines of code:

i <- createDataPartition(E$origCells, times = 1, p = 0.7, list = FALSE)
training = E[i[,1],]
testingset = E[-i[,1],]
   
ctrl <- trainControl(method = "repeatedcv", number=numreps)
#fit a regression model and use k-fold CV to evaluate performance
model <- train(origCells~., data = training, method = "rf", trControl = ctrl, verbose=FALSE)

where numreps=2. It seems the train-test split is 70-30 (not 85-15) and you are using 2-fold validation (not 5-fold validation). Can you help us understand these differences?

Recommend Projects

erickenjilee / physmap_manuscript Goto Github PK

physmap_manuscript's Issues

Question about CellExplorerV2 classification paper vs. code

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent