Giter Club home page Giter Club logo

Comments (5)

ayushpatnaikgit avatar ayushpatnaikgit commented on June 15, 2024 1

I am making a bare minimum multistage survey design.

from survey.jl.

smishr avatar smishr commented on June 15, 2024

Im not sure the approximations for mean and variance would work out this way. You may be mathematically correct but it should be checked beforehand. Current attempt at SurveyDesign tries to infer whether given inputs/data is SRS or Stratified.

Will look more into this

from survey.jl.

smishr avatar smishr commented on June 15, 2024

"In the single-stage approximation the PSUs are treated as strata and the second-stage sampling units are treated as PSUs."

Rereading through Chap3 of Lumley

from survey.jl.

smishr avatar smishr commented on June 15, 2024

I am attempting to make a structure for cluster sampling, which accommodates multiple stages. My work in progress is in #134

from survey.jl.

smishr avatar smishr commented on June 15, 2024

@ayushpatnaikgit @iuliadmtru
I have been going through R svydesign.default function for the below 2-stage cluster sample command:

dclus2<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2)

I did trace(svydesign,browser) in R, and went through each step.

We could take some cues from R for the generalised survey design.

Observations

  • They have a lot of checks for NULL and typeof argument. There is a failsafe function that has logic if an argument is not given. In Julia this could/should be incorporated as multiple dispatches to increase legibility and code quality.
  • Multiple clusters/strata are stored together in string concatenated form separated by '.'. Eg in above command the two cluster columns are stored in the design object as fpc1, and fpc1+'.'+fpc2 !
  • strata are created and filled even if no strata provided. See appendix below.
  • There is an allprobs matrix and probs vector. allprobs has the probabilities for each stage, while probs seems to be the net sampling probability (which is the product of each column in allprobs). rval$prob <- apply(probs, 1, prod); rval$allprob <- probs
  • weights are not stored in the design object, only probs. If they ever need weights they just do as.matrix(1/probs).
  • There is a neat function called as.fpc which calculates and returns the popsize and sampsize for the design, given all arguments. fpc <- as.fpc(fpc, strata, ids, pps = pps)

fpc correction

R logic when no probs or weights given and popsize could not be inferred. Related to #110 and #93 . So it is not that different from what is currently implemented for SRS and Stratified?

if (is.null(probs) && is.null(weights)) {
  if (is.null(fpc$popsize)) {
    if (missing(probs) && missing(weights)) 
      warning("No weights or probabilities supplied, assuming equal probability")
    probs <- rep(1, nrow(ids))
  }
  else {
    probs <- 1/weights(fpc, final = FALSE)
  }
}

Appendix

Observe how they store strata (even when not stratified). dclus2[["strata"]][["V2"]] is just a vector of ones

> dclus2[["strata"]][["V2"]]
  [1] 1.15  1.63  1.83  1.83  1.83  1.117 1.132 1.132 1.132 1.152 1.152
 [12] 1.152 1.173 1.173 1.173 1.173 1.176 1.198 1.198 1.198 1.198 1.200
 [23] 1.200 1.200 1.200 1.200 1.228 1.228 1.264 1.295 1.295 1.295 1.295
 [34] 1.295 1.302 1.302 1.302 1.302 1.403 1.403 1.403 1.403 1.403 1.452
 [45] 1.452 1.452 1.452 1.456 1.480 1.480 1.480 1.480 1.480 1.523 1.523
 [56] 1.534 1.534 1.534 1.534 1.534 1.549 1.549 1.549 1.549 1.549 1.552
 [67] 1.552 1.570 1.570 1.570 1.570 1.570 1.574 1.575 1.575 1.575 1.575
 [78] 1.575 1.596 1.596 1.596 1.596 1.596 1.620 1.620 1.620 1.620 1.620
 [89] 1.638 1.638 1.638 1.638 1.638 1.639 1.639 1.639 1.639 1.639 1.674
[100] 1.674 1.679 1.679 1.679 1.679 1.687 1.687 1.687 1.701 1.701 1.711
[111] 1.711 1.719 1.731 1.731 1.731 1.731 1.731 1.742 1.768 1.768 1.781
[122] 1.781 1.781 1.781 1.781 1.795
40 Levels: 1.117 1.132 1.15 1.152 1.173 1.176 1.198 1.200 ... 1.83

from survey.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.