Giter Club home page Giter Club logo

mocktablegenerators.jl's People

Contributors

ararslan avatar beacon-infra[bot] avatar ericphanson avatar glennmoy avatar kleinschmidt avatar omus avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mocktablegenerators.jl's Issues

use of `GLOBAL_RNG` makes generation non-reproducible

Global RNG may change with julia versions and if other randomness is happening in other tasks/threads then it may lead to non-reproducible draws. We should provide some paved path for users to bring their own RNGs at generation time; either by updating the docs/examples to show how you could implement this yourself in your own generators, or by changing the API methods to accept an RNG argument first (a la rand and friends) that defaults to the global rng...

Public vs Private API is unclear

Besides TableGenerator it's not clear which functions are public vs private API.
Empirically, one would think generate is the only public-facing function but it's not documented one way or the other.

add utility function for generating actual mock tables, not just a network of mocked records

I want to use MockTableGenerators to generate mock tables, but what it really supports out of the box is generating a network of mocked records; e.g. if I run the example code in the README the output is this data structure:

julia> # pass RNG for reproducible generation:
       collect(MockTableGenerators.generate(StableRNG(11), DAG))
41-element Vector{Any}:
  :person => (id = UUID("5a3d3d5e-ff13-417a-8b79-7c9e0c9cfb56"), first_name = "David", last_name = "Brown")
   :visit => (id = UUID("78899e60-eb7d-462d-9ed9-ef4e2046b221"), person_id = UUID("5a3d3d5e-ff13-417a-8b79-7c9e0c9cfb56"), index = 1, date = Date("1991-09-12"))
 :symptom => (visit_id = UUID("78899e60-eb7d-462d-9ed9-ef4e2046b221"), symptom = "Chills")
 :symptom => (visit_id = UUID("78899e60-eb7d-462d-9ed9-ef4e2046b221"), symptom = "Fever")
   :visit => (id = UUID("ac7d2f48-b1ae-4461-b3d7-85c527da1949"), person_id = UUID("5a3d3d5e-ff13-417a-8b79-7c9e0c9cfb56"), index = 2, date = Date("1995-11-04"))
 :symptom => (visit_id = UUID("ac7d2f48-b1ae-4461-b3d7-85c527da1949"), symptom = "Runny nose")
 :symptom => (visit_id = UUID("ac7d2f48-b1ae-4461-b3d7-85c527da1949"), symptom = "Runny nose")
  :person => (id = UUID("3c65b575-6628-4583-824e-a5340d3a02a8"), first_name = "Carol", last_name = "Johnson")
   :visit => (id = UUID("a8378ede-0d8b-4037-a6a6-aaa001ad216b"), person_id = UUID("3c65b575-6628-4583-824e-a5340d3a02a8"), index = 1, date = Date("1984-05-31"))
 :symptom => (visit_id = UUID("a8378ede-0d8b-4037-a6a6-aaa001ad216b"), symptom = "Fever")
 :symptom => (visit_id = UUID("a8378ede-0d8b-4037-a6a6-aaa001ad216b"), symptom = "Fever")
   :visit => (id = UUID("3375b4e2-da8d-4c63-987b-1beafa8088e4"), person_id = UUID("3c65b575-6628-4583-824e-a5340d3a02a8"), index = 2, date = Date("1991-12-26"))
 :symptom => (visit_id = UUID("3375b4e2-da8d-4c63-987b-1beafa8088e4"), symptom = "Chills")
 :symptom => (visit_id = UUID("3375b4e2-da8d-4c63-987b-1beafa8088e4"), symptom = "Chills")
   :visit => (id = UUID("515f665b-3f9e-458a-9a83-fdf8f3afdf66"), person_id = UUID("3c65b575-6628-4583-824e-a5340d3a02a8"), index = 3, date = Date("1995-01-10"))
 :symptom => (visit_id = UUID("515f665b-3f9e-458a-9a83-fdf8f3afdf66"), symptom = "Muscle Loss")
 :symptom => (visit_id = UUID("515f665b-3f9e-458a-9a83-fdf8f3afdf66"), symptom = "Fainting")
  :person => (id = UUID("7aef3538-782c-42b3-affb-ab188f985ce1"), first_name = "David", last_name = "Williams")
          
 :symptom => (visit_id = UUID("fe57c21d-fa08-4f2e-ab58-1c01ea7a2625"), symptom = "Muscle Loss")
 :symptom => (visit_id = UUID("fe57c21d-fa08-4f2e-ab58-1c01ea7a2625"), symptom = "Fainting")
  :person => (id = UUID("35131b5b-e0c7-4cb7-9ffc-cd5524c5617e"), first_name = "David", last_name = "Smith")
   :visit => (id = UUID("c8534847-0d27-441a-8c33-416586da8a9b"), person_id = UUID("35131b5b-e0c7-4cb7-9ffc-cd5524c5617e"), index = 1, date = Date("1983-11-22"))
 :symptom => (visit_id = UUID("c8534847-0d27-441a-8c33-416586da8a9b"), symptom = "Runny nose")
   :visit => (id = UUID("12cea550-51ac-4215-b58d-a2ec5fc14f9f"), person_id = UUID("35131b5b-e0c7-4cb7-9ffc-cd5524c5617e"), index = 2, date = Date("1996-03-27"))
 :symptom => (visit_id = UUID("12cea550-51ac-4215-b58d-a2ec5fc14f9f"), symptom = "Fatigue")
 :symptom => (visit_id = UUID("12cea550-51ac-4215-b58d-a2ec5fc14f9f"), symptom = "Fatigue")
   :visit => (id = UUID("0a01520e-1f7c-47f4-88d4-899bef6f0524"), person_id = UUID("35131b5b-e0c7-4cb7-9ffc-cd5524c5617e"), index = 3, date = Date("1998-04-29"))
 :symptom => (visit_id = UUID("0a01520e-1f7c-47f4-88d4-899bef6f0524"), symptom = "Muscle Loss")
 :symptom => (visit_id = UUID("0a01520e-1f7c-47f4-88d4-899bef6f0524"), symptom = "Fainting")
  :person => (id = UUID("5593e69f-b17b-41f9-a773-c2889a31cf14"), first_name = "Alice", last_name = "Johnson")
   :visit => (id = UUID("7cb41823-d331-41b9-82f1-90e940436b0c"), person_id = UUID("5593e69f-b17b-41f9-a773-c2889a31cf14"), index = 1, date = Date("1982-05-03"))
 :symptom => (visit_id = UUID("7cb41823-d331-41b9-82f1-90e940436b0c"), symptom = "Fatigue")
   :visit => (id = UUID("d27666c8-5183-4d7d-8c0e-23ff3dfb66df"), person_id = UUID("5593e69f-b17b-41f9-a773-c2889a31cf14"), index = 2, date = Date("1990-01-09"))
 :symptom => (visit_id = UUID("d27666c8-5183-4d7d-8c0e-23ff3dfb66df"), symptom = "Runny nose")
 :symptom => (visit_id = UUID("d27666c8-5183-4d7d-8c0e-23ff3dfb66df"), symptom = "Fatigue")

This is useful if I'm trying to upload this stream of dependent records in the intended order to an external storage system, but for simple flat-file-generation purposes I'd love it if I could do this:

julia> MockTableGenerators.generate_tables(StableRNG(11), DAG)

and just immediately get a Vector where each element is a Table.jl-compliant table, with the whole bundle generated in accordance with the given DAG

or can just place a tablify function in this package that converts the current MockTableGenerators.generate output to this form

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

errors are not surfaced unless channel is immediately `collect`ed

MWE:

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.8.5 (2023-01-08)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> import Pkg; Pkg.activate("/Users/dkleinschmidt/.julia/dev/MockTableGenerators/")
  Activating project at `~/.julia/dev/MockTableGenerators`

julia> using MockTableGenerators

julia> struct Gen <: TableGenerator end

julia> g = Gen()
Gen()

julia> c = MockTableGenerators.generate(g)
Channel{Any}(10) (closed)

julia> collect(c)
Any[]

julia> c = collect(MockTableGenerators.generate(g))
ERROR: TaskFailedException
Stacktrace:
  [1] try_yieldto(undo::typeof(Base.ensure_rescheduled))
    @ Base ./task.jl:871
  [2] wait()
    @ Base ./task.jl:931
  [3] wait(c::Base.GenericCondition{ReentrantLock})
    @ Base ./condition.jl:124
  [4] take_buffered(c::Channel{Any})
    @ Base ./channels.jl:416
  [5] take!
    @ ./channels.jl:410 [inlined]
  [6] iterate(c::Channel{Any}, state::Nothing)
    @ Base ./channels.jl:498
  [7] iterate
    @ ./channels.jl:496 [inlined]
  [8] _collect(cont::UnitRange{Int64}, itr::Channel{Any}, #unused#::Base.HasEltype, isz::Base.SizeUnknown)
    @ Base ./array.jl:723
  [9] collect(itr::Channel{Any})
    @ Base ./array.jl:712
 [10] top-level scope
    @ REPL[9]:1

    nested task error: MethodError: no method matching table_key(::Gen)
    Stacktrace:
     [1] _generate!(callback::MockTableGenerators.var"#4#6"{Channel{Any}}, rng::Random._GLOBAL_RNG, gen::Gen, deps::Dict{Any, Any})
       @ MockTableGenerators ~/.julia/dev/MockTableGenerators/src/generate.jl:105
     [2] generate(callback::Function, rng::Random._GLOBAL_RNG, dag::Gen)
       @ MockTableGenerators ~/.julia/dev/MockTableGenerators/src/generate.jl:71
     [3] (::MockTableGenerators.var"#3#5"{Random._GLOBAL_RNG, Gen})(ch::Channel{Any})
       @ MockTableGenerators ~/.julia/dev/MockTableGenerators/src/generate.jl:62
     [4] (::Base.var"#591#592"{MockTableGenerators.var"#3#5"{Random._GLOBAL_RNG, Gen}, Channel{Any}})()
       @ Base ./channels.jl:134

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.