Giter Club home page Giter Club logo

oea-testdatagenkit's Introduction

Note: This kit is currently unreleased, and is dependent on the OEA framework v0.7

Module Test Data Generation Kit

This module test data generation kit aims to enable users to generate randomized test data, that will be able to be used across all modules, schemas, and packages within the OEA framework. This tool will allow you to create temporary data to be used in experimentation with any module or package. Test data generated here will also connect across modules, allowing the user to create robust dashboards on semi-realistic data, with no threat to the privacy of an education system.

OEA Module Test Data Generator Overview

Test Data Generation: Base-Truth Table Structures

The OEA test data generation kit uses five base-truth tables to artifically generate data for any module by creating the general data and then assigning the data source's proper column names. These base-truth table details are described below, which are defined within the test data generation class notebook.

Abbreviations

  • SIS: School Information System
  • UUID: Universal Unique Identifier

Students

Column Name Description
Gender Student gender: M (male), F (female), or O (other)
FirstName Student first name
MiddleName Student middle name
LastName Student last name
StudentID SIS ID: UUID
Birthday Student birth date: YYYY-MM-DD
School School name
SchoolID SIS ID: UUID
Grade Student grade level (numerical)
Performance Student academic performance: high, avg (average), or low
HispanicLatino Student ethnicity: True or False
Race white (White), blackafricanamerican (Black or African American), americanindianalaskanative (American Indian or Alaska Native), asian (Asian), nativehawaiianpacificislander (Native Hawaiian or Other Pacific Islander), or twoormoreraces (Two or More Races)
Flag (Blank), FreeLunch, ReducedLunch, Homeless, or GiftedOrTalented
Email Student school email address: (FirstName)(LastName)@contoso.edu
Phone Student phone number
Address Student street address
City Student city
State Student state: CA
Zipcode Student zipcode: #####

Schools

Column Name Description
SchoolName School name
SchoolID SIS ID: UUID

Courses

Column Name Description
CourseName Course name
CourseID Course information system ID: UUID
SchoolName School name where course is hosted
SchoolID School information system ID of school where course is hosted
CourseSubject English Language and Literature, Mathematics, Life and Physical Sciences, Social Sciences and History, Visual and Performing Arts, Physical Health and Safety Education, Information Technology, Communication and Audio Video Technology, Business and Marketing, Health Care Sciences, Architecture and Construction, Human Services, Engineering and Technology, World Language, Miscellaneous, or Non-Subject-Specific
CourseGradeLevel Numeric grade level (i.e. 9, 10, 11, 12)

Sections

Column Name Description
SectionName Section name: (CourseName) ###
SectionID SIS ID: UUID
CourseName CourseName associated with section
CourseID CourseID associated with section
SchoolName SchoolName where section is hosted
SchoolID SchoolID of SchoolName
SectionSubject CourseSubject of related course
SectionGradeLevel CourseGradeLevel of related course

Enrollment

Column Name Description
StudentName Student first and last name
StudentID StudentID of StudentName
SectionName SectionName of section the student is enrolled in
SectionID SectionID of SectionName
CourseName CourseName associated with section
CourseID CourseID associated with section
CourseGradeLevel CourseGradeLevel associated with CourseName/CourseID
SchoolName School that is hosted section that student is enrolled in
SchoolID SchoolID of SchoolName

Test Data Generation Setup Instructions

OEA Module Test Data Generator Setup Instructions

Preparation: This module currently leans on v0.7 of the OEA framework. Ensure you have proper Azure subscription and credentials and setup of the OEA framework. This will include v0.7 of the OEA python class.

Notes:

  • Examine modules/data sources currently compatible. See below for these applicable data sources. Choose which modules or data sources to apply this test data generator.
  • If you do not see a data source you wish to generate test data for, you will need to develop assets similar to the Insights module test data generator example.
  1. Import the general module test data generation class and demo notebooks, and run the demo notebook to create the base-truth tables. See more details and instructions under the notebook folder in this kit.
  2. Run the desired module-specific test data generation demo notebook.
  3. Verify that the test data was created and stored in stage1.
  4. Ingest the test data within the scope of that particular module or package. You can then utilize the test data generated for the relevant module or package/use case Power BI dashboard.

Data Source Compatibility

As it currently stands, this test data generation kit can be applied to the following OEA Modules:

Module Applicable Tables
Clever Module For the Daily Participation and Resource Usage tables.
Microsoft Education Insights Module For M365 roster and activity tables.

See the Insights module test data generator assets under the Notebook resource for an example of a compatible module for this test data generation kit.

Test Data Generation Kit Components

Out-of-the box assets for this OEA test data generation kit include:

  1. Base-truth table generation notebooks:
    • test_data_generation_py: Main class for test data generation. Used by test_data_gen_demo to create base truth table files for support test data generation modules.
    • test_data_gen_demo: Run this file in your OEA Synapse environment to generate base truth table files that can be used to create any module test data.
  2. Module-specific table generation notebooks:

This Test Data Generation Kit welcomes contributions.

This module was developed by Kwantum Analytics. The architecture and reference implementation for all modules is built on Azure Synapse Analytics - with Azure Data Lake Storage as the storage backbone, and Azure Active Directory providing the role-based access control.

Legal Notices

Microsoft and any contributors grant you a license to the Microsoft documentation and other content in this repository under the Creative Commons Attribution 4.0 International Public License, see the LICENSE file, and grant you a license to any code in the repository under the MIT License, see the LICENSE-CODE file.

Microsoft, Windows, Microsoft Azure and/or other Microsoft products and services referenced in the documentation may be either trademarks or registered trademarks of Microsoft in the United States and/or other countries. The licenses for this project do not grant you rights to use any Microsoft names, logos, or trademarks. Microsoft's general trademark guidelines can be found at http://go.microsoft.com/fwlink/?LinkID=254653.

Privacy information can be found at https://privacy.microsoft.com/en-us/

Microsoft and any contributors reserve all other rights, whether under their respective copyrights, patents, or trademarks, whether by implication, estoppel or otherwise.

oea-testdatagenkit's People

Contributors

cstohlmann avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.