Engineering Exercise – Clover Health

Instructions

Part of the onsite includes a portion where we want to evaluate your code. Please complete this exercise and get it back to us within 48 hours of your visit.

Problem definition

You have directories containing data files and specification files. The specification files describe the structure of the data files. Write an application in the language of your choice that reads format definitions from specification files. Use these definitions to load the data files into a database.

Problem details

Data files exist in a data/ directory relative to your application and specification files exist in a specs/ directory relative to your application.

Specification files will have filenames equal to the file type they specify and extension of .csv. So fileformat1.csv would be the specification for files of type fileformat1.

Data files will have filenames based on their specification file name, followed by an underscore, followed by the drop date and an extension of .txt. For example, fileformat1_2007-10-15.txt would be a data file to be parsed using specs/fileformat1.csv, which arrived on 10/15/2007.

Format files will be csv formated with columns "column name", "width", and "datatype".

"column name" will be the name of that column in the database table
"width" is the number of characters taken up by the column in the data file
"datatype" is the SQL data type that should be used to store the value in the database table.

Data files will be flat text files with lines matching single records for the database. Lines are formatted as specified by their associated specification file.

Example

This is an example file pair; other files may vary in structure while still fitting the structure of the problem details (above):

specs/testformat1.csv:

"column name",width,datatype

name,10,TEXT

valid,1,BOOLEAN

count,3,INTEGER

Here we have a specification that describes 3 columns:

The first 10 characters labeled "name" of type TEXT
The next 1 character labeled "valid" of type BOOLEAN ('1' = True, '0' = False)
The last 3 characters labeled "count" of type INTEGER

data/testformat1_2015-06-28.txt:

Foonyor 1 1

Barzane 0-12

Quuxitude 1103

Processing this data file results in the following table:

	name	valid	count
1	Foonyor	TRUE	1
2	Barzane	FALSE	-12
3	Quuxitude	TRUE	103

Expectations

Expectations

Clover primarily uses Python, but your application can be written with language/libraries of your choosing. Take this opportunity to best demonstrate your talents!
Database type and connection mechanism is left to your discretion.
You are expected to write unit tests for different cases of the core logics. You do not have to use any specific unit test frameworks but should include instructions to run your tests should you use any. You may even write a simple driver program for us to run your unit test cases with mockup data.
You should implement the conversions for the SQL data types: TEXT, BOOLEAN, and INTEGER
You should be able to handle any specification file that matches the problem description (not just the given example)
Files can be assumed to use UTF-8 encoding
You should be prepared to discuss implementation decisions and possible extensions to your application.

Solution

The solution is using a Spring Boot powered application to run a REST service with H2 in-memory database for demo purpose. The application supports calling the REST service to import a data file by name. The application then processes the data file and inserts the data into the H2 database. The data files can then subsequently be queried using the REST service by ID.

Database schema design

The database is designed so support multiple data files with different type of columns and many rows.

Build

# build using maven
> mvn clean install

Run

# run, this will start the server running on port 8080
> java -jar target/clover-health-demo.jar

Import a data file

# POST filename to import the datafile. The application knows how to access the file
curl -X POST http://localhost:8080/data-file/testformat1_2015-06-28.txt

Get all imported files

# Call GET all to retrieve all imported data files
curl  http://localhost:8080/data-files | json_pp

Response

[
   {
      "id" : 1,
      "name" : "testformat1_2015-06-28.txt",
      "importedAt" : "2021-03-21T21:18:54.552+0000",
      "columns" : [
         {
            "id" : 1,
            "width" : 10,
            "type" : "TEXT",
            "rows" : [
               {
                  "valueTxt" : "Foonyor",
                  "id" : 1
               },
               {
                  "id" : 4,
                  "valueTxt" : "Barzane"
               },
               {
                  "id" : 7,
                  "valueTxt" : "Quuxitude"
               }
            ],
            "name" : "name"
         },
         {
            "rows" : [
               {
                  "valueBool" : true,
                  "id" : 2
               },
               {
                  "id" : 5,
                  "valueBool" : false
               },
               {
                  "valueBool" : true,
                  "id" : 8
               }
            ],
            "type" : "BOOLEAN",
            "name" : "valid",
            "id" : 2,
            "width" : 1
         },
         {
            "name" : "count",
            "type" : "INTEGER",
            "rows" : [
               {
                  "id" : 3,
                  "valueInt" : 1
               },
               {
                  "valueInt" : -12,
                  "id" : 6
               },
               {
                  "valueInt" : 103,
                  "id" : 9
               }
            ],
            "width" : 3,
            "id" : 3
         }
      ]
   }
]

Get one file by ID

# Call GET by ID to retrieve one imported data file
curl http://localhost:8080/data-file/1 | json_pp

Response

{
   "id" : 1,
   "columns" : [
      {
         "width" : 10,
         "rows" : [
            {
               "valueTxt" : "Foonyor",
               "id" : 1
            },
            {
               "id" : 4,
               "valueTxt" : "Barzane"
            },
            {
               "id" : 7,
               "valueTxt" : "Quuxitude"
            }
         ],
         "name" : "name",
         "type" : "TEXT",
         "id" : 1
      },
      {
         "type" : "BOOLEAN",
         "id" : 2,
         "width" : 1,
         "rows" : [
            {
               "id" : 2,
               "valueBool" : true
            },
            {
               "id" : 5,
               "valueBool" : false
            },
            {
               "valueBool" : true,
               "id" : 8
            }
         ],
         "name" : "valid"
      },
      {
         "id" : 3,
         "type" : "INTEGER",
         "rows" : [
            {
               "valueInt" : 1,
               "id" : 3
            },
            {
               "id" : 6,
               "valueInt" : -12
            },
            {
               "valueInt" : 103,
               "id" : 9
            }
         ],
         "width" : 3,
         "name" : "count"
      }
   ],
   "name" : "testformat1_2015-06-28.txt",
   "importedAt" : "2021-03-22T17:23:08.415+0000"
}

Delete one file by ID

# Call DELETE by ID to delete one imported data file from DB
curl -X DELETE http://localhost:8080/data-file?id=1

Response

{"Status":"File deleted successfully"}

Delete all files

# Call DELETE to delete all files from DB
curl -X DELETE http://localhost:8080/data-files

Response

SUCCESS

Query the DB by going to H2 Console

Navigate in your browser to http://localhost:8080/h2-console

Press Connect

mtuzhilov / clover-health-exercise Goto Github PK

clover-health-exercise's Introduction

Engineering Exercise – Clover Health

Instructions

Problem definition

Problem details

Example

Expectations

Solution

Database schema design

Build

Run

Import a data file

Get all imported files

Get one file by ID

Delete one file by ID

Delete all files

Query the DB by going to H2 Console

clover-health-exercise's People

Watchers

Recommend Projects

Recommend Topics

Recommend Org