Giter Club home page Giter Club logo

data-engineer-takehome's Introduction

Data Engineer Takehome Test

Please clone this repo in your account, send us the link of the solution in your GitHub account together with your application.

Problem 1:

Build a Python script that detects faces in an image using OpenCV, and saves the headshots of the detected faces to a specified directory. The script should take as input a file path to an image, a directory path to save the headshots, and output the number of faces detected in the image. Instructions:

  • Use OpenCV's Haar Cascade classifier for face detection
  • The script should be written in Python and use the following libraries: OpenCV, Numpy, and PIL (Python Imaging Library)
  • The script should be well commented and easy to understand
  • The script should be able to handle a variety of image types (e.g. jpeg, png, etc.)
  • The script should be able to handle images with multiple faces
  • The script should save the headshots in the specified directory with the file name in the format "face_1.jpg", "face_2.jpg", etc.

Problem 2:

Move all image files from one S3 bucket to another S3 bucket, but only if the image has no transparent pixels.

Objective: Write a Python script that uses the Boto3 library to accomplish the following:

  • List all the image files in a given S3 bucket
  • Check if each image file has transparent pixels
  • If an image file has no transparent pixels, copy it to a different S3 bucket
  • If an image file has transparent pixels, log it in a separate file

Guidelines:

  • Your script should take the name of the source and destination buckets as input
  • You should use the Boto3 library to interact with S3
  • You should use the Pillow library to check for transparent pixels in an image
  • Your script should handle any errors that may occur during the opening of image file, copy process and anywhere else you deem necessary
  • Your script should be well commented and easy to understand
  • Your script should be executed from the command line

data-engineer-takehome's People

Contributors

randat9 avatar ahmetcetin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.