raju009f Goto Github PK
Name: RAJU
Type: User
Bio: Learnin
Twitter: Raj009f
Blog: Dataengist.com
Name: RAJU
Type: User
Bio: Learnin
Twitter: Raj009f
Blog: Dataengist.com
Introduction to Data Engineering workshop, learn to build a data pipeline with Luigi!
GCP CI CD Pipeline using Cloud Build and Cloud Run etc ...
GCPDLP
simple repository
MyProfile
Sample Python Flask application for testing OpenShift 3 deployment using OpenShift default Python S2I builder and gunicorn.
Placement coding problem practise
Common solutions and tools developed by Google Cloud's Professional Services team
retail_db
retail_db_json
The Luigi pipeline should contain the following tasks Parse/Cleanup - This step should parse the documents.txt file and remove any punctuation Compute TF - This step should compute the term-frequency for each term in each document. Compute IDF - This step should compute the inverse document frequency for each term Compute TF-IDF - This step will compute the TF-IDF weight of each term in each document Compute Similarity - This step should determine the similarity between all documents by calculating the Euclidian Distance between each TF-IDF vector. Expected output The final output of the pipeline should be a csv with columns corresponding to: Document 1 ID Document 2 ID Similarity Document id is in the index of the document according to the order defined by the input file. Output is ordered from most similar to least. Similarity computed for each pair of documents in the input where document_id_1 < document_id_2
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.