Giter Club home page Giter Club logo

Hello There! I am Sanyam Gujral.

Linkedin Badge Gmail Badge

A passionate frontend developer and data science enthusiastic

gujralsanyam22

A Boy trying to Figure-Out Stuff!!

gujralsanyam22

gujralsanyam

khushboogoel01

Coding

Connect with me:

gujralsanyam https://www.kaggle.com/sanyamgujral https://www.hackerrank.com/gujralsanyam

Languages and Tools:

arduino aws c css3 docker gcp git heroku html5 mysql opencv python scikit_learn sqlite tensorflow

gujralsanyam22

 gujralsanyam22

gujralsanyam22

Sanyam gujral's Projects

alien-war-game icon alien-war-game

War Games was a cartoon series proposed around 2007 and developed by Bluefields Creative. The notion was to generate a series of short episodes, in the vein of the series Star Wars: Clone Wars ("Newbie is a rookie Colonial Marine in a future where the Alien threat is widespread and feared by all. Marine training now involves HOT "Bug Hunt" scenarios - HOT meaning this is not a simulation. Rookies are dropped into a semi-controlled environment with live ammunition and REAL aliens (though tethered with remote stun collars). Each episode of the animated series would start with a mini-episode of Newbie and his training exercises, which get more elaborate each week. The animated piece at bluefieldscreative.com represents the first training exercise, pitting Newbie, with very limited ammo, in the middle of an abandoned terraforming station that is home to one single alien. A very dangerous game of one-on-one hide-and-seek. Of course, by the end of the first season, Newbie will be integrated into the main storyline and seeing his first taste of real action in the war against the bugs, but these mini-episodes.

awesome-streamlit icon awesome-streamlit

The purpose of this project is to share knowledge on how awesome Streamlit is and can be

book-recomendation-system icon book-recomendation-system

This repository contains all the files related to the final project for the Master in Data Science, VI Edition, held by Kschool Madrid. The objective of this project is the implementation of a book recommender system, using the data made available by Goodreads, a website where users can register and rate the books they have read, sharing their ratings and opinions with other readers. The approach chosen is to generate a system that recommends books using the information inherent in users' ratings. So, rather than predicting the ratings that each user would give to all the books included in this analysis that he hasn’t read yet (and hence trying to reduce the error in those predictions), I have chosen instead to generate a system that give relevant recommendations to each user based on a certain measure of similarity between the books he has already read and rated and the books he hasn't read but other users do. In order to successfully run the code, please download all the .csv files included in this repository and place them in your own working directory and hence change the path at the beginning of each script (os.chdir("/Users/678094/Desktop/Goodreads")) so it will be pointing at your own working directory where the files have just been saved. I have replicated this part at the beginning of each notebook because in order to build the recommender system it is not compulsory to run all three notebooks: notebooks 01. and 02. are facultative, they serve to scrape from the Goodreads website additional data used in the analysis. But the same data is saved as .csv file and loaded again in notebook .03. So the logical sequence of the code consists in running notebook 01, 02 and 03 respectively, but in case you want to skip the scraping part and go directly to the recommender system, you can run notebook 03. independently, changing the working directory accordingly as indicated above.

book-recomendation-system1 icon book-recomendation-system1

This repository contains all the files related to the final project for the Master in Data Science, VI Edition, held by Kschool Madrid. The objective of this project is the implementation of a book recommender system, using the data made available by Goodreads, a website where users can register and rate the books they have read, sharing their ratings and opinions with other readers. The approach chosen is to generate a system that recommends books using the information inherent in users' ratings. So, rather than predicting the ratings that each user would give to all the books included in this analysis that he hasn’t read yet (and hence trying to reduce the error in those predictions), I have chosen instead to generate a system that give relevant recommendations to each user based on a certain measure of similarity between the books he has already read and rated and the books he hasn't read but other users do. In order to successfully run the code, please download all the .csv files included in this repository and place them in your own working directory and hence change the path at the beginning of each script (os.chdir("/Users/678094/Desktop/Goodreads")) so it will be pointing at your own working directory where the files have just been saved. I have replicated this part at the beginning of each notebook because in order to build the recommender system it is not compulsory to run all three notebooks: notebooks 01. and 02. are facultative, they serve to scrape from the Goodreads website additional data used in the analysis. But the same data is saved as .csv file and loaded again in notebook .03. So the logical sequence of the code consists in running notebook 01, 02 and 03 respectively, but in case you want to skip the scraping part and go directly to the recommender system, you can run notebook 03. independently, changing the working directory accordingly as indicated above.

bundler icon bundler

Manage your Ruby application's gem dependencies

cab-fair-price-prediction icon cab-fair-price-prediction

Predicting the Quality of Red Wine using Machine Learning Algorithms for Regression Analysis, Data Visualizations and Data Analysis. Description Context The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine. For more details, consult the reference [Cortez et al., 2009]. Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.). These datasets can be viewed as classification or regression tasks. The classes are ordered and not balanced (e.g. there are much more normal wines than excellent or poor ones). This dataset is also available from the UCI machine learning repository, https://archive.ics.uci.edu/ml/datasets/wine+quality , I just shared it to kaggle for convenience. (If I am mistaken and the public license type disallowed me from doing so, I will take this down if requested.) Content For more information, read [Cortez et al., 2009]. Input variables (based on physicochemical tests): 1 - fixed acidity 2 - volatile acidity 3 - citric acid 4 - residual sugar 5 - chlorides 6 - free sulfur dioxide 7 - total sulfur dioxide 8 - density 9 - pH 10 - sulphates 11 - alcohol Output variable (based on sensory data): 12 - quality (score between 0 and 10) Tips What might be an interesting thing to do, is aside from using regression modelling, is to set an arbitrary cutoff for your dependent variable (wine quality) at e.g. 7 or higher getting classified as 'good/1' and the remainder as 'not good/0'. This allows you to practice with hyper parameter tuning on e.g. decision tree algorithms looking at the ROC curve and the AUC value. Without doing any kind of feature engineering or overfitting you should be able to get an AUC of .88 (without even using random forest algorithm) KNIME is a great tool (GUI) that can be used for this. 1 - File Reader (for csv) to linear correlation node and to interactive histogram for basic EDA. 2- File Reader to 'Rule Engine Node' to turn the 10 point scale to dichtome variable (good wine and rest), the code to put in the rule engine is something like this: $quality$ > 6.5 => "good" TRUE => "bad" 3- Rule Engine Node output to input of Column Filter node to filter out your original 10point feature (this prevent leaking) 4- Column Filter Node output to input of Partitioning Node (your standard train/tes split, e.g. 75%/25%, choose 'random' or 'stratified') 5- Partitioning Node train data split output to input of Train data split to input Decision Tree Learner node and 6- Partitioning Node test data split output to input Decision Tree predictor Node 7- Decision Tree learner Node output to input Decision Tree Node input 8- Decision Tree output to input ROC Node.. (here you can evaluate your model base on AUC value) Inspiration Use machine learning to determine which physiochemical properties make a wine 'good'! Acknowledgements This dataset is also available from the UCI machine learning repository, https://archive.ics.uci.edu/ml/datasets/wine+quality , I just shared it to kaggle for convenience. (I am mistaken and the public license type disallowed me from doing so, I will take this down at first request. I am not the owner of this dataset. Please include this citation if you plan to use this database: P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009. Relevant publication P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.

classyvision icon classyvision

An end-to-end PyTorch framework for image and video classification

credit-card-fraud-detection icon credit-card-fraud-detection

Data Source: https://www.kaggle.com/dalpozz/creditcardfraud/data It is a CSV file, contains 31 features, the last feature is used to classify the transaction whether it is a fraud or not. Information about data set The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions. It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues original features are not provided and more background information about the data is also not present. Features V1, V2, ... V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-senstive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise. Given the class imbalance ratio, we recommend measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification. The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (Université Libre de Bruxelles) on big data mining and fraud detection. Flow of Project We have done Exploratory Data Analysis on full data then we have removed outliers using "LocalOutlierFactor", then finally we have used KNN technique to predict to train the data and to predict whether the transaction is Fraud or not. We have also applied T-SNE to visualize the Fraud and genuine transactions in 2-D. How to Run the Project In order to run the project just download the data from above mentioned source then run any file. Prerequisites You need to have installed following softwares and libraries in your machine before running this project. Python 3 Anaconda: It will install ipython notebook and most of the libraries which are needed like sklearn, pandas, seaborn, matplotlib, numpy, scipy. Installing Python 3: https://www.python.org/downloads/ Anaconda: https://www.anaconda.com/download

credit-card-fraud-detection-1 icon credit-card-fraud-detection-1

Data Source: https://www.kaggle.com/dalpozz/creditcardfraud/data It is a CSV file, contains 31 features, the last feature is used to classify the transaction whether it is a fraud or not. Information about data set The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions. It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues original features are not provided and more background information about the data is also not present. Features V1, V2, ... V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-senstive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise. Given the class imbalance ratio, we recommend measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification. The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (Université Libre de Bruxelles) on big data mining and fraud detection. Flow of Project We have done Exploratory Data Analysis on full data then we have removed outliers using "LocalOutlierFactor", then finally we have used KNN technique to predict to train the data and to predict whether the transaction is Fraud or not. We have also applied T-SNE to visualize the Fraud and genuine transactions in 2-D. How to Run the Project In order to run the project just download the data from above mentioned source then run any file. Prerequisites You need to have installed following softwares and libraries in your machine before running this project. Python 3 Anaconda: It will install ipython notebook and most of the libraries which are needed like sklearn, pandas, seaborn, matplotlib, numpy, scipy. Installing Python 3: https://www.python.org/downloads/ Anaconda: https://www.anaconda.com/download/

data-science-portfolio icon data-science-portfolio

Portfolio of data science projects completed by me for academic, self learning, and hobby purposes.

deep-surveillance-monitor-facial-emotion-age-gender-recognition-system icon deep-surveillance-monitor-facial-emotion-age-gender-recognition-system

Computer Vision module for detecting emotion, age and gender of a person in any given image, video or real time webcam. A custom VGG16 model was developed and trained on open source facial datasets downloaded from Kaggle and IMDB. OpenCV,dlib & keras were used to aid facial detection and video processing. The final system can detect the emotion, age and gender of people in any given image, video or real time webcam

denoiser icon denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

deploying-a-sentiment-analysis-model-on-amazon-sagemaker icon deploying-a-sentiment-analysis-model-on-amazon-sagemaker

Deploying a Sentiment Analysis Model on Amazon Sagemaker which consists of deploying a Sentiment Analysis model using Recurrent Neural Networks in the Amazon AWS SageMaker tool. The notebook and Python files provided here result in a simple web application which interacts with a deployed recurrent neural network performing sentiment analysis on movie reviews. In the final architecture AWS API Gateway and AWS Lambda functions is used as well.

detectron2 icon detectron2

Detectron2 is FAIR's next-generation platform for object detection and segmentation.

diabities-classification-project icon diabities-classification-project

Predict whether a patient has diabetes. This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective is to predict based on diagnostic measurements whether a patient has diabetes. I conducted a brief exploratory data analysis then ran three models: logistic regression, linear SVC and gradient boosting classification. Links [1] - https://www.kaggle.com/uciml/pima-indians-diabetes-database

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.