Giter Club home page Giter Club logo
  • 👋 Hi, I’m @vaitybharati
  • 👀 I’m interested in Data Science, Machine Learning and Artificial Intelligence
  • 🌱 I’m currently mastering Python, Tableau, R, MySQL, Azure, Apache, Sapark, Hadoop, SAS, Artificial intelligence and Deep learning
  • 💞️ I’m looking to collaborate on all topics related to Data Science, Machine Learning and Artificial Intelligence.
  • 📫 You can reach me on my email id [email protected]

Vaitybharati's Projects

assignment-07-k-means-clustering-airlines- icon assignment-07-k-means-clustering-airlines-

Assignment-07-K-Means-Clustering-Airlines. Perform clustering (K means clustering) for the airlines data to obtain optimum number of clusters. Draw the inferences from the clusters obtained. The file EastWestAirlinescontains information on passengers who belong to an airline’s frequent flier program. For each passenger the data include information on their mileage history and on different ways they accrued or spent miles in the last year. The goal is to try to identify clusters of passengers that have similar characteristics for the purpose of targeting different segments for different types of mileage offers.

assignment-08-pca-data-mining-wine- icon assignment-08-pca-data-mining-wine-

Assignment-08-PCA-Data-Mining-Wine data. Perform Principal component analysis and perform clustering using first 3 principal component scores (both heirarchial and k mean clustering(scree plot or elbow curve) and obtain optimum number of clusters and check whether we have obtained same number of clusters with the original data (class column we have ignored at the begining who shows it has 3 clusters)

assignment-09-association-rules-data-mining-books- icon assignment-09-association-rules-data-mining-books-

Association-Rules-Data-Mining-Books. Apriori Algorithm, Association rules with 10% Support and 70% confidence, Association rules with 20% Support and 60% confidence, Association rules with 5% Support and 80% confidence, visualization of obtained rule.

assignment-09-association-rules-data-mining-groceries- icon assignment-09-association-rules-data-mining-groceries-

Association Rules Data Mining (Groceries). Converting the data frame into a list of lists, Using Transactionencoder to transform this dataset into a logical data frame, Building the data frame: rows are logical and columns are the items that have been purchased, Print Column names, We need to drop nan column from the data frame, Most popular items, Top 10 Popular items, Barplot visualization of popular items, Apriori Algorithm: Association rules with 5% Support and 70% confidence, Association rules with 1% Support and 80% confidence, Visualization of obtained rule.

assignment-09-association-rules-data-mining-my_movies- icon assignment-09-association-rules-data-mining-my_movies-

Assignment-09-Association-Rules-Data-Mining-my_movies. Apriori Algorithm. Association rules with 10% Support and 70% confidence. Association rules with 5% Support and 90% confidence. Lift Ratio > 1 is a good influential rule in selecting the associated transactions. Visualization of obtained rule.

assignment-1-q11-basic-statistics-level-1- icon assignment-1-q11-basic-statistics-level-1-

Q11) Suppose we want to estimate the average weight of an adult male in Mexico. We draw a random sample of 2,000 men from a population of 3,000,000 men and weigh them. We find that the average person in our sample weighs 200 pounds, and the standard deviation of the sample is 30 pounds. Calculate 94%,98%,96% confidence interval?

assignment-1-q12-basic-statistics-level-1- icon assignment-1-q12-basic-statistics-level-1-

Below are the scores obtained by a student in tests 34,36,36,38,38,39,39,40,40,41,41,41,41,42,42,45,49,56. Find mean, median, variance, standard deviation. What can we say about the student marks?

assignment-1-q22-basic-statistics-level-1- icon assignment-1-q22-basic-statistics-level-1-

Q 22) Calculate the Z scores of 90% confidence interval,94% confidence interval, 60% confidence interval for Adipose Tissue (AT) and Waist Circumference(Waist) from wc-at data set

assignment-1-q24-basic-statistics-level-1- icon assignment-1-q24-basic-statistics-level-1-

Q 24) A Government company claims that an average light bulb lasts 270 days. A researcher randomly selects 18 bulbs for testing. The sampled bulbs last an average of 260 days, with a standard deviation of 90 days. If the CEO's claim were true, what is the probability that 18 randomly selected bulbs would have an average life of no more than 260 days

assignment-10-recommendation-system-data-mining-books- icon assignment-10-recommendation-system-data-mining-books-

Assignment-10-Recommendation-System-Data-Mining-books. Recommend a best book based on the ratings: Sort by User IDs, number of unique users in the dataset, number of unique books in the dataset, converting long data into wide data using pivot table, replacing the index values by unique user Ids, Impute those NaNs with 0 values, Calculating Cosine Similarity between Users on array data, Store the results in a dataframe format, Set the index and column names to user ids, Nullifying diagonal values, Most Similar Users, extract the books which userId 162107 & 276726 have watched, extract the books which userId 276729 & 276726 have watched.

assignment-11-text-mining-01-elon-musk icon assignment-11-text-mining-01-elon-musk

Assignment-11-Text-Mining-01-Elon-Musk, Perform sentimental analysis on the Elon-musk tweets (Exlon-musk.csv), Text Preprocessing: remove both the leading and the trailing characters, removes empty strings, because they are considered in Python as False, Joining the list into one string/text, Remove Twitter username handles from a given twitter text. (Removes @usernames), Again Joining the list into one string/text, Remove Punctuation, Remove https or url within text, Converting into Text Tokens, Tokenization, Remove Stopwords, Normalize the data, Stemming (Optional), Lemmatization, Feature Extraction, Using BoW CountVectorizer, CountVectorizer with N-grams (Bigrams & Trigrams), TF-IDF Vectorizer, Generate Word Cloud, Named Entity Recognition (NER), Emotion Mining - Sentiment Analysis.

assignment-11-text-mining-02-amazon-product-reviews icon assignment-11-text-mining-02-amazon-product-reviews

NLP: Sentiment Analysis or Emotion Mining on Amazon Product Reviews - Part-1. Let’s learn the NLP techniques to perform Sentiment Analysis or Emotion Mining on extracted Product Reviews from Amazon. Part-1 covers Text preprocessing and Feature extraction, the next part covers Sentiment Analysis or Emotion Mining on text corpus. https://medium.com/@vaitybharati/nlp-sentiment-analysis-or-emotion-mining-on-amazon-product-reviews-part-1-428d43112027

assignment-11-text-mining-amazon-reviews-using-scrapy icon assignment-11-text-mining-amazon-reviews-using-scrapy

Text-Mining-Amazon-Reviews-using-Scrapy. Ever wondered? Life would be easier if there could be ways to know how well your product performs and what do people feel about your product? The Solution -Text Mining Techniques. https://medium.com/@vaitybharati/text-mining-how-to-extract-amazon-reviews-using-scrapy-5bd709cb826c

assignment-2-set2-q1-basic-statistic-level-2- icon assignment-2-set2-q1-basic-statistic-level-2-

The time required for servicing transmissions is normally distributed with mean = 45 minutes and SD = 8 minutes. The service manager plans to have work begin on the transmission of a customer’s car 10 minutes after the car is dropped off and the customer is told that the car will be ready within 1 hour from drop-off. What is the probability that the service manager cannot meet his commitment?

assignment-2-set2-q2-basic-statistic-level-2- icon assignment-2-set2-q2-basic-statistic-level-2-

The current age (in years) of 400 clerical employees at an insurance claims processing center is normally distributed with mean = 38 and Standard deviation =6. For each statement below, please specify True/False. If false, briefly explain why.A. More employees at the processing center are older than 44 than between 38 and 44. B. A training program for employees under the age of 30 at the center would be expected to attract about 36 employees.

assignment-2-set2-q5-basic-statistic-level-2- icon assignment-2-set2-q5-basic-statistic-level-2-

Consider a company that has two different divisions. The annual profits from the two divisions are independent and have distributions Profit1 ~ N(5, 3^2) and Profit2 ~ N(7, 4^2) respectively. Both the profits are in $ Million. Answer the following questions about the total profit of the company in Rupees. Assume that $1 = Rs. 45 A. Specify a Rupee range (centered on the mean) such that it contains 95% probability for the annual profit of the company. B. Specify the 5th percentile of profit (in Rupees) for the company C. Which of the two divisions has a larger probability of making a loss in a given year?

assignment-2-set3-q5-basic-statistic-level-2- icon assignment-2-set3-q5-basic-statistic-level-2-

In January 2005, a company that monitors Internet traffic (WebSideStory) reported that its sampling revealed that the Mozilla Firefox browser launched in 2004 had grabbed a 4.6% share of the market. I. If the sample were based on 2,000 users, could Microsoft conclude that Mozilla has a less than 5% share of the market? II. WebSideStory claims that its sample includes all the daily Internet users. If that’s the case, then can Microsoft conclude that Mozilla has a less than 5% share of the market?

assignment-2-set4-q3-basic-statistic-level-2- icon assignment-2-set4-q3-basic-statistic-level-2-

Auditors at a small community bank randomly sample 100 withdrawal transactions made during the week at an ATM machine located near the bank’s main branch. Over the past 2 years, the average withdrawal amount has been $50 with a standard deviation of $40. Since audit investigations are typically expensive, the auditors decide to not initiate further investigations if the mean transaction amount of the sample is between $45 and $55. What is the probability that in any given week, there will be an investigation?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.