The test_challenge from owlphi

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ML Traveller\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1 Euler\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "https://projecteuler.net/problem=1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.\n",
    "\n",
    "Find the sum of all the multiples of 3 or 5 below 1000."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Solution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "__author__ = 'cripton'\n",
    "\"\"\"\n",
    "https://projecteuler.net/problem=1\n",
    "If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.\n",
    "Find the sum of all the multiples of 3 or 5 below 1000.\n",
    "\"\"\"\n",
    "\n",
    "def sum_multiple(a, b, c):\n",
    "\n",
    "    total=0\n",
    "    for i in range(1,c):\n",
    "        if i % a == 0 or i%b==0:\n",
    "            total=total+i\n",
    "    print total\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "233168\n"
     ]
    }
   ],
   "source": [
    "sum_multiple(3,5,1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2 Machine Learning in HackerRank"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "https://www.hackerrank.com/challenges/correlation-and-regression-lines-7"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here are the test scores of 10 students in physics and history:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "          \n",
    "Physics Scores | History Scores\n",
    "--- | --- | ---\n",
    "15 | 10\n",
    "12 |25\n",
    "8  |17\n",
    "8 |11\n",
    "7 |13\n",
    "7 |17\n",
    "7 |20\n",
    "6 |13\n",
    "5 |9\n",
    "3| 15"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compute the slope of the line of regression obtained while treating Physics as the independent variable. Compute the answer correct to three decimal places."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Output Format"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the text box, enter the floating point/decimal value required. Do not leave any leading or trailing spaces. Your answer may look like: 0.255\n",
    "\n",
    "This is **NOT** the actual answer - just the format in which you should provide your answer."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Solution"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('Coefficients: \\n', array([ 0.20833333]))\n"
     ]
    }
   ],
   "source": [
    "from sklearn import linear_model\n",
    "\n",
    "X= [[15], [12], [8],  [8],  [7],  [7],  [7],  [6], [5], [3] ]\n",
    "y= [10, 25, 17, 11, 13, 17, 20, 13, 9, 15]\n",
    "\n",
    "clf = linear_model.LinearRegression()\n",
    "# Create linear regression object\n",
    "regr = linear_model.LinearRegression()\n",
    "\n",
    "# Train the model using the training sets\n",
    "regr.fit(X, y)\n",
    "\n",
    "# The coefficients\n",
    "print('Coefficients: \\n', regr.coef_)\n",
    "#print round(regr.coef_,3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. NLP in HackerRank"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "https://www.hackerrank.com/challenges/byte-the-correct-apple"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The word \"Apple\" could generally refer to one of these two:<br />\n",
    "(a) Apple Inc., the great Computer giant. <br />(b) Apple, the fruit "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You are provided a text file, with a number of lines. Each line contains either a sentence or a paragraph or a text snippet which could either be related to Apple, the computer company, or the apple, the fruit. Your task is to perform disambiguation between these two groups and identify which one is being referred to. It is possible that the plural or the possessive form of Apple might exist in some of the tests (apples, Apple's). "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Training Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You are provided with two text files, which contain near-complete text from the Wikipedia for Apple Inc. as well as apple the fruit. For offline inspection and access, you could access these two files here: "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[Text from Wikipedia entry on Apple-Computers](https://s3.amazonaws.com/hr-testcases/1053/assets/apple-computers.txt)<br />\n",
    "[Text from Wikipedia entry on Apple the fruit](https://s3.amazonaws.com/hr-testcases/1053/assets/apple-fruit.txt)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also, when you submit your program, you can assume that these two text files are available in the directory where your program is run, and their names are \"apple-computers.txt\" and \"apple-fruit.txt\". "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Input Format"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An Integer N, no more than 100.  \n",
    "\n",
    "line_1<br />\n",
    "line_2<br />\n",
    "line_3<br />\n",
    "line_4<br />\n",
    "...<br />\n",
    "line_N  <br />"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Constraints"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "N <= 100 <br />\n",
    "Each line will have not more than 1000 characters in it.<br />\n",
    "Assume that the encoding is UTF-8.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Output Format"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "computer-company<br/>\n",
    "fruit<br/>\n",
    "computer-company<br/>\n",
    "fruit<br/>\n",
    "..<br/>\n",
    "..<br/>\n",
    "..<br/>\n",
    "N lines of output  <br/>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Solution:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Download the dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('apple-fruit.txt', <httplib.HTTPMessage instance at 0x7fa7c375fb00>)"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import urllib\n",
    "testfile = urllib.URLopener()\n",
    "testfile.retrieve(\"https://s3.amazonaws.com/hr-testcases/1053/assets/apple-computers.txt\", \"apple-computers.txt\")\n",
    "testfile.retrieve(\"https://s3.amazonaws.com/hr-testcases/1053/assets/apple-fruit.txt\", \"apple-fruit.txt\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from sklearn.feature_extraction.text import CountVectorizer\n",
    "from sklearn.pipeline import Pipeline\n",
    "from sklearn.naive_bayes import MultinomialNB\n",
    "\n",
    "train={}\n",
    "train['text']=[]\n",
    "train['class']=[]\n",
    "\n",
    "with open('apple-computers.txt', 'r') as f:\n",
    "    for line in f:\n",
    "        train['text'].append(line)\n",
    "        train['class'].append(0)\n",
    "\n",
    "with open('apple-fruit.txt', 'r') as f:\n",
    "    for line in f:\n",
    "        train['text'].append(line)\n",
    "        train['class'].append(1)\n",
    "\n",
    "pipeline = Pipeline([\n",
    "    ('vectorizer',  CountVectorizer(ngram_range=(1, 2),stop_words='english')),\n",
    "    ('classifier',  MultinomialNB()) ])\n",
    "\n",
    "text_clf = pipeline.fit(train['text'], train['class'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Predict examples"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "computer-company\n",
      "fruit\n"
     ]
    }
   ],
   "source": [
    "test_example=['Apple is a famous company','Apple is a delicious fruit']\n",
    "\n",
    "test_label=text_clf.predict(test_example)\n",
    "for e in test_label: print('computer-company' if e==0 else 'fruit')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Predict from a input"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "5\n",
      "apple is a company\n",
      "apple is a fruit\n",
      "\n",
      "\n",
      "\n",
      "computer-company\n",
      "fruit\n",
      "computer-company\n",
      "computer-company\n",
      "computer-company\n"
     ]
    }
   ],
   "source": [
    "test=[]\n",
    "for i in range(int(raw_input())):\n",
    "\ts=raw_input()\n",
    "\ttest.append(s)\n",
    "\n",
    "test_label=text_clf.predict(test)\n",
    "for e in test_label: print('computer-company' if e==0 else 'fruit')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
owlphi / test_challenge Goto Github PK

test_challenge's Introduction

test_challenge's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent