Giter Club home page Giter Club logo

g-searcher's Introduction

GSearcher

Build Status

Table of Contents

  1. About The Project
  2. Getting Started

About The Project

G-Searcher is a Google Web Scraper. You upload a CSV with all the keywords you want searched, and let G-Searcher handle the rest!

  • Upload CSV files for massive queries ๐Ÿ“
  • Google Authentication ๐Ÿ‘ฎโ€โ™€๏ธ
  • Dedicated API endpoints ๐ŸŽฏ
  • Advanced filters to search through all your queries ๐Ÿ”Ž
  • Sleek modern dashboard for your data needs ๐Ÿ‘ฉโ€๐Ÿซ

Built With

Getting Started

Prerequisites

You will require

Installation

To start G-Searcher:

  • Clone the repo
    git clone https://github.com/bterone/g-searcher
  • Install dependencies
    mix deps.get
  • Start docker container
    docker compose -f docker-compose.dev.yml up -d
  • Create and migrate your database
    mix ecto.setup
  • Install Node.js dependencies with npm install inside the assets directory
  • Fill in Oauth credientials to dev.exs
    config :ueberauth, Ueberauth.Strategy.Google.OAuth,
      client_id: "<GOOGLE CLIENT ID>",
      client_secret: "<GOOGLE CLIENT SECRET>"
  • Start Phoenix endpoint with
    mix phx.server

Now you can visit localhost:4000 from your browser.

g-searcher's People

Contributors

bterone avatar dependabot[bot] avatar

Stargazers

Otu Ekanem avatar

Watchers

Otu Ekanem avatar Micky Jittjana avatar

g-searcher's Issues

Push deployments via Docker

Why

In order to complete an objective.

Acceptance Criteria

1/ Write a deployment workflow for when branch is pushed onto develop, it pushes to heroku and deploys on staging.
2/ Write a deployment workflow for when branch is pushed onto main, it pushes to heroku and deploys on production.

Add SCSS linter

Why

Have an scss-linter for better code quality.

Acceptance Criteria

  1. Add scss-linter
  2. Add it to CI

As a user, I can make cross-origin requests

Why

For the API to be functional, it needs to handle cross-origin requests from different domains.

Acceptance Criteria

  • Allow localhost to be able to connect to the Staging API

Update project to follow our current Compass conventions

Why

Most of the project conventions have been established over on here and here. It would be nice to update the project to match these conventions

Acceptance Criteria

1/ Update all the schemas to their individual folders such as g_searcher/search_results/schema/search_result.ex etc.
2/ Move business domain level files into their own contexts such as g_searcher/search_results.ex โžก๏ธ g_searcher/search_results/search_results.ex

As a user, I can see report parsing progress on my dashboard

Why

Frontend for #28
Related to #9

The user will need to be directed to some results where they can see the status of their report and what results are parsed to

Acceptance Criteria

  1. The dashboard should have all the user's reports rendered.
  2. Each report should have its status on whether it's complete, failed, or still running.

As a user, I can see updated Error pages

Why

Currently, the Error pages are basic HTML. Upgrading it to the rest of the application style would be nice

Acceptance Criteria

  1. Use the styling of the base app to create an error view for 404, 500.

Rollback only keywords and fail report when searching keywords

Feature request

Currently, it rolls back everything when a keyword fails to save, including the report.

  • We need to fail the report
  • Any failed to save keywords should be taken as the CSV is malformed and should rollback all saved keywords
  • It should not search for any keywords before completely saving all.

Fix: Incorrect value of results on page in Report Show Page

Issue

The value of the Results on Page in the Report Show Page does not match the Search Result Show Page

Expected

Show the correct value.

Screen Shot 2021-07-04 at 10 58 44 PM
Screen Shot 2021-07-04 at 10 57 59 PM

There are fewer values than displayed in the Results on Page column

Steps to reproduce

  1. Upload CSV report
  2. Navigate to Report Show Page. Observe the values
  3. Navigate to one of the Search Result Show Page. Verify the values match.

As a user, I can search across my reports

Why

A user should be able to search across reports using a search bar.

Acceptance Criteria

  1. Make queries to the table from #35
  2. A user should be able to query for all information across their stored keywords.

As a user, I can see a keyword index of search results

Why

The user would need to see a link on their dashboard to an index page of all their stored keywords

Acceptance Criteria

  1. Have a link on the dashboard that takes them to an index of their keywords
  2. Have information such as number of results, top advertiser count, regular advertiser count, total results
  3. Include parsing status on each keyword

[Frontend] As a web scraper, I can search Google and save results

Feature request

Modify the dashboard to list all reports with their status on completing the search.

Clicking on a report will show all the search results associated with the report.

Clicking on a search result will show details of search result and display HTML cache

As a user, I can see a helpful tooltip that displays the advanced filter options

Why

Users would have no prior knowledge of the advanced search features of GSearcher. Seeing a helpful tooltip that displays all the available options would provide better UX.

Acceptance Criteria

1/ Have a tooltip next to the search bar.
2/ The tool tip needs to display all the available options for advanced search filters

As a user, I can see the Search result URL titles alongside the URLs

Why

The search result URLs page is bland, there isn't much information to present there other that the URLs in 3 separate URL tables.

With the Search result URL titles, we would have one extra column and better UX

Acceptance Criteria

  1. Extract URLs out of the search_result schema into its own search_result_urls schema.
  2. One search result would have many search result URLs.
  3. Redesign the search_result URLs page to include the extra column.
  4. In the search worker, we need to save this extra information by targeting the specific class and fetching the title

As a user, I can view the new landing page

Why

Update the landing page to tailwind, and create base styles so we can have an easier time developing in the long run

Acceptance Criteria

  1. Update landing page to use tailwind
  2. Update the dashboard page to use tailwind
  3. Create some base styling for buttons, headings, and links for consistency across the site.

Redesign Dashboard

Why

Currently, the form is taking up too much space on the dashboard, having more space is nice to have.

Acceptance Criteria

  1. Make the upload report form hidden from view until the user clicks the upload button.
  2. Move the upload button to the right of the report index table.

As a user, I can search across my reports

Why

A user should be able to search across reports using a search bar.

Acceptance Criteria

  1. Have the dashboard pages display a search bar
  2. Typing a query and searching should take them to a page with the search results

As a user, I can see individual search results and it's progress

Why

On the dashboard, there will be an index of all the user's reports,
The user should be able to click on one of these reports to get a list of all the associated keywords and whether they're parsed or not.

Acceptance Criteria

  1. Display an index of keywords associated to the report
  2. Display a status depending on whether the keyword is searched or not.
  3. Clicking on a completed keyword should take the user to the search result with it's details and HTML view of the search result.

As a user, I can see my report status be updated live in real time

Why

Just better UX overall instead of refreshing the page constantly, as well as a chance to take a deeper look into Phoenix LiveView

Acceptance Criteria

  1. Integrate Phoenix LiveView into project
  2. Convert dashboard route to a Live route.
  3. Create a channel for report updates so we can broadcast updates directly from the worker.

As a user, I can click a button to toggle display on the create report form

Why

The dashboard is going to contain the report index, so we would need to have a create report button that when clicked,
will display the create report form.

Acceptance Criteria

  1. Add a button to create the report that when clicked will display the create report form
  2. Animate with fade down / fade up when toggling the form.

As a user, I can request paginated search results from the API

Why

Currently, the search_result index in the API returns all the search results associated with the user. We need to add pagination so that we can query a selection of data, and will have more space to introduce the "relationship" and "included" attributes as part of the JSONAPI spec.

Acceptance Criteria

1/ Introduce pagination to the API
2/ Add the "relationship" and "included" attribute to the search result index.

As a user, I can see report parsing progress on my dashboard

Why

Related to #9
The user will need to be directed to some results where they can see the status of their report and what results are parsed to

Acceptance Criteria

  1. The dashboard should have all the user's reports rendered.
  2. Each report should have its status on whether it's complete, failed, or still running.

As a user, I can see an updated README

Why

The current README is using the default generated by Phoenix. We'd rather have a custom README.

Acceptance Criteria

  • Update the README to be a bit more stylish and relevant.

As a user, I can make an API call to sign in to my account

Why

We need to encode an access token that we can use to represent the session. This token should time out after a period of time.

Acceptance Criteria

  1. Encode the user session into an access token using Guardian.
  2. Have the token time out after 1 day.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.