Giter Club home page Giter Club logo

arxiv_web_scraper's Introduction

This project was bootstrapped with Create React App, using the Redux and Redux Toolkit template.

Available Scripts

In the project directory, you can run:

npm install

npm start

Runs the app in the development mode.
Open http://localhost:3000 to view it in the browser.

The page will reload if you make edits.
You will also see any lint errors in the console.

npm run build

Builds the app for production to the build folder.
It correctly bundles React in production mode and optimizes the build for the best performance.

Discussion

This project was timeboxed to approximately 4 hours, so several of the design considerations were implemented with this time constraint in mind and some technical sacrifices were made. In general, the app was prioritized to be as user friendly as possible to fulfill the main project criteria which was to create a friendlier interface for using Arxiv.

The app simply links to two pages and makes one fetch call when that page first loads. The fetch call retrieves a hundred of the most recent articles published on Arxiv relating to psychiatry, therapy, data science or machine learning. The response is in XML so it is parsed to JSON before it is saved in the redux store. Then, the Articles component selects this data and renders a list of articles.

Future work would be to allow for pagination in the Articles component by fetching ten or so results at a time as the user clicks through the pages. It would also be nice to allow filtering by categories or manually entering new search parameters to find the most relevant results.

Next, adding a loading page or some indication that the app is fetching articles in the background would improve the UX. Similarly some messaging to help the user recover from an error state would also be helpful but were not added due to time constraints.

The Authors page makes the same fetch call but then loops through each article and keeps a tally of the authors it encounters to see how many articles they have published in the last six months. This is, however, limited to only the articles they have published regarding the categories we searched for before. If we wanted to find every author that has published any category of article on Arxiv, we might have to make multiple requests (for every available category) or do a wild card search if the API allows for that.

arxiv_web_scraper's People

Watchers

abagheri avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.