Giter Club home page Giter Club logo

metascraper-lambda's Introduction

metascraper-lambda

This project is a small API to resolve meta information from a website that is meant to be deployed to AWS Lambda + API Gateway.

It is powered by ClaudiaJS CLI and API Builder and ianstormtaylor's metascraper library for NodeJS. Shouts out to both projects!

Why?

The project was created because we had problems getting to run freely available metadata scraper APIs. Both opengraph.io and linkpreview.net had problems with their CORS settings, making it effectively impossible to run them from browser-side Javascript.

It is meant to be deployed and forgotten, although beware, that in its current version, there is no access control. You may want to secure your installation with custom CORS settings and/or an API key

Dependencies

All command line tools require NodeJS and npm. I tested it with NodeJS v6 LTS and npm v3.10.10.

For deployment, you need the package claudia:

npm install -g claudia

For development, you also need to install all dependencies:

npm install

For testing, you need the jasmine CLI:

npm install -g jasmine

First deployment

The deployment is done using the ClaudiaJS CLI. I created an npm run script for that purpose. So you can either run that (npm run deploy) or, if you want to set a bit more options, run claudia create --region eu-central-1 --api-module api with your custom options.

Notice that in either case, you need to have your AWS credentials configured.

After the deployment, ClaudiaJS will output a URL similar to this:

https://something.execute-api.eu-central-1.amazonaws.com/latest

Using the deployed API

You can then use your newly created API by appending /metascraper?url=https://google.com and POSTing the request (where https://google.com is the address you want to scrape the metadata for)

CURL:

curl -X POST https://something.execute-api.eu-central-1.amazonaws.com/latest/metascraper?url=https://github.com

jQuery:

See the full example in this Codepen.

var url = "https://github.com";
$.post("https://something.execute-api.eu-central-1.amazonaws.com/latest/metascraper?url=" + url, function(data, status){
    alert("Data: " + JSON.stringify(data) + "\nStatus: " + status);
});

You may need to URLencode the URL you want to test.

Redeploy

If you modify some stuff in the app.js or update this package, you may want to redeploy by running either

npm run update or

claudia update

Test

There is a minimal set of test cases included in this project's spec directory. The test runner is jasmine, which is also the command used to run the tests. Useful documentation for testing ClaudiaJS API Builder apps can be found here.

Other stuff

Since this package is pretty minimal, but ClaudiaJS actually supports a LOT of customization options, you can just go to their site and look up how to do more stuff.

metascraper-lambda's People

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.