Giter Club home page Giter Club logo

chrome-extension-manifests-dataset-shoppe-'s Introduction

Chrome Extension manifest.json Dataset (>100k Extensions)

This repository contains >100k manifest.json files for extensions hosted in the Chrome Web Store. These were collected via scraping Chrome Web Store. Some metadata has been added as front matter to the manifests in order to provide context, e.g. extension name and publisher, rating and user count.

Note that the scraping approach changed with the 2023-06-01 and 2023-11-29 snapshots. With the 2023-06-01 snapshot, the number of manifests increased from 10k to >50k, and with the 2023-11-29 snapshot to >100k. The latter also changed metadata in various ways, e.g.: user counts beyond 10,000,000 are possible, release dates are in ISO format, slug field is gone and category is only indicated by the category_slug field without the human-readable category field.

This has been inspired by a similar repository created by @IAmMandatory. Captures for a bunch of points in time have been created but I cannot promise that any updates will happen in future. It's meant to be useful for analysis of the Chrome extension ecosystem, such as what permissions are requested, common Content Security Policies, etc.

Querying the dataset

The repository contains a query.js script allowing running queries against the dataset. To run the script you will need Node.JS 16 or higher. Before using the script for the first time, run npm install command in this directory to install dependencies.

The script uses matchme queries and lists matching extensions. You can pass a manifest query and optionally a metadata query on the command line:

query.js [-m metadata-query] manifest-query

Examples:

# List all Manifest V3 extensions
query.js "manifest_version == 3"
# List all Manifest V3 extensions with at least 10.000 users
query.js -m "user_count >= 10000" "manifest_version == 3"
# List all extensions using 'unsafe-eval' Content Security Policy
query.js "content_security_policy =? /unsafe-eval/i || content_security_policy.extension_pages =? /unsafe-eval/i"
# List all extensions with less than 1.000 users using activeTab permission
query.js -m "user_count < 1000" "permissions =? /activeTab/i"
# List all extensions requesting permissions for all websites (<all_urls>,
# *://*/* or https://*/* permissions)
query.js "permissions =? /(<all_urls>|\*:\/\/\*\/\*|https:\/\/\*\/\*)/i || host_permissions =? /(<all_urls>|\*:\/\/\*\/\*|https:\/\/\*\/\*)/i"

Results example:

$ query.js -m "user_count >= 10000000" "content_security_policy =? /unsafe-eval/i"
aapbdbdomjkkjkaonfhkkikfgjllcleb Google Translate 34000000
hdokiejnpimakedhajhdlcegeplioahd LastPass: Free Password Manager 10000000
nkbihfbeogaeaoehlefnkodbefgpgknn MetaMask 15000000
Matched 3 out of 30 manifests (10.00%).

chrome-extension-manifests-dataset-shoppe-'s People

Contributors

palant avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.