Giter Club home page Giter Club logo

summerofcode-softwareheritage's Introduction

Summer of Code @ Software Heritage

Report for Google Summer of Code '22 Project @ Software Heritage

Project Details
Initial Proposal Mine Information from Archived Content
Repository swh-indexer
Mentors Stefano Zacchirolli, Valentin Lorentz, and Kumar Shivendu
Contributions swh-indexer
Duration 3 months (13-06-2022 to 12-09-2022)

About the SWH Project

Software Heritage is a far-reaching Open Source-Research project that is working to collect and preserve software source code. As a part of this, Software Heritage’s indexer extracts metadata from source code repositories. Metadata ranges from simple information (eg. project name or hosting place) to more substantial information like the entity behind the project, its license, etc. Metadata is the information it collects and extracts that provides additional information on source code.

Contributions

The search feature of Software Heritage's universal archive of software source code offers searching via URL or through package metadata. As part of GSoC'22, I worked on adding mappings to Packagist (composer.json), NuGet (.nuspec), and dart (pubspec.yaml) packages. Additionally, I am currently working on a mapping for Cocoapods (.podspec) packages. Please find all my contributions here. Here is a summary:

Title Diff. Related Task No. of Packages
Indexer for Packagist (composer.json) D8047 T4357 386k
Metadata Indexer for Pub (pubspec.yaml) D8079 T4376 34.6k
Add NuGet Mapping (*.nuspec) D8144 T4392 397k

In total, these span more than 800k packages.

Future Work

Continuing from here, I am very excited to continue contributing to Software Heritage and Open Source. Software Heritage is on an important mission that I'm privileged to be a part of and deeply excited to continue contributing to. Here are some future aspects to this project:

  • Writing a metadata indexer for Cocoapods packages (*.podspec) (Related Task: T4437)
  • Extend the coverage of supported metadata to all Libraries.io-indexed package managers
  • Possibly use Bibliothecary to extract package metadata

Parallel to my coding journey, I have written up 2 blogs to summarize my learning curve and the state of my project at the time. My mentors were kind to review them before they were published.

Overall, it was a wonderful experience working with knowledgeable mentors and learning from them. Looking forward to continue learning with them.

summerofcode-softwareheritage's People

Contributors

vickymerzown avatar

Stargazers

Isaac avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.