Giter Club home page Giter Club logo

lszoszk / un-treatybodiesdocsearch Goto Github PK

View Code? Open in Web Editor NEW
0.0 3.0 3.0 2.21 MB

Application enabling to search through the General Comments/ Recommendations adopted by the UN Treaty Bodies (based on keywords, concerned groups and customized labels) and analyze search results.

Home Page: https://lszoszk.pythonanywhere.com/

License: MIT License

Python 24.42% CSS 20.95% JavaScript 15.81% HTML 38.82%
human-rights natural-language-processing nltk-python

un-treatybodiesdocsearch's Introduction

UN Treaty Bodies Search and Analysis App

DOI

This Flask application, also available at lszoszk.pythonanywhere.com, is designed to perform in-depth analysis and search through a collection of the General Comments/Recommendations adopted by the UN Treaty Bodies. It offers functionalities such as keyword searching, concerned groups filtering, analysis of collocations and export search results to Excel. πŸ‡ΊπŸ‡³ πŸ”πŸ“ŠπŸ“„

Description

The app processes JSON data, enabling users to search through the General Comments/Recommendations (paragraph-level search) based on keywords, concerned groups/persons labels, and Treaty Bodies. It features an advanced text analysis pipeline using NLTK for tokenization, term frequencies, bigram extraction, and custom stopwords processing. The application also provide a search-within-search functionality, which allows for a more advanced filtering of search results.

Getting Started

Dependencies

  • Python 3.6+
  • Flask
  • Pandas
  • NLTK
  • BeautifulSoup
  • GC-info.json file for the app's document metadata

Installation

  1. Clone the repository:
    git clone [URL of this repository]
    
  2. Navigate to the project directory:
    cd [project_name]
    

Executing Program

  1. Install the required Python packages:
    pip install -r requirements.txt
    
  2. Run the Flask application:
    python app.py
    
  3. Access the application through a web browser at localhost:5000.

Features

  • Advanced Search πŸ”: A robust search functionality that allows users to filter relevant paragraphs from the documents based on keyword, concerned groups/persons (e.g., children, women, indigenous peoples), and by the UN Treaty Bodies (e.g., Committee on the Rights of the Child, Committee on Economic, Social and Cultural Rights).
  • Text Analysis πŸ“Š: Text processing capabilities, leveraging the NLTK for word frequencies, bigram analysis, custom UN-related stopwords list, and search within search results functionality.
  • Custom Labels and Stopwords 🏷️: Ability to define and use custom labels (e.g., concerned groups, human rights issues) and custom stopwords for text analysis.
  • Interactive Results πŸ’‘: Highlights search terms and displays results interactively.
  • Data Export πŸ“: Export search results to Excel format for further analysis.

Screenshots

search.png Main page with search functionality.

search_results.png Search results. You can visit the source document (OHCHR website) and copy it to a clipboard with automatically generated references.

analytical_dashboard.png Analytical dashboard. Insert a query in "Narrow your search" to run an additional, dynamic search within your search results.

dark_mode.png Dark mode of the application.

Help

If you encounter any issues, please check if all dependencies are correctly installed and the GC-info.json file is properly formatted and located in the root directory of the project.

Author

Łukasz Szoszkiewicz

E-mail: [email protected]

Zuzanna Kowalska

E-mail: [email protected]

Version History

  • 0.1
    • Initial Release (8 January 2024) - includes General Comments adopted by the Committee on the Rights of the Child and the Committee on Economic, Social and Cultural Rights.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

un-treatybodiesdocsearch's People

Contributors

lszoszk avatar zuzkow avatar

Watchers

 avatar  avatar Kacper avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.