Giter Club home page Giter Club logo

usea's Introduction

Universal Semantic Annotator (USeA)

Paper Conference License: CC BY-NC 4.0

This is the official repository for the paper Universal Semantic Annotator: the First Unified API for WSD, SRL and Semantic Parsing, which will be presented at LREC 2022 by Riccardo Orlando, Simone Conia, Stefano Faralli, and Roberto Navigli.

Cite this work

If you use USeA or any part of this work, please consider citing the paper as follows:

@inproceedings{orlando-etal-2022-usea,
    title      = "{U}niversal {S}emantic {A}nnotator: the First Unified {API} for {WSD}, {SRL} and {S}emantic {P}arsing",
    author     = "Orlando, Riccardo and Conia, Simone and Faralli, Stefano and Navigli, Roberto",
    booktitle  = "Proceedings of the 13th Language Resources and Evaluation Conference (LREC 2022)",
    month      = june,
    year       = "2022",
    address    = "Marseille, France",
    publisher  = "European Language Resources Association"
}

Abstract

In this paper, we present the Universal Semantic Annotator (USeA), which offers the first unified API for high-quality automatic annotations of texts in 100 languages through state-of-the-art systems for Word Sense Disambiguation, Semantic Role Labeling and Semantic Parsing. Together, such annotations can be used to provide users with rich and diverse semantic information, help second-language learners, and allow researchers to integrate explicit semantic knowledge into downstream tasks and real-world applications.

usea picture

Description

Universal Semantic Annotator (USeA) is the first unified API for three primary tasks in Natural Language Understanding (NLU):

  • Word Sense Disambiguation (WSD): the task of assigning the most appropriate sense to a word in context;
  • Semantic Role Labeling (SRL): the task of extracting the predicate-argument structures within a sentence;
  • Semantic Parsing (Abstract Meaning Representation, AMR): the task of representing a text in a structured semantic graph.

The main motivations behind USeA are manifold: i) the creation of an easy-to-use tool and service for the automatic annotation of explicit semantic knowledge in 100 languages, ii) enabling the use of explicit semantics in multilingual and cross-lingual real-world applications, iii) the democratization of state-of-the-art systems that would otherwise require expert knowledge of the field for their development and implementation, and last but not least iv) fostering further research in NLU and other fields on the interplay between semantics and other modalities, e.g., computer vision, speech recognition, video understanding.

usea modules

Image Description

This repo is for the Docker image of the USeA service proxy. This image takes care of:

  1. Receiving an HTTP/S request to process an input text;
  2. Sending task-specific requests to task-specific endpoints (preprocessing, WSD, SRL, AMR parsing);
  3. Processing and merging the results from the task-specific endpoints;
  4. Returning all the annotations. NOTE: This image does not perform any preprocessing or annotation. For these tasks, please refer to usea-preprocessing, usea-wsd, usea-srl, and usea-amr (soon to be released).

How to use

How to start a usea-service container

Make sure you have installed docker before proceeding, then run the following command:

docker run --name usea-service -p 22000:8000 sapienzanlp/usea-service

If you want to run the container in the background, simply use the flag -d as follows:

docker run -d --name usea-service -p 22000:8000 sapienzanlp/usea-service

If everything went well, the service will become available at localhost:22000/process. You can check that everything is fine with the following Python script:

import requests
import json

text = "La volpe veloce salta sopra il cane pigro."
response = requests.post(
    "http://localhost:22000/process", json={"type": "text", "content": text}
)
print(json.dumps(response.json(), indent=2))

Specifying your own endpoints

By default, the proxy image sends the requests to our online servers. You can specify the URL of the preprocessing, WSD, SRL and AMR parsing enpoints to point to your own (local) instances, as follows:

BASE_HOST=https://nlp.uniroma1.it/usea
PREPROCESSING_ENDPOINT="$BASE_HOST"/preprocessing
WSD_ENDPOINT="$BASE_HOST"/wsd
SRL_ENDPOINT="$BASE_HOST"/srl
AMR_ENDPOINT="$BASE_HOST"/amr

docker run --name usea-service -p 22000:8000 sapienzanlp/usea-service \
    -e PREPROCESSING_ENDPOINT=$PREPROCESSING_ENDPOINT \
    -e WSD_ENDPOINT=$WSD_ENDPOINT \
    -e SRL_ENDPOINT=$SRL_ENDPOINT \
    -e AMR_ENDPOINT=$AMR_ENDPOINT 

Stopping the container

Simply run:

docker stop usea-service

If you also want to remove the container:

docker stop usea-service
docker rm usea-service

Deploy USeA locally

Using Docker Compose

USeA can be started using Docker Compose by simply running the following command:

docker-compose up -d

It will be available at localhost:22000/process. By default, it will run the CPU version of the images. Ports and individual module endpoints can be changed using the .env file.

GPU Support

If you want to use the GPU, you can use one (or more) configuration files inside docker-compose-files folder. Let's say, for example, we want to deploy USeA with GPU support for usea-amr. What you have to do is running the following command:

docker-compose -f docker-compose.yaml -f docker-compose-files/docker-compose.amr.cuda.yaml up -d

You can read here for more information about using multiple configuration files.

Without Docker Compose

If you don't want to use Docker Compose, you can manually pull the images from our Docker Hub, and run them as usual.

Acknowledgements

The authors gratefully acknowledge the support of the European Language Grid project No. 825627 (Universal Semantic Annotator, USeA) under the European Union’s Horizon 2020 research and innovation programme.

License

This work is under the Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.

usea's People

Contributors

c-simone avatar dfuchss avatar riccorl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

dfuchss

usea's Issues

Usea Preprocessing Container Crashes

I'm currently trying to run usea locally.
Unfortunately, the preprocessing container crashes:

                               __  __   _____          ___

                              / / / /  / ___/  ___    /   |

                             / / / /   \__ \  / _ \  / /| |

                            / /_/ /   ___/ / /  __/ / ___ |

                            \____/   /____/  \___/ /_/  |_|


Universal Semantic Annotator: A Unified API for Multilingual WSD, SRL and AMR annotations


                                Preprocessing Module


Killed

                                 Sapienza NLP group


Downloading resources...


                               __  __   _____          ___

                              / / / /  / ___/  ___    /   |

                             / / / /   \__ \  / _ \  / /| |

                            / /_/ /   ___/ / /  __/ / ___ |

                            \____/   /____/  \___/ /_/  |_|


Universal Semantic Annotator: A Unified API for Multilingual WSD, SRL and AMR annotations


                                Preprocessing Module


                                 Sapienza NLP group


Downloading resources...

Do you have any idea how to fix this?

Docker Image is missing exposed port

After looking to your docker file sapienzanlp/usea-service:1.0.2, I've recognized that your docker image does not expose the API port (at least it's not declared).

I would suggest to add EXPOSE 8000 to your docker image. Then the port to be used is clear.

Exception in WSD container

Dear developers,

I encountered the following error with a fully local deployment through docker-compose:
File "/elg/usea_service/app.py", line 165, in wsd_response_handler
for word in data["tokens"]:
└ {'detail': "Server Error: 'Word' object has no attribute 'start_char'"}

By entering the wsd container and looking through the python code, the issue is located /app/sapienzanlp/predictors/wsd.py, starting line 305. The WsdWord dataclass is instanciated for each Word instance, and it appears that Word (and thus WsdWord its subclass) no longer has the start_char/end_char attributes.

By removing the two parameters and restarting the container the issue went away. This would be easy to fix in the next release of the WSD image on docker hub. The issue does not occur in the online demo API so it must have been fixed there

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.