Giter Club home page Giter Club logo

home-gcl's Introduction

[VLDB2024] Jointly Learning Representations for Heterogeneous Spatial-temporal Entities via Graph Contrastive Learning

This is a PyTorch implementation of HeterOgeneous Map Entity Graph Contrastive Learning (HOME-GCL) for generic road segment and land parcel representation learning, submitted to VLDB2024.

Abstract

The electronic map plays a crucial role in geographic information systems, serving various urban managerial scenarios and daily life services. Developing effective Spatial-Temporal Representation Learning (STRL) methods is crucial to extracting embedding information from electronic maps and converting map entities into representation vectors for downstream applications. However, existing STRL methods typically focus on one specific category of map entities, such as POIs, road segments, or land parcels, which is insufficient for real-world diverse map-based applications and might lose latent structural and semantic information interacting between entities of different types. Moreover, using representations generated by separate models for different map entities can introduce inconsistencies. Motivated by this, we propose a novel method named HOME-GCL for learning representations of multiple categories of map entities. Our approach utilizes a heterogeneous map entity graph (HOME graph) that integrates both road segments and land parcels into a unified framework. A HOME encoder with parcel-segment joint feature encoding and heterogeneous graph transformer is then deliberately designed to convert segments and parcels into representation vectors. Moreover, we introduce two types of contrastive learning tasks, namely intra-entity and inter-entity tasks, to train the encoder in a self-supervised manner. Extensive experiments on three large-scale datasets covering road segment-based, land parcel-based, and trajectory-based tasks demonstrate the superiority of our approach. To the best of our knowledge, HOME-GCL is the first attempt to jointly learn representations for road segments and land parcels using a unified model.

Requirements

Our code is based on Python version 3.9, PyTorch version 2.0.1, and torch-geometric version 2.3.1.

Please make sure you have installed Python, PyTorch, and torch-geometric correctly.

Then you can install all the dependencies with the following command by pip:

pip install -r requirements.txt

Data

We conduct our experiments on three datasets, including BJ (Beijing), CD (Chengdu) and XA (Xi'an).

You can download the pre-processed Chengdu and Xi'an datasets from this link or this link. The downloaded data need to be placed in the ./raw_data directory, forming the following structure: raw_data/cd/* and raw_data/xa/*.

Since the Beijing dataset is too large, you can get it from this link here.

Train & Test

You can train HOME-GCL through the following commands:

python run_model.py --dataset ${name}

You need to replace the ${name} as cd or xa.

A field exp_id is generated to mark the experiment number during the experiment. Once the training is completed, the performance of the representations generated by the model on the road segment-based downstream task and the land parcel-based downstream task is automated.

  • The pre-trained model will be stored at libcity/cache/{exp_id}/model_cache/{exp_id}_{model_name}_{dataset}.pt.
  • The road segment representations generated by the model will be stored at libcity/cache/{exp_id}/evaluate_cache/road_embedding_{model_name}_{dataset}_{dim}.npy.
  • The land parcel representations generated by the model will be stored at libcity/cache/{exp_id}/evaluate_cache/region_embedding_{model_name}_{dataset}_{dim}.npy.
  • The model's evaluation results for the road segment-based downstream task and the land parcel-based downstream task are stored in the libcity/cache/{exp_id}/evaluate_cache/{exp_id}_evaluate_{model_name}_{dataset}_{dim}.csv.

For trajectory-based downstream tasks, you can run the following command:

python traj_task.py --dataset ${name} --emb_id ${exp_id}

You need to replace the ${name} as cd or xa.

You need to replace the ${exp_id} as the exp_id generated above.

The experiment generates a new exp_id2 and the test results of the two trajectory tasks are saved at libcity/cache/{exp_id2}/evaluate_cache/*.csv.

home-gcl's People

Contributors

aptx1231 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.