Giter Club home page Giter Club logo

m-sena-backend's Introduction

Python 3.6 Torch 1.2 Flask 1.1.2 License

This project is the backend of the M-SENA Platform.

Installation

Docker

We provide a docker image of our platform. See the main repo for instructions.

From Source

1. Clone this Repository

$ git clone https://github.com/iyuge2/M-SENA-Backend.git
$ cd M-SENA-Backend

2. Install Requirements

  • Install system requirements
$ apt install mysql-server default-libmysqlclient-dev libsndfile1 ffmpeg
  • Install python requirements
$ conda create --name sena python=3.8
$ source active sena
$ pip install -r requirements.txt

3. Configure MySQL

  • Login MySQL with root
$ mysql -u root -p
  • Create a database for M-SENA
mysql> CREATE DATABASE sena;
  • Create a user for M-SENA and grant privileges
mysql> CREATE USER sena IDENTIFIED BY 'MyPassword';
mysql> GRANT ALL PRIVILEGES ON sena.* TO sena@`%`;
mysql> FLUSH PRIVILEGES;

4. Configs

  • Edit Constants.py. Alter DATASET_ROOT_DIR, DATASET_SERVER_IP, OPENFACE_FEATURE_PATH, MM_CODES_PATH, MODEL_TMP_SAVE, AL_CODES_PATH and LIVE_TMP_PATH to fit your settings.
  • Edit config.sh. Look for DATABASE_URL and change it to fit your database settings.

5. Datasets

  • Download datasets and locate them under DATASET_ROOT_DIR specified in constants.py
  • Add information in DATASET_ROOT_DIR/config.json file to register the new dataset.
  • Format datasets with MM-Codes/data/DataPre.py
  • For datasets that needs labeling, the config file locates in AL-Codes directory.
$ python MM-Codes/data/DataPre.py --working_dir $PATH_TO_DATASET --openface2Path $PATH_TO_OPENFACE2_FeatureExtraction_TOOL --language cn/en
  • The structure of the DATASET_ROOT_DIR directory is introduced in the next section.

6. Run

$ source config.sh
$ flask run --host=0.0.0.0

Reference

Dataset Structure

The structure of the root dataset directory should look like this:

.
├── config.json
├── MOSEI
│   ├── label.csv
│   ├── Processed
│   └── Raw
├── MOSI
│   ├── label.csv
│   ├── Processed
│   └── Raw
└── SIMS
    ├── label.csv
    ├── Processed
    └── Raw
  • config.json: stating necessary information for all datasets. For example, language, label_path, features, etc. It only works when scanning and updating datasets.
  • **/label.csv: storing detailed information for each video clip in ** dataset, including video_id, clip_id, normal text, label value (Float), annotation (String), mode (training attributes). Besides, we define a field label_by to indicate the label type, which is necessary for labeling based on active learning.

dataset-Label

  • **/Processed: placing feature files. We use pickle to store processed features, which are organized as the following structure. These files are used in MM-Codes.
{
    "train": {
        "raw_text": [],
        "audio": [],
        "vision": [],
        "id": [], # [video_id$_$clip_id, ..., ...]
        "text": [],
        "text_bert": [],
        "audio_lengths": [],
        "vision_lengths": [],
        "annotations": [],
        "classification_labels": [], # Negative(< 0), Neutral(0), Positive(> 0)
        "regression_labels": []
    },
    "valid": {***}, # same as the "train"
    "test": {***}, # same as the "train"
}
  • **/Raw: placing raw videos. The path of each clip should be consistent with label.csv.

We provide the download link for preprocessed SIMS, code: 4aa6, md5: 3befed5d2f6ea63a8402f5875ecb220d, which follows the above requirements. You can get more datasets from CMU-MultimodalSDK.

Code Structure

The source code is organized as follows:

.
├── AL-Codes                # Active learning codes
├── MM-Codes                # MSA algorithm codes
├── app.py                  # Flask main codes
├── config.py               # Basic config
├── config.sh               # Basic config
├── constants.py            # Global variable definition
├── database.py             # Database definition & initialization
├── httpServer.py           # Dataset server (for video previews)
└── requirements.txt        # Python requirements
  • MM-Codes

MSA Code Framework

Based on MMSA, all model and dataset parameters are saved in MM-Codes/config.json.

  • AL-Codes

Labeling based on Active Learning Code Framework

Based on MMSA, all model and dataset parameters are saved in AL-Codes/config.json.

m-sena-backend's People

Contributors

flamesky-s avatar iyuge2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

z3r0s3v3n

m-sena-backend's Issues

运行问题

显示app_init_.py文件db.create_all()语句RuntimeError: Working outside of application context.请问最新版本是可执行代码吗?运行报错了,谢谢回答。

添加只包含语音和文本的数据集

   你好,我尝试使用本平台来进行我自己的数据集分析,但是我的数据集不包含视频的数据,我看配置文件中有一些关于audio格式相关的配置,所以我觉得本平台应该是能够支持这个功能的,但是仅靠目前的数据集Demo里的数据示例我无法知道怎么成功的导入一个音频的数据集,所以希望知道平台是否支持这类的数据导入?如果支持,是否有相关的配置文件和数据组织形式的示例?
   希望能得到您的回复,非常感谢。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.