π Issue board. βοΈ Neptune.ai
The basics of how to set up and run this project
Make sure python 3.8 is installed.
From project root run:
Setup up python virtual env:
virtualenv --python=/usr/bin/python3.x env && source env/bin/activate
Install project and dependencies
pip install -r requirements.txt
./scripts/install-git-hooks.sh
Run python src/main.py --help
to get a help message.
Run an experiment python src/main.py -e <experiment-title-wihout-spaces> "<description of the experiment>"
From project root run:
mamba
External API tests takes a long time to run. To just run the fast local tests run:
mamba -t unit,integration
Run black src spec
When committing new content to git correct and structured conventions should be utilized. In addition to marking commits with the issue number fro 'linear', a commit should always contain a type descriptions (such as for a feature, experiment, test, and so forth),
This inclues the use of prefixes to the commit messages used. The following are the relevant conventions for this project:
Prefix | Content |
---|---|
feat/ | Describes that there is a new feature added |
test/ | Describes that the contents of the commit is to add tests |
exp/ | Identification to use in case of an experiment. |
ref/ | Refactoring code / Rewriting code |
paper/ | Updates done to the paper |
The commit should start with the type prefix, followed by the MAS code if there is any, lastly followed by a description of the commit function.
Examples |
---|
feat/MAS-30/Add-functionality-for-saving-figures |
test/Add-missing-tests-with-saving-figures |
Branches should follow the same naming convention as the one used for commits. The branch name is mostly for clarification, and not as important as the commit messages creating the log to be read at a later time.
Prerequisites:
- Setup ssh alias in your SSH config for local NTNU user:
<username>@login.stud.ntnu.no
Host idun
HostName idun-login1.hpc.ntnu.no
User <username>
- Clone this repo into a
<username>@login.stud.ntnu.no:~/Masteroppgave
- Set up SSH keys for IDUN to github to allow pushing back to the repo
- Configure git config on idun to allow commits
- Create a .env file in
<username>@login.stud.ntnu.no:~/Masteroppgave/.env
with the contents described below
Running experiments:
- Write code for the experiment
- Add or tune a .slurm file in batch_Jobs/
- Run
./scripts/execute_experiment_hpc.sh <job-file.slurm> <Job Name> <Job description>
##Basic batch job commands
- Start job:
sbatch <slurm_file>
- See jobs:
squeue -u <username>
- Cancel jobs:
scancel <job-id>
In order to execute code on the HPC cluster, the cluster expects the use of a config file in the form of a ".slurm" file. Such ".slurm" files have already been created and are located under the folder "batch_jobs/".
Before a test is run on the HPC cluster, one should update the current values of the *.slurm files. The values to be updated are information such as job_name, mail for receiving information, etc.
Create a .env file in root folder to configure project related environment variables
# Email used to send HPC batch job status emails
EMAIL=<email>
USERNAME=<NTNU-username>
LOG_FILE=<batch_job_log_file> default: batch_jobs/output/sbatch_job.txt
NEPTUNE_API_TOKEN=<api-token>
βββ .pylintrc <- Python style guidance
βββ README.md <- The top-level README for developers using this project.
βββ .env <- Environment variables
β
β
βββ batch_jobs <- Batch .slurm files for executing workloads on HPC-cluster
β βββ output <- Log output from batch_jobs
β
βββ data
β βββ external <- Data from third party sources.
β βββ interim <- Intermediate data that has been transformed.
β βββ processed <- The final, canonical data sets for modeling.
β βββ raw <- The original, immutable data dump.
β
βββ models <- Trained and serialized models, model predictions, or model summaries
β
βββ notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
β the creator's initials, and a short `-` delimited description, e.g.
β `1.0-jqp-initial-data-exploration`.
β
βββ reports <- Generated analysis as HTML, PDF, LaTeX, etc.
β βββ figures <- Generated graphics and figures to be used in reporting
β
βββ requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
β generated with `pip freeze > requirements.txt`
β
βββ setup.py <- makes project pip installable (pip install -e .) so src can be imported
βββ src <- Source code for use in this project.
β βββ __init__.py <- Makes src a Python module
β βββ main.py <- Main file for project
β βββ experiment.py <- Experiment class
β β
β β
β βββ data_types <- Data classes, enums and data types.
β β
β βββmode-_structures<- Scripts to train models and then use trained models to make
β β β predictions
β β βββ ...
β β βββ ...
β β
β βββ pipelines <- Data processing pipelines
β β
β βββ utils <- Utilities and helper functinos
β β βββ config_parser.py
β β βββ ...
β β βββ logger.py
β β
β β
β βββ visualization <- Scripts to create exploratory and results oriented visualizations
β βββ ...
β
βββ spec/ <- Spesification tests for the project
βββ tox.ini <- tox file with settings for running tox; see tox.testrun.org