Giter Club home page Giter Club logo

opentargetinternship's Introduction

OpenTargetInternship

6 months internship at the EMBL-EBI in the data team of the OpenTarget platform

Create Google Compute Engine instance

# Set parameters.
export INSTANCE_NAME=plip-interaction
export INSTANCE_ZONE=europe-west1-d
export INSTANCE_TYPE=n1-highcpu-64

# Create the instance and SSH.
gcloud compute instances create \
  ${INSTANCE_NAME} \
  --project=open-targets-eu-dev \
  --zone=${INSTANCE_ZONE} \
  --machine-type=${INSTANCE_TYPE} \
  --service-account=426265110888-compute@developer.gserviceaccount.com \
  --scopes=https://www.googleapis.com/auth/cloud-platform \
  --create-disk=auto-delete=yes,boot=yes,device-name=${INSTANCE_NAME},image=projects/ubuntu-os-cloud/global/images/ubuntu-2004-focal-v20210927,mode=rw,size=2000,type=projects/open-targets-eu-dev/zones/europe-west1-d/diskTypes/pd-balanced

# SSH command may take a while to work while the instance is provisioned and configured.
gcloud compute ssh --zone ${INSTANCE_ZONE} ${INSTANCE_NAME}

# Use screen to avoid losing output when connection is lost. On reconnect, the session can be restored with calling `screen -d -r`.
screen

Configure environment and install dependencies (first time)

sudo apt update
# python3-openbabel has to be installed globally because of a number of errors in the current PIP packaging
# https://github.com/openbabel/openbabel/issues/2408
sudo apt install -y \
  openbabel \
  openjdk-11-jre-headless \
  python3-openbabel \
  python3-pip \
  python3-testresources \
  python3.8-venv

git clone -q https://github.com/MarineGirardey/OpenTargetInternship
cd OpenTargetInternship
python3 -m venv --system-site-packages venv
source venv/bin/activate
pip install --quiet --upgrade pip setuptools
# Manually mark openbabel as installed, because we did it previously and we don't want pip to try to install the broken PyPi version
# https://stackoverflow.com/questions/39403002/manually-set-package-as-installed-in-python-pip
touch venv/lib/python3.8/site-packages/openbabel-3.0.0-py3.8.egg-info
pip install --quiet --upgrade \
  dask \
  distributed \
  matplotlib \
  numpy \
  pandarallel \
  pandas \
  plip \
  pyspark \
  requests \
  git+https://github.com/PDBeurope/arpeggio

Commands to reconnect to the machine and/or reactivate the environment

  • Reconnect: gcloud compute ssh --zone ${INSTANCE_ZONE} ${INSTANCE_NAME}
  • Restore previously created screen session: screen -d -r
  • Reactivate the environment: cd ~/OpenTargetInternship && source venv/bin/activate

Run the analysis

cd scripts
mkdir -p pdb
time python script_plip_interaction_mapping.py \
  --input_file structure_for_plip_human_structures.csv \
  --output_file output.csv \
  --log_file log.txt \
  --pdb_folder pdb
mkdir -p residue_gen_pos_output
time python residue_genomic_position_script.py \
  --plip_input gene_mapped_structures.json \
  --plip_output output.csv \
  --output_folder residue_gen_pos_output \
  --log_file genomic_position_log.txt

opentargetinternship's People

Contributors

marinegirardey avatar tskir avatar

Watchers

Daniel Suveges avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.