Giter Club home page Giter Club logo

About me

Intellectually curious Health Data Scientist passionate about the data revolution in the healthcare sector. I am particularly interested in creating innovative AI/ML solutions that deliver high value in the pharmaceutical/healthcare sector. After 6+ years of consultancy experience delivering complex data products and assisting large client organisations, I wish to expand my Data Science skills and knowledge. My goal is to pivot my career to continue as a Data Scientist in the healthcare industry.

To know more about me, please feel free to contact me or visit my LinkedIn

Project summary

Main Portfolio Project

As part of my Health Data Science MSc dissertation at UCL, I have built a Knowledge Graph Retrieve Augmented Generation (KG-RAG) system that leverages Large Language Models to efficiently interrogate and analyse a large collection of clinical trial protocols from ClinicalTrials.gov.

Key learnings:

  • Deploy open-source Large Language Models (LLMs), such as Llama3 or Mixtral8x7b, in High-Performance Computing (HPC) using vLLM.
  • Process semi-structured Clinical Trial Protocols using Non-SQL/MongoDB.
  • Creation and hosting of a Knowlege Graph using BioCypher and Neo4j AuraDB.
  • Implementation of a ReAct design using DSPy, creating custom tools that can be used by an LLM to query Knowledge Graphs and SQL dbs.
  • Use high-level frameworks such as Llama-index and LangChain for txt-2-SQL and txt-2-Cypher.
  • How to evaluate Large Language Models.

Do you want to know more about this project?

Full Portfolio

Please, see below a summary of a few projects showcasing my Data Science skills.

Skill \ Technology UCI Heart Disease Card Fraud Disaster Tweets Causal Impact
Business question Diagnose which patients
are suffering heart diseases
Detect likely
fraudulent transactions
Identify disaster events
mentioned in text/tweets
Quantify the effect of COVID
lockdown in stock price
Language Python Python Python Python / R
ML type Classifier Classifier NLP Classifier Time Series Regression
Data Engineering pySpark
Feature Engineering Time Series Features Word Embedding
Over / Under sampling SMOTE
Traditional ML Sklearn Sklearn Causal Impact
Gradient Boosting XGBoost CatBoost
Deep Learning LSTM, GRU, DistilBert
Hyper fine tunning Optuna
Explainable ML SHAP Values
User Interface Streamlit
ML Ops MLFlow MLFlow

Volunteering

I participated with the NHS Pycom in the development of nhspy-plothedots, a package for Statistical Process Control analysis and plotting. My mean contribution was creating unit test scripts. This gave me an opportunity to (a) know more about the package so I can contribute in other areas in the future and (b) practice software development skills (e.g. unit testing, raise pull request) that I have used in my professional career but they may not show up in my Data Science portfolio.

jponsa's Projects

jponsa doesnโ€™t have any public repositories yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.