Giter Club home page Giter Club logo

Comments (11)

GemmaTuron avatar GemmaTuron commented on July 22, 2024 2

Hi @IshitaPathak

Thanks for the explanations, much celarer now, and good job on doing a PCA as well!
Please move onto preparing your final application, many thanks!

from ersilia.

GemmaTuron avatar GemmaTuron commented on July 22, 2024 1

Hi @IshitaPathak
Please update here w2 tasks that you have marked as done, so we can provide feedback

from ersilia.

IshitaPathak avatar IshitaPathak commented on July 22, 2024 1

MY PROGRESS AND LEARNINGS

So far, I've learned valuable skills to contribute to Ersilia. It's been an exciting journey

  • Learned Docker by Dockerized a simple app GitHub repo here, learned about
    • Dockerfile
    • Caching layers
    • Publishing to Docker Hub.
  • Explored Docker Compose, understanding port mapping and managing environment variables.

I have a strong foundation in Python, but my exposure to libraries was somewhat limited. To address this, I've invested some time in learning some libraries GitHub repo here like Pandas and NumPy. By today, I aim to complete my understanding of Matplotlib and other libraries essential for my current task. Following this, I move forward with the next part of Week 2 tasks.

from ersilia.

GemmaTuron avatar GemmaTuron commented on July 22, 2024 1

Hi @IshitaPathak

Thanks for the explanation. I suggest the following timeline:

  • Finish week 2 tasks, including a good explanation of what you have done and your conclusions
  • Start working on your final application

As the application period is coming to an end and we want to ensure applicants have time to prepare strong applications please do not tackle Week 3 tasks and focus on the final application instead. Thanks!

from ersilia.

IshitaPathak avatar IshitaPathak commented on July 22, 2024

Motivation Letter

Hi, I am Ishita Pathak currently a first year student pursuing Master of Computer Application from Indira Gandhi Delhi Technical University For Women,Delhi, India. I am writing to express my genuine excitement about the opportunity to contribute to Ersilia's goals, to ensure that laboratories in less affluent countries have access to cutting-edge AI and ML tools for discovering drugs to treat infectious and neglected diseases.

As a computer science student, I have worked across various tech stacks. However, my current aspiration lies in delving deeper into AI/ML as ML is in my coursework too and Ersilia's project provides a chance to leverage my skills and knowledge to address real-world challenges. Being a quick learner, I'm ready to dedicate the time and effort needed to achieve these goals and learn new things along way.

Six years back, I went through a tough time when someone very close to me passed away because they couldn't get the medical help they needed in time. It really affected me and sparked a strong desire to make a difference in healthcare. I believe that contributing to Ersilia with my technical skills is the best way for me to do that. I am confident that I can contribute positively to advancing healthcare solutions and ultimately saving lives.

Why me?
My passion for open source and never give up attitude sets me apart from others. I’ve always felt that working in open source and helping is my way of doing good for society but through this project, I’ll not only be able to give back to the community but also potentially save lives. I am excited about the opportunity to work on this project and will work as hard as I have to make this project a grand success.

Thanks and Regards
Ishita Pathak

from ersilia.

IshitaPathak avatar IshitaPathak commented on July 22, 2024

Week 1 TASK ✅

After Installation of Ersilia Model Hub I test it for simple model

ersilia -v fetch eos3b5e
ersilia serve eos3b5e
ersilia -v api run -i "CCCC"

Output

Screenshot 2024-03-12 213405

Testing Ersilia with Docker

docker pull ersiliaos/eos4wt0:latest
ersilia serve eos4wt0 
ersilia -v api run -i "CCCC"

Output

Screenshot 2024-03-22 154436

While completing the task I stuck at a point when I was testing ersillia model eos3b5e , where the container is always in exited status. I asked about this in Slack channel, where mentor helped me resolve the issue.

Screenshot 2024-03-12 210909

I truly appreciate the supportive environment within community, where both mentors and peers are always ready to lend a helping hand.

from ersilia.

IshitaPathak avatar IshitaPathak commented on July 22, 2024

Week 2 TASK ✅

  • Chose the hERG model "eos30gr" from the list of suggested models in GitBook
  • Read the publication to better understand the model.

Model Overview

As hERG channel is responsible for regulating the electrical signals in the heart. When certain drugs block this channel, it can cause a condition known as long QT syndrome, which can lead to dangerous heart rhythm abnormalities.

To identify which drugs might have this effect, Ersilia developed a computer-based model called deephERG. This model uses a type of artificial intelligence called deep neural networks to analyze large datasets containing information on thousands of chemicals. By studying the chemical structures and properties of these compounds, deephERG can predict their likelihood of blocking the hERG channel.

  • Ensured model functionality on my system by downloading, serving, and running it using the following commands:
ersilia -v fetch eos30gr
ersilia serve eos30gr
ersilia -v api run -i "CCCC" 

Upon fetching the eos30gr model, I encountered consistent null output for the smiles prediction. Since the models are regularly updated, I tried the command ersilia -v fetch eos30gr --from_github to fetch the latest code from GitHub, which resolved the issue seamlessly.

Output

Screenshot 2024-03-22 140337

  • Next I understood the repository structure from the provided example and created the GitHub Repository that has all necessary files.

from ersilia.

IshitaPathak avatar IshitaPathak commented on July 22, 2024

Thankyou so much @GemmaTuron for the guidance and timeline. I'm committed to finishing the week 2 tasks and starting work on my final application right away.

from ersilia.

IshitaPathak avatar IshitaPathak commented on July 22, 2024
  • Selected list of 1000 molecules reference_library.csv shared in Slack (data channel). To make sure the data was consistent, I standardized this SMILES representations using the function from src. For three SMILES, RDKit encounters invalid SMILES, resulting in NaN values. I removed those invalid entries from the dataset.

  • Next, I obtained the InChIKey representation for all the standardized SMILES. This information was used to create a DataFrame containing the processed SMILES and their corresponding InChIKeys. Now, this DataFrame had two columns: "smiles" and "InChI_key" I then saved this processed data as a csv file named processed_input.csv.

After cleaning the data and obtaining corresponding InChIKey, I ran the model on the processed dataset using following commands

ersilia -v fetch eos30gr --from_github
ersilia serve eos30gr
ersilia -v api run -i processed_input.csv -o output.csv

The output generated by the model is saved in the file output.csv

  • I use the predictions I got from the Ersilia Model Hub and create the necessary plots to see how are they distributed...
histogram scatter plot

From the scatter plot we can say that due to significant overlap between the two classes, distinguishing between them becomes challenging. This overlap suggests that the features used for classification may not be distinct enough, impacting the model's ability to make accurate predictions and without a clear separation between the classes, the model may struggle to effectively differentiate between hERG blockers and non-blockers.

Completed week2 Task1 here is the link of notebook for this task 00_model_bias.ipynb

WEEK2 TASK2

  • Selected Table6 from this repo provided in the publication on page no. 32 where author have taken 1,824 FDA approved small molecule drugs from DrugBank database. After standardising the smilies, removing null and duplicates values.

  • I ran the model on the dataset using following commands

ersilia -v fetch eos30gr 
ersilia serve eos30gr
ersilia -v api run -i input_week2_task2.csv -o output_week2_task2.csv

  • Then I compared the results of publication with those generated by the eos30gr model. The objective was to determine if both sources produce similar results.
LineChart_-vePredictiveProbability BarChart_-vePredictiveProbability
LineChart_+vePredictiveProbability BarChart_+vePredictiveProbability

From the above graphs, it's very clear that there's a difference between the results obtained from the publication and those from the Ersilia Model Hub. This inconsistency suggests that the eos30gr model may not be reproducible.

Percentage of hERG Blockers and Non-Blockers in Publication Result:

Blockers Number Percentage
Yes (Herg Blockers) 513 29.79%
No (Non-Blockers) 1209 70.21%

Percentage of hERG Blockers and Non-Blockers After Testing from the Model:

Blockers Number Percentage
Yes (Herg Blockers) 411 23.87%
No (Non-Blockers) 1311 76.13%

From these percentages also, it's evident that there is a discrepancy between the percentage of hERG blockers and non-blockers in the publication results compared to those obtained from testing the model. This suggests potential issues with the reproducibility of the model. Hence model eos30gr is not reproducible.

Here is the link for GitHub repository

WEEK3 TASK

Selected a suitable dataset with sufficient experimental results, named external_dataset_Xaio_Li.csv in data folder.

Here is the reference of the data , I have taken Li 1092 test data

Screenshot 2024-04-03 003723

from ersilia.

IshitaPathak avatar IshitaPathak commented on July 22, 2024

Thankyou soo much @GemmaTuron. I really appreciate your time and feedback. Started working on final application.

from ersilia.

IshitaPathak avatar IshitaPathak commented on July 22, 2024

WEEK 4 TASK ✅

  • Created final application and received feedback from mentor.
  • Submitted the final application on the Outreachy website.

from ersilia.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.