Giter Club home page Giter Club logo

crabnet's Introduction

Hi, I'm Anthony ๐Ÿ‘‹

A passionate Data Scientist from Germany

  • ๐Ÿ”ญ Iโ€™m currently working on my PhD at the Technische Universitรคt Berlin

    • Topic keywords: deep learning, machine learning, data science and visualization, materials science
  • ๐Ÿ›  I use the following Python tools:

    • AI/ML: PyTorch, TensorFlow, scikit-Learn
    • Data wrangling: Pandas, NumPy, SQLite
    • Data visualization: Matplotlib, Seaborn, Plotly, Datashader
  • โ˜๏ธ I also have experience with cloud computing and HPC:

    • Colab / Jupyter / Kaggle notebooks
    • Grid.ai
    • Model training and deployment on SageMaker
    • High-performance computing at the TU Berlin
  • ๐ŸŒฑ Iโ€™m currently learning reinforcement learning and code golfing

  • ๐Ÿ“„ More about me & contact info: https://anthonywang.de/

  • ๐Ÿ’ฌ Ask me about our latest work, CrabNet for the accurate and inspectable predictions of materials properties based on the Transformer architecture!

Anthony Wang's GitHub stars

Most-used programming languages by Anthony Wang

crabnet's People

Contributors

anthony-wang avatar mahamadsalah74 avatar sgbaird avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

crabnet's Issues

about the predictor

I'm a beginner . May I ask that can your code predict the yield_strength of the material?

Reason to use pe_scaler and ple_scaler

Could you explain the reason why you use the pe_scaler and ple_scaler in the forward pass of the Encoder-class in kingcrab.py?
In particular, why do you choose the form
pe_scaler = 2**(1-self.pos_scaler)**2
and
ple_scaler = 2**(1-self.pos_scaler_log)**2?
I don't really understand why one needs these two scalers (and also self.emb_scaler) in the first place and why you chose the above exponential forms for them.

fit() and predict() methods

Not necessarily critical, but struck me that it might be good to have a mdl.fit() and a mdl.predict() relative to what happens in train_crabnet.py to make it easier to "train once" and then reuse the model several times and for consistency with similar packages. Then I realized there is a model.fit() call inside of train_crabnet.py relative to model = Model(CrabNet(...), ...). If I want to "train once" and then reuse the model, how would you suggest going about that?

saved models are not displayed well.

use_crabnet.py has a function "list_saved_models".

This function lists models, but they aren't usable when you copy them. We should clean this up so you can double click on a model and copy it directly into the predict_crabnet() function.

Parameter used for the results in the published work

Please correct me if I am wrong, but right now, when I try to run the sample dataset in example_materials_property, it runs for 40 epochs with three checks for early stopping. I would like to know if the same setting was used to produce the results in the paper or was it different as I could not find the exact details on parameters for the runs? Thank You

CrabNet matbench data, possible mismatch between submission notebook and results

@MahamadSalah74 I ran the matbench notebook posted on the matbench GitHub page (materialsproject/matbench#23), and got somewhat higher MAEs for a few repeat runs (see below) compared to what was reported.

0.3683
0.3661
0.3658

The reported matbench result (0.3463) seems a bit more in line with matbench_crabnet.py which uses 300 epochs and the full train/val dataset for training. It's not a huge difference, but I'm trying to figure out if/what the discrepancy is.

Maybe I'm missing something basic. Could you comment on this?

AttributeError: 'SWA' object has no attribute '_optimizer_step_pre_hooks'. Did you mean: '_optimizer_step_code'?

The pytorch Optimizer class has changed with recent releases, which leads to the following error:

[...]
stepping every 16 training passes, cycling lr every 1 epochs
checkin at 2 epochs to match lr scheduler
Traceback (most recent call last):
  File "/home/pbenner/Source/pycoordinationnet-results/model_comparison/crabnet/eval.py", line 141, in <module>
    run_cv(X, y, f'eval-{task}-{target}.txt', n_splits)
  File "/home/pbenner/Source/pycoordinationnet-results/model_comparison/crabnet/eval.py", line 95, in run_cv
    model = train_model()
  File "/home/pbenner/Source/pycoordinationnet-results/model_comparison/crabnet/eval.py", line 62, in train_model
    model.fit(epochs=1000, losscurve=False)
  File "/home/pbenner/Source/pycoordinationnet-results/model_comparison/crabnet/model.py", line 228, in fit
    self.train()
  File "/home/pbenner/Source/pycoordinationnet-results/model_comparison/crabnet/model.py", line 140, in train
    self.optimizer.step()
  File "/home/pbenner/.local/opt/anaconda3/envs/crysfeat/lib/python3.10/site-packages/torch/optim/lr_scheduler.py", line 69, in wrapper
    return wrapped(*args, **kwargs)
  File "/home/pbenner/.local/opt/anaconda3/envs/crysfeat/lib/python3.10/site-packages/torch/optim/optimizer.py", line 271, in wrapper
    for pre_hook in chain(_global_optimizer_pre_hooks.values(), self._optimizer_step_pre_hooks.values()):
AttributeError: 'SWA' object has no attribute '_optimizer_step_pre_hooks'. Did you mean: '_optimizer_step_code'?

The following patch fixed the issue:

diff --git a/utils/optim.py b/utils/optim.py
index 33008dd..18224ea 100644
--- a/utils/optim.py
+++ b/utils/optim.py
@@ -1,6 +1,7 @@
-from collections import defaultdict
+from collections import defaultdict, OrderedDict
 from itertools import chain
 from torch.optim import Optimizer
+from typing import Callable, Dict
 import torch
 import warnings
 import numpy as np
@@ -116,6 +117,8 @@ class SWA(Optimizer):
         self.optimizer = optimizer
 
         self.defaults = self.optimizer.defaults
+        self._optimizer_step_pre_hooks: Dict[int, Callable] = OrderedDict()
+        self._optimizer_step_post_hooks: Dict[int, Callable] = OrderedDict()
         self.param_groups = self.optimizer.param_groups
         self.state = defaultdict(dict)
         self.opt_state = self.optimizer.state

attention-heads as samples from posterior distribution in a Bayesian sense

https://aclanthology.org/2020.emnlp-main.17.pdf

Though I think CrabNet might need to be refitted for new samples (i.e. if you specify N=10, then you only get 10 samples from the posterior, to get more would probably require refitting, and not sure if these would be directly comparable to the 10 from the first run). Also not exactly sure how this could be converted to individual predictions. Maybe just some basic plumbing in and after:

CrabNet/crabnet/kingcrab.py

Lines 151 to 157 in 9e0d79c

if self.attention:
encoder_layer = nn.TransformerEncoderLayer(self.d_model,
nhead=self.heads,
dim_feedforward=2048,
dropout=0.1)
self.transformer_encoder = nn.TransformerEncoder(encoder_layer,
num_layers=self.N)

pip or conda install

How tough do you think it would be to package CrabNet into one of the package managers?

CrabNet matbench results - possibly neglecting 25% of the training data it could have used

@anthony-wang,

In the CrabNet matbench notebook, it does train/val/test splits. However, if #15 (comment) is correct such that the validation data (i.e. val.csv) doesn't contribute to hyperparameter tuning, then that 25% of the training data is essentially getting thrown away, correct?

In other words, the CrabNet results are based on only 75% of the training data compared to what the other matbench models use for training. From what I understand, the train/val/test split in the context of matbench only really makes sense if you're doing hyperparameter optimization in a nested CV scheme, as follows:

(Source: https://hackingmaterials.lbl.gov/automatminer/advanced.html)

To correct this, I think all that needs to be done is change:

#split_train_val splits the training data into two sets: training and validation
def split_train_val(df):
  df = df.sample(frac = 1.0, random_state = 7)
  val_df = df.sample(frac = 0.25, random_state = 7)
  train_df = df.drop(val_df.index)

  return train_df, val_df

to

#split_train_val splits the training data into two sets: training and validation
def split_train_val(df):
  train_df = df.sample(frac = 1.0, random_state = 7)
  val_df = df.sample(frac = 0.25, random_state = 7)

  return train_df, val_df

which makes it so there's data bleeding between train_df and val_df, but val_df ends up being essentially just a dummy dataset so that CrabNet doesn't error out when a val.csv isn't available.

Sterling

Multiclass classification

Is there an easy way to modify crabnet for a multi-class classification problem? I can see there is a built-in function for binary classification. I mean, is there any way other than introducing my own custom function (let's say which uses CrossEntropyLoss) for the task?

the classification criterion doesn't factor in the uncertainty - does this mean ignore the uncertainty for classification?

i.e. log_std is an unused parameter in the classification criterion:

CrabNet/utils/utils.py

Lines 263 to 265 in a5be06f

def BCEWithLogitsLoss(output, log_std, target):
loss = nn.functional.binary_cross_entropy_with_logits(output, target)
return loss

If this is the case, should the uncertainty output from CrabNet be ignored by the user during classification? In other words, are the uncertainty values essentially just a bunch of random numbers for classification?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.