vectorinstitute / cyclops Goto Github PK
View Code? Open in Web Editor NEWToolkit for evaluating and monitoring AI models in clinical settings
Home Page: https://vectorinstitute.github.io/cyclops/
License: Apache License 2.0
Toolkit for evaluating and monitoring AI models in clinical settings
Home Page: https://vectorinstitute.github.io/cyclops/
License: Apache License 2.0
The scikit-learn model wrapper (SKModel
) currently supports this behaviour using the decorator pattern - if an attribute or method is not found in the wrapper, then the wrapped model is checked.
This should be implemented for the PTModel
class as well, so that an instance of the wrapper appears to be like an instance of torch.nn.Module
with additional functionalities.
I'm currently attempting to build API docs using sphinx for the evaluate package and running into some issues.
To reproduce the error,
git checkout add_api_docs_evaluate
cd docs
make html
The autosummary plugin generates two warnings which indicate failure to generate API docs. The issue seems to be an ImportError which is tied to the medical_imagefolder
module.
Since fairness metrics are a parity/ratio, it might be better to visualize it as scatter plot instead of a bar plot. I think we did this before in https://github.com/VectorInstitute/cyclops/blob/main/nbs/explore_huggingface_datasets.ipynb.
To Do:
Update tutorial with fairness metrics visualization to scatter plots
During training, models are currently being saved in current_working_directory/output/ModelClassName
.
path/to/model_directory/1
and path/to/model_directory/2
Even though installing Cyclops in Kaggle Notebooks is successful, importing it fails. It's probably due to a change in PyArrow API, that causes incompatibility with the required version and the installed version in Cyclops.
After installing Cyclops, try the following in a Kaggle Notebook:
from cyclops.query import MIMICIIIQuerier
Currently, some of the drift_detection use cases import implemented models from the https://github.com/VectorInstitute/cyclops/tree/main/cyclops/monitor/baseline_models directory. This should be removed and instead be replaced with model implementations from https://github.com/VectorInstitute/cyclops/tree/main/cyclops/models.
The data sub-package has important utility functions that need unit tests.
Task: Write good unit tests to increase coverage for the module to 100%.
(https://app.codecov.io/gh/VectorInstitute/cyclops/blob/main/cyclops/data/utils.py)
When logging model parameters using the report.log_model_parameters()
, it seems like only string values are captured in the report.
Running the https://vectorinstitute.github.io/cyclops/api/tutorials/mimiciii/mortality_prediction.html exposes this bug.
I think we should be able to record numeric and boolean values as well.
So, for example all the hyper parameters in the dict seen below should be captured when logging.
[Optional]
Some of these tend to be model hyper-parameters, for example learning_rate
, so i wonder if we should make a distinction.
Most hard metrics for classifier model performance (F1, precision, recall, sensitivity, specificity, NPV) require a pre-specified classification threshold cutoff (usually 0.5). Although for some use cases, it is useful to see how the model performs, and what tradeoffs you can make, if you'd like to prioritize for example, true positives, or minimize for example, false positives.
One option is a runway plot, like the one shown below:
We can currently do this in the model transparency report by developing a plotly
object and logging it to the report with log_plotly_figure()
.
Below is a working function that will create a runway plot in plotly as required:
import plotly.graph_objects as go
import plotly.subplots as sp
import numpy as np
from typing import List
from sklearn.metrics import roc_curve, precision_recall_curve, confusion_matrix
def runway_plot(true_labels: List[int], pred_probs: List[float]) -> go.Figure:
"""
Plot diagnostic performance metrics with an additional histogram of predicted probabilities.
The plot uses Plotly with a clean aesthetic. Gridlines are kept, but background color is removed.
Y-axis ticks and labels are shown. The legend is added at the bottom.
Tooltips show values with 3 decimal places. X-axis labels are only shown on the bottom subplot.
The histogram's bin size is reduced and it has no borders.
Args:
- true_labels (List[int]): True binary class labels (0 or 1).
- pred_probs (List[float]): Predicted probabilities for the positive class (1).
Returns:
- A Plotly figure containing the diagnostic performance plots and histogram.
Example:
```
# Generate synthetic data for demonstration
true_labels = np.random.binomial(1, 0.5, 1000)
pred_probs = np.random.uniform(0, 1, 1000)
# Generate and show the modified faceted plot
faceted_fig = plot_diagnostic_performance_with_histogram(true_labels, pred_probs)
faceted_fig.show()
```
"""
# ROC curve components
fpr, tpr, _ = roc_curve(true_labels, pred_probs)
# Precision-Recall curve components
precision, recall, _ = precision_recall_curve(true_labels, pred_probs)
# Thresholds for PPV and NPV
thresholds = np.linspace(0, 1, 100)
ppv = np.zeros_like(thresholds)
npv = np.zeros_like(thresholds)
# Calculate PPV and NPV for each threshold
for i, threshold in enumerate(thresholds):
# Binarize predictions based on threshold
binarized_predictions = pred_probs >= threshold
tn, fp, fn, tp = confusion_matrix(true_labels, binarized_predictions).ravel()
# Calculate PPV and NPV
ppv[i] = tp / (tp + fp) if (tp + fp) != 0 else 0
npv[i] = tn / (tn + fn) if (tn + fn) != 0 else 0
# Define hover template to show three decimal places
hover_template = 'Threshold: %{x:.3f}<br>Metric Value: %{y:.3f}<extra></extra>'
# Create a subplot for each metric
fig = sp.make_subplots(rows=5, cols=1, shared_xaxes=True, vertical_spacing=0.02)
# Sensitivity plot (True Positive Rate)
fig.add_trace(go.Scatter(x=thresholds, y=tpr, mode='lines', name='Sensitivity', hovertemplate=hover_template), row=1, col=1)
# Specificity plot (1 - False Positive Rate)
fig.add_trace(go.Scatter(x=thresholds, y=1-fpr, mode='lines', name='1 - Specificity', hovertemplate=hover_template), row=2, col=1)
# PPV plot (Positive Predictive Value)
fig.add_trace(go.Scatter(x=thresholds, y=ppv, mode='lines', name='PPV', hovertemplate=hover_template), row=3, col=1)
# NPV plot (Negative Predictive Value)
fig.add_trace(go.Scatter(x=thresholds, y=npv, mode='lines', name='NPV', hovertemplate=hover_template), row=4, col=1)
# Add histogram of predicted probabilities
fig.add_trace(go.Histogram(x=pred_probs, nbinsx=80, name='Predicted Probabilities'), row=5, col=1)
# Update layout
fig.update_layout(
height=1000,
width=700,
title_text="Diagnostic Performance Metrics by Thresholds",
legend=dict(orientation="h", yanchor="bottom", y=-0.2, xanchor="center", x=0.5)
)
# Remove subplot titles
for i in fig['layout']['annotations']:
i['text'] = ''
# Remove the plot background color, keep gridlines, show y-axis ticks and labels
fig.update_xaxes(showgrid=True)
fig.update_yaxes(showgrid=True, showticklabels=True)
# Only show the x-axis line and labels on the bottommost plot
fig.update_xaxes(showline=True, linewidth=1, linecolor='black', mirror=True)
fig.update_xaxes(showticklabels=True, row=4, col=1)
fig.update_yaxes(showline=True, linewidth=1, linecolor='black', mirror=True)
fig.update_xaxes(showline=False, row=5, col=1, showticklabels=False)
fig.update_yaxes(showline=False, row=5, col=1)
# Set the background to white
fig.update_layout(plot_bgcolor='white')
return fig
# Generate synthetic data for demonstration
true_labels = np.random.binomial(1, 0.5, 1000)
pred_probs = np.random.uniform(0, 1, 1000)
# Generate and show the modified faceted plot
faceted_fig_clean = runway_plot(true_labels, pred_probs)
faceted_fig_clean.show()
This will generate a plot like so:
This issue has 2 parts:
The codebase needs a review to see if it adheres to google code style guide (https://google.github.io/styleguide/pyguide.html).
For docstrings, the parameter types are documented using type hints, and adding them to the docstring is redundant. These can be removed to keep docstrings clean and reduce redundancy.
In the PyTorch model wrapper class (PTModel
) there is a function that was originally intended for re-weighting the loss for unbalanced datasets. This function is currently not implemented and can either be removed - because some PyTorch loss functions already support this - or refactored to a more general case for handling imbalanced datasets.
The model wrappers have a method that is intended for tuning the model hyperparameters and returning the best model.
The method has the following signature:
find_best(
self,
X: ArrayLike,
y: ArrayLike,
parameters: Union[Dict, List[Dict]],
metric: Union[str, Callable, Sequence, Dict] = None,
method: Literal["grid", "random"] = "grid",
**kwargs,
)
Currently, only the scikit-learn model wrapper SKModel
implements this method, and that implementation would benefit from the following improvements:
cyclops.evaluate.metrics
in the hyperparameter search, potentially using the sklearn.metrics.make_scorer
method.group
and fit_params
arguments when calling clf.fit
.The PyTorch model wrapper (PTModel
) should implement this method as well, with the same behaviour as the sklearn model wrapper.
When evaluate/evaluate_fairness is called and slice_spec is set, if the number of examples matching a specific slice spec is one, the type_target
and type_preds
become unknown and there will be a value error raised by _binary_stat_scores_format
.
For particular problems such as prevalence estimates or risk score prediction, well calibrated models are necessary. It would be great for data scientist end users to be able to generate calibration plots in their transparency reports to examine this aspect of model performance..
One option for calibration plots looks like this:
We can currently do this in the model transparency report by developing a plotly
object and logging it to the report with log_plotly_figure()
.
Below is a working function that will create a calibration plot in plotly as required:
def generate_calibration_plot(df, y_true_col, y_prob_col, grouping_var=None):
"""
Generates a calibration plot with an optional histogram below the plot for the predicted probabilities.
Parameters:
df (DataFrame): The dataframe containing the true labels and predicted probabilities.
y_true_col (str): The name of the column with true labels (0 or 1).
y_prob_col (str): The name of the column with predicted probabilities.
grouping_var (str, optional): The name of the column to group data by. If provided, the plot will include
multiple curves on the plot, one for each level of the grouping_var.
"""
# Create subplots: 1 plot for calibration curve, 1 plot for histogram
fig = make_subplots(rows=2, cols=1, shared_xaxes=True, vertical_spacing=0.02, row_heights=[0.8, 0.2])
if grouping_var:
# Plot a calibration curve for each level of the grouping variable
unique_groups = df[grouping_var].unique()
for group in unique_groups:
group_df = df[df[grouping_var] == group]
prob_true, prob_pred = calibration_curve(group_df[y_true_col], group_df[y_prob_col], n_bins=10)
fig.add_trace(go.Scatter(x=prob_pred, y=prob_true, mode='markers+lines', name=f'{group}'), row=1, col=1)
else:
# Plot a single calibration curve
prob_true, prob_pred = calibration_curve(df[y_true_col], df[y_prob_col], n_bins=10)
fig.add_trace(go.Scatter(x=prob_pred, y=prob_true, mode='markers+lines', name='Model'), row=1, col=1)
# Add perfectly calibrated line to the calibration curve
fig.add_trace(go.Scatter(x=[0, 1], y=[0, 1], mode='lines', name='Perfectly calibrated', line=dict(dash='dot')), row=1, col=1)
# Plot histogram if no grouping variable is provided
fig.add_trace(go.Histogram(x=df[y_prob_col], nbinsx=100, name='Probabilities', showlegend=False), row=2, col=1)
# Update layout
legend_title = grouping_var if grouping_var else None
fig.update_layout(title='Calibration Plot', yaxis_title='Fraction of Positives', legend_title=legend_title)
fig.update_xaxes(title_text='Mean Predicted Probability', row=2, col=1)
fig.update_yaxes(title_text='Count', row=2, col=1)
fig.show()
Run on sample data, such as this:
import pandas as pd
import numpy as np
np.random.seed(42)
n_samples = 100
data = {
'y_true': np.random.binomial(1, 0.5, n_samples), # Random true labels (0s and 1s)
'y_prob': np.random.rand(n_samples), # Random probabilities between 0 and 1
'group': np.random.choice(['A', 'B'], n_samples) # Random group labels ('A' or 'B')
}
sample_df = pd.DataFrame(data)
generate_calibration_plot(sample_df, 'y_true', 'y_prob')
Generating the following plot:
Or calibration by group:
generate_calibration_plot(sample_df, 'y_true', 'y_prob', 'group')
Generating the following plot:
There are two methods in ClassificationPlotter
- roc_curve()
, precision_recall_curve()
- that take a tuple of ndarray
s whose order is important and may introduce potential bugs. Also, the last element of tuple is never used inside the functions.
Refactor the arguments so that it is clear which element is which. Using namedtuple
is preferred.
The current API documentation (https://vectorinstitute.github.io/cyclops/api/index.html#) needs improvement, with more clear descriptions of the sub-packages. Also needs sections on high level overview of the modules.
The API docs also needs tutorials to demonstrate the tasks and evaluate functionalities.
The query dataset API design has some problems and could be improved. This issue is to have a design review and suggest a proposal for improving the API. Firstly some problems:
querier.table(join=(join_table, keys, join_type), filter=(col, val), apply=(col, func), ...)
where join
, filter
and apply
are more generic and flexible methods. We would have to ofc allow conditions using timestamps too and a few other cases, but the above 3 funcs should cover a lot of ground.Installing Cyclops in a new conda environment on Windows is unsuccessful due to not finding any Python version that can satisfy all requirements.
Upon creation of a new conda environment, enter:
py -m pip install 'pycyclops[query,models]'
The bug does not occur when installing Cyclops on Kaggle Notebooks.
It is a rather long and complicated method, it would be nice if we could refactor it.
To improve the documentation, I would suggest creating a link to "Contributing to cyclops" under Getting Started's "Developing" subsection. This way it's clear that this isn't the only development documentation, and where to find this further info.
I have the following SQL code which I'd like to port to my CyclOps pipeline:
SELECT * FROM my_table WHERE my_field LIKE '%my_pattern%' OR my_field IN (my_values)
Unfortunately with the current query operations, this does not seem possible.
One potential design would be:
query_ops = qo.Sequential([
qo.Or([
qo.Like(my_pattern, my_field),
qo.ConditionIn(my_values, my_field)
])
])
Although there may be better designs.
This may be possible by joining many tables with a different qo.ConditionEquals
each, but the performance would suffer greatly. It also doesn't seem to be possible to perform a LIKE
with any of the existing features.
Currently the Apply
function in cyclops.query.ops supports only the application of a function to one or more columns. What would be a desirable addition is being able to apply a function that takes in multiple column inputs and gives a single column output. so F(col1, col2, ...)
.
Using the public NIH dataset as a use-case, the support for Chest X-Ray (CXR) images and the support for dimensionality reduction and two-sample tests on large CXR image datasets will be implemented.
This will be a placeholder for a task list of features (other issues) that will be incrementally implemented. I'll create branches specifically to work on these issues to track development.
cyclops/cyclops/process/feature/split.py
Line 21 in f5b2046
I was thinking the function could probably be both simplified and easier to use if the fractions were normalized within the function instead of forcing the user to ensure normalization before calling it.
I just came upon your framework and am very interested in its development.
I'm working on a research project in which we (currently) use MIMIC-IV.
The cyclops project looks quite interesting to use, especially for the ETL task.
However since it's clearly still in alpha stage, we are not sure if it makes sense to base our work on this library, since its prone to change in the future.
It would be very helpful to have an outlook on what the plans for this project are in the short and long term future.
Maybe we could also contribute on your plans if they line up with our goals.
Therefore, I would highly appreciate a development roadmap :)
The current batching while querying is a bit finicky and the implementation is untested. While it works and has been used for the mimic use case, replacing it with dask is a better choice.
TODO:
_train_loop
method or accept a predefined split when the fit
method is called.Currently, when instantiating an SKModel wrapper (https://github.com/VectorInstitute/cyclops/blob/main/cyclops/models/wrappers/sk_model.py#L56), model params in the form of a dict needs to be specified. This is done by loading default configs from https://github.com/VectorInstitute/cyclops/tree/main/cyclops/models/configs and then passing to the create_model
factory method.
This places additional work from the user, and hence default configs can be loaded in the wrapper directly with an option to override by the user. This pattern is used in the DatasetQuerier class (https://github.com/VectorInstitute/cyclops/blob/main/cyclops/query/base.py#L90) using hydra to load yaml config.
Add an op using sqlalchemy to apply regex to filter out rows that match (including not condition).
Many of the process API fns that use pandas like groupby for aggregation, imputation and normalization have dask support. So a good feature enhancement is to be able to load as dask dataframes and call process API funcs which then call dask compute method to process in batches.
TODO:
Currently, when we wish to aggregate over the same column using multiple aggregation funcs, it is not supported. (See https://github.com/VectorInstitute/cyclops/blob/main/cyclops/query/ops.py#L1751)
We wish to support something like:
GroupByAggregate("person_id",
{
"lab_name": [
("string_agg", "lab_name_agg"),
("median", "lab_name_median")
},
{"lab_name": ", "}
)(table)
Considerations:
("string_agg", "lab_name_agg", ", ")
Mypy configuration is currently a little easy-going, and doesn't care if type annotations aren't added to functions.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.