- ๐ฑ Iโm learning machine learning.
timelovercc / caf-gnn Goto Github PK
View Code? Open in Web Editor NEW[CIKM 2023] Towards Fair Graph Neural Networks via Graph Counterfactual.
Home Page: https://arxiv.org/abs/2307.04937
License: MIT License
[CIKM 2023] Towards Fair Graph Neural Networks via Graph Counterfactual.
Home Page: https://arxiv.org/abs/2307.04937
License: MIT License
Hi, thanks for the great work ! Yet two issues occur when I'm reproducing it.
To run the code, Module Pandas and Rich are also needed. However they are not in the setup.sh
in this line the dataset_name should be 'bail' since the line above say so.
Thanks again for the great work !
Thanks for your timely reply!
In GEAR's evaluation part, they generate subgraphs for each node to calculate the mean cf. Do you also implement with generating subgraphs when derive cf metric or simply follows the definition with something like cf = 1 - (np.sum(y_pred_cf == y_pred) / n)
:
# For convenience, attached is code for how GEAR evaluate cf with subgraphs
def evaluate(model, data, subgraph, cf_subgraph_list, labels, sens, idx_select, type='all'):
loss_result = compute_loss(model, subgraph, cf_subgraph_list, labels, idx_select)
if type == 'easy':
eval_results = {'loss': loss_result['loss'], 'loss_c': loss_result['loss_c'], 'loss_s': loss_result['loss_s']}
elif type == 'all':
n = len(labels)
idx_select_mask = (torch.zeros(n).scatter_(0, idx_select, 1) > 0) # size = n, bool
# performance
emb = get_all_node_emb(model, idx_select_mask, subgraph, n)
output = model.predict(emb)
output_preds = (output.squeeze() > 0).type_as(labels)
auc_roc = roc_auc_score(labels.cpu().numpy()[idx_select], output.detach().cpu().numpy())
f1_s = f1_score(labels[idx_select].cpu().numpy(), output_preds.cpu().numpy())
acc = accuracy_score(labels[idx_select].cpu().numpy(), output_preds.cpu().numpy())
# fairness
parity, equality = fair_metric(output_preds.cpu().numpy(), labels[idx_select].cpu().numpy(),
sens[idx_select].numpy())
# counterfactual fairness
cf = 0.0
for si in range(len(cf_subgraph_list)):
cf_subgraph = cf_subgraph_list[si]
emb_cf = get_all_node_emb(model, idx_select_mask, cf_subgraph, n)
output_cf = model.predict(emb_cf)
output_preds_cf = (output_cf.squeeze() > 0).type_as(labels)
cf_si = 1 - (output_preds.eq(output_preds_cf).sum().item() / idx_select.shape[0])
cf += cf_si
cf /= len(cf_subgraph_list)
eval_results = {'acc': acc, 'auc': auc_roc, 'f1': f1_s, 'parity': parity, 'equality': equality, 'cf': cf,
'loss': loss_result['loss'], 'loss_c': loss_result['loss_c'], 'loss_s': loss_result['loss_s']} # counterfactual_fairness
return eval_results
Hi.
An error occurs when I reproduce CAF, in line trainer.fit(model, datamodule=data_module)
:
RuntimeError: It looks like your LightningModule has parameters that were not used in producing the loss returned by training_step. If this is intentional, you must enable the detection of unused parameters in DDP, either by setting the string value
strategy='ddp_find_unused_parameters_true'
or by setting the flag in the strategy withstrategy=DDPStrategy(find_unused_parameters=True)
.
And I think this line self.data.x[:, sens_idx] = self.data.sens
does not re-asign the sensitive value
To be specific, in the implementation of torch_geometric.data.Dataset,
The data object will be transformed before every access
which means that:
when executing dataset = Bail(...,transform=NormalizeFeatures())
, features dataset[0].x
are not normalized.
when executing data = dataset[0]
(i.e. accessing data object), features data.x
are implicitly normalized.
Yet this line self.data.x[:, sens_idx] = self.data.sens
is executed in dataset = Bail(...,transform=NormalizeFeatures())
, in other words, the re-asigning of sensitive values are executed beform feature normalization.
And you use Row-Normalization (torch_geometric.transforms.NormalizeFeatures
) in your code, resulting in a variety values of
Hi, I notice some issues in all three files './src/datasets/bail.py&credit.py&german.py'.
Take 'bail.py' as an example:
Line13 self.load(self.processed_paths[0])
Maybe you mean self.data = torch.load(self.processed_paths[0])[0]
?
Line14 sens_idx = 1
I think the sens_idx
for bail
is supposed to be 0
.
Line15 self.data.x[:, sens_idx] = self.data.sens
I don't understand what this line does, since self.data.x[:, sens_idx]
always equals to self.data.sens
Line31 self.save([data], self.processed_paths[0])
Maybe you mean torch.save([data], self.processed_paths[0])
?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.