Giter Club home page Giter Club logo

caf-gnn's Introduction

Hi there ๐Ÿ‘‹

  • ๐ŸŒฑ Iโ€™m learning machine learning.

caf-gnn's People

Contributors

timelovercc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

caf-gnn's Issues

Two Module Missing in setup.sh & A Little Mistake in README.md

Hi, thanks for the great work ! Yet two issues occur when I'm reproducing it.

1. Two Module Missing in setup.sh

To run the code, Module Pandas and Rich are also needed. However they are not in the setup.sh

2. A Little Mistake in README.md

in this line the dataset_name should be 'bail' since the line above say so.

Thanks again for the great work !

Request for details for counterfacutal fairness calculation implementation

Thanks for your timely reply!
In GEAR's evaluation part, they generate subgraphs for each node to calculate the mean cf. Do you also implement with generating subgraphs when derive cf metric or simply follows the definition with something like cf = 1 - (np.sum(y_pred_cf == y_pred) / n):

# For convenience, attached is code for how GEAR evaluate cf with subgraphs
def evaluate(model, data, subgraph, cf_subgraph_list, labels, sens, idx_select, type='all'):
    loss_result = compute_loss(model, subgraph, cf_subgraph_list, labels, idx_select)
    if type == 'easy':
        eval_results = {'loss': loss_result['loss'], 'loss_c': loss_result['loss_c'], 'loss_s': loss_result['loss_s']}

    elif type == 'all':
        n = len(labels)
        idx_select_mask = (torch.zeros(n).scatter_(0, idx_select, 1) > 0)  # size = n, bool

        # performance
        emb = get_all_node_emb(model, idx_select_mask, subgraph, n)
        output = model.predict(emb)
        output_preds = (output.squeeze() > 0).type_as(labels)

        auc_roc = roc_auc_score(labels.cpu().numpy()[idx_select], output.detach().cpu().numpy())
        f1_s = f1_score(labels[idx_select].cpu().numpy(), output_preds.cpu().numpy())
        acc = accuracy_score(labels[idx_select].cpu().numpy(), output_preds.cpu().numpy())

        # fairness
        parity, equality = fair_metric(output_preds.cpu().numpy(), labels[idx_select].cpu().numpy(),
                                       sens[idx_select].numpy())
        # counterfactual fairness
        cf = 0.0
        for si in range(len(cf_subgraph_list)):
            cf_subgraph = cf_subgraph_list[si]
            emb_cf = get_all_node_emb(model, idx_select_mask, cf_subgraph, n)
            output_cf = model.predict(emb_cf)
            output_preds_cf = (output_cf.squeeze() > 0).type_as(labels)
            cf_si = 1 - (output_preds.eq(output_preds_cf).sum().item() / idx_select.shape[0])
            cf += cf_si
        cf /= len(cf_subgraph_list)

        eval_results = {'acc': acc, 'auc': auc_roc, 'f1': f1_s, 'parity': parity, 'equality': equality, 'cf': cf,
                        'loss': loss_result['loss'], 'loss_c': loss_result['loss_c'], 'loss_s': loss_result['loss_s']}  # counterfactual_fairness
    return eval_results

An Error Occurs when reproducing CAF & A question about Data Normalization

Hi.

1, An Error Occurs when reproducing CAF

An error occurs when I reproduce CAF, in line trainer.fit(model, datamodule=data_module):

RuntimeError: It looks like your LightningModule has parameters that were not used in producing the loss returned by training_step. If this is intentional, you must enable the detection of unused parameters in DDP, either by setting the string value strategy='ddp_find_unused_parameters_true' or by setting the flag in the strategy with strategy=DDPStrategy(find_unused_parameters=True).

2. A question about Data Normalization

And I think this line self.data.x[:, sens_idx] = self.data.sens does not re-asign the sensitive value $s$ to $s\in \lbrace0,1 \rbrace$. Since this is done before the normalization.
To be specific, in the implementation of torch_geometric.data.Dataset,

The data object will be transformed before every access

which means that:

  1. when executing dataset = Bail(...,transform=NormalizeFeatures()), features dataset[0].x are not normalized.

  2. when executing data = dataset[0] (i.e. accessing data object), features data.x are implicitly normalized.

Yet this line self.data.x[:, sens_idx] = self.data.sens is executed in dataset = Bail(...,transform=NormalizeFeatures()), in other words, the re-asigning of sensitive values are executed beform feature normalization.
And you use Row-Normalization (torch_geometric.transforms.NormalizeFeatures) in your code, resulting in a variety values of $s$: $s \in (0,1)$. e.g. might be $0.18$, $0.23$ depending on other features' values of this individual.

Some Issues in Data Processing Modules

Hi, I notice some issues in all three files './src/datasets/bail.py&credit.py&german.py'.
Take 'bail.py' as an example:

  1. Line13 self.load(self.processed_paths[0])
    Maybe you mean self.data = torch.load(self.processed_paths[0])[0]?

  2. Line14 sens_idx = 1
    I think the sens_idx for bail is supposed to be 0.

  3. Line15 self.data.x[:, sens_idx] = self.data.sens
    I don't understand what this line does, since self.data.x[:, sens_idx] always equals to self.data.sens

  4. Line31 self.save([data], self.processed_paths[0])
    Maybe you mean torch.save([data], self.processed_paths[0])?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.