The cognitive_nlp from athn-nik

cognitive_nlp's Issues

Extract preprocessed datasets

For every experiment's data extract the preprocessed data.Experiment for size store it and fix input pipeline for the new structure

Voxel selection

Explore voxel selection implementation.
Current code:

trainVoxelwiseTargetPredictionModels.txt

Voxel Selection from all Data, Normalizations

Run and debug voxel selection from data. Parametrize normalization for the decoder to use

Ensemble the 2 datasets

Currently we have Mitchell and Perreira Dataset.Ensmebling proposed --> Same voxel selection dimension
May reduce performance in case of Mitchell

Compositionality Evaluation

Again weights are up.Do the text preprocessing i do for voc extraction.You know the rest or ask me

Compositionality Computational Model

Methods in this paper
Implementation not currently available but easy

Task definition to start

Ideas are:

Hard-coded brain parts for certain words
Text generation
Compositionality in brain
Use brain's advantage to learn from few examples

@georgepar follow below script to extract dataset and i will fix a function that given a word return the its embeddings.In weights folder you will find the trained weights for each of three experiments.The embeddings which will be used are glove42B.300d

import numpy as np

from sklearn.datasets.base import Bunch
from .utils import _get_as_pd

def fetch_MEN(which="all", form="natural"):
    """
    Fetch MEN dataset for testing similarity and relatedness
    ----------
    which : "all", "test" or "dev"
    form : "lem" or "natural"
    Returns
    -------
    data : sklearn.datasets.base.Bunch
        dictionary-like object. Keys of interest:
        'X': matrix of 2 words per column,
        'y': vector with scores
    Published at http://clic.cimec.unitn.it/~elia.bruni/MEN.html.

    """
    if which == "dev":
        data = _get_as_pd('https://www.dropbox.com/s/c0hm5dd95xapenf/EN-MEN-LEM-DEV.txt?dl=1',
                          'similarity', header=None, sep=" ")
    elif which == "test":
        data = _get_as_pd('https://www.dropbox.com/s/vdmqgvn65smm2ah/EN-MEN-LEM-TEST.txt?dl=1',
                          'similarity/EN-MEN-LEM-TEST', header=None, sep=" ")
    elif which == "all":
        data = _get_as_pd('https://www.dropbox.com/s/b9rv8s7l32ni274/EN-MEN-LEM.txt?dl=1',
                          'similarity', header=None, sep=" ")
    else:
        raise RuntimeError("Not recognized which parameter")

    if form == "natural":
        # Remove last two chars from first two columns
        data = data.apply(lambda x: [y if isinstance(y, float) else y[0:-2] for y in x])
    elif form != "lem":
        raise RuntimeError("Not recognized form argument")

return Bunch(X=data.values[:, 0:2].astype("object"), y=data.values[:, 2:].astype(np.float) / 5.0)

athn-nik / cognitive_nlp Goto Github PK

cognitive_nlp's People

Contributors

Watchers

Forkers

cognitive_nlp's Issues

Extract preprocessed datasets

Voxel selection

Voxel Selection from all Data, Normalizations

Ensemble the 2 datasets

Compositionality Evaluation

Compositionality Computational Model

Task definition to start

Evaluation MEN whole dataset

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent