Giter Club home page Giter Club logo

dmae's Introduction

tim Hey there!, I'm Juan Lara rock

Passionate about programming, teaching, learning, thinking, writing, music, and video games ๐Ÿค”.

  • MSc in computer science and biomedical engineer.
  • My research interests include: machine learning, neural networks, Bayesian modeling, natural language processing and computer vision.
  • My professional interests are: data science, machine learning engineering, DevOps and MLOps, software development, and big data.
  • bash is love, vim is life :D.

dmae's People

Contributors

juselara1 avatar volodyaco avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dmae's Issues

Strange covariance matrices

So, I understand that you are learning the inverse covariance matrix. You do so by defining a completely trainable matrix X, and then you obtain the inverse of the covariance matrix as S = X @ X.T. So, in a way, you are using a factorisation of the covariance matrix inverse.

Now, I am using the following code to plot the ellipses learnt by the DMAE method on the dataset presented in issue #1:

def plot_results(X, means, covariances):
    for i, (mean, covar) in enumerate(zip(
            means, covariances)):
        # covar = np.linalg.pinv(np.matmul(covar, covar.T))  # <<<< note this!, I'm commenting it!
        v, w = np.linalg.eigh(covar)
        v = 2. * np.sqrt(2.) * np.sqrt(np.abs(v))
        u = w[:, 0]

        # Plot an ellipse to show the Gaussian component
        angle = np.arctan(u[1] / u[0])
        angle = 180. * angle / np.pi  # convert to degrees
        ell = mpl.patches.Ellipse(mean, v[0], v[1], 180. + angle, color='green')
        ell.set_clip_box(plt.gca().bbox)
        ell.set_alpha(0.5)
        plt.gca().add_artist(ell)
    
    # Plot the data
    plt.scatter(*X.T, .2, color='black')
    plt.scatter(*(X + np.array([1, 0])).T, .2, color='blue')
    plt.scatter(*(X + np.array([0, 1])).T, .2, color='blue')
    plt.scatter(*(X + np.array([-1, 0])).T, .2, color='blue')
    plt.scatter(*(X + np.array([0, -1])).T, .2, color='blue')
    plt.xlim([-1, 2])
    plt.ylim([-1, 2])
    plt.show()

The plot generated when inputting the means and covariances using plot_results(X, model2.layers[1].get_weights()[0], model2.layers[1].get_weights()[1]) yields a very good result!

image

However, covar IS NOT the covariance matrix that defines the gaussian, as it should... The correct way of doing this is to uncomment the line that I point out in the code. When I do this, things go wrong!

Also, the visualisations using vis_utils are this:

image

image

which is nonsense. I wonder what I'm doing wrong, or if the visualisation utils are somehow mistaken.

High probability density where there is no data

Hi @larajuse.

I've been trying to use optuna to optimise hyperparameters of a simple pipeline where the bare DMAE Keras (no autoencoder) was used on this dataset (2 clusters with periodic boundary conditions):

image

Also, the best parameters found through optuna were:

{'alpha': 95.11757184163116,
 'batch_size': 32,
 'epochs': 64,
 'lr': 7.009510754706714e-05}

The rest of the code is this one:

tf.random.set_seed(0)

def toroidal_dis(x_i, Y, interval=tf.constant((1.0, 1.0))):
    d = tf.reduce_sum((x_i-Y)**2, axis=1)
    for val in itertools.product([0.0, 1.0, -1.0], repeat=2):
        delta = tf.constant(val)*interval
        d = tf.minimum(tf.reduce_sum((x_i-Y+delta)**2, axis=1), d)
    return d

def toroidal_pairwise(X, Y, interval=tf.constant((1.0, 1.0))):
    func = lambda x_i: toroidal_dis(x_i, Y, interval)
    Z = tf.vectorized_map(func, X)
    return Z

def toroidal_loss(X, mu_tilde, interval=tf.constant((1.0, 1.0))):
    d = tf.reduce_sum((X-mu_tilde)**2, axis=1)
    for val in itertools.product([0.0, 1.0, -1.0], repeat=2):
        delta = tf.constant(val)*interval
        d = tf.minimum(tf.reduce_sum((X-mu_tilde+delta)**2, axis=1), d)
    return d

interval = tf.constant((1.0, 1.0))
dis = lambda X, Y: toroidal_pairwise(X, Y, interval)
dmae_loss = lambda X, mu_tilde: toroidal_loss(X, mu_tilde, interval)

inp = tf.keras.layers.Input(shape=(2, ))
# DMM layer
theta_tilde = DMAE.Layers.DissimilarityMixtureAutoencoder(alpha=alpha, n_clusters=n_clusters,
                                                          initializers={"centers": DMAE.Initializers.InitPlusPlus(X, n_clusters, dis, 1),
                                                                        "mixers": tf.keras.initializers.Constant(1.0)},
                                                          trainable = {"centers": True, "mixers": False},
                                                          dissimilarity=dis)(inp)
# DMAE model
callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=3)
model = tf.keras.Model(inputs=[inp], outputs=theta_tilde)
model.compile(loss=dmae_loss, optimizer=tf.optimizers.Adam(lr=lr))
history = model.fit(X, X, epochs=epochs, batch_size=batch_size, callbacks=[callback], verbose=0)
loss_circular = history.history['loss'][-1]

def toroidal_mahalanobis(x_i, Y, cov, interval=tf.constant((1.0, 1.0))):
    d = tf.reduce_sum((x_i-Y)**2, axis=1)
    for val in itertools.product([0.0, 1.0, -1.0], repeat=2):
        delta = tf.constant(val)*interval
        diff = tf.expand_dims(x_i-Y+delta, axis=-1)
        d = tf.minimum(tf.squeeze(tf.reduce_sum(tf.matmul(cov, diff)*diff, axis=1)), d)
    return d

def toroidal_mahalanobis_pairwise(X, Y, cov, interval=tf.constant((1.0, 1.0))):
    func = lambda x_i: toroidal_mahalanobis(x_i, Y, cov, interval)
    Z = tf.vectorized_map(func, X)
    return Z

def toroidal_mahalanobis_loss(X, mu_tilde, Cov_tilde, interval=tf.constant((1.0, 1.0))):
    d = tf.reduce_sum((X-mu_tilde)**2, axis=1)
    for val in itertools.product([0.0, 1.0, -1.0], repeat=2):
        delta = tf.constant(val)*interval
        diff = tf.expand_dims(X-mu_tilde+delta, axis=1)
        d = tf.minimum(tf.squeeze(tf.matmul(tf.matmul(diff, Cov_tilde), tf.transpose(diff, perm = [0, 2, 1]))), d)
    return d

interval = tf.constant((1.0, 1.0))
dis = lambda X, Y, cov: toroidal_mahalanobis_pairwise(X, Y, cov, interval)
dmae_loss = lambda X, mu_tilde, Cov_tilde: toroidal_mahalanobis_loss(X, mu_tilde, Cov_tilde, interval)

inp = tf.keras.layers.Input(shape=(2, ))
# DMM layer
theta_tilde = DMAE.Layers.DissimilarityMixtureAutoencoderCov(alpha=alpha, n_clusters=n_clusters,
                                                             initializers={"centers": tf.keras.initializers.RandomUniform(0, 1),
                                                                           "cov": DMAE.Initializers.InitIdentityCov(X, n_clusters),
                                                                           "mixers": tf.keras.initializers.Constant(1.0)},
                                                             trainable = {"centers": True, "mixers": False, "cov": True},
                                                             dissimilarity=dis)(inp)
# DMAE model
model2 = tf.keras.Model(inputs=[inp], outputs=theta_tilde)

loss = dmae_loss(inp, *theta_tilde)
model2.add_loss(loss)
model2.compile(optimizer=tf.optimizers.Adam(lr=lr))

init_means = model.layers[-1].get_weights()[0]
original_params = model2.layers[1].get_weights()
model2.layers[1].set_weights([init_means, *original_params[1:]])

history = model2.fit(X, epochs=epochs, batch_size=batch_size, callbacks=[callback])

inp = tf.keras.layers.Input(shape=(2,))
assigns = DMAE.Layers.DissimilarityMixtureEncoderCov(alpha=alpha, n_clusters=n_clusters,
                                                     dissimilarity=dis,
                                                     trainable={"centers": False, "mixers": False, "cov": False})(inp)
DMAE_encoder = tf.keras.Model(inputs=[inp], outputs=[assigns])
DMAE_encoder.layers[-1].set_weights(model2.layers[1].get_weights())

fig, ax = vis_utils.visualize_distribution(model2, dmae_loss, 50, X, figsize=(15, 15), cov=True)
ax.set_xlim([0, 1])
ax.set_ylim([0, 1])
plt.show()

fig, ax = vis_utils.visualize_probas(DMAE_encoder, X, n_clusters, rows=1, cols=2, figsize=(20, 8))
for axi in ax:
    axi.set_xlim([0, 1])
    axi.set_ylim([0, 1])
plt.show()

image
image

The resulting figures show that the model wrongly learnt a high probability density in a region where there is no data. I wonder why that would be...

It seems that since the only thing the DMAE worries about is to get the points within the estimated density for each component. Maybe an approach like negative sampling could fix this issue? I will try embedding the DMAE layer into a deep autoencoder to see if things get any better (though I doubt that an autoencoder will help).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.