tensorops / transformerx Goto Github PK

Flexible Python library providing building blocks (layers) for reproducible Transformers research (Tensorflow ✅, Pytorch 🔜, and Jax 🔜)

License: MIT License

Python 100.00%

attention attention-mechanism deep-learning multihead-attention nlp self-attention transfomers transformers vit xformers

transformerx's People

Contributors

Stargazers

Watchers

Forkers

valanm22 wentropy shbaydadaev wangzhefeng bselvaradjou himanshu662000 danielemurgolo ravis3011 ayroking

transformerx's Issues

AddNorm advanced features

normalization: batch normalization, instance normalization, and layer normalization.
specify an epsilon value for numerical stability in the normalization layer.
residual connection with dropout.
different activation functions.
regularization of the kernel and bias weights.
normalization techniques that can help to stabilize the training process and improve the performance of deep neural networks.

New residual and residual gate layers

A list of new embedding layers to be added

To contribute, please:

Create a new issue, copy an available (not closed and not opened by someone else) subtask name and paste it in the title followed by the reference of the task. (ex. subtask_name #source_issue_number). Make sure the subtask is not already opened by someone else. Another way is to hover over the subtasks and click on the Open convert to issue
Copy and paste your new issue link here in the comments.
Fork the repository
Add your changes
Create a pull request and mention your issue link

Residual
GRU Gating
Inverted Residual Block
Bottleneck Residual Block

Fix bug in `test_main` module

When running running test_main module, I encountered this error:

test_main.py:42: in <module> encoder_blk = TransformerEncoderBlock(24, 24, 24, 24, norm_shape, 48, 8, 0.5) E TypeError: TransformerEncoderBlock.__init__() takes from 6 to 7 positional arguments but 9 were given

It seems invalid and needs to be fixed.

[tests] TransformerEncoderBlock

Readme typo

The JAX in the roadmap is written as JAZ!

Dilated

[doc] TransformerEncoder

Update the docstrings of the newly refactored TransformerEncoder

[tests] TransformerDecoderBlock

Unit tests and Integration tests

Readme updated information

Docs for the decoder block

Update the documentation for the transformer decoder block

TransformerEncoder

PositionwiseFFN

MultiheadAttention: Add support for distributed computation

Add the capability to support distributed computation

[Issue] Quantization support

Add support for quantization techniques:

1. Weight Quantization
2. Quantization-Aware Training

Refactor DotProductAttention layer

add support for different initializers, regularizers, and learnable scaling weights

MultiHeadAttention

MultiHeadAttention #31

Refacor TransformerEncoderBlock

Make the TransformerEncoderBlock layer more flexible and performant.

MultiHeadAttention #30

Add details to the documentation

PositionwiseFFN

Refactor softmax_attention

Separate the masked_softmax and sequence_mask functions. Also add support for calibration using temperature scaling

Documentation - Readme

AddNorm docs #30

Add more details to the documentation

New attention masking layers

Description: We need to implement several new attention masking layers in our model to improve its performance on specific tasks. The following masking layers need to be implemented:

It is important to carefully consider the design and implementation of these masking layers to ensure they are effective and efficient.

Deadline for each layer: 2 weeks after opening the issue. After the deadline, the issue opened will be closed to make it available for other contributors.

[Tests] TransformerEncoder

Write the tests for the newly refactored layer mentioned in #99

#1 Refactor the structure

Refactor the structure of the project to a standard library

PositionalEncoding

Positional encoding test cases

PositionalEncoding

Fix `call` method overriding in layer package

Most of the base classes in layers package have a method like def call(self, X, state, **kwargs):. These classes are designed to be callable, However, this syntax is incorrect and needs to be fixed.

Documentation

Document the layers API

For contributing, please:

Create a new issue, copy an available (not closed and not opened by someone else) subtask name and paste it in the title followed by the reference of the task. (ex. subtask_name #source_issue_number). Make sure the subtask is not already opened by someone else.
Copy and paste your new issue link here in the comments.
Fork the repository
Add your changes
Create a pull request and mention your issue link

pyproject.toml license file

The License name should be modified so that the license should be derived from the project instead of manually.

Add modules to the packages' init.py

To directly import the modules and objects from the packages, they must be imported in the packages' init.py first.

PositionWiseFFN advanced features

The current PositionWiseFFN class is a simple implementation of a feed-forward neural network that operates on the feature dimension of the input tensor. Here are other more advanced options and features to improve or modify the implementation:

activation functions
initialization
non-linear projection
contextualized embeddings
dropout

Test cases for the layers

Test cases for the following layers:

For contributing to the project, please follow these steps:

First, check if the subtask you want to work on is available by looking at the open issues in the repository. If it is already assigned to someone else or has been closed, please choose a different subtask.
Once you have selected a subtask, create a new issue with the number of the issue as its title (e.g. "#1234").
Copy the link to your newly created issue and paste it into the comments section below.
Next, fork the repository and make the necessary changes to complete the subtask.
Once you are satisfied with your changes, create a pull request and mention the issue link in the description so that your changes can be reviewed and merged.

DotProductAttention

Refactor implementation - AbsolutePositionalEncoding class

The sin and cos functions are being called from both numpy and tensorflow. This can cause errors if the types are not properly converted between the two libraries. In this case, it's better to just use the tf.sin and tf.cos functions. Also, less conversion leads to less time and space complexity.
The P matrix is created using numpy and then converted to a tensorflow tensor using tf.convert_to_tensor. This is unnecessary, as the P matrix can be created directly as a tensorflow tensor using tf.sin and tf.cos.
The tf.keras.layers.Dropout layer is used to apply dropout to the entire X tensor, including the positional encoding. This is not ideal, as the positional encoding is meant to be fixed and not subject to dropout. Instead, the dropout layer should be applied only to the input features.

AddNorm

Test cases for this layer.

AddNorm #31

Check if the input data is appropriate

DotProductAttention docs #30

Add more details to the documentation.

Refactor transformer decoder block

Refactor the whole decoder block and add advanced features and make it more flexible

Handle input arguments and raise exceptions

Handle the inputs and raise issues in case inappropriate was data provided.

For contributing, please:

Create a new issue, copy an available (not closed and not opened by someone else) subtask name and paste it in the title followed by the reference of the task. (ex. subtask_name #source_issue_number). Make sure the subtask is not already opened by someone else.
Copy and paste your new issue link here in the comments.
Fork the repository
Add your changes
Create a pull request and mention your issue link

Rename PositionalEncoding to AbsolutePositionalEmbedding

New feedforward layers

A list of new embedding layers to be added

To contribute, please:

Create a new issue, copy an available (not closed and not opened by someone else) subtask name and paste it in the title followed by the reference of the task. (ex. subtask_name #source_issue_number). Make sure the subtask is not already opened by someone else. Another way is to hover over the subtasks and click on the Open convert to issue
Copy and paste your new issue link here in the comments.
Fork the repository
Add your changes
Create a pull request and mention your issue link

TransformerEncoderBlock

Masking system and general RC additions

Source files for the docs
Integrate masking system
All tests running

TransformerDecoder

[Issue] Refactor the TransformerEncoder layer

Update the TransformerEncoder layer and adapt it to the newly refactored sublayers.

TransformerDecoderBlock

New embedding layers

A list of new embedding layers to be added

To contribute, please:

Create a new issue, copy an available (not closed and not opened by someone else) subtask name and paste it in the title followed by the reference of the task. (ex. subtask_name #source_issue_number). Make sure the subtask is not already opened by someone else. Another way is to hover over the subtasks and click on the Open convert to issue
Copy and paste your new issue link here in the comments.
Fork the repository
Add your changes
Create a pull request and mention your issue link

Embedding:

TokenEmbedding

Positional embedding:

Please refer to the following useful links:

https://paperswithcode.com/methods/category/position-embeddings

New attention layers

A list of new attention layers to be added

You can pick any of the following frameworks and implement your layers:

To contribute, please:

Create a new issue, copy an available (not closed and not opened by someone else) subtask name and paste it in the title followed by the reference of the task. (ex. subtask_name #source_issue_number). Make sure the subtask is not already opened by someone else. Another way is to hover over the subtasks and click on the Open convert to issue
Copy and paste your new issue link here in the comments.
Fork the repository
Add your changes
Create a pull request and mention your issue link

Please note that we use Numpy style for the docstrings and provide examples as much as possible!
Also please try to provide unit tests using Pytest

Whenever you need help with implementation or any other related issues you might be dealing with, please reach out to me via discussions or Discord server community @soran-ghaderi or @sigma1326

tensorops / transformerx Goto Github PK

transformerx's People

Contributors

Stargazers

Watchers

Forkers

transformerx's Issues

Recommend Projects

Recommend Topics

Recommend Org