Giter Club home page Giter Club logo

βœ‹ Hello World

Β  Hi there! I'm moon, somebody who's focusing on solving problems through artificial intelligence. AI has many subcategories, of which I find Natural Language Processing the most interesting. By profession, I'm a NLP Machine Learning Engineer. As an engineer, I aim to develop a model that can communicate naturally with people. So the codes in all my repos will contain the progress towards the goal. In addition to the codes in my git repos, reviews of the papers and personal research on artificial intelligence techniques are recorded on my notion page. If you would like to contact me, please contact me at the email address in the left information field.



πŸ€– Model Architecture

Β  Model architecture is a crucial element in machine learning engineering. The choice of model architecture can significantly impact performance. A series of projects, concentrating on model architecture, are presented below to establish standards for the suitable model structures in three NLG tasks: Translation, Dialogue Generation, and Summarization.

  β€’ β€Š RNN Seq2Seq                 Β  β€’ β€Š RNN Seq2Seq with Attention           β€’ β€Š Transformer

  β€’ β€Š Transformer Variants           β€’ β€Š Encoder Decoder Balance             β€’ β€Š PLM Fusion



πŸƒβ€β™‚οΈ Training Strategy

Β  In the typical training process of a Seq2Seq model for Natural Language Generation, the issue of 'Exposure Bias' and the discrepancy between training and inference inevitably arises. The most ideal solution is to train the model on a large and diverse dataset, but in reality, this is a challenging endeavor. To overcome these constraints and enhance training effectiveness, several training strategies are proposed below. Among these, Auxiliary Training and Scheduled Sampling aim to make the most of GPU parallel processing while facilitating complementary learning. On the other hand, Generative Training and SeqGAN Training may have lower training efficiency but serve as strategies to extract maximum performance in extremely data-restricted environments.

  β€’ β€Š Auxiliary Training       β€’ β€Š Scheduled Sampling       β€’ β€Š Pre Training         β€’ β€Š Generative Training       β€’ β€Š SeqGAN



⏰ Toward Efficiency

Β  Large-scale models with numerous parameters tend to deliver better performance. Many recent research focus on training even larger models on extensive datasets to achieve superior results. However, deploying such large-scale models in typical computing environments can be restrictive. To address this issue, the following project introduces an efficient approach that maintains a certain level of performance while mitigating computational demands.

  β€’ β€Š Efficient Training           β€’ β€Š Efficient PreTrained Language Models           β€’ β€Š Param Efficient Fine-Tuning



πŸ”„ Neural Machine Translation

Β  Machine translation is the task of converting Text from Source Language into Target Language using a computer processing. The hegemony of machine translation was Rule-Based at the earliest, followed by SMT, and now NMT has been established. NMT aims to derive more accurate and natural translation results using Neural Networks. Below are experiments of various Neural Network Architectures for this purpose.

  β€’ β€Š Back Translation       β€’ β€Š Multi-Lingual Translation       β€’ β€Š Code Translation       β€’ β€Š Machine Translation Blend



πŸ—£οΈ Dialogue Generation

Β  Dialogue Generation is a task to generate a response to a previous utterance, just like humans do in a conversational situation. However, it is very difficult for the model to understand the flow of the conversation and return appropriate answers. Below are a set of experiments to generate more natural responses like humans do.

  β€’ β€Š Characteristic Dialogue       β€’ β€Š Utilize SimEnt       β€’ β€Š Multi-Turn Dialgue       β€’ β€Š Dialgue Generation Blend



πŸ“ Abstract Text Summarization

Β  Summarization Task summarizes long text into short sentences through Neural Networks, and the task can be devided into Extractive and Abstractive methods. Extractive Summarization selects key sentences from original text to make summary, whereas Abstractive Summarization creates a new summary sentence through the model's decoder. The experiments below mainly deal with Abstractive summary tasks.

  β€’ Hierarchical Encoder           β€’ Sparse Attention           β€’ Summarization Blend


moon23k's Projects

moon23k doesn’t have any public repositories yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.