Term Frequency calculates the probability of finding a word in a document. Well, for example, we want to know what is the probability of finding the word wi in document dj.


Text analysis and Big Data b.TWEEN 3D
Text analysis and Big Data

In addition to these three major architectures, there have been many improvements in neural network machine translation over the past few years. Some notable developments are presented below: doctranslator.

Sequence-to-Sequence Learning with Neural Networks have proven the effectiveness of LSTM for neural machine translation. The article presents a general approach for sequential learning, which is characterized by minimal assumptions about the structure of the sequence. This method uses a layered LSTM to map the input sequence into a fixed vector, followed by another LSTM to decode the target sequence from the vector. Neural Machine Translation by Jointly Learning to Align and Translate introduced the attention mechanism in NLP (which will be covered in the next part). Recognizing that the use of a fixed-length vector is a bottleneck in improving NMT performance, the authors propose to allow the model to automatically search for parts of the original sentence that are relevant to the prediction of the target word, without the need to explicitly form those parts.

Convolutional over Recurrent Encoder for Neural Machine Translation enhances the standard RNN encoder in NMT with an additional convolutional layer to capture the wider context in the encoder's output.

Google has created its own NMT system, called Google's Neural Machine Translation, that addresses the challenges of accuracy and ease of use. The model consists of a deep LSTM network with 8 coding and 8 decoding layers and uses both residual links and attention links from decoder to encoder network.

Instead of using recurrent neural networks, Facebook AI Researchers use convolutional neural networks for sequence-to-sequence learning tasks in NMT.