Applying Context Aware Spell Checking in Spark NLP which is scalable, extensible, and highly accurate! Btw you can extend it with your own training to add support for more languages or specific domains.
|JohnSnowLabs/spark-nlp: State of the Art Natural Language Processing – GitHub
Spark NLP: State of the Art Natural Language Processing. Spark NLP is a Natural Language Processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines that scale easily in a distributed environment. Spark NLP comes with 200+ pretrained pipelines and models in more than 45+ languages.
Models for Spark NLP https://nlp.johnsnowlabs.com/docs/en/models
|Models – Spark NLP
High Performance NLP with Apache Spark Offline. If you have any trouble using online pipelines or models in your environment (maybe it’s air-gapped), you can directly download them for offline use.. After downloading offline models/pipelines and extracting them, here is how you can use them iside your code (the path could be a shared storage like HDFS in a cluster):
Spark NLP in action https://www.johnsnowlabs.com/spark-nlp-in-action
|Spark NLP in Action | John Snow Labs
Recognize Persons, Locations, Organizations and Misc entities using out of the box pretrained Deep Learning models based on GloVe (glove_100d) and BERT (ner_dl_bert) word embeddings.
On accuracy: The pre-trained contextual spell checker model delivers a word error rate of 8.09% for fully automatic correction in the Holbrook benchmark compared to 20.24% error rate that JamSpell attains on the same benchmark.
Transformer is the most critical alogrithm innovation in the NLP field in recent years. It brings higher model accuracy while introduces more calculations. The efficient deployment of online Transformer-based services faces enormous challenges. In order to make the costly Transformer online service more efficient, the WeChat AI open-sourced a Transformer inference acceleration tool called TurboTransformers, which has the following characteristics.
|An overview of gradient descent optimization algorithms
Gradient descent is the preferred way to optimize neural networks and many other machine learning algorithms but is often used as a black box. This post explores how many of the most popular gradient-based optimization algorithms such as Momentum, Adagrad, and Adam actually work.