Applying Context Aware Spell Checking in Spark NLP

Applying Context Aware Spell Checking in Spark NLP which is scalable, extensible, and highly accurate! Btw you can extend it with your own training to add support for more languages or specific domains.

Blogpost https://medium.com/spark-nlp/applying-context-aware-spell-checking-in-spark-nlp-3c29c46963bc
GitHub https://github.com/JohnSnowLabs/spark-nlp

JohnSnowLabs/spark-nlp: State of the Art Natural Language Processing – GitHub
Spark NLP: State of the Art Natural Language Processing. Spark NLP is a Natural Language Processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines that scale easily in a distributed environment. Spark NLP comes with 200+ pretrained pipelines and models in more than 45+ languages.
github.com

Models for Spark NLP https://nlp.johnsnowlabs.com/docs/en/models

Models – Spark NLP
High Performance NLP with Apache Spark Offline. If you have any trouble using online pipelines or models in your environment (maybe it’s air-gapped), you can directly download them for offline use.. After downloading offline models/pipelines and extracting them, here is how you can use them iside your code (the path could be a shared storage like HDFS in a cluster):
nlp.johnsnowlabs.com

Spark NLP in action https://www.johnsnowlabs.com/spark-nlp-in-action

Spark NLP in Action | John Snow Labs
Recognize Persons, Locations, Organizations and Misc entities using out of the box pretrained Deep Learning models based on GloVe (glove_100d) and BERT (ner_dl_bert) word embeddings.
www.johnsnowlabs.com

On accuracy: The pre-trained contextual spell checker model delivers a word error rate of 8.09% for fully automatic correction in the Holbrook benchmark compared to 20.24% error rate that JamSpell attains on the same benchmark.

By Philip Vollet https://www.linkedin.com/posts/philipvollet_opensource-linkedin-artificialintelligence-activity-6695043270982139904-XruJ


Publié

dans

par

Étiquettes :