In recent years, natural language processing (NLP) has seen quick growth in quality and usability, and this has helped to drive business adoption of artificial intelligence (AI) solutions. In the last few years, researchers have been applying newer deep learning methods to NLP. Data scientists started moving from traditional methods to state-of-the-art (SOTA) deep neural network (DNN) algorithms which use language models pretrained on large text corpora. This repository contains examples and best practices for building NLP systems, provided as Jupyter notebooks and utility functions. The focus of the repository is on state-of-the-art methods and common scenarios that are popular among researchers and practitioners working on problems involving text and language. The goal of this repository is to build a comprehensive set of tools and examples that leverage recent advances in NLP algorithms, neural architectures, and distributed machine learning systems.
Features
- We aim to have end-to-end examples of common tasks and scenarios such as text classification, named entity recognition etc.
- Text Analytics are a set of pre-trained REST APIs
- QnA Maker is a cloud-based API service that lets you create a conversational question-and-answer layer
- Language Understanding is a SaaS service to train and deploy a model as a REST API
- For this repository our target audience includes data scientists and machine learning engineers
- The repository aims to expand NLP capabilities along three separate dimensions
- We aim to support multiple models for each of the supported scenarios