Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of speech and morphological features, to give a syntactic structure dependency parse, and to recognize named entities. The toolkit is designed to be parallel among more than 70 languages, using the Universal Dependencies formalism. Stanza is built with highly accurate neural network components that also enable efficient training and evaluation with your own annotated data.

Features

  • The modules are built on top of the PyTorch library
  • Stanza includes a Python interface to the CoreNLP Java package and inherits additional functionality from there
  • Constituency parsing, coreference resolution, and linguistic pattern matching
  • Native Python implementation requiring minimal efforts to set up
  • Full neural network pipeline for robust text analytics, including tokenization, multi-word token (MWT) expansion
  • Pretrained neural models supporting 66 (human) languages

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow Stanza

Stanza Web Site

Other Useful Business Software
Auth for GenAI | Auth0 Icon
Auth for GenAI | Auth0

Enable AI agents to securely access tools, workflows, and data with fine-grained control and just a few lines of code.

Easily implement secure login experiences for AI Agents - from interactive chatbots to background workers with Auth0. Auth for GenAI is now available in Developer Preview
Try free now
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Stanza!

Additional Project Details

Programming Language

Python

Related Categories

Python Library Management Software, Python Languages Software, Python Neural Network Libraries, Python Natural Language Processing (NLP) Tool

Registered

2021-10-05