0% found this document useful (0 votes)

20 views

CH2

NLP CH2

Uploaded by

shyamthakkar1673

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

CH2

NLP CH2

Uploaded by

shyamthakkar1673

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 119

CH-2 Natural Language

Processing Models and

Algorithm
What is Language Model:

• A language model in Natural Language Processing (NLP) is a statistical

tool that helps understand and generate human language. It predicts
the probability of a sequence of words, aiding various NLP tasks like
text generation, translation, sentiment analysis.

• Language models assign a probability to a sequence of words. This

helps in predicting the next word in a sentence or the likelihood of a
given sentence.
Types of Language Model:
• N-gram Models: These are simple statistical models that predict the next word in a sequence based on
the previous 'n-1' words.

• Neural Language Models: These use neural networks to predict word sequences and are more
powerful than traditional statistical models.

• Recurrent Neural Networks (RNNs): These models handle sequential data by maintaining a 'memory'
of previous words in the sequence.

• Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU): These are advanced types of RNNs
designed to better capture long-range dependencies.

• Transformers: These models, like BERT and GPT, use self-attention mechanisms to process the entire
sequence of words simultaneously, leading to improved performance on many NLP tasks
Applications of Language Models:
• Text Generation: Creating coherent and contextually relevant text.

• Machine Translation: Translating text from one language to another.

• Speech Recognition: Converting spoken language into text.

• Chatbots and Conversational Agents: Facilitating human-like interactions.

• Sentiment Analysis: Understanding and categorizing emotions and

opinions in text.

• Autocompletion: Predicting the next words or phrases to assist in typing.

Example of Language Model:
• GPT (Generative Pre-trained Transformer):

• Developed by OpenAI, models like GPT-3 and GPT-4 generate human-like text based on
the input they receive.

• BERT (Bidirectional Encoder Representations from Transformers):

• Developed by Google, BERT is designed to understand the context of words in a sentence

by looking at the words that come before and after.

• T5 (Text-to-Text Transfer Transformer):

• Also developed by Google, T5 treats every NLP problem as a text-to-text problem,

making it highly versatile.
Training Language Models:
• Language models are trained on large corpora of text data. During training,
they learn to capture linguistic patterns, grammar, context, and even world
knowledge. The training process involves:
• Tokenization: Breaking down text into individual words or subwords.
• Learning Weights: Adjusting parameters in the neural network to minimize
prediction errors.
• Fine-Tuning: Adjusting a pre-trained model to better perform on specific tasks
N- Gram Language model of Log of
Probabilities:
• We always represent and compute language model probabilities in log
format as log probabilities.

• probabilities are (by definition) less than or equal to 1, the more

probabilities we multiply together, the smaller the product becomes.
Multiplying enough n-grams together would result in numerical underflow.

• By using log probabilities instead of raw probabilities, we get numbers that

are not as small Adding in log space is equivalent to multiplying in linear
space, so we combine log probabilities by adding them.
• The result of doing all computation and storage in log space is that we
only need to convert back into probabilities if we need to report them
at the end; then we can just take the exp of the logprob:

• p1 × p2 × p3 × p4 = exp(log p1 +log p2 +log p3 +log p4).

What is smoothing NLP
• In NLP, we have statistical models to perform tasks like auto-
completion of sentences, where we use a probabilistic model. Now,
we predict the next words based on training data, which has
complete sentences so that the model can understand the pattern for
prediction. Naturally, we have so many combinations of words
possible. It is next to impossible to include all the varieties in training
data so that our model can predict accurately on unseen data. So,
here comes Smoothing to the rescue.
Why do we need smoothing in NLP?
• To improve the accuracy of our model.
• To handle data sparsity, out of vocabulary words, words that are
absent in the training set.
• Example - Training set: ["I like coding", “Prakriti likes mathematics”,
“She likes coding”]
• Let’s consider bigrams, a group of two words.
• P(wi | w(i-1)) = count(wi w(i-1)) / count(w(i-1))
• So, let's find the probability of “I like mathematics”.
Bigram
Natural Language Generation:
Part of Speech of Tagging (POS)
• portions of In Natural Language Processing (NLP), speech tagging is a
linguistic task in which each word in a document is assigned to a
specific part of speech (verb, adjective, adverb, etc.) or grammatical
category. This process helps to clarify the meaning and structure of
the sentence by adding a layer of syntactic and semantic information
to the words.
Part of Speech of Tagging (POS):
• POS tagging is typically performed using machine learning algorithms,
which are trained on a large annotated corpus of text. The algorithm
learns to predict the correct POS tag for a given word based on the
context in which it appears.
• Text: “The cat sat on the mat.”
• POS tags:
• The: determiner
• cat: noun
• sat: verb
• on: preposition
• the: determiner
• mat: noun
• POS tagging is a useful tool in natural language processing (NLP) as it
allows algorithms to understand the grammatical structure of a
sentence and to disambiguate words that have multiple meanings..
Use of Parts of Speech Tagging in NLP:
1. To understand the grammatical structure of a sentence: By labeling
each word with its POS, we can better understand the syntax and
structure of a sentence. This is useful for tasks such as machine
translation and information extraction, where it is important to
know how words relate to each other in the sentence.
1. 2. To disambiguate words with multiple meanings: Some words, such as
“bank,” can have multiple meanings depending on the context in which
they are used. By labeling each word with its POS, we can disambiguate
these words and better understand their intended meaning.

2. 3. To improve the accuracy of NLP tasks: POS tagging can help improve
the performance of various NLP tasks, such as named entity recognition
and text classification. By providing additional context and information
about the words in a text, we can build more accurate and sophisticated
algorithms.
1. 4. To facilitate research in linguistics: POS tagging can also be used
to study the patterns and characteristics of language use and to
gain insights into the structure and function of different parts of
speech.
Steps Involved in the POS tagging:
1. Collect a dataset of annotated text: This dataset will be used to train and test the POS
tagger. The text should be annotated with the correct POS tags for each word.

2. Preprocess the text: This may include tasks such as tokenization (splitting the text into
individual words), lowercasing, and removing punctuation.

3. Divide the dataset into training and testing sets: The training set will be used to train
the POS tagger, and the testing set will be used to evaluate its performance.

4. Train the POS tagger: This may involve building a statistical model, such as a hidden
Markov model (HMM), or defining a set of rules for a rule-based or transformation-
based tagger. The model or rules will be trained on the annotated text in the training
set.
5. Test the POS tagger: Use the trained model or rules to predict the POS tags of the
words in the testing set. Compare the predicted tags to the true tags and calculate
metrics such as precision and recall to evaluate the performance of the tagger.

6.Fine-tune the POS tagger: If the performance of the tagger is not satisfactory,
adjust the model or rules and repeat the training and testing process until the
desired level of accuracy is achieved.

7.Use the POS tagger: Once the tagger is trained and tested, it can be used to
perform POS tagging on new, unseen text. This may involve preprocessing the text
and inputting it into the trained model or applying the rules to the text. The output
will be the predicted POS tags for each word in the text.
Application of POS Tagging:
• Information extraction: POS tagging can be used to identify specific types of
information in a text, such as names, locations, and organizations. This is useful for
tasks such as extracting data from news articles or building knowledge bases for
artificial intelligence systems.

• Named entity recognition: POS tagging can be used to identify and classify named
entities in a text, such as people, places, and organizations. This is useful for tasks such
as building customer profiles or identifying key figures in a news story.

• Text classification: POS tagging can be used to help classify texts into different
categories, such as spam emails or sentiment analysis. By analyzing the POS tags of the
words in a text, algorithms can better understand the content and tone of the text.
• Machine translation: POS tagging can be used to help translate texts
from one language to another by identifying the grammatical
structure and relationships between words in the source language
and mapping them to the target language.

• Natural language generation: POS tagging can be used to generate

natural-sounding text by selecting appropriate words and
constructing grammatically correct sentences. This is useful for tasks
such as chatbots and virtual assistants
Types of POS Tagging in NLP:

• Rule-based part-of-speech (POS) tagging is a method of labeling

words with their corresponding parts of speech using a set of pre-
defined rules. This is in contrast to machine learning-based POS
tagging, which relies on training a model on a large annotated corpus
of text.
• Rule-based POS taggers can be relatively simple to implement and are
often used as a starting point for more complex machine learning-
based taggers. However, they can be less accurate and less efficient
than machine learning-based taggers, especially for tasks with large
or complex datasets.
Statistical POS Tagging:
• In statistical POS tagging, a model is trained on a large annotated
corpus of text to learn the patterns and characteristics of different
parts of speech. The model uses this training data to predict the POS
tag of a given word based on the context in which it appears and the
probability of different POS tags occurring in that context.
• Statistical POS taggers can be more accurate and efficient than rule-
based taggers, especially for tasks with large or complex datasets.

• Collect a large annotated corpus of text and divide it into training and
testing sets.

• Train a statistical model on the training data, using techniques such as

maximum likelihood estimation or hidden Markov models.

• Use the trained model to predict the POS tags of the words in the
testing data.
• Evaluate the performance of the model by comparing the predicted
tags to the true tags in the testing data and calculating metrics such
as precision and recall.

• Fine-tune the model and repeat the process until the desired level of
accuracy is achieved.

• Use the trained model to perform POS tagging on new, unseen text.
Transformation-based tagging (TBT):
• A set of rules is defined to transform the tags of words in a text based
on the context in which they appear. For example, a rule might
change the tag of a verb to a noun if it appears after a determiner
such as “the.” The rules are applied to the text in a specific order, and
the tags are updated after each transformation.
• TBT can be more accurate than rule-based tagging, especially for tasks with
complex grammatical structures. However, it can be more computationally
intensive and requires a larger set of rules to achieve good performance.

• Define a set of rules for transforming the tags of words in the text. For
example:

• If the word is a verb and appears after a determiner, change the tag to
“noun.”

• If the word is a noun and appears after an adjective, change the tag to
“adjective.”
• Iterate through the words in the text and apply the rules in a specific
order. For example:

• In the sentence “The cat sat on the mat,” the word “sat” would be
changed from a verb to a noun based on the first rule.

• In the sentence “The red cat sat on the mat,” the word “red” would
be changed from an adjective to a noun based on the second rule.

• Output the transformed tags for each word in the text.

• Hidden Markov Model POS tagging:
• Hidden Markov models (HMMs) are a type of statistical model that
can be used for part-of-speech (POS) tagging in natural language
processing (NLP). In an HMM-based POS tagger, a model is trained
on a large annotated corpus of text to learn the patterns and
characteristics of different parts of speech. The model uses this
training data to predict the POS tag of a given word based on the
probability of different tags occurring in the context of the word.
Challenges of POS

• Ambiguity: Some words can have multiple POS tags depending on the
context in which they appear, making it difficult to determine their
correct tag. For example, the word “bass” can be a noun (a type of
fish) or an adjective (having a low frequency or pitch).
• Out-of-vocabulary (OOV) words: Words that are not present in the
training data of a POS tagger can be difficult to tag accurately,
especially if they are rare or specific to a particular domain.

• Complex grammatical structures: Languages with complex

grammatical structures, such as languages with many inflections or
free word order, can be more challenging to tag accurately.
• Lack of annotated training data: Some languages or domains may
have limited annotated training data, making it difficult to train a
high-performing POS tagger.

• Inconsistencies in annotated data: Annotated data can sometimes

contain errors or inconsistencies, which can negatively impact the
performance of a POS tagger.
• Default tagging is a fundamental step in part-of-speech labelling. It is
done with the class. The DefaultTagger class takes a single parameter,
'tag'. NN is the tag for a single noun. DefaultTagger is most beneficial
when it comes to dealing with the most frequent part-of-speech tags.
This is why a noun tag is recommended.
DefaultTagger:
• Example of POS Tagging
• Consider the sentence: “The quick brown fox jumps over the lazy dog.”
• After performing POS Tagging:
• “The” is tagged as determiner (DT)
• “quick” is tagged as adjective (JJ)
• “brown” is tagged as adjective (JJ)
• “fox” is tagged as noun (NN)
• “jumps” is tagged as verb (VBZ)
• “over” is tagged as preposition (IN)
• “the” is tagged as determiner (DT)
• “lazy” is tagged as adjective (JJ)
• “dog” is tagged as noun (NN)
Morphology
• Morphology is the study of the way words are built from smaller meaningful units called
morphemes.
• We can divide morphemes into two broad classes.
• Stems – the core meaningful units, the root of the word.
• Affixes – add additional meanings and grammatical functions to words.
• Affixes are further divided into:
• Prefixes – precede the stem: do / undo
• Suffixes – follow the stem: eat / eats
• Infixes – are inserted inside the stem
• Circumfixes – precede and follow the stem
• English doesn’t stack more affixes.
• But Turkish can have words with a lot of suffixes.
• Languages, such as Turkish, tend to string affixes together are called agglutinative
languages.

BİL711 Natural Language Processing 81

Surface and Lexical Forms
• The surface level of a word represents the actual spelling
of that word.
• geliyorum eats cats kitabım
• The lexical level of a word represents a simple concatenation
of morphemes making up that word.
• gel +PROG +1SG
• eat +AOR
• cat +PLU
• kitap +P1SG
• Morphological processors try to find correspondences between lexical and
surface forms of words.
• Morphological recognition – surface to lexical
• Morphological generation – lexical to surface

BİL711 Natural Language Processing 82

Inflectional and Derivational Morphology
• There are two broad classes of morphology:
• Inflectional morphology
• Derivational morphology
• After a combination with an inflectional morpheme,
the meaning and class of the actual stem usually do not change.
• eat / eats pencil / pencils
• gel / geliyorum masa / masam
• After a combination with an derivational morpheme, the meaning and the
class of the actual stem usually change.
• compute / computer do / undo friend / friendly
• Uygar / uygarlaş kapı / kapıcı
• The irregular changes may happen with derivational affixes.

BİL711 Natural Language Processing 83

English Inflectional Morphology
• Nouns have simple inflectional morphology.
• plural -- cat / cats
• possessive -- John / John’s
• Verbs have slightly more complex inflectional, but still relatively simple
inflectional morphology.
• past form -- walk / walked
• past participle form -- walk / walked
• gerund -- walk / walking
• singular third person -- walk / walks
• Verbs can be categorized as:
• main verbs
• modal verbs -- can, will, should
• primary verbs -- be, have, do
• Regular and irregular verbs: walk / walked -- go / went

BİL711 Natural Language Processing 84

English Derivational Morphology
• Some English derivational affixes
• -ation : transport / transportation
• -er : kill / killer
• -ness : fuzzy / fuzziness
• -al : computation / computational
• -able : break / breakable
• -less : help / helpless
• un : do / undo
• re : try / retry

BİL711 Natural Language Processing 85

Morphological Parsing
• Morphological parsing is to find the lexical form of a word
from its surface form.
• cats -- cat +N +PLU
• cat -- cat +N +SG
• goose -- goose +N +SG or goose +V
• geese -- goose +N +PLU
• gooses -- goose +V +3SG
• catch -- catch +V
• caught -- catch +V +PAST or catch +V +PP
• geliyorum -- gel +V +PROG +1SG
• masalardan -- masa +N +PLU +ABL
• There can be more than one lexical level representation
for a given word. (ambiguity)
BİL711 Natural Language Processing 86
Parts of A Morphological Processor
• For a morphological processor, we need at least followings:

• Lexicon : The list of stems and affixes together with basic information
about them such as their main categories (noun, verb, adjective, …) and
their sub-categories (regular noun, irregular noun, …).
• Morphotactics : The model of morpheme ordering that explains which
classes of morphemes can follow other classes of morphemes inside a
word.
• Orthographic Rules (Spelling Rules) : These spelling rules are used to
model changes that occur in a word (normally when two morphemes
combine).

BİL711 Natural Language Processing 87

Lexicon
• A lexicon is a repository for words (stems).
• They are grouped according to their main categories.
• noun, verb, adjective, adverb, …
• They may be also divided into sub-categories.
• regular-nouns, irregular-singular nouns, irregular-plural nouns, …
• The simplest way to create a morphological parser, put all possible
words (together with its inflections) into a lexicon.

BİL711 Natural Language Processing 88

Combine Lexicon and Morphotactics
o
f x
c a t
s
d o g
s
p
h e e

g e
o s
o

e e e
m
o u s
i c

This only says yes or no. Does not give lexical representation.
It accepts a wrong word (foxs).

BİL711 Natural Language Processing 90

Formal Definition of FST (Mealey Machine)
• FST is Q x  x q0 x F x 

• Q : a finite set of N states q0, q1, … qN

•  : a finite input alphabet of complex symbols.
• Each complex symbol is a pair of an input and an output symbol i:o
• where i is a member of I (an input alphabet),
• and o is a member of O (an output alphabet).
• I and O may contain empty string.
• So,  is a subset of IxO.
• q0 : the start state
• F : the set of final states -- F is a subset of Q
• (q,i:o) : transition function

BİL711 Natural Language Processing 92

FST (cont.)
•  may not contain all possible pairs from IxO.
• For example:
• I = {a, b, c} O={a,b,c, є}
•  = {a:a, b:b, c:c, a:є, b: є, c: є}
• feasible pairs – In two-level morphology terminology, the pairs in 
are called as feasible pairs.
• default pair – Instead of a:a we can use a single character for this
default pair.
• FSAs are isomorphic to regular languages, and FSTs are isomorphic to
regular relations (pair of strings of regular languages).

BİL711 Natural Language Processing 93

FST Properties
• FSTs are closed under: union, inversion, and composition.

• union : The union of two regular relations is also a regular relation.

• inversion : The inversion of a FST simply switches the input and output
labels.
• This means that the same FST can be used for both directions of a morphological
processor.
• composition : If T1 is a FST from I1 to O1 and T2 is a FST from O1 to O2, then
composition of T1 and T2 (T1oT2) maps from I1 to O2.

• We use these properties of FSTs in the creation of the FST for a

morphological processor.

BİL711 Natural Language Processing 94

A FST for Simple English Nominals
+N: є
+S:#
reg-noun +PL:^s#

irreg-sg-noun +N: є +SG:#

irreg-pl-noun
+PL:#

+N: є

BİL711 Natural Language Processing 95

FST for stems
• A FST for stems which maps roots to their root-class
reg-noun irreg-pl-noun irreg-sg-noun
fox g o:e o:e se goose
cat sheep sheep
dog m o:i u:є s:c e mouse

• fox stands for f:f o:o x:x

• When these two transducers are composed, we have a FST which maps
lexical forms to intermediate forms of words for simple English noun
inflections.
• Next thing that we should handle is to design the FSTs for orthographic
rules, and combine all these transducers.

BİL711 Natural Language Processing 96

lexicl

intermediate d o g ^ s #

surface d o g s

BİL711 Natural Language Processing

Lexical to Intermediate FST

BİL711 Natural Language Processing 98

Orthographic Rules
• We need FSTs to map intermediate level to surface level.
• For each spelling rule we will have a FST, and these FSTs run parallel.

• Some of English Spelling Rules:

• consonant doubling -- 1-letter consonant doubled before ing/ed -- beg/begging
• E deletion - Silent e dropped before ing and ed -- make/making
• E insertion -- e added after s, z, x, ch, sh before s -- watch/watches
• Y replacement -- y changes to ie before s, and to i before ed -- try/tries
• K insertion -- verbs ending with vowel+c we add k -- panic/panicked

• We represent these rules using two-level morphology rules:

• a => b / c __ d rewrite a as b when it occurs between c and d.

BİL711 Natural Language Processing 99

Generating or Parsing with FST Lexicon and
Rules

BİL711 Natural Language Processing 100

Accepting Foxes

BİL711 Natural Language Processing 101

Intersection
• We can intersect all rule FSTs to create a single FST.

• Intersection algorithm just takes the Cartesian product of states.

• For each state qi of the first machine and qj of the second machine, we
create a new state qij
• For input symbol a, if the first machine would transition to state qn and the
second machine would transition to qm the new machine would transition to
qnm.

BİL711 Natural Language Processing 102

What is named entity recognition?
• Named entity recognition (NER)—also called entity chunking or entity
extraction—is a component of natural language processing (NLP) that
identifies predefined categories of objects in a body of text.
• organizations, locations, expressions of times, quantities, medical codes,
monetary values and percentages, among others. Essentially, NER is the
process of taking a string of text (i.e., a sentence, paragraph or entire
document), and identifying and classifying the entities that refer to each
category.
NER techniques:

• The organizations that do utilize NER for unstructured data extraction

rely on a range of approaches, but most fall into three broad
categories: rule-based approaches, machine learning approaches and
hybrid approaches.
• Rule-based approaches involve creating a set of rules for the grammar
of a language. The rules are then used to identify entities in the text
based on their structural and grammatical features. These methods
can be time-consuming and may not generalize well to unseen data.

• Machine learning approaches involve training an AI-driven

machine learning model on a labeled dataset using algorithms
like conditional random fields and maximum entropy (two
types of complex statistical language models).
• Techniques can range from traditional machine learning
methods (e.g., decision trees and support vector machines) to
more complex deep learning approaches, like recurrent neural
networks (RNNs) and transformers.

• Hybrid approaches combine rule-based and machine learning

methods to leverage the strengths of both. They can use a rule-based
system to quickly identify easy-to-recognize entities and a machine
learning system to identify more complex entities.
NER methodologies:

• Recurrent neural networks (RNNs) and long short-term memory

(LSTM). RNNs are a type of neural network designed for sequence
prediction problems.

• LSTMs, a special kind of RNN, can learn to recognize patterns over

time and maintain information in “memory” over long sequences,
making them particularly useful for understanding context and
identifying entities.
• Conditional random fields (CRFs). CRFs are often used in combination
with LSTMs for NER tasks. They can model the conditional probability
of an entire sequence of labels, rather than just individual labels,
making them useful for tasks where the label of a word depends on
the labels of surrounding words.
• Transformers and BERT. Transformer networks, particularly the BERT
(Bidirectional Encoder Representations from Transformers) model,
have had a significant impact on NER. Using a self-attention
mechanism that weighs the importance of different words, BERT
accounts for the full context of a word by looking at the words that
come before and after it.
NER process:
• Step 1. Data collection
• The first step of NER is to aggregate a dataset of annotated text. The
dataset should contain examples of text where named entities are
labeled or marked, indicating their types. The annotations can be
done manually or using automated methods.
• Step 2. Data preprocessing
• Once the dataset is collected, the text should be cleaned and formatted.
You may need to remove unnecessary characters, normalize the text
and/or split text into sentences or tokens.
• Step 3. Feature extraction
• During this stage, relevant features are extracted from the preprocessed
text. These features can include part-of-speech tagging (POS tagging), word
embeddings and contextual information, among others. The choice of
features will depend on the specific NER model the organization uses.
• Step 4. Model training
• The next step is to train a machine learning or deep learning model
using the annotated dataset and the extracted features. The model
learns to identify patterns and relationships between words in the
text, as well as their corresponding named entity labels.
• Step 5. Model evaluation
• After you have trained the NER model, it should be evaluated to
assess its performance. You can measure metrics like precision, recall
and F1 score, which indicate how well the model correctly identifies
and classifies named entities.
• Step 6. Model fine-tuning
• Based on the evaluation results, you will refine the model to improve
its performance. This can include adjusting hyperparameters,
modifying the training data and/or using more advanced techniques
(e.g., ensembling or domain adaptation).
• Step 7. Inference
• At this stage, you can start using the model for inference on new,
unseen text. The model will take the input text, apply the
preprocessing steps, extract relevant features and ultimately predict
the named entity labels for each token or span of text.
Application of NER:
Challenges of NER:
• Ambiguity in Entity Names: Certain words or phrases can have multiple
possible meanings or interpretations.

• Misspelled Entity Names: Text data often contains spelling errors or

variations, making it difficult to recognize named entities accurately.

• Ambiguity in Entity Types: Some words or phrases can be classified into

multiple entity types, leading to uncertainty in classification.

• Variations in Entity References: Entities can be referred to using different

expressions or synonyms, making their identification challenging.
• Contextual Challenges: Understanding the context of a word or
phrase within a sentence or document is essential for accurate
entity recognition.

FreeBitcoin Script Roll 10000
0% (1)
FreeBitcoin Script Roll 10000
2 pages
Cumuluspower™: User & Operation Manual
100% (1)
Cumuluspower™: User & Operation Manual
94 pages
Cloud - Chapter 8
No ratings yet
Cloud - Chapter 8
33 pages
01 NLP Unit 4 Part 1
No ratings yet
01 NLP Unit 4 Part 1
25 pages
Natural Language Processing 5
No ratings yet
Natural Language Processing 5
24 pages
Unit 1 NLP KCS072
No ratings yet
Unit 1 NLP KCS072
12 pages
Unit 5 NLP
No ratings yet
Unit 5 NLP
24 pages
Session 6 - Part-Of-Speech Tagging, Sequence Labeling
No ratings yet
Session 6 - Part-Of-Speech Tagging, Sequence Labeling
86 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
17 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
25 pages
NLPChapter3
No ratings yet
NLPChapter3
14 pages
NLP Intro
No ratings yet
NLP Intro
74 pages
CAT King study material 5
No ratings yet
CAT King study material 5
21 pages
Adnan Amin
No ratings yet
Adnan Amin
19 pages
Part of Speech Tagging and Hidden Markov Models
No ratings yet
Part of Speech Tagging and Hidden Markov Models
24 pages
Part-of-Speech (POS) Tagging
No ratings yet
Part-of-Speech (POS) Tagging
47 pages
module-3
No ratings yet
module-3
33 pages
Python Text Mining: Perform Text Processing, Word Embedding, Text Classification and Machine Translation
From Everand
Python Text Mining: Perform Text Processing, Word Embedding, Text Classification and Machine Translation
Alexandra George
No ratings yet
POStagging
No ratings yet
POStagging
72 pages
Module-5 (Markov Model and Pos Tagging)
No ratings yet
Module-5 (Markov Model and Pos Tagging)
66 pages
Parts of Speech Tagging
No ratings yet
Parts of Speech Tagging
17 pages
Pos Tagging
No ratings yet
Pos Tagging
84 pages
Text Analytics and Natural Language Processing - KAI073.docx
No ratings yet
Text Analytics and Natural Language Processing - KAI073.docx
24 pages
Pos Tagging
No ratings yet
Pos Tagging
84 pages
NLP-Questions (1)
No ratings yet
NLP-Questions (1)
26 pages
NLP Part1
No ratings yet
NLP Part1
67 pages
NLP Short Que Ans
No ratings yet
NLP Short Que Ans
21 pages
Apznzaaczprqee1da4bjade7ul0meb Ap8tjou Feozcgqct6cpnh0z32ibu3faj 0wgfmnhp5p Eneunhaucakhow Bie9yhlaoqtsknu7yq0gfnxrzjd2mjuyrbnhadveb2wj7gjgcxpffbjgyxl4nzdqf5qeux-Lla2ggr5kg9w4bp8ev5hqrj7bwr3npwnp9gfmazwtau
No ratings yet
Apznzaaczprqee1da4bjade7ul0meb Ap8tjou Feozcgqct6cpnh0z32ibu3faj 0wgfmnhp5p Eneunhaucakhow Bie9yhlaoqtsknu7yq0gfnxrzjd2mjuyrbnhadveb2wj7gjgcxpffbjgyxl4nzdqf5qeux-Lla2ggr5kg9w4bp8ev5hqrj7bwr3npwnp9gfmazwtau
108 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
17 pages
MOD-1
No ratings yet
MOD-1
71 pages
NLP_(1)[1]
No ratings yet
NLP_(1)[1]
30 pages
NLP Insem Notes
No ratings yet
NLP Insem Notes
13 pages
PARTS OF SPEECH TAGGING Article
No ratings yet
PARTS OF SPEECH TAGGING Article
4 pages
Introduction Machine Learning & NLP: 17B1NCI731 (Credits:3, Contact Hours: 3)
No ratings yet
Introduction Machine Learning & NLP: 17B1NCI731 (Credits:3, Contact Hours: 3)
93 pages
9.Chapter7 POS Tagging
No ratings yet
9.Chapter7 POS Tagging
37 pages
Experiment 4
No ratings yet
Experiment 4
3 pages
Week 8-Module 7 NLP
No ratings yet
Week 8-Module 7 NLP
52 pages
NLP FINAL
No ratings yet
NLP FINAL
33 pages
Basics of Chat GPT: How to utilize this powerful tool to enhance your life!
From Everand
Basics of Chat GPT: How to utilize this powerful tool to enhance your life!
Adam Larsen
No ratings yet
NLPAssignment Purna
No ratings yet
NLPAssignment Purna
12 pages
NLP Programming en 04 HMM
No ratings yet
NLP Programming en 04 HMM
24 pages
TSP Unit1 Own
No ratings yet
TSP Unit1 Own
13 pages
POS__Tagging
No ratings yet
POS__Tagging
11 pages
NLP unit1
No ratings yet
NLP unit1
24 pages
Sample
No ratings yet
Sample
8 pages
IntroductionToNLPAbebeZerihun
No ratings yet
IntroductionToNLPAbebeZerihun
45 pages
module-1 ch-2
No ratings yet
module-1 ch-2
31 pages
Handling Corpus Raw Text
No ratings yet
Handling Corpus Raw Text
15 pages
CSC 528 Lecture 3
No ratings yet
CSC 528 Lecture 3
42 pages
TSP unit1 own (1)
No ratings yet
TSP unit1 own (1)
20 pages
Intro To NLP: Natural Language Toolkit
No ratings yet
Intro To NLP: Natural Language Toolkit
11 pages
Unit 5 - Aiaaia
No ratings yet
Unit 5 - Aiaaia
19 pages
UNIT-2
No ratings yet
UNIT-2
6 pages
NLP Ia2
No ratings yet
NLP Ia2
18 pages
UNIT3
No ratings yet
UNIT3
52 pages
Important Questions-Answers Text Analytics and Natural Language Processing [KAI073]
No ratings yet
Important Questions-Answers Text Analytics and Natural Language Processing [KAI073]
37 pages
pos tagging and chunking
No ratings yet
pos tagging and chunking
29 pages
Module 123
No ratings yet
Module 123
19 pages
Chapter 7.1 - Introducing Natural Language Processing
No ratings yet
Chapter 7.1 - Introducing Natural Language Processing
39 pages
NLP 4
No ratings yet
NLP 4
83 pages
Machine Learning Natural Language 2023
No ratings yet
Machine Learning Natural Language 2023
28 pages
Unit Ii Part of Speech Tagging and Syntactic Parsing
No ratings yet
Unit Ii Part of Speech Tagging and Syntactic Parsing
29 pages
Natural Language Processing
From Everand
Natural Language Processing
Ajit Singh
No ratings yet
Article 3292
No ratings yet
Article 3292
18 pages
TD Ui400 DLT7230 24 en 004
No ratings yet
TD Ui400 DLT7230 24 en 004
3 pages
TCL Ucli Modelsim Ref v11p7
No ratings yet
TCL Ucli Modelsim Ref v11p7
564 pages
CCNA Routing and Switching Portable Command Guide 3rd Edition by Scott Empson 1587204304 978-1587204302 - The ebook version is available in PDF and DOCX for easy access
100% (5)
CCNA Routing and Switching Portable Command Guide 3rd Edition by Scott Empson 1587204304 978-1587204302 - The ebook version is available in PDF and DOCX for easy access
63 pages
Uva Summer Programmes - Basic Student Housing - de Key
No ratings yet
Uva Summer Programmes - Basic Student Housing - de Key
2 pages
Working With SFC Charts PDF
No ratings yet
Working With SFC Charts PDF
55 pages
Topic 2 - Subtopic 2.2
No ratings yet
Topic 2 - Subtopic 2.2
49 pages
man-g-belig
No ratings yet
man-g-belig
58 pages
Heinzmann RE
100% (1)
Heinzmann RE
27 pages
Fed STD 151B
No ratings yet
Fed STD 151B
46 pages
Assessment Questions
No ratings yet
Assessment Questions
1 page
Lec 2
No ratings yet
Lec 2
34 pages
Mini CRD ErP - 1037515
No ratings yet
Mini CRD ErP - 1037515
2 pages
Digital Technology - Grade 8 - Topic 4 - Ports
No ratings yet
Digital Technology - Grade 8 - Topic 4 - Ports
10 pages
Case Study Setting in SDLC
No ratings yet
Case Study Setting in SDLC
6 pages
Computer Organization - Hardwired V/s Micro-Programmed Control Unit
No ratings yet
Computer Organization - Hardwired V/s Micro-Programmed Control Unit
9 pages
Intelligent System Sem-VII Lab-Manual
No ratings yet
Intelligent System Sem-VII Lab-Manual
33 pages
Introduction To Decision Tree: Gini Index
No ratings yet
Introduction To Decision Tree: Gini Index
15 pages
Dec 2007
100% (1)
Dec 2007
100 pages
Trading Strategies HF
0% (1)
Trading Strategies HF
6 pages
G50 Datasheet 1.9.60
No ratings yet
G50 Datasheet 1.9.60
34 pages
WhatsApp Data Collection Tutorial at ICWSM 19
No ratings yet
WhatsApp Data Collection Tutorial at ICWSM 19
153 pages
Chapter Six: Game Theory
No ratings yet
Chapter Six: Game Theory
42 pages
Implementation of Matlab-SIMULINK Based Real Time Temperature Control For Set Point Changes
No ratings yet
Implementation of Matlab-SIMULINK Based Real Time Temperature Control For Set Point Changes
8 pages
Selwyn - Toward A Digital Sociology of School
No ratings yet
Selwyn - Toward A Digital Sociology of School
14 pages
Sync Scan
100% (1)
Sync Scan
12 pages
Touch Screen
No ratings yet
Touch Screen
27 pages

CH2

Uploaded by

CH2

Uploaded by

CH-2 Natural Language

Processing Models and

• A language model in Natural Language Processing (NLP) is a statistical

• Language models assign a probability to a sequence of words. This

• Machine Translation: Translating text from one language to another.

• Speech Recognition: Converting spoken language into text.

• Chatbots and Conversational Agents: Facilitating human-like interactions.

• Sentiment Analysis: Understanding and categorizing emotions and

• Autocompletion: Predicting the next words or phrases to assist in typing.

• BERT (Bidirectional Encoder Representations from Transformers):

• Developed by Google, BERT is designed to understand the context of words in a sentence

• T5 (Text-to-Text Transfer Transformer):

• Also developed by Google, T5 treats every NLP problem as a text-to-text problem,

• probabilities are (by definition) less than or equal to 1, the more

• By using log probabilities instead of raw probabilities, we get numbers that

• p1 × p2 × p3 × p4 = exp(log p1 +log p2 +log p3 +log p4).

• Natural language generation: POS tagging can be used to generate

• Rule-based part-of-speech (POS) tagging is a method of labeling

• Train a statistical model on the training data, using techniques such as

• Output the transformed tags for each word in the text.

• Complex grammatical structures: Languages with complex

• Inconsistencies in annotated data: Annotated data can sometimes

BİL711 Natural Language Processing 81

BİL711 Natural Language Processing 82

BİL711 Natural Language Processing 83

BİL711 Natural Language Processing 84

BİL711 Natural Language Processing 85

BİL711 Natural Language Processing 87

BİL711 Natural Language Processing 88

BİL711 Natural Language Processing 90

• Q : a finite set of N states q0, q1, … qN

BİL711 Natural Language Processing 92

BİL711 Natural Language Processing 93

• union : The union of two regular relations is also a regular relation.

• We use these properties of FSTs in the creation of the FST for a

BİL711 Natural Language Processing 94

irreg-sg-noun +N: є +SG:#

BİL711 Natural Language Processing 95

• fox stands for f:f o:o x:x

BİL711 Natural Language Processing 96

BİL711 Natural Language Processing

BİL711 Natural Language Processing 98

• Some of English Spelling Rules:

• We represent these rules using two-level morphology rules:

BİL711 Natural Language Processing 99

BİL711 Natural Language Processing 100

BİL711 Natural Language Processing 101

• Intersection algorithm just takes the Cartesian product of states.

BİL711 Natural Language Processing 102

• The organizations that do utilize NER for unstructured data extraction

• Machine learning approaches involve training an AI-driven

• Hybrid approaches combine rule-based and machine learning

• Recurrent neural networks (RNNs) and long short-term memory

• LSTMs, a special kind of RNN, can learn to recognize patterns over

• Misspelled Entity Names: Text data often contains spelling errors or

• Ambiguity in Entity Types: Some words or phrases can be classified into

• Variations in Entity References: Entities can be referred to using different

You might also like