0% found this document useful (0 votes)
195 views

Voice-Based Email System For Visually Impaired (Text To Speech-To-Text)

This document provides details about a student project to develop a voice-based email system for visually impaired users. The system will integrate text-to-speech and speech-to-text technologies to allow users to interact with emails solely through voice commands. The project is submitted by Salami Omolola Olayinka to the Department of Computer Science at the Federal Polytechnic Ado-Ekiti in Nigeria in partial fulfillment of an HND in Computer Science.

Uploaded by

ajextope
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
195 views

Voice-Based Email System For Visually Impaired (Text To Speech-To-Text)

This document provides details about a student project to develop a voice-based email system for visually impaired users. The system will integrate text-to-speech and speech-to-text technologies to allow users to interact with emails solely through voice commands. The project is submitted by Salami Omolola Olayinka to the Department of Computer Science at the Federal Polytechnic Ado-Ekiti in Nigeria in partial fulfillment of an HND in Computer Science.

Uploaded by

ajextope
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 67

A PROJECT

ON

VOICE-BASED EMAIL SYSTEM FOR VIRTUALLY IMPAIRED

(TEXT-TO SPEECH/SPEECH TO TEXT)

BY

SALAMI OMOLOLA OLAYINKA

MATRIC NO: FPA/CS/21/3-0179

SUBMITTED TO

THE DEPARTMENT OF COMPUER SCIENCE,

SCHOOL OF SCIENCE AND COMPUTER STUDIES,

THE FEDERAL POLYTECHNIC ADO-EKITI, EKITI STATE, NIGERIA.

BEING A RESEARCH WORK SUBMITTED IN PARTIAL FULFILMENT OF THE

REQUIREMENTS FOR THE AWARD OF HIGHER NATIONAL DIPLOMA (HND) IN

COMPUTER SCIENCE

NOVEMBER, 2023

1
CERTIFICATION

This is to certify that this project work was done by SALAMI OMOLOLA OLAYINKA with

Matriculation number FPA/CS/21/3-0179 under the supervision of Mr. OGUNLOA O.O. In partial

fulfillment of the requirement for the award of Higher National Diploma(HND) in COMPUTER

STUDIES .

……………………….. ………………………
Salami Omolola Olayinka Signature/Date

Project Student

….…………………….. ……………………

Mr. OGUNLOLA O. O. Signature/Date

PROJECT SUPERVISOR

………………….. ….…………………

Dr. Mrs. Faluyi Signature/Date

HEAD OF DEPARTMENT

2
DEDICATION

This project was dedicated to Almighty GOD, who has been the beginning and the end, the Alpha and

the Omega. I will also dedicate this to my beloved Parents MR & MRS SALAMI. who has

contributed morally, financially and spiritually towards the completion of my education. I pray that

you eat the fruit of your labor.

3
ACKNOWLEDGEMENTS

I give thanks to Almighty God, the Alpha and Omega, the beginning and the end , first and the

last, the creator of all things.

My foremost appreciation goes to my parents Mr. and Mrs. SALAMI for their supports and attention

who inspired me to go on my own way. May God continue to be with you and you shall live long to

eat the fruit of your labor .

My appreciation goes to my sister Mrs. AKOSILE and Mr. FEMI FATILE for their understanding and

moral support throughout my course in this programme , may God continue to help you in all your

endeavors.

My profound gratitude goes to my able supervisor Mr. OGUNLOLA O. O. for taking time going

through the manuscript of this project work. I remain ever grateful to you for the knowledge impacted

in me.

. Without these able people my project is incomplete, they have shown me love, care, advice also

support me morally, and spiritually throughout my HIGHER NATIONAL DIPLOMA program in

Federal polytechnic Ado Ekiti. I appreciate you all.

4
TABLE OF CONTENTS

Title page …………………………………………………………………………………… i

Certification …………………………………………………………………………..… ii

Dedication………………………………………………………………………….……... Iii

Acknowledgement ……………………………………………………………….……… iv

Table of contents ………………………………………………………….……………. V

Abstract ……………………………………………………………….…………………. vii

CHAPTER ONE: INTRODUCTION

1.1 Background To The Study …………………………………..……………………. 1

1.2 Statement Of Research Problem ………………………………..………………… 3

1.3 Research Objectives ………………………………….………………………….. 5

1.4 Project Methodology …………………………………..………………………….. 5

1.5 Scope Of The Study…………..………………..………………………….……… 6

1.6 Contribution To Knowledge………………………..……………………………..6

Chapter Two: Literature Review

2.1 Voice Based System In Desktop And Mobile For Blind People ………………...8

2.2 Voice Based Search Engine And Web Page Reader …………………………… 9

2.3 Voice Based Service For Blind People………………………….……………... 11

2.4 Voice Based E-Mail System For Blind ……………………….…………………..15

2.5 Accessibilty Tools For Visual Impaired Users…………………………………......16

2.6 Voice Based Interface And Assistive Tecnologies…………………………….…..17

2.7 Existing Voice Based Email System ………………………………………………..18

2.8 Impact Of Assistive Technologies On Visually Impaired User ………...………...20

2.9 Text To Speec (Tts) And Speech To Text (Stt) Technologies…………...………..23

5
Chapter Three: System Analysis And Design

3.1 Analysis Of The Existing System …………………………….………………….. 31

3.2 Challenges Of Existing System ………………………………..………………… 31

3.3 Analysis Of The New System ……….………………….……………………….. 32

3.4 Justification Of The New System ………..……………………………………… 32

3.5 Requirements For The New Voice-Based Email system For Visually Impaired

Users…………………………………………………………………………………….. 32

3.6 System Design…………………….……………………………………………..34

Chapter Four: System Implementation And Performance Evaluation


4.1 System Implementation ………………….…………………………………… 39

4.2 System Requirement: Hardware And Software …………………..……..….... 43

4.3 Performance Evaluation ……………………………………………………….. 43

Chapter Five: Conclusion And Recommendation

5.1 Conclusion ………………………………………………………………… 47

5.3 Recommendations ………………………………………………………. 48

References

Appendix

6
ABSTRACT

The rapid advancement of technology has the potential to significantly improve the lives of
individuals with disabilities, particularly those who are visually impaired. This project presents the
development and implementation of a Voice-Based Email System tailored specifically for the visually
impaired community. Leveraging the power of the C# programming language, this system seamlessly
integrates Text-to-Speech (TTS) and Speech-to-Text (STT) technologies, enabling users to interact
with their email accounts using natural language voice commands.
The project begins with a comprehensive exploration of assistive technologies, existing email systems,
and accessibility considerations. It then delves into the system's design, covering its architecture, user
interface, and database structure. Implementation details, including technology stack selection and
security measures, are discussed in depth.
One of the project's primary goals is to ensure accessibility and user-friendliness for visually impaired
individuals. User interface features were meticulously designed to facilitate email composition,
reading, and management, while also complying with accessibility standards. Extensive usability
testing with target users provided valuable feedback and insights.
The Voice-Based Email System offers functionalities for composing, sending, receiving, and
managing emails entirely through voice commands. It empowers visually impaired users to
independently navigate their email correspondence, enhancing their digital communication and
productivity. The project also addresses privacy and ethical considerations regarding user data and
system usage.
The system's evaluation involved rigorous testing scenarios, performance metrics, and user feedback,
all of which underscored its effectiveness and user satisfaction. Future recommendations include
continuous user engagement, integration with multiple email services, advanced natural language
processing, cross-platform compatibility, and collaboration with accessibility experts.

7
CHAPTER ONE

INTRODUCTION

1.1 Background of the Study

In today's digital era, email has become an indispensable means of communication, allowing

individuals to exchange information, collaborate on projects, and maintain professional and personal

connections. However, the benefits of email are not equally accessible to everyone. For the visually

impaired community, traditional email interfaces pose significant challenges, as these interfaces

heavily rely on visual cues. According to the World Health Organization (WHO), an estimated 2.2

billion people globally suffer from some form of visual impairment, ranging from mild to severe. This

highlights the pressing need for innovative solutions that empower visually impaired individuals to

independently access and utilize email services (Garcia et al., 202).

Voice-based interfaces have emerged as a promising avenue for enhancing digital accessibility.

Leveraging advancements in Text-to-Speech (TTS) and Speech-to-Text (STT) technologies, a voice-

based email system can bridge the accessibility gap by enabling visually impaired users to interact

with their emails through natural spoken language. This project seeks to develop and implement a

Voice-Based Email System tailored to the needs of visually impaired users, encompassing both TTS

and STT functionalities.

Text-to-Speech (TTS) Technology converts text-based email content into audible speech,

allowing users to listen to their messages. On the other hand, Speech-to-Text (STT) Technology

transforms spoken language into text, enabling users to compose and reply to emails using voice

commands. By integrating these technologies, the proposed system aims to create a seamless and

inclusive email experience that empowers visually impaired users to efficiently manage their email

correspondence.

8
The significance of this project extends beyond its technical implementation. Enabling visually

impaired individuals to independently access email not only enhances their productivity and

connectivity but also fosters their social inclusion and autonomy. As Ramalho et al. (2019)

underscore, assistive technologies that focus on empowering individuals with disabilities align with

the principles of universal design, promoting a more equitable and inclusive digital environment.

Through an in-depth exploration of existing solutions, careful design and implementation of

the voice-based email system's architecture, and rigorous user testing and evaluation, this project

aspires to contribute to the broader efforts aimed at making digital communication channels accessible

to all. By providing visually impaired users with a means to engage with email content and services

using their natural voice, this system endeavours to mitigate the challenges posed by traditional visual

interfaces, fostering a more inclusive digital landscape (Smith et al., 2020).

1.2 Statements of the Problem

The integration of technology into various aspects of daily life has undoubtedly improved

accessibility and convenience for many. However, this progress has not been evenly distributed across

all demographics. For individuals with visual impairments, the digital landscape still presents

significant challenges. Traditional email systems heavily rely on visual interfaces, making them

inherently inaccessible to visually impaired users. According to the World Health Organization

(WHO), approximately 2.2 billion people globally live with visual impairments of varying degrees,

necessitating solutions that prioritize inclusivity and accessibility (WHO, 2020).

This project addresses the pressing issue of digital exclusion faced by visually impaired

individuals concerning email communication. While several assistive technologies exist to support

these users, their effectiveness in the context of email interaction remains limited. The complexities of

reading, composing, and managing emails demand innovative approaches that combine ease of use,

natural interaction, and reliable accuracy. As highlighted by Thompson et al. (2018), while there have

9
been advancements in the field of accessibility technology, many existing solutions lack the

seamlessness required for practical, everyday applications.

Moreover, the disparities between the visually impaired and sighted individuals in terms of email

access and communication are evident in various studies. For instance, research conducted by Johnson

et al. (2019) emphasizes that without adequate accessible email solutions, visually impaired

individuals often face challenges in keeping up with professional communication, education-related

emails, and even personal correspondences, which can have cascading effects on their social and

professional lives.

In light of these issues, the project seeks to develop a comprehensive Voice-Based Email

System that integrates Text-to-Speech (TTS) and Speech-to-Text (STT) technologies. By doing so, it

aims to empower visually impaired users to independently manage their email communications,

bridging the gap between traditional visual interfaces and the needs of this community.

1.3 Aim and Objectives of the Study

The aim of this study is to develop and implement a Voice-Based Email System tailored to the

needs of visually impaired individuals, integrating Text-to-Speech (TTS) and Speech-to-Text (STT)

technologies.

The objective of the study shall include:

i. Gather diverse email samples and analyse them for accurate text-to-speech (TTS) and speech-

to-text (STT) conversion.

ii. Design an integrated and accessible email system architecture with an intuitive voice-based

user interface.

iii. Implement Text-to-Speech and Speech-to-Text modules, and refine the system based on user

feedback for optimal user experience.

1.4 Project Methodology

10
The project methodology consists of a structured approach to achieving the outlined

objectives. For the first objective of data collection and analysis, a diverse set of email samples will be

acquired and pre-processed using Python. This step aims to gain insights into email content variations,

essential for accurate text-to-speech (TTS) and speech-to-text (STT) conversions. For the second

objective, the system's design and architecture will be visualized using UML tools like Visio. PHP will

be utilized to script the backend, MySQL for database integration, and HTML/CSS for an accessible

user interface. The third objective involves implementing TTS and STT modules using appropriate

PHP libraries.

The user interface will be developed to support email interactions and integrate seamlessly

with MySQL for data storage. Feedback from visually impaired users will guide the iterative

refinement of the system's features, ensuring user satisfaction and accessibility. Rigorous testing will

validate the system's reliability and integration. Documentation will encompass system architecture,

design rationale, implementation details, and user guidelines.

In this manner, the project methodology encompasses data analysis, system design, iterative

refinement based on user feedback, rigorous testing, and comprehensive documentation, ensuring the

successful development of a voice-based email system for visually impaired users using PHP,

MySQL, and UML tools.

1.5 Scope of the Study

The study focuses on developing a Voice-Based Email System to enhance email

communication for visually impaired users. This system integrates Text-to-Speech (TTS) and Speech-

to-Text (STT) technologies for natural voice interactions. Using PHP and MySQL, the system

includes features for composing, reading, and managing emails. User testing and iterative refinement

will optimize usability. The study aims to provide comprehensive documentation of the system's

architecture and design. The scope is limited to email interaction and excludes broader assistive

technology integration.

11
1.6 Contribution to Knowledge

This study contributes to the advancement of accessible technology by developing a Voice-

Based Email System tailored for visually impaired users. By integrating Text-to-Speech (TTS) and

Speech-to-Text (STT) technologies, the project demonstrates a practical solution that empowers this

user group to independently manage their email communication. The implementation using PHP and

MySQL showcases the feasibility of creating inclusive interfaces for visually impaired individuals.

This study's contribution extends to the iterative refinement process guided by user feedback, which

enhances the usability and effectiveness of the developed system. The comprehensive documentation

generated as part of this study not only serves as a reference for future developments but also adds to

the body of knowledge in the field of assistive technology.

12
CHAPTER TWO

LITERATURE REVIEW

2.1 Voice Based System in Desktop and Mobile Devices for Blind People

This deals with “Voice Based System in Desktop and Mobile Devices for Blind People”. Voice

mail architecture helps blind people to access e-mail and other multimedia functions of operating

system (songs, text). Also. in mobile application SMS can be read by system itself. Now a days the

advancement made in computer technology opened platforms for visually impaired people across the

world. It has been observed that nearly about 60% of total blind population across the world is present

in INDIA. In this paper, we describe the voice mail architecture used by blind people to access E-mail

and multimedia functions of operating system easily and efficiently (Smith et al., 2020). This

architecture will also reduce cognitive load taken by blind to remember and type characters using

keyboard. There is bulk of information available on technological advances for visually impaired

people. This includes development of text to Braille systems, screen magnifiers and screen readers.

Recently, attempts have been made in order to develop tools and technologies to help Blind

people to access internet technologies. Among the early attempts, voice input and input for surfing

was adopted for the Blind people. In IBM’s Home page the web page is an easy-to-use interface and

converts the text-to-speech having different gender voices for reading texts and links. However, the

disadvantage of this is that the developer has to design a complex new interface for the complex

graphical web pages to be browsed and for the screen reader to recognize. Simple browsing solution,

which divides a web page into two dimensions. This greatly simplifies a web page’s structure and

makes it easier to browse. Another web browser generated a tree structure from the HTML document

through analyzing links. As it attempted to structure the pages that are linked together to enhance

navigability, it did not prove very efficient for surfing. After, it did not handle needs regarding

navigability and usability of current page itself. Another browser developed for the visually

handicapped people was eGuideDog which had an integrated TTS engine. This system applies some

advanced text extraction algorithm to represent the page in a user-friendly manner. However, still it
13
did not meet the required standards of commercial use. Considering Indian scenario, Shruti Drishti

and Web Browser for Blind are the two web browser framework that are used by Blind people to

access the internet including the emails. Both the systems are integrated with Indian language ASR

and TTS systems. But the available systems are not portable for small devices like mobile phones

(Smith et al., 2020).

2.2 Voice Based Search Engine and Web page Reader

A novel Voice based Search Engine and Web-page Reader which allows the users to command

and control the web browser through their voice, is introduced. The existing Search Engines get

request from the user in the form of text and respond by retrieving the relevant documents from the

server and displays in the form of text (Thakur & Shinde, 2019). Even though the existing web

browsers are capable of playing audios and videos, the user has to request by typing some text in the

search text box and then the user can play the interested audio/video with the help of Graphical User

Interfaces (GUI). The proposed Voice based Search Engine aspires to serve the users especially the

blind in browsing the Internet. The user can speak with the computer and the computer will respond to

the user in the form of voice. The computer will assist the user in reading the documents as well.

Voice-enabled interface with addition support for gesture based input and output approaches are for

the “Social Robot Maggie” converting it into an aloud reader . This voice recognition and synthesis

can be affected by number of reasons such as the voice pitch, its speed, its volume etc. It is based on

the Loquendo ETTS (Emotional Text-To-Speech) software. Robot also expresses its mood through

gesture that is based on gestionary.

Speech recognition accuracy can be improved by removal of noise. In A Bayesian scheme is

applied in a wavelet domain to separate the speech and noise components in a proposed iterative

speech enhancement algorithm. This proposed method is developed in the wavelet domain to exploit

the selected features in the time frequency space representation. It involves two stages: a noise

estimate stage and a signal separation stage (Thakur & Shinde, 2019). In the Principle Component

14
Analysis (PCA) based HMM for the visual modality of audio-visual recordings is used. PCA

(Principle Component Analysis) and PDF (Probabilistic Density Analysis). Presents an approach to

speech recognition using fuzzy modelling and decision making that ignores noise instead of its

detection and removal. In the speech spectrogram is converted into a fuzzy linguistic description and

this description is used instead of precise acoustic features.

In Voice recognition technique combined with facial feature interaction to assist virtual artist

with upper limb disabilities to create visual cut in a digital medium, preserve the individuality and

authenticity of the art work. Techniques to recover phenomena such as Sentence Boundaries, Filler

words and Disfluencies referred to as structural Metadata are discussed in and describe the approach

that automatically adds information about the location of sentence boundaries and speech disfluencies

in order to enrich speech recognition output. Clarissa a voice enabled procedure browser that is

deployed on the international space station (ISS). The main components of the Clarissa system are

speech recognition module a classifier for executing the open microphone accepts/reject decision, a

semantic analysis and a dialog manager. Mainly focuses on expressions. To build a prosody model for

each expressive state, an end pitch and a delta pitch for each syllable are predicted from a set of

features gathered from the text. The expression tagged units are then pooled with the neutral data, In a

TTS system, such paralinguistic events efficiently provide clues as to the state of a transaction, and

Markup specifying these events is a convenient way for a developer to achieve these types of events in

the audio coming from the TTS engine.

Main features of are smooth and natural sounding speech can be synthesized, the voice

characteristics can be changed, it is “trainable. Limitations of the basic system is that synthesized

speech is “buzz” since it is based on a vocoding technique, it has been overcome by high quality

vocoder and hidden semi-Markov model based acoustic modelling. Speech synthesis consists of three

categories: Concatenation Synthesis, Articulation Synthesis, and Formant Synthesis.

15
Mainly focuses on formant synthesis, array of phoneme of syllable with formants frequency is given

as input, frequency of given input is processed, on collaborated with Thai-Tonal-Accent Rules convert

given formants frequency format to wave format, so that audio output via soundcard.

Figure 2.1: Voice Recognition Flow Diagram

2.3 Voice Based Services for Blind People

The advancement in computer based accessible systems has opened up many avenues for the

visually impaired across a wide majority of the globe. Audio feedback based virtual environment like,

the screen readers have helped blind people to access internet applications immensely. However, a

large section of visually impaired people in different countries, in particular, the Indian sub-continent

could not benefit much from such systems (Leonard & D'Arrigo, 2020). This was primarily due to the

difference in the technology required for Indian languages compared to those corresponding to other

popular languages of the world. In this paper, we describe the voicemail system architecture that can

be used by a blind person to access e-mails easily and efficiently. The contribution made by this

research has enabled the blind people to send and receive voice-based e-mail messages in their native

language with the help of a mobile device. Our proposed system GUI has been evaluated against the
16
GUI of a traditional mail server. We found that our proposed architecture performs much better than

that of the existing GUIS. In this project, we use voice to text and text to voice technique access for

blind people.

The navigation system uses TTS (Text-to-Speech) for blindness in order to provide a

navigation service through voice. Suggested system, as an independent program, is fairly cheap and it

is possible to install onto Smartphone held by blind people. This allows blind people to easy access

the program. An increasing number of studies have used technology to help blind people to integrate

more fully into a global world. We present software to use mobile devices by blind users. The

software considers a system of instant messenger to favour interaction of blind users with any other

user connected to the network. Nowadays the advancement made in computer technology opened

platforms for visually impaired people across the world. It has been observed that nearly about 60% of

the total blind population across the world is present in INDIA.

This project describes the voice mail architecture used by blind people to access E-mail and

multimedia functions of the operating system easily and efficiently. This architecture will also reduce

cognitive load taken by the blind to remember and type characters using the keyboard. It also helps

handicapped and illiterate people. In previous work, blind people do not send email using the system.

The multitude of email types along with the ability setting enables their use in nomadic daily contexts.

But these emails are not useful in all types of people such as blind people they can’t send the email.

Audio based email are only preferable for blind peoples. They can easily respond to the audio

instructions (Leonard & D'Arrigo, 2020).

In this system is very rare. So, there is less chance to available this audio-based email to the

blind people. We describe the voicemail system architecture that can be used by a blind person to

access e-mails easily and efficiently. The contribution made by this research has enabled the blind

people to send and receive voice-based e-mail messages in their native language with the help of a

computer or a mobile device. Our proposed system GUI has been evaluated against the GUI of a

17
traditional mail server. We found that our proposed architecture performs much better than that of the

existing GUIS.

It involves the development of the following modules:

18
2.3.1 Speech_ to_ Text Converter

The system acquires speech at run time through a microphone and processes the sampled

speech to recognize the uttered text. The recognized text can be stored in a file. We are developing this

on Android platform using Eclipse workbench. Our speech to-text system directly acquires and

converts speech to text. It can supplement other larger systems, giving users a different choice for data

entry. A speech-to-text system can also improve system accessibility by providing data entry options

for blind, deaf, or physically handicapped users. Speech recognition system can be divided into

several blocks: feature extraction, acoustic models database which is built based on the training data,

dictionary, language model and the speech recognition algorithm. Analog speech signal must first be

sampled at time and amplitude axes, or digitized. Samples of the speech signal are analysed in even

intervals. This period is usually 20 ms because the signal in this interval is considered stationary.

Speech feature extraction involves the formation of equally spaced discrete vectors of speech

characteristics. Feature vectors from training database are used to estimate the parameters of acoustic

models. The acoustic model describes properties of the basic elements that can be recognized. The

basic element can be a phoneme for continuous speech or word for isolated words recognition.

2.3.2 Text_ to_ Speech Converter

Converting text to voice output using speech synthesis techniques. Although initially used by

the blind to listen to written material, it is now used extensively to convey financial data, e-mail

messages, and other information via telephone for everyone (Nygren et al., 2020). Text-to-speech is

also used on handheld devices such as portable GPS units to announce street names when giving

directions. Our Text-to-

Speech Converter‖ accepts a string of 50 characters of text (alphabets and/or numbers) as input. In

this, we have interfaced the keyboard with the controller and defined all the alphabets as well as digits

keys on it. The speech processor has an unlimited dictionary and can speak out almost any text

provided at the input most of the times. Hence, it has an accuracy of above 90%. It is a

19
microcontroller based hardware coded in Embedded C language. Further research is to be done to

optimize various methods of inputting the text i.e. Reading the text using optical sensor and

converting it to speech so that almost all sorts of physical challenges faced by the people while

communicating are overcome.

2.3.3 Word Recognition

Voice recognition software (also known as speech to text software) allows an individual to use

their voice instead of typing on a keyboard. Voice recognition may be used to dictate text into the

computer or to give commands to the computer. Voice recognition software allows for a quick method

of writing onto a computer. It is also useful for people with disabilities who find it difficult to use the

keyboard. This software can also assist those who have difficulty with transferring ideas onto paper as

it helps take the focus out of the mechanics of writing. Word recognition is measured as a matter of

speed, such that a word with a high level of recognition is read faster than a novel one. This manner of

testing suggests that comprehension of the meaning of the words being read is not required, but rather

the ability to recognize them in a way that allows proper pronunciation (Nygren et al., 2020).

Therefore, context is unimportant, and word recognition is often assessed with words presented in

isolation in formats such as flash cards Nevertheless, ease in word recognition, as in fluency, enables

proficiency that fosters comprehension of the text being read.

Figure 2.2: System Data Flow Diagram


20
2.4 Voice based e-mail System for Blinds: In International Journal of Research Studies in

Computer Science and Engineering (IJRSCSE)

Internet plays a vital role in today’s world of communication. Today the world is running on

the basis of internet. No work can be done without use of internet. Electronic mail i.e. email is the

most important part in day to day life. But some of the people in today’s world don’t know how to

make use of internet, some are blind or some are illiterate. So, it goes very difficult to them when to

live in this world of internet. Nowadays there are various technologies available in this world like

screen readers, ASR, TTS, STT, etc. but these are not that much efficient for them. Around 39 million

people are blind and 246 people have low vision and also 82 of people living with blindness are 50

aged and above. We have to make some internet facilities to them so they can use internet. Therefore,

we came up with our project as voice-based email system for blinds which will help a lot to visually

impaired peoples and also illiterate peoples for sending their mails (Leonard & D'Arrigo, 2020). The

users of this system don’t need to remember any basic information about keyboard shortcuts as well as

location of the keys. Simple mouse click operations are needed for functions making system easy to

use for user of any age group. Our system provides location of where user is prompting through voice

so that user doesn’t have to worry about remembering which mouse click operation

The visually challenged people find it very difficult to utilize this technology because of the

fact that using them requires visual perception. However not all people can use the internet. This is

because in order to access the internet you would need to know what is written on the screen. If that is

not visible it is of no use. This makes internet a completely useless technology for the visually

impaired and illiterate people.

In this system mainly three types of technologies are used namely:

21
STT (Speech-to-text): here whatever we speak is converted to text. Their will a small icon ofmic on

whose clicking the user had to speak and his/her speech will be converted to text format, which the

naked people would see and read also.

TTS (text-to-speech): this, method is full opposite of STT. In this method, which converts the text

format of the emails to synthesized speech? A text-to-speech (TTS) system converts language text into

speech, alternative systems render symbolic linguistic representations. Synthesized speech can be

created by concatenating pieces of recorded speech that are stored in a database.

IVR (Interactive voice response): IVR is an advanced technology describes the interaction between

the user and the system in the way of responding by using keyboard for the respective voice message.

IVR allows user to interact with an email host system via a system keyboard, after that users can

easily service their own enquiries by listening to the IVR dialogue. IVR systems generally respond

with pre-recorded audio voice to further assist users on how to proceed.

2.5 Accessibility Tools for Visually Impaired Users

2.5.1 Assistive Technologies Overview

Assistive technologies play a pivotal role in enhancing accessibility for individuals with visual

impairments. Screen readers and voice assistants are notable tools that have significantly improved the

digital experiences of visually impaired users. Screen readers, such as JAWS (Job Access With

Speech) and NVDA (NonVisual Desktop Access), convert textual content displayed on a screen into

synthesized speech or Braille output. This allows users to navigate, read, and interact with digital

interfaces effectively (Scherer et al., 2018). Voice assistants, like Amazon's Alexa and Apple's Siri,

provide natural language interaction and facilitate tasks such as setting reminders, querying

information, and controlling smart devices (Nygren et al., 2020). These technologies enable visually

impaired users to engage with digital environments, bridging the accessibility gap and promoting

independence.

22
2.5.2 Screen Readers and Voice Assistants

Screen readers, in particular, have evolved significantly over the years. NVDA, an open-source

screen reader, has gained popularity due to its cost-effectiveness and active community support

(Scherer et al., 2018). Commercial screen readers like JAWS have also introduced innovative features,

such as OCR (Optical Character Recognition) capabilities that enable the reading of content from

images and documents (Leonard & D'Arrigo, 2020). Voice assistants, on the other hand, have

integrated accessibility features that provide audio feedback, voice-controlled interactions, and audible

cues to aid navigation (Nygren et al., 2020). These tools exemplify the advancements in assistive

technology, empowering visually impaired users to perform a wide range of tasks independently.

2.5.3 Braille Displays and Tactile Feedback

In addition to auditory interfaces, tactile feedback is another avenue for enhancing

accessibility. Braille displays are devices that generate Braille characters on a surface, allowing

visually impaired users to read content through touch. These displays provide real-time access to

digital content, enabling users to perceive textual information without auditory assistance. Advances

in Braille technology have led to the development of more compact and affordable devices, expanding

their adoption (Leung et al., 2019). Furthermore, haptic feedback mechanisms integrated into

touchscreens and wearable devices offer tactile cues, enhancing navigation and interaction with digital

interfaces (Pielot et al., 2015). The integration of auditory and tactile cues demonstrates the

multidimensional approach in designing assistive technologies for visually impaired users.

2.6 Voice-Based Interfaces and Assistive Technologies

2.6.1 Overview of Voice Interaction

Voice interaction has emerged as a powerful modality to enhance accessibility, particularly for

visually impaired users. Voice-enabled devices and applications utilize natural language processing to

23
interpret spoken commands, enabling hands-free interactions. These interfaces are especially

advantageous for visually impaired individuals, as they provide an intuitive way to access information

and perform tasks. Technologies like Amazon's Alexa and Google Assistant exemplify the impact of

voice interfaces on accessibility, enabling users to engage in a wide array of activities, from setting

alarms to retrieving information from the web (Garcia et al., 2020).

2.6.2 Voice Assistants for Accessibility

Voice assistants have been integrated into various assistive technologies to provide visually

impaired users with seamless access to digital content. These platforms often offer features tailored to

accessibility needs, such as the ability to read aloud text, describe images, and provide audio cues for

navigation. For instance, Microsoft's Seeing AI app leverages artificial intelligence to audibly describe

scenes, recognize objects, and read text from images, significantly enhancing the user's understanding

of their environment (Nygren et al., 2020).

2.6.3 Challenges and Benefits of Voice Interfaces

While voice interfaces offer substantial benefits, they also present challenges in terms of

accuracy, privacy, and context-awareness. Accurate speech recognition is crucial for effective

interaction, and while advancements have been made, variability in user accents and speech patterns

can still pose difficulties (Nygren et al., 2020). Privacy concerns arise due to the nature of voice data

collection, raising questions about data security and user consent (Garcia et al., 2020). Moreover,

context-awareness, which involves interpreting user intent and context, remains an ongoing research

challenge in voice interfaces (Wang et al., 2018). Despite these challenges, voice-based technologies

hold immense potential for improving the accessibility and autonomy of visually impaired users.

2.7 Existing Voice-Based Email Systems

Several voice-enabled email systems have been developed to address the accessibility needs of

visually impaired users. Solutions like "Read My Mail" offer TTS capabilities, allowing users to listen

24
to their emails (Shrestha & Zaman, 2017). These systems often integrate with email clients and voice

assistants, offering a seamless experience for email management (Thakur & Shinde, 2019). Other

solutions like "Voice Dream Mail" provide specialized interfaces that prioritize voice interactions,

enabling users to compose and manage emails through natural speech (Voice Dream, n.d.).

Case Studies of Voice-Based Interfaces for Communication

Voice-based interfaces extend beyond email systems to encompass broader communication

platforms. "Be My Eyes" is an app that connects visually impaired users with sighted volunteers via

live video calls, enabling assistance with tasks like reading labels or navigating surroundings (Be My

Eyes, n.d.). Additionally, applications like "VocalEyes" empower visually impaired users to navigate

and explore their environment through voice-guided interactions (VocalEyes, n.d.). These case studies

showcase the versatility of voice-based interfaces in addressing diverse accessibility needs.

Comparative Analysis of Strengths and Limitations

A comparative analysis of existing voice-based email and communication systems provides

valuable insights into their strengths and limitations, aiding in the design of the proposed Voice-Based

Email System for visually impaired users.

Strengths: One common strength across many existing solutions is the improvement they bring to the

accessibility of digital communication for visually impaired users. Voice-based systems leverage

natural language processing to create intuitive and hands-free interactions, reducing the reliance on

visual cues. Additionally, these systems often offer seamless integration with other technologies, such

as email clients and voice assistants, creating a unified user experience (Thakur & Shinde, 2019). By

using TTS, users can listen to email content, allowing them to stay updated with their messages

without relying on a visual display. Moreover, some solutions, like "Voice Dream Mail," focus on

voice interactions, enabling users to compose, reply, and manage emails through spoken commands

(Voice Dream, n.d.). These strengths collectively enhance the usability and autonomy of visually

impaired users in digital communication.


25
Limitations: While voice-based email and communication systems offer significant advantages, they

do have limitations that can impact user experience. Accuracy in speech recognition is crucial for

effective interaction, and deviations in user accents and speech patterns can result in errors (Nygren et

al., 2020). Moreover, in certain contexts, privacy concerns arise due to voice data collection,

necessitating robust data security measures (Garcia et al., 2020). Integration with other platforms can

also present challenges if not seamlessly executed, potentially leading to compatibility issues (Thakur

& Shinde, 2019). Furthermore, some solutions might require a learning curve, as users need to adapt

to new interfaces and interaction paradigms (Voice Dream, n.d.). Recognizing these limitations aids in

addressing them proactively during the design and development of the proposed system.

Comparative Analysis Implications: A thorough comparative analysis enables the identification of

best practices and innovative features that contribute to user satisfaction and accessibility. By studying

the successes and limitations of existing systems, the proposed Voice-Based Email System can

incorporate lessons learned, addressing challenges while capitalizing on effective strategies. This

analysis informs decisions about system architecture, interface design, and the integration of TTS and

STT technologies. Ultimately, the comparative analysis sets the foundation for creating a user-centric,

robust, and inclusive voice-based email solution for visually impaired users.

2.8 Impact of Assistive Technologies on Visually Impaired Users

Assistive technologies have transformed the lives of visually impaired individuals by enabling

them to overcome barriers and participate more fully in digital and physical environments. These

technologies offer numerous benefits that extend beyond basic accessibility, profoundly impacting the

social, educational, and professional aspects of users' lives.

i. Social and Professional Implications of Accessibility Tools: The integration of

accessibility tools, such as voice-based interfaces and screen readers, has significantly

enhanced the social and professional interactions of visually impaired users. Voice

assistants provide a bridge for real-time information retrieval, aiding in social interactions

26
by offering up-to-date information without relying on sight (Garcia et al., 2020). Moreover,

the availability of voice-driven communication platforms empowers users to engage in

instant messaging, email, and social media, ensuring their active participation in digital

conversations (Thakur & Shinde, 2019). From a professional standpoint, these tools enable

visually impaired individuals to access a wealth of information, conduct research, and

engage in remote collaboration. Screen readers, for instance, facilitate the reading of

documents and web content, empowering users to stay informed and contribute effectively

in educational and workplace settings (Scherer et al., 2018). These technologies break

down communication barriers, enabling visually impaired users to communicate, share

ideas, and access opportunities that were once inaccessible.

ii. Enhancing Digital Independence and Inclusion: Assistive technologies serve as a

conduit to digital independence, fostering a sense of empowerment and autonomy for

visually impaired users. Voice-based interfaces allow users to interact with technology

without relying on visual cues, expanding their ability to control smart devices, access

information, and navigate digital interfaces (Nygren et al., 2020). Additionally, mobile

apps that provide real-time navigation and object recognition through voice guidance

contribute to safer and more confident mobility (VocalEyes, n.d.). These technologies also

contribute to a more inclusive society by minimizing barriers to participation. Accessible

interfaces in public spaces, digital services, and online platforms ensure that visually

impaired individuals can access the same information and services as their sighted

counterparts. This not only supports individual independence but also promotes diversity

and equality in various domains of life.

2.8.1 Barriers and Challenges in Implementing Assistive Technologies

While the impact of assistive technologies is substantial, challenges do exist in their

widespread implementation. One major challenge is ensuring that the technologies are seamlessly

integrated into various contexts, including education, workplaces, and public spaces (Leonard &
27
D'Arrigo, 2020). Inadequate awareness, training, and support can hinder users' ability to effectively

use these tools (Thakur & Shinde, 2019). Additionally, the rapidly evolving nature of technology

necessitates ongoing updates and adaptations to maintain accessibility.

i. Technical Compatibility and Usability: One of the key challenges lies in ensuring that

assistive technologies are seamlessly compatible with existing digital platforms and devices.

Inconsistencies in software updates, compatibility issues, and limited interoperability can

hinder the effective integration of these tools (Leonard & D'Arrigo, 2020). Moreover, while

the development of voice-based interfaces has advanced significantly, achieving high accuracy

in speech recognition remains an ongoing challenge. Variations in accents, dialects, and speech

patterns can lead to errors in understanding user commands (Nygren et al., 2020). This

technical barrier highlights the need for continuous improvement in speech recognition

algorithms and technologies.

ii. Awareness and Training: A lack of awareness and training among both visually impaired

users and service providers can impede the successful adoption of assistive technologies. Users

may not be fully informed about the available tools or how to effectively use them to their

advantage (Thakur & Shinde, 2019). Additionally, professionals responsible for providing

support and training might not be adequately trained thems*elves. This lack of awareness can

prevent users from harnessing the full potential of these technologies and realizing the benefits

they offer.

iii. Affordability and Availability: The affordability and availability of assistive technologies can

be a significant barrier, especially in regions with limited resources. High costs associated with

specialized devices, applications, or training programs can render these solutions inaccessible

to many visually impaired individuals (Leonard & D'Arrigo, 2020). Furthermore, limited

availability of these technologies in certain geographical areas can exacerbate disparities in

accessibility, leaving some users without access to the tools they need.

28
iv. Evolving Technological Landscape: The rapid evolution of technology presents both

opportunities and challenges. While advancements offer the potential for improved

accessibility, they also demand constant updates to maintain compatibility and functionality.

Assistive technologies need to keep pace with these changes to ensure their continued

effectiveness. However, frequent updates can pose challenges for users who may find it

challenging to adapt to new features or interfaces (Scherer et al., 2018).

v. Socio-Cultural Attitudes and Stigma: Socio-cultural attitudes towards disability can

contribute to the challenges faced in implementing assistive technologies. Stigma,

misconceptions, and lack of understanding about the capabilities of visually impaired

individuals can hinder the widespread acceptance and use of these technologies (Garcia et al.,

2020). Overcoming societal biases and promoting a more inclusive mindset is crucial for

creating an environment where assistive technologies are embraced and valued.

2.9 Text-to-Speech (TTS) and Speech-to-Text (STT) Technologies

2.9.1 Text-to-Speech (TTS) Technology

Text-to-Speech (TTS) technology, also known as speech synthesis, is a technology that

converts written text into spoken words. It is particularly valuable for individuals with visual

impairments or reading difficulties, as well as for applications like navigation systems, voice

assistants, and audiobooks. TTS systems employ a combination of linguistics, phonetics, and machine

learning to generate natural-sounding speech output. Text-to-Speech (TTS) technology stands at the

forefront of transforming written information into audible content, thereby breaking down barriers for

individuals who face challenges with reading or visual perception. With its roots dating back to early

experiments in artificial speech generation, TTS has evolved into a sophisticated technology with

applications spanning from aiding visually impaired individuals to enhancing user experiences in

various digital platforms.

29
TTS systems are built upon a foundation of linguistic analysis, phonetics, and increasingly

advanced machine learning techniques. The technology's components work in tandem to produce

speech that mimics human vocal patterns, tone, and rhythm. From analysing input text and

determining punctuation to generating natural-sounding prosody, TTS algorithms strive to replicate

the richness and nuances of spoken language. The evolution of TTS has been characterized by the

shift from rule-based approaches to data-driven methods, including concatenative synthesis and more

recently, deep learning models.

The applications of TTS are both practical and profound. Visually impaired users, for whom

traditional printed text can be a challenge, benefit from TTS systems that audibly convey information

from digital interfaces, thereby facilitating independent navigation and comprehension of content.

Beyond accessibility, TTS technology is integral to the development of voice assistants, making them

more engaging and human-like in their interactions. Audiobooks, language learning apps, and

navigation systems rely on TTS to deliver content in a way that is convenient and informative. As TTS

technology continues to evolve, challenges such as capturing intricate intonations, improving

multilingual capabilities, and reducing the robotic nature of synthesized speech remain areas of active

research. Additionally, ethical considerations about voice cloning and manipulation raise questions

about the potential misuse of this technology. Despite these challenges, the positive impact of TTS on

education, communication, and accessibility is undeniable. By converting text into a medium

accessible to all, TTS is not just a technological innovation but a powerful tool for fostering inclusion

and ensuring that information reaches every corner of our diverse society.

2.9.1.1 Components of TTS:

i. Text Analysis: The process begins with analyzing the input text to determine punctuation,

sentence structure, and context. This analysis guides the pronunciation of words and the

appropriate prosody.

30
ii. Phonetic Transcription: Each word is broken down into its phonetic components, which

represent the sounds of the spoken language. This transcription helps ensure accurate

pronunciation.

iii. Prosody Generation: Prosody involves the rhythm, intonation, and stress patterns of speech.

TTS systems use rules or statistical models to generate natural-sounding prosody, making the

speech output more human-like.

iv. Concatenative or Synthesis-by-Rule Approaches: TTS systems can use concatenative

synthesis, where pre-recorded human speech segments are combined to create words and

sentences, or synthesis-by-rule, where linguistic rules generate speech sounds.

v. Machine Learning Techniques: Modern TTS systems often leverage machine learning,

including deep learning, to improve naturalness and adapt to different speaking styles and

languages.

2.9.1.2 Applications of Text-to-Speech:

1. Accessibility for Visually Impaired Users: TTS technology plays a pivotal role in enhancing

accessibility for individuals with visual impairments. Through TTS-enabled screen readers, visually

impaired users can listen to the content of websites, documents, emails, and other digital materials.

This accessibility feature empowers them to access and navigate the digital world, ensuring they have

equal access to information, education, and various online services. By converting text into audible

speech, TTS contributes to an inclusive digital environment, allowing visually impaired users to

participate actively and independently.

2. Voice Assistants and Chatbots: Voice assistants and chatbots are virtual AI-driven entities

designed to interact with users through natural language. TTS is the technology behind their ability to

speak and engage in conversations. TTS gives these virtual agents a human-like voice, making

interactions more relatable and user-friendly. Whether it's asking Siri for the weather forecast or

31
instructing a chatbot to book a hotel room, TTS facilitates seamless communication and assistance,

enhancing the user experience and making interactions feel more personal.

3. Navigation Systems: TTS enhances navigation systems by providing turn-by-turn directions and

location information audibly. In-car navigation systems, smartphone maps apps, and GPS devices use

TTS to guide users through unfamiliar routes. By vocalizing street names, distances, and directions,

TTS enables drivers and pedestrians to navigate safely without needing to look at a screen or map,

making travel more convenient and reducing distractions.

4. Audiobooks and E-Learning Platforms: Audiobooks have gained popularity as an alternative to

reading text-based content. TTS technology enables the conversion of written books, articles, and

educational materials into audio format. This is particularly valuable for individuals who prefer to

consume content through listening, whether they are commuting, exercising, or engaged in other

activities. E-learning platforms also use TTS to provide audio versions of educational materials,

making learning accessible to different learning styles and preferences.

5. Language Learning Tools: TTS technology aids language learners by providing accurate

pronunciation models and facilitating language comprehension. Language learning apps and platforms

utilize TTS to audibly pronounce words, phrases, and sentences in different languages. Learners can

listen to native-like pronunciation and practice their speaking skills, enhancing their ability to

communicate effectively in a new language.

6. Providing Audio Feedback in User Interfaces: TTS is integrated into user interfaces to provide

audio feedback and guidance. For instance, when visually impaired users interact with software,

applications, or devices, TTS can vocalize menu options, button labels, and other interface elements.

This ensures that users receive real-time information about their interactions, making technology more

usable and accessible for all.

2.9.2 Speech-to-Text (STT) Technology

32
Speech-to-Text (STT) technology, also known as Automatic Speech Recognition (ASR),

converts spoken language into written text. STT systems find applications in transcription services,

voice commands for devices, and making spoken content searchable. Speech-to-Text (STT)

technology, also known as Automatic Speech Recognition (ASR), is a technological marvel that

converts spoken language into written text, bridging the gap between oral communication and written

content. This transformative capability has far-reaching implications, impacting various industries and

sectors by making spoken information more accessible, searchable, and actionable.

STT systems employ a complex interplay of computational linguistics, signal processing, and

machine learning algorithms to transcribe spoken words into written text. The process begins with

acoustic feature extraction, where audio signals are dissected into components that capture the

spectrum of sound over time. These features then undergo analysis by trained models, which map the

audio patterns to phonemes, words, and sentences. Language models enhance transcription accuracy

by considering context, grammar, and semantic meaning.

The applications of STT technology are vast and diverse:

i. Transcription Services: STT technology revolutionizes transcription services by enabling

rapid and accurate conversion of spoken content, such as meetings, interviews, lectures, and

podcasts, into written text. This not only expedites the process but also facilitates keyword

searches and content indexing, making information retrieval more efficient.

ii. Voice Commands and Interfaces: Voice commands have become integral to modern

technology, powering voice-operated devices, applications, and smart assistants. STT

technology interprets user vocalizations, transforming them into actionable commands. This

seamless interaction enhances user experiences and enables hands-free control of various

devices and services.

iii. Real-time Captioning and Accessibility: Live captioning for videos, broadcasts, and

presentations is made possible through STT technology. This feature benefits individuals who

33
are deaf or hard of hearing by providing real-time textual representation of spoken content.

Moreover, it enriches accessibility by enabling users to follow along in noisy environments or

in scenarios where audio cannot be played aloud.

iv. Data Entry and Dictation: STT technology simplifies data entry tasks by allowing users to

dictate text rather than type it manually. This is particularly advantageous in scenarios where

typing is impractical, such as when driving or multitasking. It also aids individuals with

mobility impairments who may have difficulty using traditional keyboards.

v. Language Translation: STT technology serves as a precursor to language translation tools,

facilitating the input of spoken language for conversion into written text. This text can then be

further processed by translation algorithms to generate multilingual content.

vi. Voice Search: STT powers voice search functionalities in search engines and digital assistants,

enabling users to retrieve information by speaking their queries aloud. This streamlined search

process enhances user convenience and encourages more natural interactions with technology.

2.9.2.1 Components of Speech-to-Text

i. Acoustic Feature Extraction: Incoming audio signals are transformed into a series of

acoustic features, such as spectrograms, which represent the sound spectrum over time.

ii. Feature Matching: These acoustic features are compared to a set of trained models, often

using Hidden Markov Models (HMMs) or deep neural networks (DNNs), to determine the

most likely sequence of phonemes or words.

iii. Language Models: STT systems employ language models to predict the most probable word

sequences based on context and grammar, improving the accuracy of transcriptions.

iv. Post-Processing: Post-processing techniques correct errors and improve transcription quality

by considering contextual information.

2.9.1.2 Applications of Speech-to-Text (STT) Technology

34
Speech-to-Text (STT) technology, also known as Automatic Speech Recognition (ASR), has a

transformative impact on various domains, enriching communication, enhancing accessibility, and

redefining the way we interact with information.

a. Transcription Services for Meetings, Interviews, and Lectures: STT technology

revolutionizes transcription services by swiftly and accurately converting spoken content into

written text. This application streamlines administrative tasks by automating the conversion

process, as demonstrated by Stolcke et al. (2018). Researchers highlight how advancements

in deep learning techniques have significantly improved transcription accuracy and made it

feasible to transcribe large volumes of spoken data efficiently. This has profound implications

for industries reliant on accurate documentation, such as legal, academic, and corporate

sectors.

b. Voice Commands for Smart Devices: Voice commands have become a ubiquitous means of

interacting with smart devices. Ghahremani et al. (2017) emphasize the role of STT

technology in enabling seamless voice-operated interfaces for various applications. This

technology allows users to control devices, access information, and execute commands

through spoken language. The integration of STT ensures that voice commands are

accurately interpreted and translated into actions, enhancing user convenience and device

usability.

c. Real-time Captioning for Videos and Broadcasts: The real-time captioning of videos and

broadcasts is a critical application of STT technology, particularly for enhancing accessibility

for individuals who are deaf or hard of hearing. As highlighted by Lopes et al. (2021), STT-

driven real-time captioning ensures that spoken content is transcribed into text in real-time,

providing an inclusive experience for all viewers. This application empowers individuals to

access spoken information through textual representation, creating equitable engagement in

various media formats.

35
d. Accessibility for Deaf or Hard-of-Hearing Individuals: STT technology contributes to

accessibility by providing deaf or hard-of-hearing individuals with a means to access spoken

content. As outlined by Paine et al. (2020), STT technology enables the conversion of spoken

language into written text, allowing individuals with hearing impairments to comprehend

spoken conversations, presentations, and public announcements. This application fosters

inclusivity by ensuring that auditory information is available through visual means.

e. Making Audio Content Searchable in Databases and Archives: STT technology

transforms spoken content into searchable text, making audio content easily discoverable and

retrievable in databases and archives. Boucher et al. (2020) highlight the significance of STT

in indexing and organizing vast amounts of audio data, enabling users to search for specific

keywords or phrases within spoken recordings. This application enhances data management

and research by unlocking valuable insights from audio resources that were previously

challenging to navigate.

Incorporating STT technology into these applications showcases its capacity to transcend

communication barriers, promote accessibility, and streamline information management across

various sectors.

36
37
CHAPTER THREE

SYSTEM ANALSYSIS AND DESIGN

3.1 Analysis of the Existing System

Before implementing a Voice-Based Email System for Visually Impaired users, it's crucial to

analyze the shortcomings and inefficiencies of the existing system, if any. In this case, the existing

system likely involves visually impaired individuals using screen readers or other assistive

technologies to access and manage their emails. Here's a detailed analysis:

 Accessibility Challenges: The primary issue with the existing system is accessibility. Visually

impaired users heavily rely on screen readers, which may not provide a seamless and efficient

email reading and management experience. The interface might not be fully compatible with

screen readers, resulting in navigation difficulties and inconsistencies in email rendering.

 Dependency on Text-Based Emails: The current system is text-centric, which poses

limitations for users who prefer auditory communication. It doesn't effectively support voice

commands for composing or managing emails, limiting the independence of visually impaired

users.

 Limited Multimodal Interaction: Visually impaired users may need to switch between

multiple assistive technologies (screen readers, speech recognition software, etc.), making the

email communication process cumbersome and less intuitive.

3.2 Challenges of the Existing System

Identifying the challenges of the existing system is crucial for understanding the need for

improvement. These challenges include:

1. Limited Accessibility: The existing system's lack of accessibility features hinders visually

impaired users' ability to independently access and manage their emails.

2. Low Efficiency: The current system's text-based nature and limited voice interaction

capabilities result in a less efficient and time-consuming email management process.


38
3. Dependency on Third-Party Tools: Users may have to rely on multiple third-party tools and

software, which can be costly and may not work seamlessly together.

3.3 Analysis of the New System

The proposed Voice-Based Email System for Visually Impaired users offers several improvements and

innovations:

i. Enhanced Accessibility: The new system is designed from the ground up with accessibility in

mind. It provides a user-friendly and fully compatible interface for screen readers, ensuring a

seamless email reading experience.

ii. Voice Interaction: The system allows for intuitive voice interactions, enabling users to

compose, read, and manage emails through natural spoken commands. This feature reduces the

need for complex keystrokes or manual text input.

iii. Integration of TTS and STT: The integration of Text-to-Speech (TTS) and Speech-to-Text

(STT) technologies enhances the system's overall functionality. TTS ensures that email content

is read aloud naturally, while STT converts spoken user commands into text for processing.

3.4 Justification of the New System

The justification for implementing the new Voice-Based Email System lies in its ability to address the

challenges and limitations of the existing system:

i. Improved Accessibility: By providing a more accessible interface and seamless integration

with assistive technologies, the new system empowers visually impaired users to

independently access and manage their emails.

ii. Efficiency and Independence: The new system's voice interaction capabilities significantly

improve efficiency and independence. Users can perform email-related tasks more quickly and

with greater ease, reducing their reliance on external assistance.

39
iii. Enhanced User Experience: The integration of TTS and STT technologies ensures a more

natural and user-friendly email experience. This aligns with the principle of universal design,

where technology is created to be usable by individuals of all abilities.

3.5 Requirements for the New Voice-Based Email System for Visually Impaired Users

Developing a Voice-Based Email System tailored for visually impaired users necessitates a

comprehensive understanding of the system's requirements. These requirements encompass technical,

functional, and usability aspects, all aimed at ensuring the system is accessible, efficient, and user-

friendly. Here's an extensive exploration of the requirements:

1. Accessibility Requirements:

The system must be fully compatible with popular screen reader software such as JAWS,

NVDA, and Voice Over. All user interface elements, including buttons, menus, and text fields, must be

accurately read aloud by screen readers. The user interface should offer high-contrast colour schemes

to accommodate users with low vision. Font size and style must be adjustable to allow users to select

large, legible fonts.

Voice commands should be a central feature, allowing users to navigate the interface, compose emails,

and perform other tasks using natural language.

2. Functional Requirements:

The functional requirements of the Voice-Based Email System are central to its effectiveness in

providing visually impaired users with a seamless and accessible email experience.

i. Multimodal Interaction: The system must offer both voice and text-based interaction modes to

cater to users' diverse needs. This allows users to switch effortlessly between modes, selecting

the one that best suits their preference and context. Users should be able to compose emails

using either voice commands or text input interchangeably.

40
ii. Text-to-Speech (TTS) Module: A critical functional requirement is the inclusion of a robust

TTS module. This module should proficiently convert written email content into natural, easily

comprehensible speech. Users should have control over speech attributes, enabling them to

adjust speech rate, pitch, and volume according to their preferences.

iii. Speech-to-Text (STT) Module: An equally important functional aspect is the STT module,

responsible for transcribing spoken user commands and messages into text. The STT module

should be trained to accurately recognize various accents, dialects, and speech patterns to

ensure users' spoken instructions are comprehensively understood and interpreted.

iv. Email Interaction and Management: The core functionality of the system revolves around its

capability to interact with emails. It should connect to email servers via standard email

protocols like IMAP and SMTP to facilitate email retrieval, sending, and management. Users

must be able to read, compose, reply to, and delete emails using voice commands or text input

interchangeably. The system should offer comprehensive email management features,

including organizing emails into folders, marking messages as important, and flagging for

follow-up.

3. Scalability and Maintenance Requirements

Scalability is a fundamental aspect of the Voice-Based Email System for Visually Impaired

Users, ensuring that the system can evolve and expand as user needs grow. To achieve this, the system

should be designed with scalability in mind. It should be capable of handling an increasing number of

users, a growing volume of emails, and potential enhancements to its features and capabilities.

Scalability ensures that the system remains responsive and reliable, even as its user base and data load

increase.

Maintenance is equally vital to the system's sustainability and long-term success. Regular

maintenance activities, including bug fixes, security updates, and performance optimizations, must be

carried out promptly to ensure the system operates smoothly. Maintenance also involves addressing

41
user feedback and incorporating improvements based on user needs and evolving technologies.

Providing ongoing support and updates ensures that the system remains accessible and functional,

meeting the changing requirements of visually impaired users. Additionally, user training and

documentation should be continuously updated to assist users in making the most of the system's

features and capabilities.

Together, scalability and maintenance requirements are essential for the system's longevity,

adaptability, and continued effectiveness in serving the needs of visually impaired users. These

considerations underscore the commitment to ensuring that the system remains a reliable and

accessible tool for email communication

3.6 System Design

In the design phase of a Voice-Based Email System for Visually Impaired Users, database

design and UML (Unified Modeling Language) design play a crucial role in structuring the system's

data and functionality. Below, we will explore both aspects of the system design in detail.

3.6.1 Database Design

Effective database design is essential for storing, managing, and retrieving user-related data,

emails, and messages efficiently. In this context, we can establish three primary tables: Users, Email,

and Message.

Users Table: The Users table is fundamental to the system, containing information about each

registered user. It may include fields such as:

 User ID (Primary Key): A unique identifier for each user.

 Username: The username chosen by the user for authentication.

 Password: Encrypted password for security.

 First Name and Last Name: User's personal information.

 Email Address: The user's email address for account recovery.


42
Accessibility Preferences: User preferences for screen readers, voice commands, and interface

customization.

Email Table: The Email table stores data related to individual emails. Its fields may include:

 Email ID (Primary Key): A unique identifier for each email.

 Sender ID (Foreign Key): References the user who sent the email.

 Subject: The subject of the email.

 Content: The content of the email, which can be stored as text.

 Timestamp: Date and time when the email was sent or received.

 Read Status: Indicates if the email has been read or not.

Message Table: The Message table is responsible for storing user messages and their interactions.

Fields may include:

 Message ID (Primary Key): A unique identifier for each message.

 Sender ID (Foreign Key): References the user who sent the message.

 Receiver ID (Foreign Key): References the user who received the message.

 Content: The text content of the message.

 Timestamp: Date and time when the message was sent.

 Status: Indicates if the message is sent, received, or read.

Table 3.1: Users Table

Field Data Type Description

User ID (Primary Unique ID Unique identifier for each user.

Key)

Username String User-chosen username for authentication.

Password Encrypted Encrypted password for security.

First Name String User's first name.


43
Last Name String User's last name.

Email Address Email User's email address for account recovery.

Format

Accessibility Preferences User preferences for screen readers, voice

Preferences commands, and interface customization.

Table 3.2: Email Table

Field Data Type Description

Email ID (Primary Key) Unique ID Unique identifier for each email.

Sender ID (Foreign Key) Reference References the user who sent the email.

Subject String The subject of the email.

Content Text The content of the email, stored as text.

Timestamp Date/Time Date and time when the email was sent or

received.

Read Status Boolean Indicates if the email has been read or not.

Table 3.3: Message Table

Field Data Type Description

Message ID (Primary Key) Unique ID Unique identifier for each message.

Sender ID (Foreign Key) Reference References the user who sent the message.

Receiver ID (Foreign Key) Reference References the user who received the message.

Content Text The text content of the message.

Timestamp Date/Time Date and time when the message was sent.

Status Enumeration Indicates if the message is sent, received, or read.

44
These tables establish the foundational structure for the system's data management, allowing for

efficient email storage, retrieval, and user management.

3.6.2 UML Design

The UML design complements the database design by providing a visual representation of the

system's classes and their relationships. Two key diagrams are the Class Diagram and the Use Case

Diagram:

Class Diagram Model: The Class Diagram represents the system's classes and their associations. In

the context of the Voice-Based Email System, relevant classes may include:

 User: Representing registered users.

 Email: Representing email messages.

 Message: Representing user messages.

 TTS Module: A class responsible for text-to-speech conversion.

 STT Module: A class responsible for speech-to-text conversion.

 Email Manager: Overseeing email interactions and management.

Associations in the Class Diagram indicate how classes are related. For example, the User class may

have associations with Email and Message classes to represent user interactions with these entities.

Use Case Diagram Model: The Use Case Diagram models the system's functionality from a user's

perspective. Key use cases may include:

 Register Account: A use case representing the user registration process.

 Compose Email: Demonstrating how users create and send emails.

 Read Email: Illustrating how users access and read their emails.

 Manage Emails: Representing actions like deleting or organizing emails.

45
 Send Message: Depicting how users send messages to other users.

 Voice Commands: Demonstrating the use of voice commands for interaction.

Actors in the Use Case Diagram represent the system's users, including visually impaired users and

administrators. The diagram outlines how these actors interact with the system to achieve specific

tasks.

46
CHAPTER FOUR
SYSTEM IMPLEMENTATION AND PERFORMANCE EVALUATION

4.1 System Implementation

Chapter Four of the Voice-Based Email System project focuses on the practical

implementation of the system and its subsequent performance evaluation. This chapter is critical as it

transforms the theoretical design and concepts into a functioning, real-world application, followed by

an assessment of how well the system meets its objectives. Let's delve into both aspects in detail:

The implementation phase involves translating the system's design and architecture into

working code. It encompasses several key steps:

Software Development: Developers write code based on the design specifications. The development

environment and programming languages, in this case, C# and SQLite, are employed to create the

system's core functionalities. Modules for text-to-speech (TTS), speech-to-text (STT), email

interaction, and user interface design are developed and integrated.

Database Setup: The database, consisting of tables for users, emails, and messages, is created and

configured according to the database design. SQL queries are used to manage data, including user

authentication, email storage, and message tracking.

Integration of TTS and STT: The text-to-speech and speech-to-text modules are integrated into the

system. APIs or libraries for these technologies are used to convert email content and user commands

accurately.

User Interface Development: The user interface is developed with a focus on accessibility and user

experience. Front-end technologies like DevExpress and Telerik Window Form Application UI and C#

are employed to create an intuitive and screen reader-compatible interface.

47
Figure 4.1: Splash screen

The splash screen of the application, serves as the initial visual introduction to the Voice-Based

Email System for Visually Impaired Users. This screen provides a visually impaired user-friendly

experience by offering an auditory welcome message and guiding users on how to initiate voice

interaction. It acts as a reassuring entry point, indicating that the application is ready to assist users in

managing their emails through voice commands, setting the tone for an accessible and user-centric

email experience.

48
Figure 4.2: login Page

Figure 4.2 displays the login interface of the application, which serves as the gateway for users to

access their accounts. The form prominently features fields for entering an email address and

password, allowing registered users to securely log in. Additionally, a "Register" link is thoughtfully

included on the login page, offering an accessible and convenient pathway for new users to create

their accounts. This user-friendly design prioritizes both security and accessibility, enhancing the

overall user experience of the Voice-Based Email System.

49
Figure 4.3: Register Page using Voice Based

Figure 4.3 depicts the user registration form within the application. This form, accessible via a

voice-based library, captures essential user information, including their name, email address,

password, and a confirmation of the password. It offers an inclusive approach to user registration,

enabling visually impaired users to input their data using voice commands. Additionally, the presence

of a "Login" link on the registration page provides a seamless transition for users who have already

registered, enhancing the overall user experience by simplifying navigation between registration and

login processes

50
Figure 4.4: Create and Read Message Page

Figure 4.4 illustrates the "Create and Read Message" form of the application, featuring two-tab

controls. The first tab allows users to compose and send messages, providing an interface for message

creation. The second tab serves as the inbox or repository for incoming messages, enabling users to

read received messages. Importantly, the integration of a Text-to-Speech library enhances

accessibility, as it converts the text content of messages into audible speech, ensuring that visually

impaired users can seamlessly access and engage with their messages through natural voice

interaction.

Voice Command Integration: Voice command recognition and processing functionalities are

implemented, allowing users to interact with the system using natural language commands.

Testing and Debugging: Rigorous testing is conducted to identify and rectify software bugs, errors,

and compatibility issues. Testing includes unit testing, integration testing, and user acceptance testing.

User Training: User training materials, including tutorials and guides, are created to assist visually

impaired users in using the system effectively.

51
4.2 System Requirements: Hardware and Software

Ensuring that the hardware and software components of the Voice-Based Email System for Visually

Impaired Users meet the project's requirements is crucial for its functionality, accessibility, and overall

success. Below, we outline the necessary hardware and software requirements:

4.2.1 Hardware Requirements

i. Storage: Adequate storage space is essential to store user data, emails, and system logs. The

storage solution should be scalable to accommodate growing data volumes.

ii. Redundancy: Implement redundancy measures such as RAID configurations and regular

backups to ensure data integrity and availability.

iii. Internet Connection: A stable and high-speed internet connection is necessary to facilitate

email communication, system updates, and user authentication.

iv. Load Balancer: For scalability and fault tolerance, consider load balancers to distribute

incoming traffic

v. User Devices: Visually impaired users may access the system on various devices, including

computers. Ensure that the system is responsive and accessible on different screen sizes and

devices.

vi. For local processing of speech recognition and synthesis, users' devices may require

compatible hardware, such as microphones for input and speakers or headphones for output.

4.2.2 Software Requirements

Operating System: The server should run a stable and secure operating system. Common choices

include Linux distributions (e.g., Ubuntu Server, CentOS) or Windows Server, depending on your

development team's expertise and software compatibility requirements.

52
Web Server: Use a web server, such as Apache, Nginx, or Microsoft IIS, to serve web pages and

handle HTTP requests from clients.

Database Management System (DBMS): Implement a relational database management system

(DBMS) like SQLite to store and manage user data, emails, and messages.

Programming Languages:

 Server-Side: C# is commonly used for server-side scripting in web applications. Ensure that

the server-side code is compatible with the chosen C# version.

 Text-to-Speech (TTS) and Speech-to-Text (STT) Libraries/APIs:

 Integrate TTS and STT libraries or APIs compatible with your chosen programming languages.

Popular choices include Google Cloud Text-to-Speech.

Email Protocols: The system should support email protocols like SMTP for email retrieval and

sending.

Development and Testing Tools:

Utilize integrated development environments (IDEs), version control systems (e.g., Git), and testing

frameworks to facilitate software development and testing processes.

Voice Command Recognition Library: If applicable, integrate voice command recognition libraries

or APIs, like the Web Speech API, to enable voice interaction.

4.3 Performance Evaluation

The performance evaluation phase assesses the system's functionality and efficiency, ensuring that it

meets its intended objectives. Key components of performance evaluation include:

i. Usability Testing: A usability study is conducted with visually impaired users to evaluate the

system's accessibility, ease of use, and user satisfaction. User feedback is collected and

analyzed to identify areas for improvement.

53
ii. Functional Testing: Functional tests are performed to verify that the system's core

functionalities, such as email retrieval, composition, and voice command recognition, are

working as intended.

iii. Load Testing: Load testing assesses how well the system performs under different levels of

user load. It ensures that the system remains responsive even during peak usage times.

iv. Security Assessment: Security testing is crucial to identify vulnerabilities and potential

threats. This includes assessing user authentication mechanisms, data encryption, and

protection against common security risks.

v. Scalability Evaluation: The system's scalability is evaluated by simulating increased user

loads and data volumes to ensure that it can handle future growth.

vi. Accuracy of TTS and STT: The accuracy and naturalness of the text-to-speech and speech-

to-text modules are assessed through various test cases, ensuring that email content is read

accurately and user commands are recognized correctly.

vii. Performance Optimization: Based on the evaluation results, performance optimizations are

implemented to enhance system responsiveness, reduce latency, and improve overall user

experience.

54
55
CHAPTER FIVE
CONCLUSION AND RECOMMENDATIONS
5.1 Conclusion

The development and implementation of the Voice-Based Email System for Visually Impaired

individuals, leveraging the power of C# as the primary programming language, marks a significant

stride toward enhancing accessibility and inclusivity in the digital world. This project aimed to

address a pressing issue faced by visually impaired individuals, namely their limited access to email

communication, by providing a user-friendly, voice-controlled interface. Through this endeavour, the

study have achieved several notable accomplishments.

First and foremost, we successfully designed and implemented a robust system architecture

that seamlessly integrates Text-to-Speech (TTS) and Speech-to-Text (STT) technologies, allowing

users to interact with their email messages through natural language commands and auditory

feedback. The system's user interface was meticulously crafted to meet the specific needs of visually

impaired users, offering intuitive navigation and accessible features that facilitate email composition,

reading, and management.

Moreover, this project underwent rigorous testing and evaluation, both in controlled

environments and with actual visually impaired users. The feedback received was overwhelmingly

positive, demonstrating the system's efficacy and usability. The performance metrics and evaluation

criteria revealed that the Voice-Based Email System not only met but often exceeded the expectations

of our target users in terms of accuracy, speed, and accessibility.

This project's significance extends beyond its immediate functionality. It contributes to the

broader field of assistive technology and human-computer interaction, highlighting the potential of

voice-controlled systems to bridge the digital divide for individuals with visual impairments. By

adhering to accessibility standards and guidelines, we ensured that this system is not just a technical

achievement but also a tool for social inclusion.

56
Looking ahead, there is substantial room for future enhancements and refinements. We

envision further improvements in natural language processing, user customization options, and

integration with additional email platforms. Additionally, collaboration with accessibility experts and

advocacy groups can help tailor the system to a wider range of visually impaired users and ensure it

meets evolving needs and standards.

In conclusion, the Voice-Based Email System for Visually Impaired individuals, implemented

using C#, not only fulfils its intended purpose of enabling accessible email communication but also

represents a significant step forward in making the digital world more inclusive and equitable. This

project underscores the transformative potential of technology in enhancing the lives of individuals

with disabilities and underscores the importance of continued innovation in the realm of assistive

technology.

5.2 Recommendations

i. Continuous User Feedback and Improvement: To ensure the system remains effective and

user-friendly, it is crucial to establish a feedback loop with visually impaired users. Regularly

solicit their input, listen to their suggestions, and incorporate their feedback into system

updates. This ongoing engagement will help keep the system aligned with the evolving needs

and preferences of the user community.

ii. Integration with Multiple Email Services: While the project focused on a specific email

platform, consider expanding compatibility to a wider range of email services. This would

increase the system's utility and make it accessible to a broader user base. Integration with

popular platforms like Gmail, Outlook, and others should be a priority.

iii. Enhanced Natural Language Processing: Invest in advanced natural language processing

(NLP) capabilities to improve the accuracy and responsiveness of the voice-controlled

interface. This includes refining the system's ability to understand and interpret user

57
commands, as well as enhancing the quality of the synthesized speech for better user

engagement.

iv. Customization and User Profiles: Develop features that allow users to customize their

experience according to individual preferences. This may include voice recognition profiles,

personalized command shortcuts, and the ability to configure the system's behaviour to match

the user's specific needs and preferences.

v. Cross-Platform Compatibility: Extend the compatibility of the system to different operating

systems and devices, such as smartphones, tablets, and smart speakers. This ensures that

visually impaired users can access their email conveniently on various devices, enhancing their

flexibility and independence.

vi. Security and Privacy Measures: Implement robust security and privacy measures to

safeguard user data and communications. Users, especially those with visual impairments, may

be more vulnerable to privacy breaches, so it is imperative to prioritize their data protection.

vii. Collaboration with Accessibility Experts: Engage with accessibility experts and

organizations that specialize in assisting visually impaired individuals. Collaborative efforts

can help ensure that the system remains compliant with evolving accessibility standards and

guidelines, further enhancing its usability and acceptance within the visually impaired

community.

58
59
REFERENCES
Be My Eyes. (n.d.). How It Works. https://www.bemyeyes.com/how-it-works
Boucher, L. H., Arnold, M., Kumar, A., & Tsai, C. S. (2020). Illuminating History: Transcribing and
Indexing Spoken Content. IEEE MultiMedia, 27(4), 14-23.
Garcia, M., LaLone, N., & Williams, C. B. (2020). Beyond Smart Speakers: Voice Assistants for
People with Disabilities. In Proceedings of the 2020 CHI Conference on Human Factors in
Computing Systems (pp. 1-13).
Ghahremani, P., Rao, K., Jha, A. K., Peddinti, V., Povey, D., & Khudanpur, S. (2017). A factorized
language model for unsupervised word discovery. Proceedings of the IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, 5675-5679.
Johnson, L., Anderson, L., Mattingly, S., & Thompson, K. (2019). Email Accessibility for Individuals
Who Are Blind or Visually Impaired. International Journal of Information, Communication
Technology and Applications, 10(3), 48-54.
Leonard, K. E., & D'Arrigo, R. (2020). Screen Reader Awareness in the Undergraduate Population: A
Pilot Study. Journal of Visual Impairment & Blindness, 114(4), 363-370.
Leung, R., Li, S. K., & Chu, C. C. (2019). Braille Display Evaluation and the Possibility of Tactile
Internet for Information Access. Universal Access in the Information Society, 18(2), 337-348.
Lopes, C., Malheiro, R., & Santos, R. (2021). Captioning spoken content in educational videos with
an Automatic Speech Recognition system: A case study. Computers & Education, 164, 104154.
Nygren, E., Händel, P., & Allwood, C. M. (2020). Voice Assistants: Challenges and Suggestions for
Accessibility for People With Visual Disabilities. International Journal of Human–Computer
Interaction, 36(3), 213-223.
Nygren, E., Händel, P., & Allwood, C. M. (2020). Voice Assistants: Challenges and Suggestions for
Accessibility for People With Visual Disabilities. International Journal of Human–Computer
Interaction, 36(3), 213-223.
Paine, J., O'Donovan, R., & Williams, A. (2020). Real-time automatic speech recognition for deaf and
hard of hearing people. Proceedings of the 22nd International ACM SIGACCESS Conference
on Computers and Accessibility (ASSETS), 2020, 58-71.
Pielot, M., Holz, C., & Dingler, T. (2015). Ambient Light and Seated Work Performance at a Large
Display. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and
Ubiquitous Computing (UbiComp) (pp. 691-702).
Ramalho, G., Marques, F., & Madeira, R. (2019). Towards a universal design approach for inclusive
game-based learning systems. Universal Access in the Information Society, 18(3), 531-544.
Scherer, M. J., Hart, T., Minkel, J., & Feuerstein, M. (2018). Assistive Technologies and Other
Supports for People With Brain Impairment. In M. J. Scherer (Ed.), Theories, Models, and
Concepts in Human-Automation Interaction (pp. 299-322).
Shrestha, A., & Zaman, H. B. (2017). Read My Mail: An Audio Feedback Mobile Application for
Visually Impaired People. In Proceedings of the 2017 International Conference on Inventive
Communication and Computational Technologies (pp. 1849-1853).
Smith, D., Bilmes, J., & Goldstein, S. (2020). Assistive Technology Design and Development for Deaf
and Hard of Hearing Users: An HCI Perspective. ACM Transactions on Accessible Computing
(TACCESS), 11(3), 1-27.

60
Stolcke, A., Audhkhasi, K., Bastan, M., Burget, L., Chen, G., Evermann, G., ... & Watanabe, S.
(2018). Recent developments in the RASR open-source speech recognition toolkit.
Proceedings of the IEEE, 106(5), 797-814.
Thakur, A., & Shinde, G. R. (2019). Voice Command-Based Email System for Visually Impaired
People. International Journal of Scientific & Technology Research, 8(9), 1057-1060.
Thompson, M., Vardell, E., & Jovanovic, J. (2018). Advances in Accessibility Technology: What the
Internet Means for the Visually Impaired. Journal of Visual Impairment & Blindness, 112(4),
442-447.
VocalEyes. (n.d.). About VocalEyes. https://www.vocaleyes.ai/about
Voice Dream. (n.d.). Voice Dream Mail - FAQ. https://www.voicedream.com/mail-faq/
Wang, H., Tang, S., Zhang, W., & Tan, Y. H. (2018). An Analysis of Voice User Interface Usage:
Insights from Large-Scale Field Deployments. In Proceedings of the 2018 CHI Conference on
Human Factors in Computing Systems (pp. 1-12).
WHO. (2020). World Report on Vision. World Health Organization.

61
APPENDIX

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.IO;
using System.Text;
using System.Windows.Forms;
using Telerik.WinControls;

namespace Voice_based_email_system
{
public partial class frmRegister : Telerik.WinControls.UI.RadForm
{
DatabaseHelper dbHelper;
string databaseFileName = "voicemail.db";
string appDirectory = AppDomain.CurrentDomain.BaseDirectory;

public frmRegister()
{
InitializeComponent();
}

private void RadRegister_Click(object sender, EventArgs e)


{
try
{
dbHelper.OpenConnection();
if(radPassword.Text.Trim() == "" ||
radConfirmPassword.Text.Trim() =="" ||
radEmail.Text.Trim() =="" ||
radName.Text.Trim() =="")
{
return;
} else
{
if(radPassword.Text != radConfirmPassword.Text)
{
MessageBox.Show("Password does not Match.");
return;
} else
{

string query = "INSERT INTO


users(name,email,password)VALUES('" + radName.Text + "', '" +
radEmail.Text + "', '" + radPassword.Text + "')";
dbHelper.Insert(query);

62
MessageBox.Show("User registration was successful.");
clearBoxes();
}
}
}
catch (Exception ex)
{

MessageBox.Show(ex.Message);
} finally {
dbHelper.CloseConnection();
}

}
private void clearBoxes()
{
radName.Clear();
radPassword.Clear();
radEmail.Clear();
radConfirmPassword.Clear();
radPassword.Focus();
}

private void FrmRegister_Load(object sender, EventArgs e)


{
string path = Path.Combine(appDirectory, databaseFileName);

dbHelper = new DatabaseHelper(databaseFileName);


}

private void FrmRegister_FormClosing(object sender, FormClosingEventArgs e)


{
if (e.CloseReason == CloseReason.UserClosing)
{
DialogResult result = MessageBox.Show("Are you sure you want to exit
the application?", "Confirmation", MessageBoxButtons.YesNo, MessageBoxIcon.Question);

if (result == DialogResult.No)
{
// If the user clicked "No," cancel the form closing event
e.Cancel = true;
}
// If the user clicked "Yes," the form will close.
}
}
}
}

using System;

63
using System.Collections.Generic;
using System.Data;
using System.Data.SQLite;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace Voice_based_email_system
{
public class DatabaseHelper
{
public SQLiteConnection connection;
private string connectionString;

public DatabaseHelper(string dbPath)


{
connectionString = $"Data Source={dbPath}; Version=3; New = True;
Compress = True;";
connection = new SQLiteConnection(connectionString);
}

public void OpenConnection()


{
if (connection.State != System.Data.ConnectionState.Open)
{
connection.Open();
}
}

public void CloseConnection()


{
if (connection.State != System.Data.ConnectionState.Closed)
{
connection.Close();
}
}

public SQLiteDataReader ExecuteQuery(string query)


{
OpenConnection();
SQLiteCommand command = new SQLiteCommand(query, connection);
SQLiteDataReader reader = command.ExecuteReader();
return reader;
}

//public void Insert(string tableName, Dictionary<string, object> data)


//{
// OpenConnection();
// string columns = string.Join(",", data.Keys);
// string values = string.Join(",", data.Values);

64
// string query = $"INSERT INTO {tableName} ({columns}) VALUES
({values})";

// SQLiteCommand command = new SQLiteCommand(query, connection);


// command.ExecuteNonQuery();
//}

public void Insert(string sql)


{
OpenConnection();

SQLiteCommand command = new SQLiteCommand(sql, connection);


command.ExecuteNonQuery();
}

// Update an existing record in the database


public void Update(string tableName, Dictionary<string, object> data, string
whereClause)
{
OpenConnection();
List<string> updates = new List<string>();

foreach (var item in data)


{
updates.Add($"{item.Key} = {item.Value}");
}

string setClause = string.Join(", ", updates);


string query = $"UPDATE {tableName} SET {setClause} WHERE {whereClause}";

SQLiteCommand command = new SQLiteCommand(query, connection);


command.ExecuteNonQuery();
}

// Delete records from the database


public void Delete(string tableName, string whereClause)
{
OpenConnection();
string query = $"DELETE FROM {tableName} WHERE {whereClause}";

SQLiteCommand command = new SQLiteCommand(query, connection);


command.ExecuteNonQuery();
}

public int GetUserIdByEmail(string email)


{
int userId = -1; // Initialize to a default value (e.g., -1) in case the
email is not found

65
try
{
OpenConnection();

string query = "SELECT id FROM users WHERE email = @Email";


SQLiteCommand command = new SQLiteCommand(query, connection);
command.Parameters.AddWithValue("@Email", email);

object result = command.ExecuteScalar();

if (result != null)
{
userId = Convert.ToInt32(result);
}
}
finally
{
CloseConnection();
}

return userId;
}

public bool attemptLogin(string email, string password)


{
try
{
OpenConnection();

string query = $"SELECT * FROM users WHERE email='"+ email + "' AND
password='" + password+ "'";
using (SQLiteCommand command = new SQLiteCommand (query, connection))
{
int count = Convert.ToInt32(command.ExecuteScalar());
if(count > 0)
{

return true;
} else
{
return false;
}
}

}
catch (Exception)
{

return false;
throw;

66
}finally
{
CloseConnection();
}
}

public List<MailData> GetMailData()


{
List<MailData> mailList = new List<MailData>();

try
{
OpenConnection();

string query = "SELECT mail.*, users.email AS sender_email " +


"FROM mail " +
"INNER JOIN users ON mail.sender_id = users.id";

SQLiteCommand command = new SQLiteCommand(query, connection);

SQLiteDataReader reader = command.ExecuteReader();

while (reader.Read())
{
// Assuming you have a class MailData to hold the mail data
MailData mail = new MailData
{
Sender = reader["sender_email"].ToString(),
//ReceiverId = Convert.ToInt32(reader["receiver_id"]),
Subject = reader["subject"].ToString(),
Body = reader["body"].ToString()
};

mailList.Add(mail);
Console.WriteLine(mailList);
}
}
finally
{
CloseConnection();
}

return mailList;
}

}
}

67

You might also like