Big data analytics involves examining large datasets to uncover insights that aid in informed business decisions, such as identifying market trends and customer preferences. It is crucial for data-driven decision-making, leading to improved marketing, operational efficiency, and new revenue opportunities. The process includes steps like data collection, cleansing, analysis, and the application of various analytical techniques such as machine learning and predictive analytics.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
4 views
Big Data Analytics Introduction-lect 1
Big data analytics involves examining large datasets to uncover insights that aid in informed business decisions, such as identifying market trends and customer preferences. It is crucial for data-driven decision-making, leading to improved marketing, operational efficiency, and new revenue opportunities. The process includes steps like data collection, cleansing, analysis, and the application of various analytical techniques such as machine learning and predictive analytics.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26
Big data analytics
• complex process of examining big data to uncover
information – • such as hidden patterns, correlations, • market trends and • Customer preferences – • that can help organizations make informed business decisions. • data analytics technologies and techniques give organizations a way to analyze data sets and gather new information Why is big data analytics important? • to make data-driven decisions that can improve business-related outcomes • More effective marketing, new revenue opportunities, customer personalization and improved operational efficiency How does big data analytics work? • collect • process :organize, configure and • partition the data properly for analytical queries, • clean any errors or inconsistencies, such as duplications or formatting mistakes, and organize and uncluttered the data. • analyze Analyze • data mining, which sifts through data sets in search of patterns and • relationships • predictive analytics, which builds models to forecast customer behavior and other future actions, scenarios and trends • machine learning, which taps various algorithms to analyze large data sets • deep learning, which is a more advanced offshoot of machine learning • text mining and statistical analysis software • artificial intelligence (AI) • mainstream business intelligence software • data visualization tools Steps of Data Analytics • Precise results • Setting • Data ’ prioriti analysi interpr es for • Data s etation • Goals measu cleansi • Data • Chec setting remen • Data ng minin king • Vital, ts • Outli g, whet gatheri unde • Decid er busin her ng rstan e • Avail reject ess they dable what ion, intelli are able , to missi genc helpf datas simpl meas ng e, ul in ets, e, uring, value data meeti recor short, and s visual ng ding/ and what inter izatio initial gener meas meth polati n, objec ating urabl ods on, explo tives, data e to data rator result goals use struct y s for uring data limiti meas analy ng, or ure it sis incon clusiv e 1 2 3 4 5 6 1. Goal Setting • The business unit has to decide on objectives for the data analytics. • These objectives might be set out in question format • For example, if a business is struggling to sell its products, some relevant questions may be: – Are we overpricing our goods? – How is the competition’s product different to ours? • To answer the question, “Are we overpricing our goods?” business company have to gather data of: – Production costs – Details about the price of similar goods on the market. 2. Setting Priorities for Measurements • Determining what type of data is needed to answer the questions regarding objectives. • How much time to take for the analysis of the project. • The units of measurement going to be using. 3. Data Gathering • Data can be already available datasets • Data can be generated by: – The direct or interview method • Company would interview “shoppers” regarding their favorite brand of toothpaste. – The indirect or questionnaire method • The questionnaire are distributed to the respondents either by personal delivery or by mail/email. – The registration method • The registration records kept by government organizations, e.g., NADRA. – The experimental method • Experimentation, simulation. 4. Data Cleansing • Data cleansing process identifying: – Incomplete – Incorrect – Inaccurate – Irrelevant parts of the data • The dirty or coarse data is: • Replaced • Modified • Or deleted. Data Cleansing Cycle 5. Data Analysis • Data analysis is process of: – Evaluating data using: • Analytical reasoning • Logical reasoning • To examine each component of the data provided. Steps of Data Analysis Data Analytics Capabilities Feature Engineering FE • “Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved accuracy on unseen data.” Jason Brownlee, Machine Learning Mastery.
• As the models are getting better and better, the focus
shifts to what is put into them.
• Transforming data to create model’s inputs.
Feature Extraction • Dimension reduction – Principal component analysis (PCA) – Non-negative matrix factorization (NMF) – Kernel PCA – Graph-based kernel PCA – Generalized discriminant analysis (GDA) • Data smoothing – Wavelet transform – Ramer–Douglas–Peucker algorithm – Kernel smoother – Laplacian smoothing – Local regression, … Feature Selection • Identifying features that are redundant or irrelevant • Improved model interpretability. Models for Analysis • Approaches – Classification – Regression • Techniques – Data mining – Machine learning – Artificial Intelligence (AI) Introduction to Computational Data Analytics The computational data analytics • The computational data analytics : interdisciplinary field to provide depth and specialization in • data science, ML, deep learning, natural language, AI, visualization, databases, high-performance computing, etc. • Some examples of computational thinking include developing a chess strategy, making and reading maps, and organizing a long to-do list into manageable daily tasks Computational Data Analytics • Steps of Computational Thinking: • Abstraction: Problem formulation; • Automation: Solution expression; • Analysis: Solution execution and evaluation.
• Principals of Computational Thinking:
• This broad problem-solving technique includes four elements: • decomposition, pattern recognition, abstraction and algorithms.
• There are a variety of ways that students can practice their
computational thinking, well before they try computer programming. Computational Data Analytics • computational skills are defined as the abilities to calculate basic addition, subtraction, multiplication, and division problems quickly and accurately using mental methods, paper-and-pencil, and other tools, such as a calculator. • The biggest benefit of computational thinking is how it enables real-world problem solving. For kids, knowing how to take large problems and break them into simpler steps can help with everything from solving math problems to writing a book report. Computational Data Analytics • Types of Computation: • Models of computation can be classified into three categories: sequential models, functional models, and concurrent models. • Purpose of Computational: Computational models intelligently gather, filter, analyze and present information • e.g. present health information to provide guidance to doctors for disease treatment based on detailed characteristics of each patient. Classification Prediction Definition: A classification is a division or Definition: Prediction is a statement category in a system which divides things made about the future, forecasting into groups or types unknown/ future figures Model: Predicts categorical class labels Model: Models continuous-valued (discrete or nominal) functions, i.e., predicts unknown or missing values Methods: Methods: Linear Classifier LDA Linear Regression SVM Non linear regression Decision trees Poisson regression Bayesian Classifier Generalized linear model Artificial Neural network Log-linear models Kernel estimation k-nearest neighbor Regression trees Applications : Email spam filtering Applications : Credit approval Cancer diagnosis Target marketing Voice classification (for Siri type Fault avoidance applications) Medical diagnosis Video classification (for uploaded videos Fraud detection on youtube, etc.)
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
(Ebook) Image Processing, Analysis, and Machine Vision, 3rd Edition by Milan Sonka, Vaclav Hlavac, Boyle Roger ISBN 9780495244387, 0495244384 instant download