Lec 1 DM .pdf
Original filename: Lec_1_DM.pdf
Title: Data Mining: Concepts and Techniques â€” Slides for Textbook â€” â€” Chapter 1 â€”
Author: Slobodan Vucetic
This PDF 1.5 document has been generated by MicrosoftÂ® PowerPointÂ® 2013, and has been sent on pdf-archive.com on 05/11/2015 at 22:09, from IP address 41.37.x.x.
The current document download page has been viewed 548 times.
File size: 927 KB (47 pages).
Privacy: public file
Download original PDF file
Lec_1_DM.pdf (PDF, 927 KB)
Share on social networks
Link to this file download page
CIS527: Data Warehousing, Filtering, and
“Necessity is the Mother of Invention”
• Data explosion problem
– Automated data collection tools and mature database technology
lead to tremendous amounts of data stored in databases, data
warehouses and other information repositories
• We are drowning in data, but starving for knowledge!
• Solution: Data warehousing and data mining
– Data warehousing and on-line analytical processing
– Extraction of interesting knowledge (rules, regularities, patterns,
constraints) from data in large databases
Why Mine Data? Commercial Viewpoint
• Lots of data is being collected
– Web data, e-commerce
– purchases at department/
– Bank/Credit Card
• Computers have become cheaper and more powerful
• Competitive Pressure is Strong
– Provide better, customized services for an edge (e.g. in
Customer Relationship Management)
Why Mine Data? Scientific Viewpoint
• Data collected and stored at
enormous speeds (GB/hour)
– remote sensors on a satellite
– telescopes scanning the skies
– microarrays generating gene
– scientific simulations
generating terabytes of data
• Traditional techniques infeasible for raw
• Data mining may help scientists
– in classifying and segmenting data
– in Hypothesis Formation
What Is Data Mining?
• Data mining (knowledge discovery in databases):
– Extraction of interesting (non-trivial, implicit, previously
unknown and potentially useful) information or patterns
from data in large databases
• Alternative names and their “inside stories”:
– Data mining: a misnomer?
– Knowledge discovery(mining) in databases (KDD),
knowledge extraction, data/pattern analysis, data
archeology, business intelligence, etc.
Examples: What is (not) Data Mining?
What is not Data
What is Data Mining?
– Look up phone
– Certain names are more
number in phone
prevalent in certain US locations
(O’Brien, O’Rurke, O’Reilly… in
– Query a Web
– Group together similar
documents returned by search
engine according to their context
(e.g. Amazon rainforest,
search engine for
Data Mining: Classification Schemes
• Decisions in data mining
– Kinds of databases to be mined
– Kinds of knowledge to be discovered
– Kinds of techniques utilized
– Kinds of applications adapted
• Data mining tasks
– Descriptive data mining
– Predictive data mining
Decisions in Data Mining
• Databases to be mined
– Relational, transactional, object-oriented, object-relational,
active, spatial, time-series, text, multi-media, heterogeneous,
legacy, WWW, etc.
• Knowledge to be mined
– Characterization, discrimination, association, classification,
clustering, trend, deviation and outlier analysis, etc.
– Multiple/integrated functions and mining at multiple levels
• Techniques utilized
– Database-oriented, data warehouse (OLAP), machine learning,
statistics, visualization, neural network, etc.
• Applications adapted
– Retail, telecommunication, banking, fraud analysis, DNA mining, stock
market analysis, Web mining, Weblog analysis, etc.
Data Mining Tasks
• Prediction Tasks
– Use some variables to predict unknown or future values of other
• Description Tasks
– Find human-interpretable patterns that describe the data.
Common data mining tasks
Association Rule Discovery [Descriptive]
Sequential Pattern Discovery [Descriptive]
Deviation Detection [Predictive]
Link to this page
Use the permanent link to the download page to share your document on Facebook, Twitter, LinkedIn, or directly with a contact by e-Mail, Messenger, Whatsapp, Line..
Use the short link to share your document on Twitter or by text message (SMS)
Copy the following HTML code to share your document on a Website or Blog