Please send us by e-mails (1) the errors you found in the book but not yet
listed in the errata, and (2) your suggestion and comments on the revision
of the book. Thanks!
Chapter 1. Introduction
Chapter 2. Data Warehouse and OLAP Technology for Data Mining
P. 67, second paragraph, "3. The top tier is a client, which contains ..."
should be "3. The top tier is a front-end client layer, which contains ..."
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Aug. 28, 2001)
P. 72, paragraph 3, line 1, "sum" should not be in sans serif font but
in DMQL keyword font.
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Aug. 30, 2001)
P. 73, line 6 of paragraph 4, "day < week < month < quarter < year" should
be "day < month < quarter < year" (pointed out by Ming Fan, on Feb. 17, 2001)
P. 81, Example 2.14, line 2, "sum" should not be in sans serif font but
in DMQL keyword font.
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Aug. 28, 2001)
P. 82, Example 2.15, line 1, "[time, item, location]" should be "[item,
location, time]" (to be consistent with the concrete examples following it).
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Aug. 28, 2001)
P. 82, Example 2.15, line 2, "sum" should not be in sans serif font but
in DMQL keyword font.
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Aug. 28, 2001)
P. 98, paragraph 2 from the bottom, line 3, "the top tier is a client"
should be changed to "the top tier is a front-end client layer"
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Oct. 16, 2001)
P. 146, in the footnote, "relevant dimensions" should be printed in bold,
not just "dimensions".
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Oct. 16, 2001)
P. 158, Figure 4.4, The pie chart is disproportional, because the sum of
class A¡'s count and class B¡'s count is 2440, which is larger than class
C¡'s count, 2160. Thus the pie of C¡ should be smaller (less than half)
(pointed out by Ming Fan on Feb. 16, 2001)
Chapter 5. Concept Description: Characterization and Comparison
P. 186, in Example 5.3, item 4. line 3, "city < province_or_state < country".
should be changed to "birth_city < birth_province_or_state < birth_country".
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Oct. 16, 2001)
P. 187, in Table 5.2, the third column in the table should be called
"birth_region" instead of "birth_country".
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Oct. 16, 2001)
P. 203, in Tables 5.7 and 5.8, "Que" and "Alt" should be changed to
"QC" and "AB" respectively (for consistency of abbreviations).
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Oct. 16, 2001)
Chapter 6. Mining Association Rules in Large Databases
P. 231, 4th paragraph, use bold font for the words "Apriori property",
Italicize the remaining setence: "All nonempty subsets of a frequent itemset must also be frequent."
This sentence should be a paragraph on its own.
That is, make the following as the beginning of a new paragraph (paragraph 5):
"The Apriori property is based on the following observation. ..."
P. 241, paragraph 2, Line 3, "I2 I1: 2" should be "I2 I4: 2"
(pointed out by Ming Fan on Feb. 16, 2001)
P. 241, paragraph 3, line 3, "I1 I3:2" should be "I1 I3:4".
(pointed out by Ming Fan on Feb. 16, 2001)
P. 242 under Algorithm Part 1. (a), "frequent items F" should be "frequent
items F (where an item is frequent if its support is no less than min_sup)"
(pointed out by Ming Fan on Feb. 16, 2001)
On P. 266, line 2, the line should be broken in front of "^ S". Similarly,
line 4, the line should be broken in front of "^ T". That is, it should
look like,
lives(C, _, "Vancouver")
^ sales(C, ?I1, S1) ^ ... ^ sales(C, ?Ik, Sk) ^ I = {I1, ..., Ik}
^ S = {S1, ..., Sk}
=> sales(C, ?J1, T1) ^ ... ^ sales(C, ?Jm, Tm) ^ J = {J1, ..., Jm}
^ T={T1, ..., Tm}
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Oct. 16, 2001)
P. 268, parag. 2, from line 8 on, "Specifically, such a set must contain at
least one item whose price is no less than $500 It is of the form S1 U S2,
whew S1 != 0 is a subset of the set of all those items with prices no less
than $500, and S2 possibly empty, is a subset of the set of all those items
with prices no greater than $500." should be changed to "Specifically, the
price of every item in such a set must be no less than $500." (pointed out
by Anthony K. H. Tung "atung@comp.nus.edu.sg", on Sept. 26, 2002)
P. 274, Exercise 6.7 (b) "multilevel assoication rules"
should be changed to "multilevel (but not cross-level) assoication rules"
P. 274, Exercise 6.7 (c) "multilevel assoication rules"
should be changed to "multilevel (but not cross-level) assoication rules"
P. 304, Fig 7.8. Change the weight w subscripted by kj (i.e., $w_{kj}$)
to jk (i.e., $w_{jk}$). (suggested by Ming Fan on Feb. 16, 2001)
P. 304, Fig 7.8. Label Oj and Ok should be relocated to much closer to the
last circle in the second and third columns of circles---now it looks like
Oj and Ok point to the whole columns, not just the last circles.
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Nov. 20, 2001)
P. 331, line 3, change "Zytko" to "Zytkow" (by author on Aug. 29, 2001)
P. 352. Figure 8.3. To be consistent with the text, the labels used in
the figure (O_i, O_j, O_random) should be o_i, o_j, and o_random (with
lowercase "o"). And the "p" in the second box should be written in boldface
like the other p's in the other boxes.
(suggested by Jonghyun Lee "jlee17@cs.uiuc.edu", on Nov. 20, 2001)
P. 368. parag. 3, starting with "From the density function ...", and
ending with "...for a 2-D data set."
This whole paragraph should be removed (discovered by Steven Y. Lee
(sleep@sfu.ca), CMPT459 student, on Dec. 19, 2000)
P. 373, line 6,
"around each data point The ..." should be
"around each data point. The ..."
(pointed out by Jian Pei "peijian@cs.sfu.ca", on July 22, 2002)
P. 374, the 2nd sentence in the 2nd paragraph, "it conforms ..." should
be "It conforms ...". Also, in the same sentence,
"... a good clustering algorithm: It handles ...".
should be changed to: "... a good clustering algorithm: it handles ...".
(suggested by Jonghyun Lee "jlee17@cs.uiuc.edu", on Nov. 20, 2001)
P. 375, Figure 8.17. Dense areas shown in the third graph do not match well
with the ones shown in the first two graphs. Also, it is better to show the
third plane and project the 3-plane intersection in the 3-D graph.
(suggested by Jonghyun Lee "jlee17@cs.uiuc.edu", on Nov. 20, 2001)
P. 407, 2nd line of Example 9.6.
"It consists of four dimensions: region temperature, ...", should be
"It consists of four dimensions: region, temperature, ...".
(suggested by Jonghyun Lee "jlee17@cs.uiuc.edu", on Nov. 20, 2001)
P. 407 Figure 9.2,
in region dimension table, "region_name" should be changed to "region",
in BC_weather fact table, "region_name" should be changed to "probe_location".
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Nov. 20, 2001)
P. 408 Figure 9.3, line 1, "region_name" should be changed to "region".
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Nov. 20, 2001)
P. 413, line 6, "image-" should be changed to "image".
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Dec. 11, 2001)
P. 425, the 3rd paragraph from the bottom, boldface for "time interval",
not just for "interval".
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Dec. 11, 2001)
P. 466. This is too obvious, but text or another figure is needed for
the blank space.
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Dec. 11, 2001)
P. 488, line 1, "(Minimum_size = 3)" should be written in norman font.
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Dec. 11, 2001)
P. 488, 6th line from the bottom, "INTO CLAUSE" is written in the same
bold font, CLAUSE should be written in lowercase, normal font.
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Dec. 11, 2001)
P. 490 line 9,
"OLE DB for DM provides a number of functions that can used ..."
should be "OLE DB for DM provides a number of functions that can be used ..."
(pointed out by Kalman Balogh (KBalogh@matavnet.hu), on March 25, 2001).
PP. 490-491. For the various functions shown in the pages (i.e. Cluster(),
ClusterProbability(), PredictHistogram(), etc), the main text uses bold
font but the illustrative example uses a different font. The font used
in the text should be changed to that used in the example to make them
consistent.
(pointed out by Jonghyun Lee "jlee17@cs.uiuc.edu", on Dec. 11, 2001)
P.520, the last line, "http://www.microsoft.com/data/oledb/dm.html"
should be "http://www.microsoft.com/data/oledb/dm.htm".
(pointed out by Illhoi Yoo "potence@drexel.edu", on Aug. 11, 2002)
Appendix B. An Introduction to DBMiner
Index
P. 534, column 1, line 25, (i.e., before "maxpaterrn") add an index entry:
"lift 261" (suggested by Steven Y. Lee (sleep@sfu.ca), CMPT459 student,
on Dec. 19, 2000)
P. 542. Column 2, line 7 (before Google", add an index entry:
"Gini index, 292" (suggested by Steven Y. Lee (sleep@sfu.ca),
CMPT459 student, on Dec. 19, 2000)
P. 535. Columns 1 and 3, C4.5 add: "C4.5 291", (suggested by Jian Pei
(jianpei@cse.buffalo.edu) on Oct. 18, 2002)