Data Mining Assignment

Please address the questions and then submit. You will need to ensure to use proper APA citations with any content that is not your own work.

 

Question 1

Suppose that you are employed as a data mining consultant for an Internet search engine company. Describe how data mining can help the company by giving specific examples of how techniques, such as clustering, classification, association rule mining, and anomaly detection can be applied.

Question 2

Identify at least two advantages and two disadvantages of using color to visually represent information.

Question 3

Consider the XOR problem where there are four training points: (1, 1, −),(1, 0, +),(0, 1, +),(0, 0, −). Transform the data into the following feature space:

Φ = (1, √ 2×1, √ 2×2, √ 2x1x2, x2 1, x2 2).

Find the maximum margin linear decision boundary in the transformed space.

Question 4

Consider the following set of candidate 3-itemsets: {1, 2, 3}, {1, 2, 6}, {1, 3, 4}, {2, 3, 4}, {2, 4, 5}, {3, 4, 6}, {4, 5, 6}

Construct a hash tree for the above candidate 3-itemsets. Assume the tree uses a hash function where all odd-numbered items are hashed to the left child of a node, while the even-numbered items are hashed to the right child. A candidate k-itemset is inserted into the tree by hashing on each successive item in the candidate and then following the appropriate branch of the tree according to the hash value. Once a leaf node is reached, the candidate is inserted based on one of the following conditions:

Condition 1: If the depth of the leaf node is equal to k (the root is assumed to be at depth 0), then the candidate is inserted regardless of the number of itemsets already stored at the node.

Condition 2: If the depth of the leaf node is less than k, then the candidate can be inserted as long as the number of itemsets stored at the node is less than max size. Assume max size = 2 for this question.

Condition 3: If the depth of the leaf node is less than k and the number of itemsets stored at the node is equal to maximize, then the leaf node is converted into an internal node. New leaf nodes are created as children of the old leaf node. Candidate itemsets previously stored in the old leaf node are distributed to the children based on their hash values. The new candidate is also hashed to its appropriate leaf node.

How many leaf nodes are there in the candidate hash tree? How many internal nodes are there?

Consider a transaction that contains the following items: {1, 2, 3, 5, 6}. Using the hash tree constructed in part (a), which leaf nodes will be checked against the transaction? What are the candidate 3-itemsets contained in the transaction?

Question 5

Consider a group of documents that have been selected from a much larger set of diverse documents so that the selected documents are as dissimilar from one another as possible. If we consider documents that are not highly related (connected, similar) to one another as being anomalous, then all of the documents that we have selected might be classified as anomalies. Is it possible for a data set to consist only of anomalous objects or is this an abuse of the terminology?

 

Order a unique copy of this paper
(550 words)

Approximate price: $22

Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our Guarantees

As the best, my homework help website in the world, Writersabc.com strives to deliver only high-quality finished papers to all customers. We value impeccable quality and guarantee that we will deliver on that promise more than anything else. We will deliver!
With us you are guaranteed of quality work done by our qualified experts.Your information and everything that you do with us is kept completely confidential.

Money-Back Guarantee

Have you received your finished paper but are not satisfied with what our writer submitted? You can initiate our money-back guarantee to get your money back with no strings attached.

Read more

Zero-Plagiarism Guarantee

Writersabc.com is the best my homework help website in the world. At WritersABC, we have a team of certified, tried, and tested writers who work around the clock to ensure that you receive only high-quality, 100% original finished papers.

Read more

Free-Revision Policy

At WritersABC, we guarantee all our customers of the best essay writing service in the writing industry. And that’s precisely what we strive to deliver. As such, we encourage all our customers to utilize our unlimited free-revision policy if you aren’t satisfied with your paper. Don’t accept any paper until you are 100% satisfied with it.

Read more

Privacy Policy

We value the trust that our clients accord us and respect every customers’ rights to personal data protection. We will never share, sell, or rent any information that we collect from you with any third parties. Both your personal and financial information is safe with us.

Read more

Fair-Cooperation Guarantee

We have only gotten this far with the help of our loyal customers and a team of dedicated experts. As the best, my homework help website, WritersABC implores customers to help make our writers’ work easier. Visit our fair-cooperation guarantee for more information on the same.

Read more

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency