R or weka lab | Computer Science homework help

  

Laboratory I:

         

To download additional .arff data sets go to:

http://www.hakank.org/weka/

or search the Internet for .arff files required

· What’s the difference between a “training set” and a “test set”?

· Why might a pruned decision tree that doesn’t fit the data so well be better than an un-pruned one?

· What’s the first thing that 1R does when making a rule based on a numeric attribute?

· How does 1R avoid overfitting when making a rule based on an enumerated and/or numeric attribute?

· What is the difference between Attribute, Instance and Training set? 

  • What      is the difference between ID3 and C4.5?
  1. Use the following learning      schemes to analyze the iris data (in iris.arff): 

  

OneR

– weka.classifiers.OneR

 

Decision table

– weka.classifiers.DecisionTable -R

 

C4.5

– weka.classifiers.j48.J48

· Do the decisions made by the classifiers make sense to you? Why?

· What can you say about the accuracy of these classifiers? When classifying iris that has not been used for training? 

· How did each one of the methods perform?

  1. Use the following learning      schemes to analyze the bolts data (bolts.arff without the TIME attribute):      

  

Decision Tree

– weka.classifiers.j48.J48

 

Decision table

– weka.classifiers.DecisionTable -R

 

Linear regression

– weka.classifiers.LinearRegression

 

M5′ 

– weka.classifiers.M5′

· The dataset describes the time needed by a machine to produce and count 20 bolts. (More details can be found in the file containing the dataset.) 

· Analyze the data. What adjustments have the greatest effect on the time to count 20 bolts? 

· According to each classifier, how would you adjust the machine to get the shortest time to count 20 bolts?

  1. Produce      a model for both Weather and Weather.nominal data sets. Which method(s) did you use? What did      the tree(s) look like?

Laboratory II:

 

To download additional .arff data sets go to:

weka data folder for

BreastTumor.arff

http://www.hakank.org/weka/

zoo.arff, wine.arff, bodyfat.arff, sleep.arff, pollution.arff

  1. Use the following learning schemes to analyze the zoo      data (in zoo.arff): 

  

OneR

– weka.classifiers.OneR

 

Decision table

– weka.classifiers.DecisionTable -R

 

C4.5

– weka.classifiers.j48.J48

 

K-means

– weka.clusterers.SimpleKMeans

Try using reduced error pruning for the C4.5. Did it change the produced model? Why? 

For K-means, for the first run, set k=10. Adjust as needed. What was the final number of k? Why?

  1. Use the following learning schemes to analyze the      breast tumor data. 

  

Linear regression

– weka.classifiers.LinearRegression

 

M5′ 

– weka.classifiers.M5′

 

Regression Tree

– weka.classifiers.M5′

 

K-means clustering

– weka.clusterers.SimpleKMeans

A) How many leaves did the Model tree produce? Regression Tree? What happens if you change the pruning factor? 

How many clusters did you choose for the K-means method? Was that a good choice? Did you try a different value for k?

B) Now perform the same analysis on the bodyfat.arff data set.

  1. Use a      k-means clustering technique to analyze the iris data set. What did you      set the k value to be? Try several different values. What was the random seed value?      Experiment with different random seed values. How did changing of these values      influence the produced models?
  2. Produce      a hierarchical clustering (COBWEB) model for iris data. How many clusters did it produce? Why?      Does it make sense? What did you expect?

Change the acuity and cutoff parameters in order to produce a model similar to the one obtained in the book. Use the classes to cluster evaluation – what does that tell you?

Laboratory III:

 

To download additional .arff data sets go to:

http://www.hakank.org/weka/

zoo.arff, wine.arff, soybean.arff, zoo2_x.arff, 

sunburn.arff, disease.arff

8. Use the following learning schemes to compare the training set and 10-fold stratified cross-validation scores of the disease data (in disease.arff): 

  

Decision table

– weka.classifiers.DecisionTable -R

 

C4.5

– weka.classifiers.j48.J48

 

Id3

– weka.clusterers.Id3

A) What does the training set evaluation score tell you? 

B) What does the cross-validation score evaluate? 

C) Which one of these models would you say is the best? Why?

9. Use the following learning schemes to analyze the wine data (in wine.arff). 

  

C4.5

– weka.classifiers.j48.J48

 

Decision List

– weka. classifiers.PART

A) What is the most important descriptor (attribute) in wine.arff?

B) How well were these two schemas able to learn the patterns in the dataset? How would you quantify your answer?

C) Compare the training set and 10-fold cross-validations scores of the two schemas.

D) Would you trust these two models? Did they really learn what is important for proper classification of wine?

E) Which one would you trust more, even if just very slightly?

10. Perform the same analysis of sunburn.arff as in 2. Instead of 10-fold cross-validations use 5-fold.

A)-E) Same as in 2.

F) Why could not we use 10-fold evaluation in this example?

11. Choose one of the following three files: soybean.arff, zoo.arff or zoo2_x.arff and use any two schemas of your choice to build and compare the models.

Calculate your paper price
Pages (550 words)
Approximate price: -

Why Choose Us

Quality Papers

We value our clients. For this reason, we ensure that each paper is written carefully as per the instructions provided by the client. Our editing team also checks all the papers to ensure that they have been completed as per the expectations.

Professional Academic Writers

Over the years, our Acme Homework has managed to secure the most qualified, reliable and experienced team of writers. The company has also ensured continued training and development of the team members to ensure that it keep up with the rising Academic Trends.

Affordable Prices

Our prices are fairly priced in such a way that ensures affordability. Additionally, you can get a free price quotation by clicking on the "Place Order" button.

On-Time delivery

We pay strict attention on deadlines. For this reason, we ensure that all papers are submitted earlier, even before the deadline indicated by the customer. For this reason, the client can go through the work and review everything.

100% Originality

At Buy An Essay, all papers are plagiarism-free as they are written from scratch. We have taken strict measures to ensure that there is no similarity on all papers and that citations are included as per the standards set.

Customer Support 24/7

Our support team is readily available to provide any guidance/help on our platform at any time of the day/night. Feel free to contact us via the Chat window or support email: support@acmehomework.com.

Try it now!

Calculate the price of your order

We'll send you the first draft for approval by at
Total price:
$0.00

How it works?

Follow these simple steps to get your paper done

Place your order

Fill in the order form and provide all details of your assignment.

Proceed with the payment

Choose the payment system that suits you most.

Receive the final file

Once your paper is ready, we will email it to you.

Our Services

Buy An Essay has stood as the world’s leading custom essay writing services providers. Once you enter all the details in the order form under the place order button, the rest is up to us.

Essays

Essay Writing Services

At Buy An Essay, we prioritize on all aspects that bring about a good grade such as impeccable grammar, proper structure, zero-plagiarism and conformance to guidelines. Our experienced team of writers will help you completed your essays and other assignments.

Admissions

Admission and Business Papers

Be assured that you’ll definitely get accepted to the Master’s level program at any university once you enter all the details in the order form. We won’t leave you here; we will also help you secure a good position in your aspired workplace by creating an outstanding resume or portfolio once you place an order.

Editing

Editing and Proofreading

Our skilled editing and writing team will help you restructure you paper, paraphrase, correct grammar and replace plagiarized sections on your paper just on time. The service is geared toward eliminating any mistakes and rather enhancing better quality.

Coursework

Technical papers

We have writers in almost all fields including the most technical fields. You don’t have to worry about the complexity of your paper. Simply enter as much details as possible in the place order section.