DEPARTMENT OF COMPUTING

Course Home | Syllabus | Assignments | Schedule | Notes | Downloads | [print]

CS 4320: Machine Learning

Assignment: Support Vector Classification

Use the heart_failure_clinical_records_dataset data set at Kaggle. You’ll need to identify which features are categorical and which are numerical. Use hyper parameter search with cross validation to create a decision tree classification model and a support vector classification model to obtain the best F1 scores possible.

It is expected that you will use the Titantic hyper parameter search with cross validation decision tree source code as a starting point for your code development.

Create a report that includes the data exploration plots and analysis. The report will also include for each type of model (decision tree and svc) which hyper parameters were used in the search, the range or set of values used for each hyper parameter, the hyper parameters selected, the number of cross validation sets, the F1 cross-validation score obtained, the training F1 score of the model when trained on all training data, and finally, the F1 score of the model on the testing data.

Include a comparison of the cross validation and full training F1 scores between the two models, and which model you would select, based only on those scores. Finally, discuss whether the F1 scores on the testing data support your decision or not.

Required Steps

Last Updated 01/16/2023