| Title | Semi-analytical method for analyzing models and model selection measures |
| Abstract | Considering the large amounts of data that is collected everyday in various domains such as health care, financial services, astrophysics and many others, there is a pressing need to convert this information into knowledge. Machine learning and data mining are both concerned with achieving this goal in a scalable fashion. The main theme of my work has been to analyze and better understand prevalent classification techniques and paradigms which are an integral part of machine learning and data mining research, with an aim to reduce the hiatus between theory and practice. Machine learning and data mining researchers have developed a plethora of classification algorithms to tackle classification problems. Unfortunately, no one algorithm is superior to the others in all scenarios and neither is it totally clear as to which algorithm should be preferred over others under specific circumstances. Hence, an important question now is, what is the best choice of a classification algorithm for a particular application? This problem is termed as classification model selection and is a very important problem in machine learning and data mining. The primary focus of my research has been to propose a novel methodology to study these classification algorithms accurately and efficiently in the non-asymptotic regime. In particular, we propose a moment based method where by focusing on the probabilistic space of classifiers induced by the classification algorithm and datasets of size N drawn independently and identically from a joint distribution i.i.d.), we obtain efficient characterizations for computing the moments of the generalization error. Moreover, we can also study model selection techniques such as cross-validation, leave-one-out and hold out set in our proposed framework. This is possible since we have also established general relationships between the moments of the generalization error and moments of the hold-out-set error, cross-validation error and leave-one-out error. Deploying the methodology we were able to provide interesting explanations for the behavior of cross-validation. The methodology aims at covering the gap between results predicted by theory and the behavior observed in practice. |
| Category | Pure Sciences |
| Subject | ComputerScience, |
| FileType | |
| Pages | 157 |
| Price | US$60.00 |
| Language | English |
| Buy Now | |
| Download | |
| Contact |
E-Mail:itpaper@hotmail.com TEL:1-888-786-998A |
| FAQ |
How to get this paper's electronic documents? 1, Click the "Buy Now" button to complete the online payment 2, Download the paper's electronic document from the successful payment return page/Or the system will send this paper's electronic document to your E-Mail within 24 hours |
| Favorite | ADD TO FAVORITE |
Semi-analytical method for analyzing models and model selection measures
Category: Pure Sciences
Tag: ComputerScience
Perhaps You will be interested in these papers
2012-03-11 A hybrid domain decomposition method and its applications to contact problems
2012-03-10 Parallel iterative algorithms for large sparse linear systems
2012-03-10 Species-specific protein secondary structure prediction
2012-03-10 A Non Linear Frequency Domain-Spectral Difference Scheme for Unsteady Periodic Flows
2012-03-09 Efficient computation of regularities in strings and applications
2012-03-09 The human serum glycan cancer biomarker analysis pipeline
2012-03-08 On generalizations of Gowers norms
2012-03-08 Indeterminate strings