Research and development - Seminars
The thesis focuses on dimensionality reduction for binary classification problems, aiming to improve classifier accuracy in projected spaces through linear projection methods. In binary classification, each observation belongs to a specific class (e.g., diabetic or non-diabetic), and the goal is for the classifier to be consistent—meaning its error converges to the minimum as more data is gathered. However, classical methods like K-nearest neighbors suffer from the “curse of dimensionality” in high-dimensional data, limiting their practical usability. To address this challenge, the thesis evaluates dimensionality reduction techniques such as Principal Component Analysis (PCA), though PCA does not necessarily optimize classification performance. As an alternative, the work introduces a strategy based on maximizing the Wasserstein distance and related metrics like Sorn distance, to optimize class separation in the projected space. Using linear programming and Lipschitz continuity properties, these distances are leveraged to construct more effective classifiers with lower error rates in the reduced space. The results demonstrate that projections based on Wasserstein and Sorn distances outperform PCA, especially on large datasets, highlighting their potential to replace traditional methods in high-dimensional classification tasks due to their greater accuracy and computational efficiency.
YouTube – Quantil Matemáticas Aplicadas
Not available
Get information about Data Science, Artificial Intelligence, Machine Learning and more.