DeustoTeka Examinando por Autor "Figuera, Pau"

Examinando por Autor "Figuera, Pau"

Mostrando 1 - 4 de 4

Clustering validation inference
(Multidisciplinary Digital Publishing Institute (MDPI), 2024-08) Figuera, Pau; Cuzzocrea, Alfredo; García Bringas, Pablo
Clustering validation is applied to evaluate the quality of classifications. This step is crucial for unsupervised machine learning. A plethora of methods exist for this purpose; however, a common drawback is that statistical inference is not possible. In this study, we construct a density function for the cluster number. For this purpose, we use smooth techniques. Then, we apply non-negative matrix factorization using the Kullback–Leibler divergence. Employing a unique linearly independent uncorrelated observational variable hypothesis, we construct a sequence by varying the dimension of the span space of the factorization only using analytical techniques. The expectation of the limit of this sequence follows a gamma probability density function. Then, identifying the dimension of the factorization of the space span with clusters, we transform the estimation of the suitable dimension of the factorization into a probabilistic estimate of the number of clusters. This approach is an internal validation method that is suitable for numerical and categorical multivariate data and independent of the clustering technique. Our main achievement is a predictive clustering validation model with graphical abilities. It provides results in terms of credibility, thus making it possible to compare results such as expert judgment on a quantitative basis.
Explainability for machine learning
(Universidad de Deusto, 2024-05-15) Figuera, Pau; García Bringas, Pablo; Facultad de Ingeniería, Programa de Doctorado en Ingeniería para la Sociedad de la Información y Desarrollo Sostenible por la Universidad de Deusto
El leitmotiv de esta Tesis es la búsqueda de interpretaciones con contenido explicable para Machine Learning. La explicabilidad la interpretamos como la fundamentación de los métodos desarrollados sobre técnicas algebraicas y estadísticas sólidas. El punto de partida es la relación entre el Probabilistic Latent Semantic Analysis y el Teorema de Descomposición en Valores Singulares. El trabajo se basa en la interpretación de la estructura de la dimensionalidad del espacio de factorización. Con estas condiciones, la búsqueda del significado de la matriz diagonal se relaciona con el kernel de Fisher. El álgebra de las matrices de entradas no negativas soporta estas estructuras de forma natural. El resultado que derivamos es que este kernel puede obtenerse de esta forma. Con la divergencia de Bregman demostramos que el error de clasificación es arbitrariamente pequeño, preservando la consistencia. Una consecuencia que examinamos es el comportamiento asintótico de las trazas que se obtienen con estas matrices. Su esperanza es un estadístico modelado por una densidad que obedece a una densidad gamma. La estimación es eficiente. Aplicamos este resultado al problema de clustering, lo que permite construir un criterio de validación. El resultado novedoso es que permite la inferencia (en sentido estadístico) en la validación de la clusterización. Se presentan los desarrollos teóricos que nos permiten llegar a cada conclusión. Además, proporcionamos ejemplos de aplicación para los resultados que hemos derivado.
On the probabilistic latent semantic analysis generalization as the singular value decomposition probabilistic image
(Atlantis Press, 2020-06-19) Figuera, Pau; García Bringas, Pablo
The Probabilistic Latent Semantic Analysis has been related with the Singular Value Decomposition. Several problems occur when this comparative is done. Data class restrictions and the existence of several local optima mask the relation, being a formal analogy without any real significance. Moreover, the computational difficulty in terms of time and memory limits the technique applicability. In this work, we use the Nonnegative Matrix Factorization with the Kullback-Leibler divergence to prove, when the number of model components is enough and a limit condition is reached, that the Singular Value Decomposition and the Probabilistic Latent Semantic Analysis empirical distributions are arbitrary close. Under such conditions, the Nonnegative Matrix Factorization and the Probabilistic Latent Semantic Analysis equality is obtained. With this result, the Singular Value Decomposition of every nonnegative entries matrix converges to the general case Probabilistic Latent Semantic Analysis results and constitutes the unique probabilistic image. Moreover, a faster algorithm for the Probabilistic Latent Semantic Analysis is provided.
Revisiting probabilistic latent semantic analysis: extensions, challenges and insights
(Multidisciplinary Digital Publishing Institute (MDPI), 2024-01-03) Figuera, Pau; García Bringas, Pablo
This manuscript provides a comprehensive exploration of Probabilistic latent semantic analysis (PLSA), highlighting its strengths, drawbacks, and challenges. The PLSA, originally a tool for information retrieval, provides a probabilistic sense for a table of co-occurrences as a mixture of multinomial distributions spanned over a latent class variable and adjusted with the expectation–maximization algorithm. The distributional assumptions and the iterative nature lead to a rigid model, dividing enthusiasts and detractors. Those drawbacks have led to several reformulations: the extension of the method to normal data distributions and a non-parametric formulation obtained with the help of Non-negative matrix factorization (NMF) techniques. Furthermore, the combination of theoretical studies and programming techniques alleviates the computational problem, thus making the potential of the method explicit: its relation with the Singular value decomposition (SVD), which means that PLSA can be used to satisfactorily support other techniques, such as the construction of Fisher kernels, the probabilistic interpretation of Principal component analysis (PCA), Transfer learning (TL), and the training of neural networks, among others. We also present open questions as a practical and theoretical research window.

Examinando por Autor "Figuera, Pau"

Resultados por página

Opciones de ordenación