Progress and future directions in machine learning through control theory
| dc.contributor.author | Zuazua, Enrique | |
| dc.date.accessioned | 2026-03-12T14:06:54Z | |
| dc.date.available | 2026-03-12T14:06:54Z | |
| dc.date.issued | 2024 | |
| dc.date.updated | 2026-03-12T14:06:54Z | |
| dc.description | Ponencia presentada en la 21 French-German-Spanish Conference on Optimization, celebrada en Gijón entre el 18 y el 21 de junio de 2024 | es |
| dc.description.abstract | This paper presents our recent advancements at the intersection of machine learning and control theory. We focus specifically on utilizing control theoretical tools to elucidate the underlying mechanisms driving the success of machine learning algorithms. By enhancing the explainability of these algorithms, we aim to contribute to their ongoing improvement and more effective application. Our research explores several critical areas: Firstly, we investigate the memorization, representation, classification, and approximation properties of residual neural networks (ResNets). By framing these tasks as simultaneous or ensemble control problems, we have developed nonlinear and constructive algorithms for training. Our work provides insights into the parameter complexity and computational requirements of ResNets. Similarly, we delve into the properties of neural ODEs (NODEs). We demonstrate that autonomous NODEs of sufficient width can ensure approximate memorization properties. Furthermore, we prove that by allowing biases to be time-dependent, NODEs can track dynamic data. This showcases their potential for synthetic model generation and helps elucidate the success of methodologies such as Reservoir Computing. Next, we analyze the optimal architectures of multilayer perceptrons (MLPs). Our findings offer guidelines for designing MLPs with minimal complexity, ensuring efficiency and effectiveness for supervised learning tasks. The generalization and prediction capacity of trained networks plays a crucial role. To address these properties, we present two nonconvex optimization problems related to shallow neural networks, capturing the ”sparsity” of parameters and robustness of representation. We introduce a ”mean-field” model, proving, via representer theorems, the absence of a relaxation gap. This aids in designing an optimal tolerance strategy for robustness and, through convexification, efficient algorithms for training. In the context of large language models (LLMs), we explore the integration of residual networks with self-attention layers for context capture. We treat ”attention” as a dynamical system acting on a collection of points and characterize their asymptotic dynamics, identifying convergence towards special points called leaders. These theoretical insights have led to the development of an interpretable model for sentiment analysis of movie reviews, among other possible applications. Lastly, we address federated learning, which enables multiple clients to collaboratively train models without sharing private data, thus addressing data collection and privacy challenges. We examine training efficiency, incentive mechanisms, and privacy concerns within this framework, proposing solutions to enhance the effectiveness and security of federated learning methods. Our work underscores the potential of applying control theory principles to improve machine learning models, resulting in more interpretable and efficient algorithms. This interdisciplinary approach opens up a fertile ground for future research, raising profound mathematical questions and application-oriented challenges and opportunities | en |
| dc.description.sponsorship | Our research has been funded by the Alexander von Humboldt-Professorship program, ModConFlex Marie Curie Action, HORIZON-MSCA-2021-DN-01, COST Action MAT-DYN-NET, Transregio 154 Project “Mathematical Modelling, Simulation and Optimization Using the Example of Gas Networks” of the DFG, AFOSR-24IOE027, grants PID2020-112617GB-C22 and TED2021-131390B-I00 of MICINN (Spain) | en |
| dc.identifier.citation | Zuazua, E. (2024). Progress and future directions in machine learning through control theory. FGS 2024 (French-German-Spanish Conference on Optimization) Proceedings: Proceedings, 116-123. | |
| dc.identifier.isbn | 9788410135307 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14454/5424 | |
| dc.language.iso | eng | |
| dc.publisher | Universidad de Oviedo, Servicio de Publicaciones | |
| dc.rights | © 2024 Universidad de Oviedo | |
| dc.rights | © Los autores | |
| dc.title | Progress and future directions in machine learning through control theory | en |
| dc.type | conference paper | |
| dcterms.accessRights | open access | |
| oaire.citation.endPage | 123 | |
| oaire.citation.startPage | 116 | |
| oaire.citation.title | FGS 2024 (French-German-Spanish Conference on Optimization) Proceedings: Proceedings | |
| oaire.licenseCondition | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
| oaire.version | VoR |
Archivos
Bloque original
1 - 1 de 1
Cargando...
- Nombre:
- zuazua_progress_2024.pdf
- Tamaño:
- 2.39 MB
- Formato:
- Adobe Portable Document Format