Comparação de métodos de regularização no treinamento de redes neurais artificiais aplicado a uma torre de resfriamento de água

Costa, Artur Tiburski Vaz

Ver

Texto completo (2.975Mb)

Fecha

2022

Autor

Costa, Artur Tiburski Vaz

Tutor

Fernandes, Pedro Rafael Bolognese

Nivel académico

Grado

Resumo

O crescente desenvolvimento e aplicação de técnicas de inteligência artificial a processos nas mais diversas indústrias promove a necessidade de, cada vez mais, estudar e adaptar o uso destas ferramentas a processos da indústria química. Entre estas ferramentas estão as Redes Neurais Artificiais, cuja capacidade de generalização tem o potencial de, sob as corretas condições da base de dados e treinamento, proporcionar resultados muito próximos da realidade. Nesse sentido, este trabalho tem como objetivo testar as técnicas de regularização comumente empregadas no treinamento de redes neuronais, as quais visam evitar os problemas de sobre-ajuste (overfitting) associados ao modelo, tendo como base o estudo de caso de uma torre industrial de resfriamento de água. As técnicas testadas foram as regularizações L1, L2 e Dropout, que foram aplicadas a redes treinadas em diversas configurações e com a utilização de variados valores de hiperparâmetros. Através da análise dos resultados das métricas de desempenho erro médio absoluto (MAE) e erro quadrático médio (RMSE), assim como do comportamento das curvas de aprendizado resultantes do treinamento das redes sob diversas condições, foi possível verificar quais arquiteturas de rede obtiveram os melhores resultados de predição sem regularização de dados, para os casos de uma e duas camadas ocultas. A técnica L2 com λ = 0,01 no caso da RNA com 5 nós internos foi o único caso em que a regularização resultou em valores de MAE e RMSE menores que a rede sem regularização. A partir deste resultado, através de análise exploratória, foi investigado o valor de λ (L2) que proporcionasse os menores MAE e RMSE para esta rede, chegando ao valor de λ = 0,01597. Contudo, foi verificado que, de forma geral, as técnicas de regularização não proporcionaram melhorias significativas na capacidade de generalização das RNA e não demonstraram, nas curvas de aprendizado, diminuição dos efeitos do sobre-ajuste, mesmo que presentes em pequena magnitude. ...

Abstract

The growing development and application of artificial intelligence in processes on various industries promotes the need to increasingly study and adapt the use of these tools to processes of the chemical industry. Among these tools, are the Artificial Neural Networks, which generalization capacities have the potential to, under the correct circumstances of the database and training, deliver results very close to reality. In that regard, this study has, as objective, testing the regularization techniques usually employed in the training of neural networks, which aim at avoiding the problems of overfitting associated to the model, having as study base the analysis of a water industrial cooling tower. The tested techniques were the L1, L2 and Dropout regularizations, which were applied to networks that were trained using various configurations, along with several values of hyperparameters. By analyzing the results of the performance metrics mean absolute error (MAE) and root mean square error (RMSE), as well as the behaviour of the learning curves resulted from the networks’ training under various conditions, it was possible to verify which network architectures obtained the best prediction results without data regularization, for the cases of one and two hidden layers. The L2 technique with λ = 0,01 in the case of the neural network with 5 internal nodes was the only case in which the regularization resulted in values of MAE and RMSE smaller than the network with no regularization. From this result, by using exploratory analysis, it was investigated the value of λ (L2) that provided the smallest MAE and RMSE for this network, obtaining the value of λ = 0,01597. However, it was verified that, in general, the regularization techniques did not deliver significant improvements in the generalization capacity of the neural networks and did not demonstrate, in the learning curves, a decrease of the overfitting effects, even if present in small intensity. ...

Institución

Universidade Federal do Rio Grande do Sul. Escola de Engenharia. Curso de Engenharia Química.

Colecciones

Tesinas de Curso de Grado (38626)

Tesinas Ingenierías (5995)

Otras opciones

Mostrar todos los metadatos

Estatísticas

Este ítem está licenciado en la Creative Commons License