Show simple item record

dc.contributor.advisorGeyer, Claudio Fernando Resinpt_BR
dc.contributor.authorRibeiro, Giovane Dutrapt_BR
dc.date.accessioned2026-01-16T08:02:21Zpt_BR
dc.date.issued2023pt_BR
dc.identifier.urihttp://hdl.handle.net/10183/300254pt_BR
dc.description.abstractThis work introduces a novel architecture that integrates the Proximal Policy Optimiza tion (PPO) algorithm supported by TensorFlow Agents (TF-Agents) to intelligently or chestrate in-memory data for Spark Streaming applications. Apache Spark Streaming is known for its reliability, but it can suffer from performance issues. This is because the Java Virtual Machine is not efficient in managing the heap for data caching. As a re sult, out-of-memory (OOM) problems can occur, which can compromise data integrity, increase latency, and decrease throughput. Additionally, memory-borrowing operations can be strenuous, leading to OOM exceptions, high latency, and system crashes. The proposed architecture introduces a novel framework that enhances the existing policy module of SparkStreaming++ (SS++). It enables the adaptation of various algorithms implemented by TF-Agents and custom implementations, thereby offering a compelling alternative to the current heuristic-based and previous reinforcement learning-based poli cies. The PPO algorithm is implemented on this new architecture to showcase its efficacy. This unique framework operates independently of Spark’s native backpressure mecha nisms and presents an improved approach to policy implementation. The findings indi cate that PPO consistently achieved higher throughput, surpassing SAQN and Adaptive by up to 24.6% and 25.5%, respectively. However, PPO’s processing times and schedul ing delays were longer than those of SAQN and Adaptive. Although PPO was robust and efficient in high-data-intensity situations, there are still areas for improvement. PPO ex hibited excellent stability, especially in high-data-intensity scenarios, and maintained an application performance, with an average response time of 0.04 seconds. The results sug gest that PPO is best suited for high-throughput applications, but minimizing scheduling delays and processing times isen
dc.format.mimetypeapplication/pdfpt_BR
dc.language.isoengpt_BR
dc.rightsOpen Accessen
dc.subjectAprendizagem por reforçopt_BR
dc.subjectPPOen
dc.subjectGestão de recursospt_BR
dc.subjectResource managementen
dc.subjectSparken
dc.subjectAlgoritmospt_BR
dc.subjectOtimização de política proximalpt_BR
dc.subjectStreamingen
dc.subjectModel-freeen
dc.subjectTF-Agentsen
dc.titleA reinforcement reinforcement learning for data orchestration in spark streaming framework with tensorFlow agentspt_BR
dc.typeTrabalho de conclusão de graduaçãopt_BR
dc.contributor.advisor-coMatteussi, Kassiano Josépt_BR
dc.identifier.nrb001195957pt_BR
dc.degree.grantorUniversidade Federal do Rio Grande do Sulpt_BR
dc.degree.departmentInstituto de Informáticapt_BR
dc.degree.localPorto Alegre, BR-RSpt_BR
dc.degree.date2023pt_BR
dc.degree.graduationCiência da Computação: Ênfase em Engenharia da Computação: Bachareladopt_BR
dc.degree.levelgraduaçãopt_BR


Files in this item

Thumbnail
   

This item is licensed under a Creative Commons License

Show simple item record