<div dir="ltr"><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">Caros,<div><br></div><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div style="text-align:justify"><div dir="ltr"><div>Na próxima sexta-feira (10 de Novembro, às 13:30h) o ciclo de Seminários do Departamento de Estatística da UFMG terá a apresentação do prof. Abhirup Datta da John Hopkins - USA.<br></div></div><div dir="ltr"><br><div><div><div dir="ltr"><div><div><div><span style="color:rgb(33,33,33)">Dr. Datta is an Associate Professor in the Department of Biostatistics at Johns Hopkins University. He completed his PhD in Biostatistics from University of Minnesota. Dr. Datta’s research focuses on developing spatial models for geographically indexed data. His work on Nearest Neighbor Gaussian Processes (NNGP) has become one of the most widely used methods for scalable analysis of massive geospatial data. His recent work focuses on developing theory and methodology for combining machine learning algorithms with traditional spatial modeling, and application of the methodology to air pollution and infectious disease modeling. He also works on developing Bayesian hierarchical models for multi-source data with applications in global health. His research as Principal Investigator is funded by grants from the National Science Foundation (NSF), National Institute of Environmental Health Sciences (NIEHS) and the Bill and Melinda Gates Foundation. He has received the Early Career Investigator award from the American Statistical Association Section of Environmental Health, the Young Statistical Scientist Award (YSSA) by the International Indian Statistical Association (IISA), and the Abdel El-Shaarawi Early Investigator's Award from the The International Environmetrics Society (TIES).</span><br></div><div><br></div></div><div>Title: <span style="color:rgb(33,33,33);font-family:Calibri,sans-serif;font-size:14.6667px">Combining machine learning with Gaussian processes for geospatial data</span><br></div></div><div><br></div></div><div dir="ltr">Abstract: Spatial generalized linear mixed-models, consisting of a linear covariate effect and a Gaussian Process (GP) distributed spatial random effect, are widely used for analyses of geospatial data. We consider the setting where the covariate effect is non-linear and propose modeling it using a flexible machine learning algorithm like random forests or deep neural networks. We propose well-principled extensions of these methods, for estimating non-linear covariate effects in spatial mixed models where the spatial correlation is still modeled using GP. The basic principle is guided by how ordinary least squares extends to generalized least squares for linear models to account for dependence. We demonstrate how the same extension can be done for these machine learning approaches like random forests and neural networks. We provide extensive theoretical and empirical support for the methods and show how they fare better than naïve or brute-force approaches to use machine learning algorithms for spatially correlated data. We demonstrate the RandomForestsGLS R-package that implements this extension for random forests.</div></div></div></div></div></div><div dir="ltr"><span style="font-family:arial,sans-serif"><br></span></div><div dir="ltr">O seminário será transmitido ao vivo pelo canal do Youtube "<a href="https://www.youtube.com/@seminariosdest-ufmg" target="_blank">Seminários DEST - UFMG</a>".<br><br><a href="https://www.youtube.com/@seminariosdest-ufmg" target="_blank">https://www.youtube.com/@seminariosdest-ufmg</a><br></div><div><br></div><div dir="ltr">Att,<br>Marcos Prates</div></div></div></div></div>