[ABE-L] Fwd: Seminário de Estatística, Ciencia de Dados e Decisao - Robert Tibshirani

Hedibert Lopes hedibert em gmail.com
Seg Maio 10 15:53:11 -03 2021


Seminário de hoje do Robert Tibshirani: https://youtu.be/ZP3WLxzaf_g

On Mon, May 10, 2021 at 1:19 PM Marco Inácio <m em marcoinacio.com> wrote:

> Bom dia, Hedibert, gostaria de perguntar se a gravação do seminário estará
> disponivel em algum repositorio.
> On 05/05/2021 21:48, Rafael Izbicki wrote:
>
>
>
> Rafael
>
>
> --
> Rafael Izbicki
> Assistant Professor
> Department of Statistics
> Federal University of São Carlos (UFSCar)
> www.rizbicki.ufscar.br/
> www.small.ufscar.br/
>
> ---------- Forwarded message ---------
> De: Hedibert Lopes <hedibert em gmail.com>
> Date: sex., 16 de abr. de 2021 13:39
> Subject: [ABE-L] Seminário de Estatística, Ciencia de Dados e Decisao -
> Robert Tibshirani
> To: abe-l <abe-l em ime.usp.br>
>
>
>
> *Data Science*
>
>
>
>
>
>
>
>
>
>
> Academic Seminar
>
>
>
> *Cross-validation: what does it estimate and how well does it do it?*
>
>
>
>
>
> *Speaker:*  Robert Tibshirani <https://statweb.stanford.edu/~tibs/>
> *University:* Stanford Universty <https://www.stanford.edu/>
>
> Cross-validation is a widely-used technique to estimate prediction error,
> but its behavior is complex and not fully understood. Ideally, one would
> like to think that cross-validation estimates the prediction
>
> error for the model at hand, fit to the training data. We prove that this
> is not the case for the linear model fit by ordinary least squares; rather
> it estimates the average prediction error of models fit on other unseen
> training sets drawn from the same population. We further show that this
> phenomenon occurs for most popular estimates of prediction error, including
> data splitting, bootstrapping, and Mallow’s Cp. Next, the standard
> confidence intervals for prediction error derived from cross-validation may
> have coverage far below the desired level. Because each data point is used
> for both training and testing, there are correlations among the measured
> accuracies for each fold, and so the usual estimate of variance is too
> small. We introduce a nested cross-validation scheme to estimate this
> variance more accurately, and show empirically that this modification leads
> to intervals with approximately correct coverage in many examples where
> traditional cross-validation intervals fail. Lastly, our analysis also
> shows that when producing confidence intervals for prediction accuracy with
> simple data splitting, one should not re-fit the model on the combined
> data, since this invalidates the confidence intervals.
>
> [image: Ícone Data]
>
> May/10/2021
>
> [image: Ícone Hora]
>
> *12**pm de Sao Paulo, Brazil (UTC/GMT -03:00)*
>
> [image: Ícone Hora]
>
> *Click here to join <https://zoom.us/j/97971536999>*
>
> --
> Hedibert Freitas Lopes, PhD
> Professor of Statistics and Econometrics
> INSPER - Institute of Education and Research
> Rua Quatá, 300 - São Paulo, SP 04546-042 Brazil
> Phone: +55 11 4504-2343
> www.hedibert.org
> _______________________________________________
> abe mailing list
> abe em lists.ime.usp.br
> https://lists.ime.usp.br/listinfo/abe
>
>

-- 
Hedibert Freitas Lopes, PhD
Professor of Statistics and Econometrics
INSPER - Institute of Education and Research
Rua Quatá, 300 - São Paulo, SP 04546-042 Brazil
Phone: +55 11 4504-2343
www.hedibert.org
-------------- Próxima Parte ----------
Um anexo em HTML foi limpo...
URL: <http://lists.ime.usp.br/pipermail/abe/attachments/20210510/885e88ff/attachment-0001.htm>


More information about the abe mailing list