[ABE-L] Fw: [External] PhD position, Lancaster University, with international funding

Qua Mar 2 12:44:34 -03 2022

Prezados,

Segue anuncio de vaga para doutorado em estatistica na Inglaterra, favor divulgar aos possiveis interessados.

Abracos,
Cristine

________________________________
From: A UK-based worldwide e-mail broadcast system mailing list <allstat em JISCMAIL.AC.UK> on behalf of Leslie, David <d.leslie em LANCASTER.AC.UK>
Sent: 02 March 2022 12:06 PM
To: allstat em JISCMAIL.AC.UK <allstat em JISCMAIL.AC.UK>
Subject: [External] PhD position, Lancaster University, with international funding

This email originated outside the University. Check before clicking links or attachments.

Topic: Reinforcement learning with structured action spaces
Supervisors: David Leslie (Lancaster) and Raphael Clifford (Bristol)
Closing date: When we have found a suitable candidate
Start date: October 2022

Reinforcement learning (RL) is machine learning technique in which computers experiment with an environment and learn effective behaviour. The problems in which RL are particularly effective are sequential decision-making problems where the task to be solved consists of observing the state of the environment, selecting an action, incurring some cost, and moving to a new state, where the cost and the successor state depend both on the initial state and the action selected; the canonical mathematical formulation of these challenges is a Markov decision process. The most famous recent example of RL success is the game of Go, addressed by Deepmind, which builds on a rich history in both games and individual decision-making examples (see Sutton and Barto (2018) for a survey).

In common reinforcement learning approaches, the set of actions which is available to the decision maker at each time instant is very regular. In many examples it is either fixed, finite and small (e.g. move North, South, East or West), or a simple continuous space (an angle and speed to move at). However in lots of problems, the action space is more complex. In robotic soccer-playing environments, the player can choose whether to run, turn or kick, and each of these choices is then parameterised by the strength and/or direction; this type of action space, with a finite number of action families each of which is indexed by a parameter, is called a parameterised action space.

In contrast to image processing, for which standard deep learning methods and libraries now exist, when action sets with complex structure are encountered, custom solutions have generally been required. This custom approach severely hinders the ability of non-specialists to deploy RL methods on their own problems. Thus the focus of the PhD topic is to formulate and code modular reinforcement learning components for general structured action spaces.
The project could take several directions, including:
â€¢ devising policy optimisation analogues of existing value learning approaches for structured action spaces
â€¢ extending parameterised action space approaches to more general structured action spaces
â€¢ deriving exploration strategies for parameterised action spaces to ensure efficient experimentation
A successful candidate will have skills in both mathematics and computer science - you will formulate methods for awkward action spaces, implement methods in modular code, and run computer experiments to compare methods on various problems.

To start the application process, please send your undergraduate transcript to d.leslie em lancaster.ac.uk with a brief note about why this project interests you.

--

David Leslie (he/him/his), Professor of Statistical Learning,
Head of Statistics, Lancaster University

You may leave the list at any time by sending the command

SIGNOFF allstat

to listserv em jiscmail.ac.uk, leaving the subject line blank.
-------------- Pr?xima Parte ----------
Um anexo em HTML foi limpo...
URL: <http://lists.ime.usp.br/pipermail/abe/attachments/20220302/df9518b9/attachment-0001.htm>