Semi-Automated experiment design

The project will focus on mechanism design for human and artificially and intelligent agents. The aim is to develop mechanisms that enables their co-operation. The context is exploring the role of mechanism design, multi-agent dynamical models, and privacy preserving algorithms, in promoting the emergence of beneficial AI, for example, social-welfare maximizing AI, in multi-agent systems, and especially in multi-agent systems in which the AIs are built through reinforcement learning.

This project focuses on experiment design, typically formalized as a multi-armed bandit process, which we intend to study in a multi-agent, privacy-preserving
setting. The work done so far in this project has focused on the problem faces by an AI that needs to collaborate with a single human. As the AI and human’s views on reality disagree, the AI must take into account the human’s beliefs.

One classic problem area of interest is experiment design. As an example, drug companies want to design
a drug with certain properties, and each company has an AI to plan experiments into the efficacy of
drugs. There is a plethora of compounds that may be useful, and not all can be tested. However, there
exist large scale databases of drug toxicity and activity. Each AI is able to use data from previous clients
to do better planning at a lower cost. The AIs can post drug descriptions, in vitro results, simulations
for in silico experiments, and the results of clinical trials.
An important question in this context is how to align incentives so that the joint plan is beneficial to
society (in terms of access to useful therapies), while simultaneously balancing the computational and
human cost associated with designing, performing and analyzing experiments. An additional challenge
relates to privacy– not only in regard to individual’s concern about their own data, but in regard to AIs,
for example looking to minimize information revealed in order to avoid a “ratchet effect” where other AIs
can take advantage of this in the future, in the context of this competitive, market-based mechanism.

Emneord: experiment design, bandit problems, reinforcement learning

Publisert 25. sep. 2019 16:33 - Sist endret 25. sep. 2019 16:33

Veileder(e)

Christos Dimitrakakis Universitetet i Oslo

Semi-Automated experiment design

Veileder(e)

Omfang (studiepoeng)