TESTING OF DEPLOYABLE LEARNING AND
DECISION SYSTEMS
A workshop in conjunction with NIPS-2006
Aside from beig important for adaptive and autonomous systems (such
as adaptability to changing environments), we believe that learning
and statistical inference methods can be successfully applied to the meta-problem of evaluating the above list of properties of a learning or decision system, before it is deployed.
The first goal of this workshop is to explore the requirements and
risks for deployable learning and decision systems. We must consider a
large range of possible tasks such as autonomous navigation, supplier
and health management, flight control, decision support (interacting
with human experts), etc.
Next, we would like to understand how (possibly novel) learning methods can contribute
to the process of testing and evaluating learning and decision systems and what
techniques are needed to evaluate the decisions computed by complex (even human-
computer) systems.
Finally, this workshop will bring together researchers and users of learning and adaptive
systems and create a forum for discussing recent advances in validation and testing of
learning systems, to better understand the practical requirements for developing and
deploying learning systems, and to inspire research on new methods and techniques for
the testing and evaluation of learning.
Topics of interest include, but are not limited to:
If you have an issue or contribution that is not covered by the topics above, please contact Dragos Margineantu
by e-mail to discuss your idea prior to submitting a position paper.
The organizers will review the submissions with the goal of assembling
a stimulating and exciting workshop. Attendence will be limited to 40
people, with preference given to people who are presenting position
papers.
Whistler, British Columbia - December 8, 2006
Workshop Motivation and Description
Although testing has been an important consideration for some time, our
community has devoted little effort towards developing principled approaches for
assessing the efficacy of deployed systems. We need new approaches,
analysis tools, and metrics for: the quality of the learning model in the context of the
actual problem, the safety decision systems in safety-critical tasks,
the effectiveness of online learning methods over an extended period
of time, the stability of online learning tasks, and for the tradeoffs between robustness and risk needed in making complex decisions.
For reliable deployment and operation, learning and decision systems need to be
trustworthy to users who have little or no knowledge about learning (e.g., engineers,
designers, quality control specialists).
System failures can occur and will occur, regardless of whether those systems contain
online learning components, offline trained components, or hard-coded decision
components. Therefore, questions such as "what are the tradeoffs for improving the
quality of the outputs of a learning system in a certain region of the space?" or "what can
be inferred (regarding future decisions) from observing the operation of a learning system
in a simulated environment?" have deep ramifications and, if answered, can result in
learning technology having a more serious impact on newly developed systems.
- statistical testing and validation of learned models
- metrics for the performance of learning systems
- definitions and metrics for stability in learning
- statistical and logical inference for validation purposes
- learning for safety-critical applications
- evaluation of online learning algorithms
- new approaches for trustable machine learning software development
- analysis of the robustness vs. risk tradeoff
- algorithms and tools for monitoring learning and adaptive systems
- deployed active and online learning tools
- novel problems and applications that require principled assessment of learning
- testbeds for the analysis and evaluation of learning systems
- design of experiments for the evaluation of learning.
Workshop Format
The workshop will have two sessions. Each session will start with an invited talk and will continue as a mix of
position paper presentations and discussions.
Participation and Submissions
To participate in the workshop, please send an e-mail message to Dragos Margineantu (dragos.d.margineantu@boeing.com) giving your name, affiliation, address, e-mail address,
and a brief description of your reasons for wanting to attend.
In addition, if you wish to present a position paper on one or more of
the topics listed above, please see the instructions on the
submissions page.
Important dates
Organizers