文件名称:
AAAI 2019 Notes.pdf
开发工具:
文件大小: 632kb
下载次数: 0
上传时间: 2019-09-01
详细说明:AAAI 2019 Notes, AAAI大会是国际人工智能领域的顶级国际会议。Friday February 1st
67
7. 1 Rcinforccmcnt Learning
67
7.1.1 Diversity-Driven Hierarchical RL 36
67
7.1.2 Towards Better Interpretability in DQN
7.1.3 On RL for Full-Length Game of Starcraft 28
69
7.2 ReasOning under Uncertainly
70
2.1 Collecting Onlinc Lcarning of CPs in Multi-Agcnt Systcms
70
7.2.2 Weighted Model Ingeration using Know ledge Compilation
Off-Policy Deep RL by Bootstrapping the Covariate Shift
.72
7.2.4 Compiling Bayes Net Classifiers into Decision Graphs 35
7
This document contains noles I took during the events I Managed to nake it to aL AAAI in
Honolulu, Hawaii, USA, including sessions of thc Doctoral Consortium. Plcasc fccl frcc to distributo
it and shoot me an email at david-abelobrown edu if you find any typos or other items that need
correcting.
1 Conference Highlights
AAAi was fantastic- the invited talks offered impressive videos, inspiring visions of the future
and excellent coverage of many areas, spanning game playing, learning, human-robot interaction
data management, and exciting applications. I also enjoyed the two evening events: 1)the 20 year
roadmap for AI research in the US, and 2)the debate on the future of Al. Both events raised
compelling questions for researchers and practitioners of Al alike
I also want to highlight the doctoral consortium(DC). This was the first DC I have participated in;
in short, I strongly encourage grad students to do at least one dc during their program. You will
get exposure to some fantastic work being done by peers all around the world and receive tailored
mentorship on your presentation skills, how you write, and your research objectives and methods
more generally
AAAI really struck me as doing a great job of mixing together many subfields that don't often
spend as much time talking to one another-I met plenty of folks working in planning, constraint
satisfication, automated theorem proving. Al and society, and lots of ML/RL researchers
A final point that was raised at the roadmap- naturally, a huge fraction of research /industry is
concentrated on ML at the moment. But, it's important that we continue to push the frontiers
of knowledge forward across many different areas. So, if you re considering going into grad school
soon, do consider pursuing other topics /areas that offer fundamental and important questions
which there are many! beyond ML
And that’ s that!Let’ s dive in.
2 Sunday january 27th: Doctoral consortium
It begins! Today I'll be at the Doctoral Consortium(DC)my goal with the notes is both to give
folks a sense of what a DC entails, and to share the exciting research of some great grad students
2.1 Overview of the dc
I highly recommend doing a doctoral consortium at some point during grad school. I learned a huge
amount from the experience
For those that dont know, a dC involves preparing a short abstract summarizing your work, and
iving a 10-20 minute presentation to your peers and their montors. Each student participating
is assigned a. mentor(from their area. )that helps with preparing your presentation and gives you
more general advice on your research
It was a great experience! I had the pleasure of meeting many wonderful grad students and hearing
about their work
2.2 Neeti Pokhriyal: Multi-View Learning From Disparate Sources for Poverty
Mapping
Focus: Learning from multiple disparate data sources, applied to sustainability and biometrics
Specific Application: Povert mapping. Spatial representation of economic deprivations for a coun
try. A major tool for policy planners
Current method is a household survey, which is 1) costly, 2)time consuming, 3 only available for
small samples
Research Goal: Get accurate, spatially detailed and diagnostic poverty maps for a country
Lots of data availablc via weather, strcct maps, cconomic data, mobilc phones, satellite imagery
But! each of these data sources are structured very different
Definition 1(Multi-View Learning): A stgle of learning takes as input separate, semantically
distinct kinds of data, and brings them together into a factorized representation for use in
predictive models
Method: learn a Gaussian Process(GP)Regression model combined with elastic net regularia-
tion图S.
Using this model yields the map pictured in Figure I Then perform quantitive analysis and vali
dates that their model is making high qua ity predictions by comparing
Objective 2: learn a factorized representation from multiple data sources. The hope is that we
can disentangle explanatory factors that are unique to each data source
Poverty Map of Senegal africa
r
bern p the moolai abrsm
hl ea ni wrol commune
Figure 1: Higher fidelity poverty prediction
Sort of an EM like approach:
1. Learning Step MAp views y and z to shared subspaces xi
Inference Step: Pcrform infcrcncc on thcsc subspaccs
Q: Main question, then: how do we learn the shared subspace?
A: Separate data belonging to different class across different views is maximized, while ensuring
alignment of projects from each view to the shared space. Can be solved using a generalized
Eigenvalue problem, or using the kernel trick
2.3 Negar Hassanpour: Counterfactual Reasoning for Causal Effect Estimation
Problem: Consider Mr. Smith, who has a disease and some known properties(age, bMi, etc.)
Doctor provides treatment X and observes the effect of treatment X (but does not get data about
the counterfactu: what would have happened if doc had applied treatment Y?
Goal: Estimate the "Individual Treatment Effects"(ITE)-how does treatment X compare to Y?
Datasets
Randomized Controlled Trial(RCT): See lots of both X and Y. But, it's expensive (lots of
trials) and unethical(giving placebos when you know the right treatment)
Observational Study: provide the preferred treatment. But, sample selection bias
F ample: Trea. ting heart, disease, a doc prescribes surgery to younger patients and medication
to older patients. Compare survival time- but, clear bias in who gets what treatment
This is a really fundamental problein called"sample selection bias"- rich palients receiving
pensive trcatmcnt vs. poor paticnts rccciving cheap trcatment and so on
Overview of this work
Generate realistic synthetic datasets for evaluating these methods(since good data is hard
to come by
->Take an rct and augment it with synthetic data
e Use representation learning to reduce sample selection bias
Want Pr((a)l=0)N Pr(o() (=1) lo be sinilar, with c the learned representation
and t the treatment
o Learn underlying causal mechanism with generative models
c Learn causal relationships between treatments and outcomes by using generative models
an we identify the latent sources of outcome froin observational dataset?
e Perform survival predictions
Can we predict outcomes that are censored or take place after studies end?
e Going beyond binary treatments
> Many, but not all, treatments are binary. Can we go beyond this to categorical or real
valued treatments?
Providing a course of treatment
> Call on reinforcement learning
2.4 Khimya Khetarpal: Learning Temporal abstraction Across Action per-
ception
Q How should an Ai agent efficiently represent, learn, and use knowledge of the world?
A: Lets use temporal abstractions
xample: preparing breakfast. Lots of subtasks/activities involved like(high level): choose eggs
lype of toast(Imid level)chop vegetables, get butler, and (low level)wrist and arIn Movements
Definition 2(Options 37 ) An option formalizes a skill/temporally
c tended action as a
triple: (, 1, B, r, wwhere Ic S is a initiation set, B: S- Pr(S)is a term..ation probability
and:S→> A is a policy.
Example: A robol navigates Through a house between two rools. To do so. it has to open a door
We lct I denotc the states whcrc the door is closcd, 6 is 1 when the door is opcn and 0 otherwise
and T opens the door. Then, this option defines the "open the door"skill
Main Question: Can we learn useful temporal abstractions?
Hypothesis: Lcarning options which arc specialized in situations of spccific intcrcst can bc uscd
to get the right temporal abstractions
Motivation: Al agents should be able to learn and develop skills continually, hierarchically, and
incrementally over line
So, imagine we had a house decomposed into different rooms. Then we would like to learn skills
that take the agent between each room. Further, the agent should be able to transfer for one agent
to another
Objective 1: Lcarn options and intcrcst functions simultancously
New idea: break the option-critic assumption 2 that I=S. Instead, consider an interest func-
tion:
Definition 3(Interest Function An interest function is an indication of the ectent to which
an option is interested in state s
Now learn a policy over options and an interest function- we can jointly optimized over both things
Derive the policy gradient, theorem for interest functions, intra-option policy, and the termination
function
Theme I: Learning options with interest functions
日回图
團围围
Figure 2: Learned interest functions
Also explore learning interest functions in continuous control tasks, showing nice separation be
tween the learn options
Objective 2: Considcr a ncvcr-cnding strcam of pcrccptual data. Wc'd like to learn a strcam of
percepts and behavior over time
Challenges
How can we the agent automatically learn features which are meaningful pseudo rewards?
Where to task descriptions come from?
How can we achieve the most general options without hand designing tasks/rewards
Evaluation in a lifelong learning task? Benchmarks?
2.5 Ana Valeria Gonzalez-Garduo: RL for Low Resource Dialogue Systems
Goal 1: Create more informed approaches to dialogue generation
Goal 2: Use Rl for domain adaption in goal oriented dialogue
And: can we do this in a language agnostic way? So, introduce models that can work with
any/many languages)
Dialogue svstems are divided into two subfields
1. Open ended dialogue generation: typically use encoder-decoder architectures
2. Goal or'ienled dialogue: predominantly tackle using "pipeline"nethods. So, automatic speech
rocognition unit, then an undcrstanding unit, and so on
Current Focus: state tracking". That is, state tracking deals with inferring the user intent or
belief state during the conversation
But, limitation: intents usua lly rely on a particular ontology that defines which intents are valid
Current Status of the Project: Bridge the gap inb goal oriented dialogue. Main goal: can we get
rid of the need for annotations?
General idea given a. bot s utterance("how can I help? "), and a user response("I want to change
payment methods), we want to find a relevant query from prior conversations to identify what the
user said. Or really, use it to condition the decoder
Result: this model works very well! On bleu their model performs favorably, but more impor-
tantly, on a human evaluation, their responses were consistently chosen over the baseline
Q: But, what if our domain is not in the pool of relevant conversations
A: Work in progress! Idea - Use RL
1. Phase 1: Use existing dala for slate tracking, pretrain Inodels in a supervised manner
Turn level supervision, slots and values represented using word embeddings
2. Phase 2: Use RL to finetune pretrained model
Rely on dialogue level supervision (joint goal accuracy) as reward. So, how many slot
values("Food-Mexican, Price-Cheap), to determine the reward
Challenges in using RL for state tracking: dialogue is long(credit assignment is hard! ) sample
efficiency, might be able to leverage curriculum learning
Main Future Direction: Enable dialogue stale transition Inodel lo generale new unseen slots
6 AAAI Tutorial: Eugene Freuder on How to Give a talk
Start with an example! Or a counter example
These arc just his conclusions! So decide for yourself, of coursc
This talk is not intended to be mean spirited -he'll be talking about mistakes people make
Meta-message: presenting a talk is a skill that can be studied and practiced! And it's worth doing
spend years researching and 10 minutes presenting. The 10 minutes should be polished
Six points
Convey enthusiasm
2. Make it Easy to follow
3. Employ examples
4. Expressive
5. Enhance your presentation with visuals/dynamic material
6. Engage the audience
2.6.1 Enthusiasm
The secret of a good talk: Enthusiasm!
If you're not enthusiastic about your work, how do you expect anyone else to be?
Fcar of public speaking: glausophobia-ranked as thc most common fcar in the USa(morc so
than spiders / death)
(系统自动生成,下载前可以参看下载内容)
下载文件列表
相关说明
- 本站资源为会员上传分享交流与学习,如有侵犯您的权益,请联系我们删除.
- 本站是交换下载平台,提供交流渠道,下载内容来自于网络,除下载问题外,其它问题请自行百度。
- 本站已设置防盗链,请勿用迅雷、QQ旋风等多线程下载软件下载资源,下载后用WinRAR最新版进行解压.
- 如果您发现内容无法下载,请稍后再次尝试;或者到消费记录里找到下载记录反馈给我们.
- 下载后发现下载的内容跟说明不相乎,请到消费记录里找到下载记录反馈给我们,经确认后退回积分.
- 如下载前有疑问,可以通过点击"提供者"的名字,查看对方的联系方式,联系对方咨询.