RomanLyapin MAB18 Abstract (PDF)

File information

Title: RomanLyapin_MAB18_Abstract_DeepArchitecturesBayesianOptimization.pdf

This PDF 1.3 document has been generated by Preview / Mac OS X 10.12.6 Quartz PDFContext, and has been sent on on 24/04/2018 at 13:38, from IP address 5.57.x.x. The current document download page has been viewed 561 times.
File size: 103.13 KB (1 page).
Privacy: public file

Document preview - RomanLyapin_MAB18_Abstract.pdf - Page 1/1

File preview

Deep architectures for Bayesian optimization
Roman Lyapin
February 14, 2018
One of the main challenges involved with running an e-commerce website lies in developing
and maintaining the relevancy of its content. At the typical approach to this
problem involves A/B testing each new feature or deployed model to prove it helps to improve
the customer experience. However, such setting may be not optimal for non-binary features or
interactions between different website components because we have to split the traffic between
all considered variants and that hurts the power of the statistical tests. The alternative for
that would be more efficient dynamic traffic allocation where we focus more on the variants
that initially perform better.
This formulation nicely fits within exploration-exploitation paradigm and we can try to
apply methods from Bayesian optimization to approach it. The benefits we get from that
include having an essentially non-parametric treatment of our data and analytical expressions
for our posteriors and decision rules (see Srinivas, 2010 or Snoek, 2012). The
downsides involve high computational complexity of the algorithms and overall difficulty in
choosing a reasonable prior over functions, i.e. the kernel and hyperparameter choices in the
underlying Gaussian process (GP), especially for multidimensional domains.
Several recent papers approach the later issue by proposing new ways to infer from data
wider classes of kernels underlying GP or GP themselves. One such example involves stacking several basic GP on top of each other (Damianou and Lawrence, 2012). Others follow
the seminal result (Neal, 1994) linking GP with infinite-width neural networks and examine potential kernels for more practical networks with less width and more depth (Lee,
2017) while some (Wilson, 2016) combine two approaches. The presented work examines
whether these new GP inference methods help to improve our performance during Bayesian
To keep it concrete and manageable we reduce the scope of the original bigger problem
to optimizing with respect to the 1D signal with a moving seasonal component. Without
any further refinements, such setting breaks assumptions underlying standard 1D Bayesian
optimization and forces us to operate in 2D domain. We compare the performance we get on
synthetic problems using deep architectures against the baselines that fit separate 1D signal
for each season using standard kernels and consider possible applications using


Download RomanLyapin MAB18 Abstract

RomanLyapin_MAB18_Abstract.pdf (PDF, 103.13 KB)

Download PDF

Share this file on social networks


Link to this page

Permanent link

Use the permanent link to the download page to share your document on Facebook, Twitter, LinkedIn, or directly with a contact by e-Mail, Messenger, Whatsapp, Line..

Short link

Use the short link to share your document on Twitter or by text message (SMS)


Copy the following HTML code to share your document on a Website or Blog

QR Code to this page

QR Code link to PDF file RomanLyapin_MAB18_Abstract.pdf

This file has been shared publicly by a user of PDF Archive.
Document ID: 0000760702.
Report illicit content