Katie Han Thesis Final.pdf

Preview of PDF document katiehanthesisfinal.pdf

Page 1 2 3 4 5 6 7 8 9 10 11 12 13

Text preview

Related Work
Many browser extensions and tools have emerged in the past to allow users to annotate, clip,
and save content from the web. Some examples include Diigo [3], Hypothesis [4], Evernote
Web Clipper [5], and SurfMark [6], all of which serve as bookmarking tools and focus on
facilitating education and research. In particular, many of these platforms provide the ability to
highlight and annotate web pages, followed by a way to organize and share the clipped
resources. Pinterest’s browser button [7] encourages users to clip web content and share them
with the Pinterest community in a social context. However, all of these extensions rely on the
web browser’s basic selection capabilities to extract information or directly save the entire URL
to a web page instead of arranging more intuitive interactions for the users. On the other hand,
Microsoft Edge [8] contains a built-in feature for drawing and writing notes on the web page,
catered towards consumers of pen and touch screen devices. However, this functionality simply
grabs a screenshot of the selection and does not keep any references to the original HTML
structure of the web page.
PDF documents are also often opened and viewed on standard web browsers, facing the same
shortcomings when a user attempts to select text, tables, or images in the document.
Alternatively, other standalone software provides more flexibility in the tasks that can be carried
out on PDF documents. For instance, Adobe Acrobat DC [9] offers a variety of features,
including editing, signing, converting, and sharing documents. However, fine-grained selections
based on natural gestures, especially those from pen and touch interactions, are not optimized.
Extraction of content occurs mostly on the grid level or by clicking and dragging lines of text.
Several applications cater more to highlighting, annotating, and even drawing on PDF
documents—namely Drawboard PDF [10] on Windows and Preview [11] on Mac OS. Yet, little
work has been done to support the workflow of selecting specific regions of a document
intuitively and extracting the information from those regions immediately.

Before the start of this project, cTed presented a proof of concept for three intuitive selection
gestures: line, bracket, and marquee. Line selection, the simplest gesture of all, finds the
underlying elements in a straight line from the start point to the end point. Bracket gesture is a
vertical line along the left side of the targeted fragment, much like how one would mark a
paragraph on paper with pen. The software intelligently identifies the elements that lie within the
given “paragraph” and capture the section that the user intended to select. Lastly, marquee
selection allows the users to draw an arbitrary rectangular box by dragging diagonally from top
left to bottom right. All elements within the bounds are selected.
Lasso is another selection gesture that has been explored in the past by the cTed team.
Commonly used in pen and touch devices, the lasso mechanism allows users to draw freehand
selection borders to extract elements from. Based on the findings of my user study, I decided to
pursue the implementation of a lasso selection in addition to the three existing selection
gestures. Below I illustrate in depth how these four selections were implemented and modified