Expressive Time Series Querying with Hand-Drawn Scale-Free Sketches

CHI, pp. 1-13, 2018.

Cited by: 18|Bibtex|Views54|Links
EI
Keywords:
Time series querying by sketchingshape definition languagenormalized discounted cumulative gaindynamic time warpingworld counterpartMore(17+)
Weibo:
Qetch’s Performance on Search Tasks We evaluate Qetch’s performance on two types of search tasks: targeted search, where users search the data set for a specific region, and exploratory search where users search for several regions in the data set that match their sketch

Abstract:

We present Qetch, a tool where users freely sketch patterns on a scale-less canvas to query time series data without specifying query length or amplitude. We study how humans sketch time series patterns --- humans preserve visually salient perceptual features but often non-uniformly scale and locally distort a pattern --- and we develop a...More

Code:

Data:

0
Introduction
  • The authors' ability to describe complex objects with hand-drawn sketches and recognize them predates the ability to do so with language.
  • Many search interfaces have capitalized on this ability, providing users with intuitive sketching interfaces: users sketch their object of interest, be it an image, a 3D-model or a chart pattern, and a matching algorithm finds similar objects in an image, model or time series database [11, 37, 28]
  • Such querying by sketching systems assume that in a “well-engineered feature space, sketched objects resemble their real-world counterparts [10]”.
  • With Qetch, the authors adopt a top-down design approach where the interface design choices cause them to develop a novel matching algorithm that tolerates the absence of time and amplitude scales on a sketch
Highlights
  • Our ability to describe complex objects with hand-drawn sketches and easily recognize them predates our ability to do so with language
  • Many search interfaces have capitalized on this ability, providing users with intuitive sketching interfaces: users sketch their object of interest, be it an image, a 3D-model or a chart pattern, and a matching algorithm finds similar objects in an image, model or time series database [11, 37, 28]
  • This fundamental assumption is often violated: “most humans are not faithful artists [10].” While simple stick-figures or cartoonlike sketches drastically differ from their real world counterparts, differences between a sketched chart pattern ( ) and actual time series data ( ) are generally perceived as less drastic
  • After various experiments with different effectiveness settings for dynamic time warping [33], we find that dynamic time warping with time and amplitude scaling, offset shifting and an unlimited warping window performs best for sketch to time-series matching
  • Qetch’s Performance on Search Tasks We evaluate Qetch’s performance on two types of search tasks: (1) targeted search, where users search the data set for a specific region, and (2) exploratory search where users search for several regions in the data set that match their sketch
  • We evaluated the effectiveness of dynamic time warping and Qetch with the popular normalized discounted cumulative gain (NDCG) measure [7, 21]
Methods
  • Participants & Methods The authors recruited

    20 university students and researchers (10 male, 10 female; 17 students, 3 researchers) to evaluate the effectiveness of Qetch’s time series querying features.
  • Six of the subjects have used tools to query or explore time series data (e.g. Google Analytics, matplotlib, Google Charts, Python, STATA, etc.).
  • The users mostly directed this play-time, but the authors asked users to attempt at least two sample query tasks.
  • In this play time the authors answered any questions they had about the tool
Results
  • A User Study of Qetch’s Interaction Features To evaluate Qetch’s user interface, the authors conducted a withinsubjects comparative user study of Qetch’s novel time series querying features: (i) regular expressions for querying repeated patterns and for anomaly detection versus no regular expressions and relative positioning of sketches for querying across multiple data sets verses specifying order constraints over sketches.
  • The authors' goal was to observe query completion times on assigned querying tasks to objectively determine whether Qetch’s features improved querying.
  • The authors elicited user preferences for the degree of smoothing for which queries should be evaluated and results presented
Conclusion
  • The authors introduced Qetch, a query-by-sketch tool for time series data.
  • The authors conducted a crowd study to learn how humans sketch time series.
  • The authors observed that participants often preserve and exaggerate the visually salient features of the reference time series they are sketching.
  • The authors designed Qetch’s matching algorithm to consider and tolerate such distortions.
  • Qetch’s sketchcentric design is powerful and expressive: through sketch annotations, users can effectively construct complex regularexpression queries and queries over multiple time-aligned series.
  • The authors publicly release the crowd-sourced data set of sketches and source code [34]
Summary
  • Introduction:

    The authors' ability to describe complex objects with hand-drawn sketches and recognize them predates the ability to do so with language.
  • Many search interfaces have capitalized on this ability, providing users with intuitive sketching interfaces: users sketch their object of interest, be it an image, a 3D-model or a chart pattern, and a matching algorithm finds similar objects in an image, model or time series database [11, 37, 28]
  • Such querying by sketching systems assume that in a “well-engineered feature space, sketched objects resemble their real-world counterparts [10]”.
  • With Qetch, the authors adopt a top-down design approach where the interface design choices cause them to develop a novel matching algorithm that tolerates the absence of time and amplitude scales on a sketch
  • Methods:

    Participants & Methods The authors recruited

    20 university students and researchers (10 male, 10 female; 17 students, 3 researchers) to evaluate the effectiveness of Qetch’s time series querying features.
  • Six of the subjects have used tools to query or explore time series data (e.g. Google Analytics, matplotlib, Google Charts, Python, STATA, etc.).
  • The users mostly directed this play-time, but the authors asked users to attempt at least two sample query tasks.
  • In this play time the authors answered any questions they had about the tool
  • Results:

    A User Study of Qetch’s Interaction Features To evaluate Qetch’s user interface, the authors conducted a withinsubjects comparative user study of Qetch’s novel time series querying features: (i) regular expressions for querying repeated patterns and for anomaly detection versus no regular expressions and relative positioning of sketches for querying across multiple data sets verses specifying order constraints over sketches.
  • The authors' goal was to observe query completion times on assigned querying tasks to objectively determine whether Qetch’s features improved querying.
  • The authors elicited user preferences for the degree of smoothing for which queries should be evaluated and results presented
  • Conclusion:

    The authors introduced Qetch, a query-by-sketch tool for time series data.
  • The authors conducted a crowd study to learn how humans sketch time series.
  • The authors observed that participants often preserve and exaggerate the visually salient features of the reference time series they are sketching.
  • The authors designed Qetch’s matching algorithm to consider and tolerate such distortions.
  • Qetch’s sketchcentric design is powerful and expressive: through sketch annotations, users can effectively construct complex regularexpression queries and queries over multiple time-aligned series.
  • The authors publicly release the crowd-sourced data set of sketches and source code [34]
Tables
  • Table1: Samples of sketches drawn by crowd workers for each marked region in Figure 1. Most sketches resemble the first three rows. A few sketches are similar to those from the last row: these valid sketches are seemingly poor; they miss or add extraneous features, are incomplete or slightly disagree with the reference region. Note that DTW ranks the reference region in its top 15 results when queried with sketches from the last row
Download tables as Excel
Related work
  • Our work draws from time series research on UI design, query specification techniques and query languages, as well as matching algorithms from the database community. We describe these prior works and how they influence or differ from Qetch.

    Time Series Query Specification Qetch allows users to query time series data with freeform sketches on an empty canvas. We broadly classify prior query specification techniques into sketch-less querying and constrained sketching where users have to (a) overlay sketches over visualizations of time series data, (b) draw shaperestricted query segments, or (c) annotate sketches with amplitude or time scales. As we explain later, such constrained sketching is usually an artifact of the underlying matching algorithm that requires a specification of time or amplitude ranges. Motivated by user-experience, Qetch’s interface and matching algorithm supports free-form sketching without constraints.
Funding
  • This work was partially supported by NSF IIS-1420941
Reference
  • Rakesh Agrawal, Giuseppe Psaila, Edward L. Wimmers, and Mohamed Zaït. 1995. Querying Shapes of Histories. In Proceedings of the 21th International Conference on Very Large Data Bases (VLDB ’95). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 502–514.
    Google ScholarLocate open access versionFindings
  • Gustavo E.A.P.A. Batista, Xiaoyue Wang, and Eamonn J. Keogh. 2011. A Complexity-Invariant Distance Measure for Time Series. In Proceedings of the 2011 SIAM International Conference on Data Mining. 699–710. DOI: http://dx.doi.org/10.1137/1.9781611972818.60
    Locate open access versionFindings
  • Donald J. Berndt and James Clifford. 1994. Using Dynamic Time Warping to Find Patterns in Time Series. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (AAAIWS’94). AAAI Press, 359–370.
    Google ScholarLocate open access versionFindings
  • Y. Chen, M. A. Nascimento, B. C. Ooi, and A. K. H. Tung. 2007. SpADe: On Shape-based Pattern Detection in Streaming Time Series. In 2007 IEEE 23rd International Conference on Data Engineering. 786–795. DOI:http://dx.doi.org/10.1109/ICDE.2007.367924
    Locate open access versionFindings
  • P. Cortez, M. Rio, M. Rocha, and P. Sousa. 2006. Internet Traffic Forecasting using Neural Networks. In The 2006 IEEE International Joint Conference on Neural Network Proceedings. 2635–2642. DOI: http://dx.doi.org/10.1109/IJCNN.2006.247142
    Locate open access versionFindings
  • Nick Craswell. 2009. Mean Reciprocal Rank. Springer US, Boston, MA, 1703–1703. DOI: http://dx.doi.org/10.1007/978-0-387-39940-9_488
    Findings
  • W Bruce Croft, Donald Metzler, and Trevor Strohman. 2010. Search engines: Information retrieval in practice. Vol. 283. Addison-Wesley Reading.
    Google ScholarLocate open access versionFindings
  • Hui Ding, Goce Trajcevski, Peter Scheuermann, Xiaoyue Wang, and Eamonn Keogh. 2008. Querying and Mining of Time Series Data: Experimental Comparison of Representations and Distance Measures. Proc. VLDB Endow. 1, 2 (Aug. 2008), 1542–1552. DOI: http://dx.doi.org/10.14778/1454159.1454226
    Locate open access versionFindings
  • Philipp Eichmann and Emanuel Zgraggen. 2015. Evaluating Subjective Accuracy in Time Series Pattern-Matching Using Human-Annotated Rankings. In Proceedings of the 20th International Conference on Intelligent User Interfaces (IUI ’15). ACM, New York, NY, USA, 28–37. DOI: http://dx.doi.org/10.1145/2678025.2701379
    Locate open access versionFindings
  • Mathias Eitz, James Hays, and Marc Alexa. 2012. How Do Humans Sketch Objects? ACM Trans. Graph. 31, 4, Article 44 (July 2012), 10 pages. DOI: http://dx.doi.org/10.1145/2185520.2185540
    Locate open access versionFindings
  • Mathias Eitz, Kristian Hildebrand, Tamy Boubekeur, and Marc Alexa. 2009. PhotoSketch: A Sketch Based Image Query and Compositing System. In SIGGRAPH 2009: Talks (SIGGRAPH ’09). ACM, New York, NY, USA, Article 60, 1 pages. DOI: http://dx.doi.org/10.1145/1597990.1598050
    Findings
  • Philippe Esling and Carlos Agon. 2012. Time-series Data Mining. ACM Comput. Surv. 45, 1, Article 12 (Dec. 2012), 34 pages. DOI: http://dx.doi.org/10.1145/2379776.2379788
    Locate open access versionFindings
  • Ada Wai-Chee Fu, Eamonn Keogh, Leo Yung Lau, Chotirat Ann Ratanamahatana, and Raymond Chi-Wing Wong. 2008. Scaling and Time Warping in Time Series Querying. The VLDB Journal 17, 4 (July 2008), 899–921. DOI:http://dx.doi.org/10.1007/s00778-006-0040-z
    Locate open access versionFindings
  • Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff, Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody, Chung-Kang Peng, and H Eugene Stanley. 2000. Physiobank, Physiotoolkit, and Physionet Components of a New Research Resource for Complex Physiologic Signals. Circulation 101, 23 (2000), e215–e220.
    Google ScholarLocate open access versionFindings
  • Keith W Hipel and A Ian McLeod. 1994. Time series modelling of water resources and environmental systems. Vol. 45. Elsevier.
    Google ScholarFindings
  • Harry Hochheiser. 2002. Interactive Querying of Time Series Data. In CHI ’02 Extended Abstracts on Human Factors in Computing Systems (CHI EA ’02). ACM, New York, NY, USA, 552–553. DOI: http://dx.doi.org/10.1145/506443.506477
    Findings
  • Harry Hochheiser and Ben Shneiderman. 2001. Interactive Exploration of Time Series Data. In Proceedings of the 4th International Conference on Discovery Science (DS ’01). Springer-Verlag, London, UK, UK, 441–446.
    Google ScholarLocate open access versionFindings
  • Harry Hochheiser and Ben Shneiderman. 2004. Dynamic Query Tools for Time Series Data Sets: Timebox Widgets for Interactive Exploration. Information Visualization 3, 1 (March 2004), 1–18. DOI: http://dx.doi.org/10.1145/993176.993177
    Locate open access versionFindings
  • Donald D Hoffman and Manish Singh. 1997. Salience of visual parts. Cognition 63, 1 (1997), 29 – 78. DOI: http://dx.doi.org/10.1016/S0010-0277(96)00791-3
    Locate open access versionFindings
  • Christian Holz and Steven Feiner. 2009. Relaxed Selection Techniques for Querying Time-series Graphs. In Proceedings of the 22Nd Annual ACM Symposium on User Interface Software and Technology (UIST ’09). ACM, New York, NY, USA, 213–222. DOI: http://dx.doi.org/10.1145/1622176.1622217
    Locate open access versionFindings
  • Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated Gain-based Evaluation of IR Techniques. ACM Trans. Inf. Syst. 20, 4 (Oct. 2002), 422–446. DOI: http://dx.doi.org/10.1145/582415.582418
    Locate open access versionFindings
  • Eamonn Keogh. 2003. Efficiently Finding Arbitrarily Scaled Patterns in Massive Time Series Databases. Springer Berlin Heidelberg, Berlin, Heidelberg, 253–265. DOI:http://dx.doi.org/10.1007/978-3-540-39804-2_24
    Findings
  • Eamonn Keogh. 2008. Indexing and Mining Time Series Data. Springer US, Boston, MA, 493–497. DOI: http://dx.doi.org/10.1007/978-0-387-35973-1_598
    Findings
  • Eamonn Keogh, Selina Chu, David Hart, and Michael Pazzani. 2004. Segmenting time series: A survey and novel approach. Data mining in time series databases 57 (2004), 1–22.
    Google ScholarFindings
  • Eamonn Keogh and Shruti Kasetty. 2002. On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’02). ACM, New York, NY, USA, 102–111. DOI: http://dx.doi.org/10.1145/775047.775062
    Locate open access versionFindings
  • Eamonn Keogh and Padhraic Smyth. 1997. A Probabilistic Approach to Fast Pattern Matching in Time Series Databases. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD’97). AAAI Press, 24–30.
    Google ScholarLocate open access versionFindings
  • Eamonn Keogh, Li Wei, Xiaopeng Xi, Michail Vlachos, Sang-Hee Lee, and Pavlos Protopapas. 2009. Supporting Exact Indexing of Arbitrarily Rotated Shapes and Periodic Time Series Under Euclidean and Warping Distance Measures. The VLDB Journal 18, 3 (June 2009), 611–630. DOI: http://dx.doi.org/10.1007/s00778-008-0111-4
    Locate open access versionFindings
  • Vladimir G. Kim, Wilmot Li, Niloy J. Mitra, Stephen DiVerdi, and Thomas Funkhouser. 2012. Exploring Collections of 3D Models Using Fuzzy Correspondences. ACM Trans. Graph. 31, 4, Article 54 (July 2012), 11 pages. DOI:http://dx.doi.org/10.1145/2185520.2185550
    Locate open access versionFindings
  • Nicholas Kong and Maneesh Agrawala. 2009. Perceptual Interpretation of Ink Annotations on Line Charts. In Proceedings of the 22Nd Annual ACM Symposium on User Interface Software and Technology (UIST ’09). ACM, New York, NY, USA, 233–236. DOI: http://dx.doi.org/10.1145/1622176.1622219
    Locate open access versionFindings
  • Jefrey Lijffijt, Panagiotis Papapetrou, Jaakko Hollmén, and Vassilis Athitsos. 2010. Benchmarking Dynamic Time Warping for Music Retrieval. In Proceedings of the 3rd International Conference on PErvasive Technologies Related to Assistive Environments (PETRA ’10). ACM, New York, NY, USA, Article 59, 7 pages. DOI: http://dx.doi.org/10.1145/1839294.1839365
    Locate open access versionFindings
  • George B. Moody and M.D. Ary L. Goldberger. 2017. MIT-BIH Heart Rate Time Series. (July 2017). http://ecg.mit.edu/time-series/
    Findings
  • Jeffrey P. Morrill. 1998. Distributed Recognition of Patterns in Time Series Data. Commun. ACM 41, 5 (May 1998), 45–51. DOI: http://dx.doi.org/10.1145/274946.274955
    Locate open access versionFindings
  • Abdullah Mueen and Eamonn Keogh. 2016. Extracting Optimal Performance from Dynamic Time Warping. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16). ACM, New York, NY, USA, 2129–2130. DOI:http://dx.doi.org/10.1145/2939672.2945383
    Locate open access versionFindings
  • New York University Abu Dhabi, Design Technology Lab. 2018. Qetch source code repository. (January 2018). https://github.com/dtl-nyuad/qetch
    Findings
  • Donald A. Norman. 2002. The Design of Everyday Things. Basic Books, Inc., New York, NY, USA.
    Google ScholarFindings
  • Central Bank of Iceland. 2016. Insurance companies statistics. (Jun 2016). http://www.cb.is/statistics/statistics/2016/08/15/Insurance-companies/
    Findings
  • Maks Ovsjanikov, Wilmot Li, Leonidas Guibas, and Niloy J. Mitra. 2011. Exploration of Continuous Variability in Collections of 3D Shapes. ACM Trans. Graph. 30, 4, Article 33 (July 2011), 10 pages. DOI: http://dx.doi.org/10.1145/2010324.1964928
    Locate open access versionFindings
  • Sanghyun Park, Sang-Wook Kim, and Wesley W. Chu. 2001. Segment-based Approach for Subsequence Searches in Sequence Databases. In Proceedings of the 2001 ACM Symposium on Applied Computing (SAC ’01). ACM, New York, NY, USA, 248–252. DOI: http://dx.doi.org/10.1145/372202.372334
    Locate open access versionFindings
  • Thanawin Rakthanmanon, Bilson Campana, Abdullah Mueen, Gustavo Batista, Brandon Westover, Qiang Zhu, Jesin Zakaria, and Eamonn Keogh. 2012. Searching and Mining Trillions of Time Series Subsequences Under Dynamic Time Warping. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’12). ACM, New York, NY, USA, 262–270. DOI: http://dx.doi.org/10.1145/2339530.2339576
    Locate open access versionFindings
  • Kathy Ryall, Neal Lesh, Tom Lanning, Darren Leigh, Hiroaki Miyashita, and Shigeru Makino. 2005. QueryLines: Approximate Query for Visual Browsing. In CHI ’05 Extended Abstracts on Human Factors in Computing Systems (CHI EA ’05). ACM, New York, NY, USA, 1765–1768. DOI: http://dx.doi.org/10.1145/1056808.1057017
    Locate open access versionFindings
  • Martin Wattenberg. 2001. Sketching a Graph to Query a Time-series Database. In CHI ’01 Extended Abstracts on Human Factors in Computing Systems (CHI EA ’01). ACM, New York, NY, USA, 381–382. DOI: http://dx.doi.org/10.1145/634067.634292
    Findings
  • M Wertheimer. 1938. Laws of organization in perceptual forms (partial translation). A Sourcebook of Gestalt Psychology (1938), 71–88.
    Google ScholarLocate open access versionFindings
  • Yunyue Zhu and Dennis Shasha. 2003. Warping Indexes with Envelope Transforms for Query by Humming. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD ’03). ACM, New York, NY, USA, 181–192. DOI: http://dx.doi.org/10.1145/872757.872780
    Locate open access versionFindings
  • Kostas Zoumpatianos, Stratos Idreos, and Themis Palpanas. 2015. RINSE: Interactive Data Series Exploration with ADS+. Proc. VLDB Endow. 8, 12 (Aug. 2015), 1912–1915. DOI: http://dx.doi.org/10.14778/2824032.2824099
    Locate open access versionFindings
Your rating :
0

 

Best Paper
Best Paper of CHI, 2018
Tags
Comments