Learning Centre

Advances in Analysing Temporal Data

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.author Kostakis, Orestis
dc.date.accessioned 2017-05-04T09:01:16Z
dc.date.available 2017-05-04T09:01:16Z
dc.date.issued 2017
dc.identifier.isbn 978-952-60-7370-5 (electronic)
dc.identifier.isbn 978-952-60-7371-2 (printed)
dc.identifier.issn 1799-4942 (electronic)
dc.identifier.issn 1799-4934 (printed)
dc.identifier.issn 1799-4934 (ISSN-L)
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/25402
dc.description.abstract Modern technical capabilities and systems have resulted in an abundance of data. A significant portion of all that data is of temporal nature. Hence, it becomes imperative to design effective and efficient algorithms, and solutions that enable searching and analysing large databases of temporal data. This thesis contains several contributions related to the broad scientific field of temporal-data analysis. First, we present a distance function for pairs of event-interval sequences, together with proofs of important properties, such as that the function is a metric, and a lower-bounding function. An embedding-based indexing method is proposed for searching through large databases of event-interval sequences, under this distance function. Second, we study the problem of subsequence search for event-interval sequences. This includes hardness results, an exact worst-case exponential-time algorithm, two upper bounds and a scheme for approximation algorithms. In addition, an equivalence is established between graphs and event-interval sequences. This equivalence allows to derive hardness results for several problems of event-interval sequences. Most importantly, it raises the question which techniques, results, and methods from each of the fields of graph mining and temporal data mining can be applied to the other that would advance the current state of the art. Third, for the problem of subsequence search, we propose an indexing method based on decomposing event-interval sequences into 2-interval patterns. The proposed indexing method is benchmarked against other approaches. In addition, we examine different variations of the problem and propose exact algorithms for solving them. Fourth, we describe a complete system that enables the clustering of a stream of graphs. The graphs are clustered into groups based on their distances, via approximating the graph edit distance. The proposed clustering algorithm achieves a good clustering with few graph comparisons. The effectiveness and usefulness of the systems is demonstrated by clustering function call-graphs of binary executable files for the purpose of malware detection. Finally, we solve the problem of summarising temporal networks. We assume that networks operate in certain modes and that the total sequence of interactions can be modelled as a series of transitions between these modes. We prove hardness results and provide heuristic procedures for finding approximate solutions. We demonstrate the quality of our methods via benchmarking and performing case-studies on datasets taken from sports and social networks. en
dc.format.extent 99 + app. 199
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.publisher Aalto University en
dc.publisher Aalto-yliopisto fi
dc.relation.ispartofseries Aalto University publication series DOCTORAL DISSERTATIONS en
dc.relation.ispartofseries 63/2017
dc.relation.haspart [Publication 1]: O. Kostakis. Classy: fast clustering streams of call-graphs. Data Mining and Knowledge Discovery, Volume 28, Issue 5, pp 1554–1585, September 2014. DOI: 10.1007/s10618-014-0367-9
dc.relation.haspart [Publication 2]: O. Kostakis, A. Gionis. Subsequence Search in Event-Interval Sequences. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, Pages 851–854, August 2015. DOI: 10.1145/2766462.2767778
dc.relation.haspart [Publication 3]: O. Kostakis, P. Papapetrou. Finding the longest common sub-pattern in sequences of temporal intervals. Data Mining and Knowledge Discovery, Volume 29, Issue 5, pp 1178-1210, September 2015. DOI: 10.1007/s10618-015-0404-3
dc.relation.haspart [Publication 4]: O. Kostakis, N. Tatti, A. Gionis. Discovering recurring activity in temporal networks. Accepted for publication in Data Mining and Knowledge Discovery, 31 pages, Submission date 2016
dc.relation.haspart [Publication 5]: O. Kostakis, P. Papapetrou. On Searching and Indexing Sequences of Temporal Intervals. Accepted for publication in Data Mining and Knowledge Discovery, 44 pages, Submission date 2016. DOI: 10.1007/s10618-016-0489-3
dc.subject.other Computer science en
dc.title Advances in Analysing Temporal Data en
dc.type G5 Artikkeliväitöskirja fi
dc.contributor.school Perustieteiden korkeakoulu fi
dc.contributor.school School of Science en
dc.contributor.department Tietotekniikan laitos fi
dc.contributor.department Department of Computer Science en
dc.subject.keyword temporal data en
dc.subject.keyword event-interval sequences en
dc.subject.keyword temporal networks en
dc.subject.keyword streams en
dc.subject.keyword graphs en
dc.identifier.urn URN:ISBN:978-952-60-7370-5
dc.type.dcmitype text en
dc.type.ontasot Doctoral dissertation (article-based) en
dc.type.ontasot Väitöskirja (artikkeli) fi
dc.contributor.supervisor Gionis, Aristides, Associate Prof., Aalto University, Department of Computer Science, Finland
dc.opn Gama, João, Associate Prof., University of Porto, Portugal
dc.contributor.lab Data Mining Group en
dc.rev Keogh, Eamonn, Professor, University of California-Riverside, USA
dc.rev Moskovitch, Robert, Ben Gurion University of the Negev, Israel
dc.date.defence 2017-05-26
local.aalto.formfolder 2017_05_03_klo_14_04
local.aalto.archive yes


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse

Statistics