Linguistic summarization is a data mining and knowledge discovery approach to extract patterns and sum up large volume of data into simple sentences. There is a large research in generating linguistic summaries which can be used to better understand and communicate about patterns, evolution and long trends in numerical, time series or labelled data. The objective of this work is to develop a computational system capable of automatically generating linguistic descriptions of time series data of septic shock patients containing labelled data, not only of the whole series, but also on the differences between subsets of the data. This is of particular interest in septic shock, as the differences between patients are not well understood. For this purpose we propose a new type of differential summaries, based on a numerical criterion assessing the characteristics of the summary on each subset of interest. Furthermore, this paper proposes an extension of linguistic summaries to provide temporal and categorical contextualization. This is of particular interest in healthcare to detect differences related to a condition or illness as well as the effectiveness of the administered treatment.
|Title of host publication
|Published - 2013