《计算机应用研究》|Application Research of Computers

基于短序列分组和拼接策略的子序列快速查询算法

Fast subsequence query algorithm based on short sequence grouping and assembling strategy

免费全文下载 (已被下载 次)  
获取PDF全文
作者 范纯龙,王靖云,滕一平,丁国辉
机构 沈阳航空航天大学 计算机学院 辽宁省大规模分布式系统实验室,沈阳 110136
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2020)06-020-1702-05
DOI 10.19734/j.issn.1001-3695.2018.11.0866
摘要 子序列查询技术在金融、商业、医疗等领域均有重要应用,但因DTW等相似性比对算法的时间复杂度较高,子序列长度对检索时间影响很大,限制了数据集上长子序列检索的效率。针对这一问题提出一种子序列快速查询算法。首先对数据集中特定长度下所有子序列进行分组并标记出代表性子序列;然后在查询时将查询序列切分成定长的小段序列,并用DTW算法确定与小段序列相似的代表子序列候选集;最后对候选集进行序列拼接,获取到查询结果序列。实验表明新算法效率较典型算法提高约10倍。
关键词 序列数据查询; 动态时间规整; 子序列; 序列分组
基金项目 国家自然科学基金资助项目(61303016)
本文URL http://www.arocmag.com/article/01-2020-06-020.html
英文标题 Fast subsequence query algorithm based on short sequence grouping and assembling strategy
作者英文名 Fan Chunlong, Wang Jingyun, Teng Yiping, Ding Guohui
机构英文名 Large-scale Distributed System Laboratory in Liaoning Province,School of Computer,Shenyang Aerospace University,Shenyang 110136,China
英文摘要 Subsequence query technique has important applications in several fields such as finance, commerce and healthcare. However, due to the high time complexity of similarity comparison algorithms such as dynamic time warping(DTW), the length of subsequence has a great influence on the retrieval time, which limits the efficiency of long subsequence retrieval on data sets. This paper proposed a fast subsequence query algorithm based on short sequence grouping and assembling strategy. This algorithm first separated all subsequences which were in a given length in the data set into groups and marked out the representative subsequence for each group. Then it cut the query sequence into small query sequences with a fixed length during the query processing, and used DTW algorithm to compute the candidate sets of subsequences. The representative subsequence had a high similarity with the small query sequences. Finally, it assembled all the sequences in the candidate sets to derive the query result sequences. Experiments show that the efficiency of the new algorithm is about 10 times higher than that of the typical algorithm.
英文关键词 sequential data query; dynamic time warping; subsequence; sequence grouping
参考文献 查看稿件参考文献
 
收稿日期 2018/11/12
修回日期 2019/2/21
页码 1702-1706,1749
中图分类号 TP391
文献标志码 A