thumbnail

Springer Science and Business Media LLC

SCOPUS (1993-2023)SCIE-ISI

  0926-8782

  1573-7578

 

Cơ quản chủ quản:  SPRINGER , Springer Netherlands

Lĩnh vực:
Hardware and ArchitectureInformation SystemsSoftwareInformation Systems and Management

Các bài báo tiêu biểu

Stratified random sampling from streaming and stored data
Tập 39 - Trang 665-710 - 2020
Trong Duc Nguyen, Ming-Hung Shih, Divesh Srivastava, Srikanta Tirthapura, Bojian Xu
Stratified random sampling (SRS) is a widely used sampling technique for approximate query processing. We consider SRS on continuously arriving data streams and statically stored data sets. We present a tight lower bound showing that any streaming algorithm for SRS over the entire stream must have, in the worst case, a variance that is $$\varOmega (r)$$ factor away from the optimal, where r is the number of strata. We present S-VOILA, a practical streaming algorithm for SRS over the entire stream that is locally variance-optimal. We prove that any sliding window-based streaming SRS needs a workspace of $$\varOmega (rM\log W)$$ in the worst case, to maintain a variance-optimal SRS of size M, where W is the number of elements in the sliding window. Due to the inherent high workspace needs for sliding window-based SRS, we present SW-VOILA, a multi-layer practical sampling algorithm that uses only O(M) workspace but can maintain an SRS of size close to M in practice over a sliding window. Experiments show that both S-VOILA and SW-VOILA result in a variance that is typically close to their optimal offline counterparts, which was given the entire input beforehand. We also present VOILA, a variance-optimal offline algorithm for stratified random sampling. VOILA is a strict generalization of the well-known Neyman allocation, which is optimal only under the assumption that each stratum is abundant. Experiments show that VOILA can have significantly smaller variance (1.4x to 50x) than Neyman allocation on real-world data.
Two-phase commit optimizations in a commercial distributed environment
Tập 3 Số 4 - Trang 325-360 - 1995
George Samaras, Kathryn Britton, Andrew Citron, C. Mohan
Scalable graph-based OLAP analytics over process execution data
- 2016
Amin Beheshti, Boualem Benatallah, Hamid Reza Motahari-Nezhad
Coordinator log transaction execution protocol
Tập 1 Số 4 - Trang 383-408 - 1993
James W. Stamos, Flaviu Cristian
Quantifying the trustworthiness of social media content
Tập 29 Số 3 - Trang 239-260 - 2011
Sai T. Moturu, Huan Liu
An efficient privacy-preserving multi-keyword search over encrypted cloud data with ranking
Tập 32 Số 1 - Trang 119-160 - 2014
Cengiz Örencik, Erkay Savaş
A novel approach to resource scheduling for parallel query processing on computational grids
- 2006
Anastasios Gounaris, Rizos Sakellariou, Norman W. Paton, Alvaro A. A. Fernandes
Sharing hierarchical context for mobile web services
Tập 21 Số 1 - Trang 85-111 - 2006
Christoph Dorn, Schahram Dustdar
Enhancing ebXML Registries to Make them OWL Aware
Tập 18 Số 1 - Trang 9-36 - 2005
Asuman Doğaç, Yildiray Kabak, Gokce B. Laleci, Carl Mattocks, Farrukh Najmi, Jeff Pollock