Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo

Chiến lược Nhóm Truy vấn Hiệu quả Trong Điện toán Đám mây

Springer Science and Business Media LLC - Tập 32 - Trang 1231-1249 - 2017

Qin Liu^1,2, Yuhong Guo³, Jie Wu⁴, Guojun Wang⁵

¹College of Computer Science and Electronic Engineering, Hunan University, Changsha, China

²State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China

³School of Computer Science, Carleton University, Ottawa, Canada

⁴Department of Computer and Information Sciences, Temple University, Philadelphia, U.S.A.

⁵School of Computer Science and Educational Software, Guangzhou University, Guangzhou, China

Tóm tắt

Khi nhu cầu phát triển điện toán đám mây ngày càng tăng, nhiều tổ chức đã chuyển giao dữ liệu và dịch vụ truy vấn của họ cho đám mây nhằm tiết kiệm chi phí và tăng cường tính linh hoạt. Giả sử một tổ chức có số lượng lớn người dùng thực hiện truy vấn, triển khai nhiều máy chủ proxy trên đám mây để đạt hiệu quả chi phí và cân bằng tải. Với n truy vấn, mỗi truy vấn được thể hiện dưới dạng một số từ khóa, và k máy chủ proxy, bài toán cần giải quyết là cách phân loại n truy vấn thành k nhóm, nhằm giảm thiểu sự khác biệt giữa mỗi nhóm và số lượng từ khóa khác nhau trong tất cả các nhóm. Vì bài toán này là NP-khó, nó được giải quyết theo cách toán học và cách tiếp cận heuristic. Phương pháp nhóm toán học sử dụng một phương pháp tối ưu hóa cục bộ, trong khi nhóm heuristic dựa trên thuật toán k-means. Cụ thể, có hai mở rộng được cung cấp: mở rộng đầu tiên tập trung vào tính linh hoạt, tức là mỗi người dùng nhận được kết quả tìm kiếm ngay cả khi một số máy chủ proxy gặp sự cố; mở rộng thứ hai tập trung vào lợi ích, tức là mỗi người dùng có thể truy xuất càng nhiều tệp càng tốt mà có thể là mối quan tâm mà không làm tăng tổng số. Các đánh giá rộng rãi đã được thực hiện trên cả tập dữ liệu giả lập và các dấu vết truy vấn thực tế để xác minh hiệu quả của các chiến lược của chúng tôi.

Từ khóa

#điện toán đám mây #truy vấn #máy chủ proxy #tối ưu hóa #k-means #phân nhóm #chiến lược hiệu quả

Tài liệu tham khảo

Mell P M, Grance T. The NIST definition of cloud computing. Communications of the ACM, 2010, 53(6): Article No. 50. Fu Z J, Shu J G, Sun X M, Zhang D X. Semantic keyword search based on trie over encrypted cloud data. In Proc. the 2nd Int. Workshop on Security in Cloud Computing, June 2014, pp.59-62. Fu Z J, Ren K, Shu J G, Sun X M, Huang F X. Enabling personalized search over encrypted outsourced data with efficiency improvement. IEEE Trans. Parallel and Distributed Systems, 2016, 27(9): 2546-2559. Liu Q, Tan C C, Wu J, Wang G J. Cooperative private searching in clouds. Journal of Parallel and Distributed Computing, 2012, 72(8): 1019-1031. Liu Q, Tan C C, Wu J, Wang G J. Towards differential query services in costefficient clouds. IEEE Trans. Parallel and Distributed Systems, 2014, 25(6): 1648-1658. Sweeney L. k-anonymity: A model for protecting privacy. International Journal of Uncertainty Fuzziness and Knowledge-Based Systems, 2002, 10(5): 557-570. Niu B, Li Q H, Zhu X Y, Cao G H, Li H. Achieving k-anonymity in privacy-aware location-based services. In Proc. IEEE INFOCOM, April 27-May 2, 2014, pp.754-762. Yi X, Paulet R, Bertino E, Varadharajan V. Practical approximate k nearest neighbor queries with location and query privacy. IEEE Trans. Knowledge and Data Engineering, 2016, 28(6): 1546-1559. Kanungo T, Mount D M, Netanyahu N S, Piatko C D, Silverman R, Wu A Y. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Analysis and Machine Intelligence, 2002, 24(7): 881-892. Guo Y H. Active instance sampling via matrix partition. In Proc. NIPS, December 2010, pp.802-810. Hamerly G. Making k-means even faster. In Proc. SIAM Int. Conf. Data Mining, April 2010, pp.130-140. Pass G, Chowdhury A, Torgeson C. A picture of search. In Proc. the 1st Int. Conf. Scalable Information Systems, May 30-June 1, 2006. Gates A F, Natkovich O, Chopra S, Kamath P, Narayanamurthy S M, Olston C, Reed B, Srinivasan S, Srivastava U. Building a high-level dataflow system on top of Map-Reduce: The pig experience. In Proc. VLDB Endowment, August 2009, pp.1414-1425. Nykiel T, Potamias M, Mishra C, Kollios G, Koudas N. MRShare: Sharing across multiple queries in MapReduce. In Proc. VLDB Endowment, September 2010, pp.494-505. Herodotou H, Lim H, Luo G, Borisov N, Dong L, Cetin F B, Babu S. Starfish: A self-tuning system for big data analytics. In Proc. Biennial Conf. Innovative Data Systems Research, January 2011, pp.261-272. Lei C, Zhuang Z F, Rundensteiner E A, Eltabakh M. Shared execution of recurring workloads in MapReduce. In Proc. VLDB Endowment, September 2015, pp.714-725. Aggarwal C C, Zhai C X. A survey of text clustering algorithms. In Mining Text Data, Aggarwal C C, Zhai C X (eds.), Springer, 2012, pp.77-128. Fahad A, Alshatri N, Tari Z, Alamri A, Khalil I, Zomaya A Y, Foufou S, Bouras A. A survey of clustering algorithms for big data: Taxonomy and empirical analysis. IEEE Trans. Emerging Topics in Computing, 2014, 2(3): 267-279. Vu T T, Willis A, Song D W. Modelling time-aware search tasks for search personalisation. In Proc. the 24th Int. Conf. World Wide Web, May 2015, pp.131-132. Zhao Y, Karypis G. Empirical and theoretical comparisons of selected criterion functions for document clustering. Machine Learning, 2004, 55(3): 311-331. Zhang T, Ramakrishnan R, Livny M. BIRCH: An efficient data clustering method for very large databases. ACM SIGMOD Record, 1996, 25(2): 103-114. Guha S, Rastogi R, Shim K. CURE: An efficient clustering algorithm for large databases. Information Systems, 2001, 26(1): 35-58. Karypis G, Han E H, Kumar V. Chameleon: Hierarchical clustering using dynamic modeling. Computer, 1999, 32(8): 68-75. Guha S, Rastogi R, Shim K. ROCK: A robust clustering algorithm for categorical attributes. In Proc. the 15th Int. Conf. Data Engineering, March 1999, pp.512-521. Schütz H, Silverstein C. Projections for efficient document clustering. ACM SIGIR Forum, 1997, 31(SI): 74-81. Cutting D R, Karger D R, Pedersen J O, Tukey J W. Scatter/Gather: A cluster-based approach to browsing large document collections. In Proc. the 15th Annual Int. ACM SIGIR Conf. Research and Development in Information Retrieval, June 1992, pp.318-329. Sarle W S. Finding groups in data: An introduction to cluster analysis. Journal of the American Statistical Association, 1991, 86(415): 830-833. Ng R J, Han J W. Efficient and effective clustering methods for spatial data mining. In Proc. the 20th Int. Conf. Very Large Data Bases, September 1994, pp.144-155. Ng R T, Han J W. CLARANS: A method for clustering objects for spatial data mining. IEEE Trans. Knowledge and Data Engineering, 2002, 14(5): 1003-1016. Wei C P, Lee Y H, Hsu C M. Empirical comparison of fast clustering algorithms for large data sets. In Proc. the 33rd Annual Hawaii Int. Conf. System Sciences, January 2000.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA