A relatedness analysis of government regulations using domain knowledge and structural organization

Springer Science and Business Media LLC - Tập 9 - Trang 657-680 - 2006
Gloria T. Lau1, Kincho H. Law1, Gio Wiederhold2
1Department of Civil & Environmental Engineering, Stanford University, Stanford, USA
2Computer Science Department, Stanford University, Stanford, USA

Tóm tắt

The complexity and diversity of government regulations make understanding and retrieval of regulations a non-trivial task. One of the issues is the existence of multiple sources of regulations and interpretive guides with differences in format, terminology and context. This paper describes a comparative analysis scheme developed to help retrieval of related provisions from different regulatory documents. Specifically, the goal is to identify the most strongly related provisions between regulations. The relatedness analysis makes use of not only traditional term match but also a combination of feature matches, and not only content comparison but also structural analysis. Regulations are first compared based on conceptual information as well as domain knowledge through feature matching. Regulations also possess specific organizational structures, such as a tree hierarchy of provisions and heavy referencing between provisions. These structures represent useful information in locating related provisions, and are therefore exploited in the comparison of regulations for completeness. System performance is evaluated by comparing a similarity ranking produced by users with the machine-predicted ranking. Ranking produced by the relatedness analysis system shows a reduction in error compared to that of Latent Semantic Indexing. Various pairs of regulations are compared and the results are analyzed along with observations based on different feature usages. An example of an e-rulemaking scenario is shown to demonstrate capabilities and limitations of the prototype relatedness analysis system.

Tài liệu tham khảo

Al-Kofahi, K., Tyrrell, A., Vachher, A., & Jackson, P. (2001). A machine learning approach to prior case retrieval. In Proceedings of the 8th International Conference on Artificial Intelligence and Law (ICAIL 2001) (pp. 88–93). St. Louis, Missouri. Americans with Disabilities Act (ADA) Accessibility Guidelines for Buildings and Facilities (1999). US Architectural and Transportation Barriers Compliance Board (Access Board). Washington, DC. Attar, R., & Fraenkel, AS. (1977). Local feedback in full-text retrieval systems. Journal of the ACM, 24(3), 397–417. Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. New York, NY: ACM Press. Balmer, D. C. (2003). Trends and issues in platform lift. In Proceedings of Space Requirements for Wheeled Mobility Workshop. Buffalo, NY. Baru, C., Gupta, A., Papakonstantinou, Y., Hollebeek, R., & Featherman, D. (2000). Putting government information at citizens' fingertips. EnVision, 16(3), 8–9. Bench-Capon, T. J. M. (1991). Knowledge based systems and legal applications. San Diego, CA: Academic Press Professional, Inc. Bender, D. (2004). 2003 Data protection survey: Cross-border transfer of personal data in 22 major jurisdictions. In Proceedings of the 3rd Annual Law Firm C.I.O. Forum 2004 (pp. 95–122). San Francisco, CA. Berman, D. H., & Hafner, C. D. (1989). The potential of artificial intelligence to help solve the crisis in our legal system. Communications of the ACM, 32(8), 928–938. Berry, M. W., & Browne, M. (1999). Understanding search engines: mathematical modeling and text retrieval. Society for Industrial and Applied Mathematics (SIAM). Philadelphia, PA. Bollacker, K. D., Lawrence, S., & Giles, C. L. (1998). CiteSeer: An autonomous web agent for automatic retrieval and identification of interesting publications. In Proceedings of the 2nd International Conference on Autonomous Agents (pp. 116–123). Minneapolis, MN. Branting, L. K. (1991). Reasoning with portions of precedents. In Proceedings of the 3rd International Conference on Artificial Intelligence and Law (ICAIL 1991) (pp. 145–154). Oxford, England. Branting, L. K. (1991). Building explanations from rules and structured cases. International Journal of Man-Machine Studies, 34(6), 797–837. Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th International World Wide Web Conference (pp. 107–117). Brisbane, Australia. British Standard 8300 (2001). British Standards Institution (BSI), London, UK. Brüninghaus, S., & Ashley, KD. (2001). Improving the representation of legal case texts with information extraction methods. In Proceedings of the 8th International Conference on Artificial Intelligence and Law (ICAIL 2001) (pp. 42–51). St. Louis, Missouri. Calado, P., Ribeiro-Neto, B., Ziviani, N., Moura, E., & Silva, I. (2003). Local versus global link information in the web. ACM Transactions on Information Systems (TOIS), 21(1), 42–63. California Building Code (CBC) (1998). California building standards commission. Sacramento, CA. Code of Federal Regulations (CFR) (2002). Title 40, Parts 141–143, US Environmental Protection Agency. Washington, DC. Coglianese, C. (2003). E-Rulemaking: Information Technology and Regulatory Policy. Technical Report, Regulatory Policy Program, Kennedy School of Government. Cambridge, MA: Harvard University. Coglianese, C. (2004). Information technology and regulatory policy. Social Science Computer Review, 22 (1), 85–91. Crouch, C. J., & Yang, B. (1992). Experiments in automatic statistical thesaurus construction. In Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 77–88). Copenhagen, Denmark. Crouch, R., Condoravdi, C., Stolle, R., King, T., de Paiva, V., Everett, J., & Bobrow, D, (2002). Scalability of redundancy detection in focused document collections. In Proceedings of the 1st International Workshop on Scalable Natural Language Understanding (ScaNaLU-2002). Heidelberg, Germany. Daniels, J. J., & Rissland, E. L. (1997). What you saw is what you want: Using cases to seed information retrieval. In Proceedings of the 2nd International Conference on Case-Based Reasoning (ICCBR-97) (pp. 325–336). Providence, RI. Deerwester, S., Dumais, S. T, Furnas, G. W, Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6), 391–407. Draft Guidelines for Accessible Public Rights-of-Way (2002). US architectural and transportation barriers compliance board (access board). Washington, DC. Dumais, S. T. (1991). Improving the retrieval of information from external sources. Behavior Research Methods, Instruments, and Computers, 23(2), 229–236. Everett, J. O., Bobrow, D. G., Stolle, R., Crouch, R., de Paiva, V., Condoravdi, C., Berg, M. V. D., & Polanyi, L. (2002). Making ontologies work for resolving redundancies across documents. Communications of the ACM, 45(2), 55– 60. Gardner, A. (1984). An artificial intelligence approach to legal reasoning. Ph.D. Thesis, Computer Science. Stanford, CA: Stanford University. Garfield, E. (1995). New international professional society signals the maturing of scientometrics and informetrics. The Scientist, 9(16). Gibbens, M. P. (2000). CalDAG 2000: California disabled accessibility guidebook. Canoga Park, CA: Builder's Book. Gibson, D., Kleinberg, J., & Raghavan, P. (1998). Inferring web communities from link topology. In Proceedings of the 9th ACM Conference on Hypertext and Hypermedia (pp. 225–234). Pittsburgh, PA. Golub, G. H., & Van Loan, C. F. (1983). Matrix computations. Baltimore, MD: The Johns Hopkins University Press. Gurrin, C., & Smeaton, A. F. (1999). A connectivity analysis approach to increasing precision in retrieval from hyperlinked documents. In Proceedings of Text REtrieval Conference (TREC). Gaithersburg, MD. Hofmann, T. (1999). Probabilistic latent semantic indexing. In Proceedings of the 22nd Annual ACM Conference on Research and Development in Information Retrieval (pp. 50–57). Berkeley, California. Ide, E. (1971). New experiments in relevance feedback. In G. Salton (Eds.), The SMART retrieval system-experiments in automatic document processing. Englewood Cliffs, NJ: Prentice Hall Inc. Kerrigan, S. (2003). A software infrastructure for regulatory information management and compliance assistance. Ph.D. Thesis, Department of Civil and Environmental Engineering, Stanford University, Stanford, CA. Kerrigan, S., & Law, K. (2003). Logic-based regulation compliance-assistance. In Proceedings of the 9th International Conference on Artificial Intelligence and Law (ICAIL 2003) (pp. 126–135). Edinburgh, Scotland. Kleinberg, J. (1998). Authoritative sources in a hyperlinked environment. In Proceedings of the 9th ACM-SIAM Symposium on Discrete Algorithms (pp. 668–677). San Francisco, CA. Lau, G. (2004). A comparative analysis framework for semi-structured documents, with applications to government regulations, Ph.D. Thesis, Civil and Environmental Engineering, Stanford University, Stanford, CA. Lau, G., Kerrigan, S., & Law, K. (2003). An information infrastructure for government regulations. In Proceedings of the 13th Workshop on Information Technology and Systems (WITS'03) (pp. 37–42). Seattle, WA. Lau, G., Law, K., & Wiederhold, G. (2003). Similarity analysis on government regulations. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 111–117). Washington, DC. Lau, G., Law, K., & Wiederhold, G. (2003). A framework for regulation comparison with application to accessibility codes. In Proceedings of the National Conference on Digital Government Research (pp. 251–254). Boston, MA. Lin, C., Hu, PJ., Chen, H., & Schroeder, J. (2003). Technology implementation management in law enforcement: COPLINK system usability and user acceptance evaluations. In Proceedings of the National Conference on Digital Government Research (pp. 151–154). Boston, MA. Merkl, D., & Schweighofer, E. (1997). En route to data mining in legal text corpora: Clustering, neural computation, and international treaties. In Proceedings of the 8th International Workshop on Database and Expert Systems Applications (pp. 465–470). Toulouse, France. Miller, G. A, Beckwith, R., Fellbaun, C., Gross, D., & Miller, K. (1993). Five papers on wordnet. Technical Report, Cognitive Science Laboratory, Princeton, NJ. Moens, M.-F., Uyttendaele, C., & Dumortier, J. (1997). Abstracting of legal cases: The SALOMON experience. In Proceedings of the 6th International Conference on Artificial Intelligence and Law (ICAIL 1997) (pp. 114–122). Melbourne, Australia. Osborn, J., & Sterling, L. (1999). JUSTICE: A judicial search tool using intelligent concept extraction. In Proceedings of the 7th International Conference on Artificial Intelligence and Law (ICAIL 1999) (pp. 173–181). Oslo, Norway. Page, L., Brin, S., Motwani, R., & Winograd, T. (1998). The pagerank citation ranking: bringing order to the web. Stanford, CA: Technical Report, Stanford University. Potential Drinking Water Contaminant Index (2003). US Environmental Protection Agency. Washington, DC. Proceedings of Business Compliance One Stop Workshop (2002). Small Business Administration. Queenstown, MD. Proceedings of the National Conference on Digital Government Research (dg.o 2001). Los Angeles, CA. Proceedings of the National Conference on Digital Government Research (dg.o 2002). Los Angeles, CA. Proceedings of the National Conference on Digital Government Research (dg.o 2003). Boston, MA. Qiu, Y., & Frei, H.-P. (1993). Concept based query expansion. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 160–169). Pittsburgh, PA. Raskopf, R. L., & Bender, D. (2003). Cross-border data: Information transfer restrictions pose a global challenge. New York Law Journal. Rissland, E. L., Ashley, K. D., & Loui, R. P. (2003). AI and law: a fruitful synergy. Artificial Intelligence, 150(1–2), 1–15. Rissland, E. L., & Skalak, D. B, (1991). CABARET: rule interpretation in a hybrid architecture. International Journal of Man-Machine Studies, 34(6), 839–887. Rocchio, J. J. (1971). Relevance feedback in information retrieval. In G. Salton (Eds.), The SMART Retrieval System—Experiments in Automatic Document Processing. Englewood Cliffs, NJ: Prentice Hall Inc. Salton, G. (1971). The smart retrieval system—Experiments in automatic document processing. Englewood Cliffs, NJ: Prentice Hall. Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic retrieval. Information Processing and Management, 24(5), 513–523. Salton, G., & McGill, M. (1983). Introduction to modern information retrieval. New York, NY: McGraw-Hill. Schweighofer, E., Rauber, A., & Dittenbach, M. (2001). Automatic text representation, classification and labeling in European law. In Proceedings of the 8th International Conference on Artificial Intelligence and Law (ICAIL 2001) (pp. 78–87.) St. Louis, Missouri. Sergot, M. J., Sadri, F., Kowalski, R. A., Kriwaczek, F., Hammond, P., & Cory, H. T. (1986). The british nationality act as a logic program. Communications of the ACM, 29 (5), 370–386. Shepard's Federal Citations (1990). Colorado Springs, CO: Shepards/Mcgraw-Hill. Silva, I., Ribeiro-Neto, B., Calado, P., Moura, E., & Ziviani, N. (2000). Link-based and content-based evidential information in a belief network model. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 96–103). Athens, Greece. Technical Standards (2001). Scottish executive. Edinburgh, Scotland, UK. Thompson, P. (2001). Automatic categorization of case law. In Proceedings of the 8th International Conference on Artificial Intelligence and Law (ICAIL 2001) (pp. 70–77). St. Louis, Missouri. Uniform Federal Accessibility Standards (UFAS) (1997). US Architectural and Transportation Barriers Compliance Board (Access Board). Washington, DC. Xu, J., & Croft, W. B. (1996). Query expansion using local and global document analysis. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 4–11). Zurich, Switzerland. Zeleznikow, J., & Hunter, D. (1994). Building intelligent legal information systems: Representation and reasoning in law. Deventer, The Netherlands: Kluwer Law and Taxation Publishers.