Semantic Lawyering: How the Semantic Web Will Transform the Practice of Law (Part 4)

(Links to parts 1, 2, and 3.)

What can you do with the Semantic Web that you can’t do without it?

The Semantic Web is a powerful way of structuring data and giving it a precise, machine-readable meaning. The most obvious and immediate benefit of semantic technologies is in organizing large quantities of information in a particular domain to make it easier to retrieve and analyze. This is reflected in the contexts in which these technologies have already been deployed, such as organizing large online databases of content (e.g., see here); or facilitating the exchange and analysis of research data (e.g. drug research, see here). Given the problem of legal information expansion discussed in the first post in this series, using semantic taxonomies and rules to organize the vast universe of legal data is clearly a promising area.[1]

In this post I will go beyond merely identifying the benefits of better structured data. Rather, I want to consider what really distinguishes the Semantic Web from rival technologies by asking: what can you do with the Semantic Web that you can’t do without it? In attempting to answer this question, I will focus on two kinds of application of the Semantic Web which promise to deliver not just enhanced performance, but may even transform the nature of the legal service involved: semantic legal query systems and, in the next part, smart legal documents.

Lawyers as optimum retrieval intermediaries

One of the core tasks performed by lawyers is giving legal advice. Schematically, what lawyers do in carrying out this task is to:

  1. identify rules in a vast corpus of laws that are relevant to a given legal query;
  2. interpret their legal meaning, often by considering how different rules interact and how they have been interpreted in the past; and
  3. consider how those rules apply to the specific query.

What distinguishes lawyers from the man on the street and what justifies both their holding a license to practice and their charging sizable fees for their services, is their (theoretically) superior ability to carry out each of these tasks. To quote the oft-repeated wisdom, the difference between a lawyer and a layman is not that the lawyer knows the law, but that he knows where to find it. I might add that the lawyer also knows whether there are legal rules for a given problem; how different rules interact (which rules preempt or modify other rules); how to check if a law is still in force or a precedent still good law; how to find an authoritative scholarly interpretation; and perhaps most importantly, the lawyer will have a wide experience through of different factual situations and contexts. In this sense, in the delivery of legal advice, a lawyer acts as an intermediary who ensures optimal retrieval of legal knowledge on behalf of his client.[2]

Semantic legal queries

We have seen how lawyers use search engines and commercial databases to deal with step 1 (identify) much more efficiently than was possible in the days of hard-copy statutes and law reports. However, even though researchers started working on expert legal systems as far back the 1970s (see here), in practice, steps 2 (interpretation) and 3 (applying the law to the query) are still largely carried out by the lawyer. This process is aided by technology only to the extent that the identification step 1 is repeated in sourcing secondary materials to guide interpretation and application of the rules. The smarter data generated on the Semantic Web will enable applications to dig deeper into steps 2 and 3.

Leveraging the higher degree of organization of legal data and the possibility of drawing inferences from the data, a semantic legal query system should be able to do more than merely retrieve information based on keywords selected by a human agent. In a world of perfect formalization, an application could carry out the interpretation and the application steps autonomously. But even in the absence of perfection, it is not unrealistic to suggest that within a few years, if enough smart legal data is available on the web, semantic legal query systems will be able to retrieve not just keyword-relevant documents but all or most of the information necessary to carry out steps 2 and 3. The application will know where to find the law (online); it will analyze the structure of the query and scour available data to determine whether there are applicable rules; it will determine what those rules are and suggest how they interact (perhaps retrieving the rules that govern the interaction); it will check whether the rules are up-to-date and retrieve any amendments or qualifications; and it will search for similar fact patterns, precedents and FAQ entries to clarify the application of the rules.

There are at least two major reasons semantic solutions have more potential than rival technologies to achieve these kinds of results. The first relates to the formal structure of Semantic Web standards: because the use of semantic metadata ensures that items of data have a precise meaning, semantic applications can make reliable inferences on the basis of the data. You need certainty to make inferences, because each step amplifies the uncertainty. Take this syllogism: Oracle is a Delaware Corporation; all Delaware Corporations are legal persons; therefore Oracle is a legal person. Now imagine each proposition in the syllogism is the result of a “best guess” data analysis process (e.g. through statistical analysis): There is a 90% percent chance that Oracle is a Delaware Corporation; there is a 90% chance that all Delaware Corporations are legal persons; therefore there is a 81% (90% of 90%) chance that Oracle is a legal person. This uncertainty compounds with each step, so beyond a few steps, any non-marginal uncertainty is fatal.

With the Semantic Web, if your query specifies a defined entity, the application will know precisely what you are referring to. In principle all instances of that object on the Semantic Web will refer to the same (online) definition, which specifies its properties and its relation to other entities. The second reason for the superiority of semantic applications relates to the openness of Semantic Web standards: the widespread adoption of standards for tagging and organizing legal data will ensure that more structured legal information is available than could possibly be achieved by a single provider of proprietary systems.

DIY and FAQs

An application that can deliver a page full of the kind of information described above will go a long way in assisting lawyers in carrying out steps 2 and 3 of legal advice delivery. In fact, if the application is good enough, it may even make the lawyer’s input redundant. How much additional specialist knowledge do you really need if all of the relevant information is right before you? Many consumers of legal services are happy to resort to “DIY” legal advice rather than incurring the costs of professional legal services. Online FAQs and other legal resources have proven popular as means of sourcing legal information without consulting a lawyer directly (often made available by legal professionals as a kind of loss leader to attract potential clients). Individual resources are inevitably limited in content, but in the aggregate the free World Wide Web (i.e. excluding subscription websites) is a fairly comprehensive source of legal information. The problem for the untrained is in finding relevant information and distinguishing the accurate and up-to-date sources from the incorrect and out-of-date. A semantic legal query application that enables laymen to access comprehensive, up-to-date legal information in response to their queries would satisfy much of the demand for simpler legal advice, reducing the demand for competing professional advice—if priced right. Even though these applications may not rival good lawyers in the quality of the service, not all consumers of legal services are concerned with getting the best quality. Good-enough might well do.

More than machines

Of course, many, if not all, lawyers would strongly resist being described as “optimum information retrieval” machines. Most would see their role as going well beyond merely delivering statements of what the law is to their clients. Rather, they are in the business of delivering solutions, offering advice on how to deal with certain situations, how to handle particular disputes, how to structure transactions, etc. Yet it is undeniable that lawyers, especially junior lawyers, spend much of their time searching for relevant information and assimilating it into bespoke legal advice. What the technological possibilities outlined in this post suggest is that simpler legal advice can likely be significantly automated, while for more complex queries, Semantic Web-based applications could considerably enhance fee-earner productivity in producing legal advice.

(Coming soon: Part 5 – Legal Documents.)

[1] As LaVern Pritchard pointed out in a comment to Part 3 of this series, “legal information” need not include only legal texts—see his article on applying taxonomies to the domain of legal practice here; see also this account of NetCase, a semantic system designed to assist lawyers with transnational cross-referrals.

[2] See discussion of “optimum retrieval” in Part 1 of this series.