"Sink Method" Poster for Conference on Empirical Legal Studies (CELS 2009 @ USC)

Sinks Poster

As we mentioned in previous posts, Seadragon is a really cool product. Please note load times may vary depending upon your specific machine configuration as well as the strength of your internet connection. For those not familiar with how to operate it please see below. In our view, the Full Screen is best the way to go ….

Law as a Seamless Web … Poster for WIN Conference @ NYU Stern

Seamless Web Poster

As we mentioned in previous posts, Seadragon is a really cool product. Please note load times may vary depending upon your specific machine configuration as well as the strength of your internet connection. For those not familiar with how to operate it please see below. In our view, the Full Screen is best the way to go ….

Power Laws, Preferential Attachment and Positive Legal Theory [Part 2] [Repost]

Law as a Complex System?

As was stated in Part 1 of this thread, it is by no means a given that the statistical artifact displayed above would appear. Namely, such large scale patterns need not assume this flavor as many social and physical systems feature substantially different properties.

For purpose of generating an empirically grounded theory of American Common Law development … explaining these artifacts would seem to critical. Fortunately, with respect to the above pattern, there exist a definable set of generative processes plausibly responsible for producing what is displayed. While certainly not the only generative process responsible for a power law, the preferential attachment model, first outlined in the physics literature by Barabási & Albert, is among the likely candidates.

Confronting much of the extant literature, query as to whether a closed form equilibria based analytical apparatus (punctuated or otherwise) is up to the task of describing the relevant dynamics? If anything, the distributions displayed above provide first-order evidence of a system which is likely to feature dynamics of a non-linear flavor. Indeed, while significant work still remains, the weight of available evidence indicates Law is a Complex Adaptive System. As such, we believe it would be appropriate to leverage the methods typically reserved for the study of complexity.  For purposes of generating positive legal theory, we believe agent based models, dynamic network analysis and other methods of computational social science offer great potential. We encourage scholars to consider learning more about these approaches.

Citation Analysis in Continental Jurisdictions

Citation Analysis

Anton Geist has posted Using Citation Analysis Techniques for Computer-Assisted Legal Research in Continental Jurisdictions to the SSRN.  While this is certainly longer than most papers, we believe it offers a good review of the broader information retrieval and law literature.  In addition, it offers some empirical insight into citation patterns within continental jurisdictions. The findings in this paper are similar to those shown in important papers by Thomas Smith in The Web of the Law and by David Post & Michael Eisen in How Long is the Coastline of Law? Thoughts on the Fractal Nature of Legal Systems. 

In our view, the next step for this research is to determine whether the pattern does indeed follow a power law distribution.  Specifically, there exists a Maximum Likelihood based test developed in the applied physics paper Power-law Distributions in Empirical Data by Aaron ClausetCosma Shalizi and Mark Newman which can help adjudicate whether the detected pattern represents a highly skewed distribution or is indeed a power law.

Either way, we are excited by this paper as we believe comparative research is absolutely critical to broader theory development.

Law as a Seamless Web? Part II

Semantic Network
In our paper Law as a Seamless Web, we offer a first-order method to generate case-to-case and opinionunit-to-opinionunit semantic networks. As constructed in the figure above, nodes represent cases decided between 1791-1865 while edges are drawn when two cases possess a certain threshold of semantic similarity. Except for the definition of edges, the process of constructing the semantic graph is identical to that of the citation graph we offered in the prior post. While computer science/computational linguistics offers a variety of possible semantic similarity measures, we choose to employ a commonly used measure. Here a description from the paper:

“Semantic similarity measures are the focus of significant work in computational linguistics. Given the scope of the dataset, we have chosen a first-order method for calculating similarity.  After lemmatizing the text of the case with WordNet, we store the nouns with the top N frequencies for each case or opinion unit. We define the similarity between two cases or opinion units A and B as the percentage of words that are shared between the top words of A and top words of B.

An edge exists between A and B in the set of edges  if  σ (A,B) exceeds some threshold.  This threshold is the minimum similarity necessary for the graph to represent the presence of a semantic connection.”  

As this a technical paper, it is slanted toward demonstrating proof of methodological concept rather than covering significant substantive ground. With that said, we do offer a hint of our broader substantive goal of detecting the spread of legal concepts between various topical domains. Specifically, with respect to enriching positive political theory, we believe union, intersect and compliment of the semantic and citation networks are really important. More on this point is forthcoming in a subsequent post…

Tax Day! A First-Order History of the Supreme Court and Tax

percentword-thumb

Click to view the full image.

In honor of Tax Day, we’ve produced a simple time series representation of the Supreme Court and tax.  The above plot shows the how often the word “tax” occurs in the cases of  the Supreme Court, for each year – that is, what proportion of all words in every case in a given year are the word “tax.”  The data underneath includes non-procedural cases from 1790 to 2004.  The arrows highlight important legislation and cases for income tax as well.

Make sure to click through the image to view the full size.

Happy Tax Day!

When is the first term enough?: On approximation in social science

Research in the academic world suffers from the “hammer problem” – that is, the methods we use are often those that we have in our toolbox, not necessarily those that we should be using.  This is especially true in computational social science, where we often attempt to directly import well-developed methods from the hard sciences.

To prove the point, I’d like to highlight one example we’ve come across in our research.  In Leicht et al’s  Large-scale structure of time evolving citation networks, the authors apply two methods to a simplified representation of the United States Supreme Court citation network.  Both of these methods rely on complicated statistical algorithms and require iterative non-linear system solvers.  However, the results are consistent, and they detect “events” around 1900, 1940, and 1970.

leichtfigure5

One  first-order alternative to detecting significant “events” in the Court would be to count citations.  One might suspect, for instance, that the formation or destruction of law might go hand-in-hand with an acceleration or deceleration in the rate of citation.  Such a method is purely conjectural, but costs much less to implement than the methods discussed above.

citationtimeseries

This figure shows the number of outgoing citations per year in blue, as well as the ten-year moving average in purple.  The plot shows jumps that coincide very well with the plot from Leicht, et. al.  Thus, although only a first-order approximation to the underlying dynamics, this method would lead historians down a similar path with much less effort.

This example, though simple, is one that really hits home for me.  After a week of struggling to align interpretations and methods, this plot convinced me more than any eigenvector or Lagrangian system.  Perhaps more importantly, unlike the above methods, you can explain this plot to a lay audience in a fifteen minute talk.