LexPredict Open Sources The 1910 Version of Black’s Law – The World’s Most Well Known Legal Dictionary is Now a Data Object

From the release
:  “At their core, many academic and commercial applications of natural language processing and machine learning can benefit from a controlled lexicon of expert-selected terms (i.e., a dictionary). This is especially true of highly technical language, such as legal text. However, after a search of the existing landscape, we were unable to find a high-quality open source or freely-available legal dictionary. Instead, the best existing versions, when available, exist under some form of restrictive licensing conditions.”

“Thus, in furtherance of both the legal profession as well as a range of legal technology providers and solutions, we are announcing another step in our broader open source plan that we outlined earlier this month. Namely, we are making available on Github the 1910 Version of Black’s Law (i.e., Black’s Law 2nd Edition) as a structured data object. This early version of arguably the premier legal dictionary is made available under the open source GPL license 3.0 which should allow both researchers and commercial providers to operate with limited restrictions.”

Click here to access the GitHub Repo.

Why We’re Open-Sourcing ContraxSuite – Product Overview, Some Use Cases and Plan for Release

Following up on our prior announcement – here is a slidedeck offering more Product Overview, Use Case and Plan for Release.

Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and the Modern Information Economy


Today we here at LexPredict announce that we will be open sourcing our document analytics platform ContraxSuite (which works on a wide class of documents beyond just contracts).

From the Announcement – “Starting on August 1st, this code base and our public development roadmap will be hosted on Github under a permissive open-source licensing model that will allow most organizations to quickly and freely implement and customize their own contract and document analytics. Like Redhat does for Linux, we will provide support, customization, and data services to “cover the last mile” for those organizations who need it.

We believe that a very important future for law lies in its central role in facilitating and regulating the modern information economy. But unless we start treating law itself like the production of information, we’ll never get there. Before we can solve big problems with smart contracts, we need to start by structuring existing legacy contracts. We hope our actions today will help lawyers, companies, and other LegalTech providers accelerate the pace of improvement and innovation through more open collaboration.”    (click here for full announcement or access via Slideshare)

Exploring the Physical Properties of Regulatory Ecosystems – Professors Daniel Martin Katz + Michael J Bommarito

Measuring the Temperature and Diversity of the U.S. Regulatory Ecosystem (Preprint on arXiv + SSRN)

From the Abstract:  Over the last 23 years, the U.S. Securities and Exchange Commission has required over 34,000 companies to file over 165,000 annual reports. These reports, the so-called “Form 10-Ks,” contain a characterization of a company’s financial performance and its risks, including the regulatory environment in which a company operates. In this paper, we analyze over 4.5 million references to U.S. Federal Acts and Agencies contained within these reports to build a mean-field measurement of temperature and diversity in this regulatory ecosystem. While individuals across the political, economic, and academic world frequently refer to trends in this regulatory ecosystem, there has been far less attention paid to supporting such claims with large-scale, longitudinal data. In this paper, we document an increase in the regulatory energy per filing, i.e., a warming “temperature.” We also find that the diversity of the regulatory ecosystem has been increasing over the past two decades, as measured by the dimensionality of the regulatory space and distance between the “regulatory bitstrings” of companies. This measurement framework and its ongoing application contribute an important step towards improving academic and policy discussions around legal complexity and the regulation of large-scale human techno-social systems.

Available in PrePrint on SSRN and on the Physics arXiv.

LexPredict Hackathon Challenge – Extracting Simple Contract Metadata

Beyond the specific prize attached to upcoming hackathon event, we welcome anyone who (for fun) would like to take a crack at this challenge.

Email us directly (Daniel Martin Katz or Mike Bommarito) – if you would like to work on this challenge.

Our LexPredict Challenge is an opportunity to develop basic tools for processing contracts.

Specifically, you will use the sample contract data below to develop algorithms to:
(1) identify the parties to an agreement
(2) identify effective date segment and date
(3) identify termination clause segment(s) and date(s)

At LexPredict, we have built this simple (and other more complex) technology for use in commercial applications.  This is an opportunity to use this challenge to produce open source content which we can be used by all (including in the Legal Analytics Course).