Natural Language Inference and NLP

How it can give us something we hitherforto though cobblers: a computer-you-can-ask-anything!

Imagine being able to ask a computer - perhaps a Virtual Assistant on a phone - any question at all and getting the answer. Not just “What’s the time in New York?” - but “What monotremes are native to Australia?” And it’s not going to reply with, “Here’s what I found on the Internet”, and tell you to read the Wikipedia article on the Fauna of Australia; it’s just going to start reading them out to you, “Short-beaked echidna, platypus…”. Well, that’s what Natural Logical Inference can do for us. It won’t be on the next iPhone perhaps, granted, but the potential already exists.

In this article I’ll explain this extraordinary potential of applying methods like Natural Logical Inference to text content. To do that, I’ll compare it to the better known field of Natural Language Processing, explaining that where NLP uses language to extract the facts that language states directly, NLI additionally uses syllogistic reasoning on top of those facts to generate entirely new knowledge - facts merely entailed by what the text says directly.

Then I’ll move on to how a systems based on technologies like NLI system, if scaled to the Internet level rather than that of an individual dataset, would effectively make much of human knowledge immediately available along with many of its entailments - like the cinematic trope of a computer-you-can-ask-anything.

If you would like to read more of my blogs, please visit Skim Technologies.

Natural Language Processing

Anyone with a passing familiarity with fields like NLP, already understands one of its core motivations: getting access to the huge amount of useful facts - or "insights", to give them their stage-name - that remain hidden away in free-text documents, where they can't be accessed programmatically.

To illustrate this, imagine you have in front of you a stack of hand-written documents. You ask the stack very sweetly, "Please show me a list of all the things these documents name as being mere mortals". But as you can imagine, nothing much will happen. You're just going to have to leaf through each page, squinting at a dozen or more different forms of cursive and keeping an organised list as you go - or you can pay someone as little as possible to do so for you of course.

But by using an NLP-based system - perhaps one that assumes the burdens inherent in tasks like handwriting recognition, tokenisation, part-of-speech tagging, etc. - then, assuming you have access to a document scanner, you can, in principle, ask it about mortals in the form of a query, and hopefully obtain a complete list. This query, greatly simplified, will say roughly this:

[*]->is-mortal

That is, extract any word (*) followed by “is” and then “mortal”. Obviously any practical query will need to be more intricate than this. It will need to avoid negations like “It’s not the case that Zeus is mortal”. It should use a Language Model to infer when a term is a synonym of “mortal” - or of “is” for that matter. Perhaps we’d need some Named Entity Recognition to determine that our subject is a Person and not an Animal, if it's just people we're interested in. And so on.

In principle though, that's how the problem can be solved with NLP. You have now "unlocked your insights" as the PowerPoint slides all say, and can access your corpus much like any other data source (e.g. a database) - and further you could of course connect this data source some sort of Virtual Assistant technology, so you can use your telephone to ask questions about who’s mortal anytime, anywhere.

So that's roughly why NLP is useful - but what about Natural Logical Inference?

Natural Logical Inference

Using NLP as above involved a roundabout way of finding what we wanted. We just wanted a list of the mortal. We wanted facts: the naming of names. We don't actually care about language, which was just the medium - a proxy - for finding that information. It's just that when working at the level of language, we need to exploit the fact that the terms we're interested in will usually be followed by others like "is mortal". And also working at the language level, we suffer all its attendant shortcomings. Imagine for example that one of our texts, one on human biology, contains the sentence - or rather, asserts the fact - that:

All men are mortal.

And a quite separate text much lower in our stack, this time a collection Hellenic biographies, asserts another fact:

Socrates is a man.

Would our NLP system have returned the name Socrates in this case? Well, no: even if they were not so far apart, in totally separate works, the most elaborate language-level query could not return the name Socrates. Why not? Because understanding that these two premises mean we should return Socrates demands an appreciation of the concept of syllogistic entailment - requires understanding our corpus at the semantic, instead of the linguistic, level. And that is what a system based on Natural Logical Inference would provide.

NLI derives new knowledge by taking an assertion made explicitly in the source text and modifying it to form a new assertion that may not have appeared at all. This modification just needs to happen in a very particular way - it needs, in other words, to be truth-preserving. To explain further, we need to understand a couple of concepts.

Hypernymy and hyponymy
The monotonicity of universal quantification.

Hypernymy and Hyponymy

These terms might sound obscure, but really they are fairly simple. A hypernym is just a term whose concept subsumes that of its corresponding hyponym. So for example Fruit is a hypernym of Apple - because all Apples are Fruits. And Apple is itself a hypernym of Bramley Apple - and so on. So Hypernymy represents descending the taxonomic tree towards the broader concepts near the trunk. And Hyponymy is climbing up the tree toward the finer-grained concepts at the leaves.

Looking back, we can see that both our example sentences above involve these concepts in one form or another. Taking the simplest case for now, the sentence “Socrates is a man” asserts that Socrates is a hyponym of man. That is, based on this sentence, the taxonomy representing our imaginary stack of papers can now be populated with these two terms, one the hypernym of the other. So now we’ve generated a very simply taxonomy from our source text, how do we exploit that? And how does our second sentence involve hypernymy?

Universal Quantification

So a Quantifier is concept - whose instantiations in Natural Language are terms like “most”, “some” or “many” - specifying which members of a given group we’re referring to. And the Universal Quantifier (which is actually a hyponym of Quantifier for those paying entirely too much attention) is the Quantifier that refers universally to the members of that group. In plain English, the Universal Quantifier does the same work as the term “all” or “for each” - but it’s represented formally with the Ɐ symbol. And of course our second example sentence “All men are mortal” involves the word “all” and hence the concept of Universal Quantification. Expressed in First Order logic, the assertion looks like this:

∀x(Man(x)→Mortal(x))

The above is basically saying: For all the things (x), if that thing is a Man, then it follows (→) that said thing is Mortal.

Anyway, now we’ve armed ourselves with the concepts of hypernymy and hyponymy, along with a formalised representation of our sentence, we can use a truth-preserving property of Universal Quantification - namely, Monotonicity - to generate new assertions, and hence new knowledge, from our original statements.

Monotonicity, Upward and Downward

Monotonicity really means that something moves only in one direction (although not necessarily at a constant “speed” or in consistently-sized steps) and never backtracks.

The Universal Quantifier has the properties of being Downwards Monotonic with its first argument (in our sentence, the Man argument) and Upward Monotonic with its second (the Mortal argument). This is really important, because it means that we can take a Universally Quantified assertion and modify it by swapping out its first argument with hyponyms (narrower terms) and/or its second argument with hypernyms (broader terms) and the result will always be true if the original statement was.

Now, I would never insult my dear reader by implying that he or she had a familiarity with that staple of British pubs and chip shops, the Fruit Machine. But it might be helpful to imagine the leftmost Quantifier argument as a reel that one may Nudge to reveal a new hyponym (instead of, say, a Bell or a Lemon), while pressing Hold on the rightmost reel so it stays fixed. Each reel can be rotated in a constant direction, but never back the way it came.

If that’s clear, we can now generate new knowledge by climbing up the taxonomic tree - or rotating the reel - for the first argument (I’d show the opposite with the second argument, too, but I honestly can’t think of any hyponyms for “Mortal”, so for that one we’ll have to use the Hold button for now.):

As we can see, each or our new sentences are just as true as the original. While this is progress, these new sentences aren’t exactly very interesting. But combined with the knowledge we obtained from our first sentence, that Socrates is a hyponym of Man, we can use Downwards Monotonicity draw a more interesting entailment from our sentence:

In other words, the principles governing the Universal Quantifier allow us to swap out Man for Socrates. Once we do that, the first argument ranges over only a single individual (Socrates) instead of a group. Since we’re no longer talking about groups, the Universal Quantifier is now redundant and can be cancelled, leaving the simple assertion that Socrates is mortal. And so from two separate assertions in two unrelated works in our stack of papers, we have derived entirely new knowledge. An NLI-based system can therefore extract this particular piece of knowledge where a lexical NLP approach cannot.

This has just been on example of course. There are other kinds of quantifiers with different monotionicities - and some without clear interpretations. Plus, not all language is quantifiers. But even so NLI offers methods of extending the knowledge contained in significant amount of natural language - and in a computationally tractable way.

The (Very) Long-Term Potential

So we’ve seen an example how NLI can derive new knowledge from our hypothetical stack of papers. But now imagine further our stack is in fact the whole internet. The benefit of a system that can reason in this way - and the possibility of connecting it to a Virtual Assistant - is now much more striking. Not only could you ask about who's mortal, but for practically any piece of information entailed by any combination of facts contained in our entire corpus.

Of course, I was perhaps a little optimistic in my introduction. There is plenty of technical and computational work to be done before we have our omniscient Virtual Assistant. To get NLI off the ground at all, the existing problems of NLP already mentioned need to have been solved - or largely solved. To do this sort of reasoning, our system needs to have been pre-loaded with, or to have learned, facts about the target language, like that "men" is the plural form of "man". We need context-sensitive Named Entity Recognition to disambiguate the word "Socrates" - mapping it to the concept of the Greek Philosopher as opposed, say, to the Brazilian footballer of the late 70s and early 80s (although this task might be confounded somewhat by mentions of Socrates the footballer having taken the Hippocratic oath).

Even so, with that work assumed, through NLI and similar knowledge-generating methods the required inferences can then be made programmatically with the application of some well-understood rules of logic and language - like hypernymy and hyponymy, and the monotonic properties of quantifiers and so on.

If you're interested in reading more of my blogs, visit https://www.skimtechnologies.com/