Adding some AI to just about anything (Part 1 – NLP)

I’ve been working on some fun and interesting things at work lately. One thing is we are in the processing of building out a chatbot. That is, you type in a question “some place”, and you get a rich response back. First, I say “some place” because several technologies allow you to write one bot, and then expose it over several “channels” or “services”. For example, you might write a chatbot and then make it available via Facebook Messenger, Slack, etc.

Generally though, pretty much every chatbot platform is geared towards the consumer market. That is, these bot frameworks support things like: Facebook Messenger, Kik, Discord, etc. If you want something enterprise-facing, and if you are using Microsoft products (like Office 365), then the Microsoft Bot Framework is kind of a no-brainer. It’s relatively easy to implement, and it’s “pretty much” free to use via Azure (I think up to 10,000 requests per month). Better though is that it ties in with: Microsoft Teams, Skype for Business, e-mail, SMS, “Direct Line” which is a catch-all that lets you expose the bot as a raw REST service – in addition to several other vendor-specific channels.

Natural Language Processing (NLP)

One issue we’ve been exploring is which NLP to use. An NLP takes loose, messy, potentially-invalid text from a fat-fingering user, and converts it into logical data structures for a developer to use.

For example, if a user is chatting with a pizza restaurant chatbot and says: “I want a medium cheese pizza” – as a developer, what I need to know is:

  • Intent – what is the core thing they want, to order a pizza. We’ll call that an “intent” of “order-pizza”.
  • Entities/Slots/Variables – are the nuanced details from the request. We tell the NLP all of our best guesses of all the ways a user might ask this questions, and the NLP will do it’s best to pull out the useful details. For example:
    • Size – the size of the pizza. Potential values are small, medium, large. We’ll call that the “pizza-size” entity.
    • Type – the type of pizza. Potential values are cheese, pepperoni, but we want to allow other arbitrary values. We’ll call that the “pizza-type” entity.

What we just came up with is a “speech model”, or an “interaction model” for our natural language processor.

Enter Wit.ai

A co-worker discovered this website:

www.wit.ai

This is Facebook’s NLP and is it free for personal AND commercial use. It is ONLY an NLP. Meaning, it will simply take human sentences, and using the training that you give it, will give you back it’s best “guess” on the intent and entities from that “utterance”. It doesn’t do anything beyond the translation of: human speech or text, to a JSON data structure.

image

To get started, I created a new project called PizzaSample (which you can see here). If I type in the “utterance” that I want it to understand, the NLP has absolutely no idea what it says:

image

What you have to do is define the intent, and entities as YOU want the NLP to process them. You need to “train” the NLP, like this:

image

When you get everything looking correct, you click “Validate” to save these changes, and to reinforce how you want the NLP to process these utterances. Next, I can modify the pizza-size entity. I change that to “keywords”, because I just want to recognize a distinct list of keywords. I then enter the other acceptable values, and synonyms for the same:

image

At this point, you just keep entering different variations, and you correct the NLP each time:

image

Training the Model:

Above is what you call “training” the model. You type in a sentence, it gives you it’s best guess of what it means. If it’s correct, click “Validate” and the speech model will be more “confident” about that style of phrasing. If it was incorrect, use the interface to correct how it SHOULD process that sentence, then click “Validate” to have the model learn from this mistake.

You might have to do this a couple dozen times even for a simple model. A speech model starts off very, very dumb. But it does quickly get very smart, and in a short time, gets ridiculously accurate. Be patient!

As a practical example, our simple speech model with that app at work with about 8 distinct types of questions took 200+ training sentences/corrections. Now, it’s deadly accurate. So, it might take a couple hours, but it’s not a huge deal to train a model (depending on the complexity).

Using NLP in your app:

Within a couple of hours, you can have a speech model that can handle a handful of requests. But how do you use this in applications? Well, you’d generally interact with Wit.ai via a REST interface. Click on “Settings” of a project and you’ll get an Authorization header that you need to include, and then you just make a REST call with your question. For example:

https://api.wit.ai/message?v=20180616&q=I’d like to order a large pepperoni pizza

Except that sentence should be URL encoded. This almost means that ANY kind of app can use NLP – from a proper .NET or Java app, to a Bash or PowerShell script – or anything in-between!

From that REST call, you’ll get a useful JSON result back – something like this:

{
  "_text": "I'd like to order a large pepperoni pizza",
  "entities": {
    "pizza_size": [
      {
        "confidence": 1,
        "value": "large",
        "type": "value"
      }
    ],
    "pizza_type": [
      {
        "confidence": 1,
        "value": "pepperoni",
        "type": "value"
      }
    ],
    "intent": [
      {
        "confidence": 1,
        "value": "order-pizza"
      }
    ]
  },
  "msg_id": "0ymBVXDhb2Pxq5jqA"
}

The significance here is that you took what the user said – including typos, unusual wordings, poor or missing punctuation and turned it into a concrete set of data structures that you, the developer, can act on.

Bottom line:

It’s kind of weird to talk about this like it’s normal – but adding natural speech to pretty much any app has reached the point where it’s:

  • Basically freewww.wit.ai is FREE for personal and commercial use, but also all the major vendors have an NLP offering and they are free or ridiculously cheap. For example, many have a “the first 10,000 translations each month are free” – so virtually free unless/until you start using this heavily.
  • Turnkey – this isn’t some obscure thing in academia, or some futuristic thing in a science-fiction movie. This is so mainstream that a “not particularly cutting edge, enormous enterprise” like where I work – is working with these technologies. These are now mainstream.
  • Easy – before we knew any better, we all thought things like this would be super complex, but they are surprisingly simple. Like I said, starting from scratch, we had a useful, working speech model going within a couple of hours that was like 98% accurate. And then to use it is a simple REST call, and you get a JSON data structure for a return. Easy peasy.

I’m focusing on www.wit.ai in this blog post because it has a VERY low barrier-to-entry. It’s free, has a very simple UI, and is very effective. The great news is, this is a GREAT platform for proving out an idea, or for hobbyist projects. For example, if you have some project you’ve done on the Raspberry Pi in Python, why not add natural speech to it?

The bad news with Wit is that it’s not really a professional platform. They have no formal support – just an e-mail address. They respond within a day, but for mission-critical apps, that’s not-OK. Last month, we experienced an outage overnight (Eastern Time), and it was down for around 9 hours, because they don’t have after-hours support. The other issue is they don’t support the concept of “environments” like dev, qa, prod. This is a big deal for a legitimate app because with each sprint you will probably be adding new intents, entities, and training data. However, with Wit, you’d need to do that with separate projects and to go to prod, you’d have to destroy the production model, and re-create it with your new model from QA. Not cool!

So with that said, if you are in a legit development environment, you might want to go with an enterprise-aware NLP from one of the typical, major IT vendors.

Posted in Computers and Internet, General, Infrastructure, Machine Learning, New Technology

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

Archives
Categories

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 9 other followers

%d bloggers like this: