Why To Build Your Own Natural Language Processor

The first question to ask yourself is why do I want to build my own when there are so many others out there. There are a few reasons why:
  1. Domain Knowledge Bias: A lot of the ones available are built on global shared models and as everyone uses the models it learns the things that they are training it on. For your specific domain you may want to use your own model.  Some providers may allow you to swap out the model in which case you can simply build your own model rather than a full provider.
  2. Language or Culture Bias: Sometime a model does not support your language and you cannot swap out the model or it does support your language but some of cultural nuances are not handled. If it is related to the model then you can simply swap it out but if it is part of the algorithm you cannot.
  3. Algorithm Failure for given scenarios or you need to inject additional rules to provide context and the processor does not allow for it.
  4. Cost: Using your own means you can host it wherever and control the cost. This can be beneficial if you have high volumes of traffic and there is not an open sourced library you can use that will suit your purposes.
For DAIN and DIANA it was all 4. As they are mentoring we needed to swap out the models. For language there were issues with cultural nuances where words meant multiple things and existing algorithms did not understand.  We also had the Reputation Engine which could add extra context to the NLP that could not be injected into the others.

Now that you know why you need to you can read my next article on How To Build Your Own Natural Language Processor which is coming next Monday.

If you are interested in learning more about building your own then reach out and let's discuss. We have created a ByoNlpQuickStart that can help you get started. I will be providing a that QuickStart and others to SitecoreDain subscribers soon. If you cannot wait email me at chris.williams@readwatchcreate.com and as a subscriber, I can release an early version to get you started now.

Comments

Popular posts from this blog

Natural Language Processing

Providers and Pipeline, Oh Why!!!

The Universal Content Processing Engine