Deploying AI to production is expensive and painful. But it shouldn't be. We believe integrating AI should be predictable, easy, and freely scalable.

We built Mighty Inference Server to take the pain out of model hosting and performance tuning, to automatically give you the highest possible throughput, to empower you to scale with predictable costs, on your terms in your stack.

What is Mighty?

A fast and lean transformer model inference web server

The best things about Mighty are:

  • Works with most Transformer models
  • Is production ready and can lower your latency up to 80%.
  • Scales linearly with the number of cores and instances you cluster.
  • Has minimal dependencies and is easy to download and run
  • Works great on either CPU and GPU
  • Highly secure and stateless

Question Answering

Faster Extractive Question-Answering

Give the question and context, get back the highlighted answer! Turn any site-search into a powerful question-answering product.

Just run mighty --question-answering from the command line!

Semantic Search

Faster Sentence Embeddings

Using the best and fastest sentence-transformer and cross-encoder models to power your neural search.

Just run mighty --sentence-transformers from the command line!


Faster Sentiment Analysis and Topic Labelling

Discover and classify text, or automatically label millions of documents for better search and recommendations.

Just run mighty --sequence-classification from the command line!

Named Entity Recognition

Faster and more accurate Named Entity Recognition

Accurately extract named entities such as People and Locations at scale.

Just run mighty --token-classification from the command line!