Putting AI models in production is notoriously difficult. The challenges have many different nuances and summarising those in one short blog post is ambitious. Instead, this blog post shortly summarizes the fundamental challenges of productionizing AI models and how we at 2021.AI accommodate these on our AI platform Grace.



Productionizing AI models are notoriously difficult, and many algorithms have been developed simply to be stowed away in their repos and actually never put to good use. In the majority of cases, the main challenge is that existing infrastructure has developed to a complexity where adapting it to accommodate an AI environment becomes difficult, or sometimes infeasible.

It is often necessary to develop or adopt entirely new platforms solely to support an AI agenda. In its mature phase, this infrastructure often has two main characteristics:

  1. Firstly, it is specialized to support a specific flavor of software that does not necessarily adhere to standard notions of software development. AI software differs from traditional software engineering practice because AI is required in cases where the desired behavior cannot be obtained without dependency on external data. This means that the traditional software practice of neatly delineated modular design and strict abstraction boundaries becomes more difficult, or is simply eroded over time.
  2. Secondly, since AI is a relatively immature field in regards to horizontal implementation with other software, 95% of the code in a mature project ends up being a “glue code” (code that adapts different modules) while only 5% is actually an AI related code.

Developing the production capabilities on Grace, we try to accommodate both of these challenges.


Grace is designed with machine learning and data science in mind. This means that the journey from data ingestion, over verification to feature extraction, model training, production, and scaling has all been considered.

The production process on Grace is designed around two simple principles listed below:

  1. Minimizing time-to-production
  2. Encouraging the development of common APIs to minimize glue code

Production process on Grace AI Platform

Production process on Grace

The first point (minimizing time-to-production), is achieved by supporting a limited set of libraries and requires that models are delivered in a certain format (see below for details). If those requirements are met, deployment to production can be done with a few simple commands. Thus, we are aiming for the middle ground where our “one- click- production” supports the most popular models and ease of use and more exotic variants need tailoring, which the platform is open to.

Grace AI Platform - AI Models

Grace AI Platform

From the user perspective, the second point (development of common APIs) is mainly done through the docker registry. Have you built a pipeline that handles a variety of multiclass classification problems and does model optimization? Or perhaps trained a general neural network to recognize faces? Great! Submit it to the platform, and this image will be available on the registry for your colleagues to use or develop further. Having common components like this across a platform makes it easier to maintain and update code.


The architecture consists of six main components and is best illustrated with an example (see figure below).

AI Model Architecture

AI Model Architecture

The user points to a project that contains the necessary requirements using either the CLI or Grace frontend. Through a RESTful API, one or several commands are sent to the message queue for execution. The available CLI commands are: build, start, stop, delete and list.

Composite commands, such as deploy, are made up of two parts- build and start. If the source is unknown, i.e. does not exist in the docker registry, the image builder will take the necessary code components from the source, and build a new image. This image is saved in the registry and ported on to a component that handles deployment to the container management system.

If the image has already been built, it will be listed as part of the registry, and the user can choose to deploy that image along with model metadata. Either way, a URL for submitting requests to the model, based on the project name, is returned to the user.



In summary, Grace offers viable solutions to well-known problems in the production phase of machine learning models. This is important because it gives the data scientist a set of specific requirements to live up to. If these requirements are fulfilled, deployment and subsequent scaling of any model will be easy.

Further, the flow from data scientist to the registry and to deployment in the container management system makes it easy to maintain and deploy new models. This is an important point, as any mature AI pipeline needs maintenance on a regular basis and new data scientists should be able to understand the process of older projects, even if they were not involved. Overall, we aim to make the process around productionizing transparent and open to configuration, while adhering to a certain standard which enables fast and reliable deployment.

You might also like…

Join our newsletter

It’s not fake. It’s not artificial. It’s real news! Sign up to our newsletter and get the latest AI insights from our data science and AI experts on how to get real value from AI.

*By subscribing I agree to receive news and updates from 2021.AI