MLOps aka Operational AI (Part-2)

In the previous post, we have seen a brief introduction of MLOps, steps, and ML production challenges.

In this post, we will discuss 2 things:

  1. Tools
  2. Team composition

Popular MLOps Tools — Cloud options and Open Source

Team composition:

One of the biggest challenges in MLOps centric projects is the requirement of a wide variety of skills. Most of the time it is difficult to get. As per my experience, the bare minimum skills are as follows:

Data Engineering Team:

  1. Strong Python skill (functional as well as OOPS), Spark
  2. Strong ETL knowledge
  3. At least one orchestration tool
  4. Strong SQL skill, Data lake, and DWH concepts
  5. A good idea of Machine Learning workflow
  6. Basic Github and earlier standard DevOps culture of code versioning and CI/CD
  7. The basic concept of Docker and containerization
  8. The mindset to extend the boundary and comfort zone
  9. One of the clouds implementation knowledge (not to be an expert)

Data Science Team:

  1. Data Scientist with good coding skill
  2. SME based on the use-case and domain of implementation
  3. The mindset to extend the boundary and comfort zone

DevOps Engineer:

I would strongly recommend having a DevOps expert with sound implementation experience of CI/CD (tools like Jenkins, JFrog, Github, AzureDevops, etc) across environments — Dev, QA, and Production.

Professional QA Engineer:

Though most of the time we ignore this and try to do it through the DE team. But a professional QA engineer is a must-have.

Please note, I have skipped other players/teams in the project to stay focused on MLOps.

So, that's it for now.

Please put your comment on what you think and challenges you faced or facing in the team composition or choosing the right toolset.

I have provided a reference where you can find the whole ecosystem of tools.

By now, you have some idea. In the next post, I will discuss various architectures.

A Data Enthusiast, Lead Architect with 17 yrs experience in the field of AI Engineering, BI, Data Warehousing, Dimensional modeling, ML and Big Data