


Communications, Media & Telecom - Development and Implementation of a Fully Automated MLOps Platform
Communications, Media & Telecom
The implementation of a fully automated MLOps platform significantly reduced the time and effort for the data science teams and enabled three times faster deployment of new and existing ML models.
Initial situation
The customer’s machine learning models had to be manually trained, tuned for hyperparameters, hosted, and deployed, and the tracking of experiments was done manually. This caused significant time and effort for the data science teams with each model change or new ML use case.
Architecture
The architecture is built on AWS, using technologies such as Apache Airflow, Apache Spark, Amazon Glue, Amazon Aurora & AmazonSageMaker, Terraform, GitHub, FastAPI, Splunk Observability, and Mlflow.
Generated benefits
As a result, there are no costs for “moving to production” since all workflows of the data scientists run in the production environment and use a special IAM authorization model. New ML models are developed in an autonomous workflow by the data scientists and deployed live in production three times faster than before, including automated model training, experiment tracking, model hosting, and serving.
Services accomplished by synvert
The synvert team has implemented an ML platform that is fully automated, testable, observable, and well-documented. Synvert developed a modular principle supported by declarative RESTful services and managed components, which together minimize the need for DevOps or engineering support and ensure rapid deployment of new and existing models.
Both existing and new ML use cases operate in a highly available and stable environment, thanks to significant investments in testing strategies.