supertype logo dark (wide)
supertype logo dark (wide)

Turning to data for automobile after-sales business

A case study on how Supertype engineers work with Toyota-Astra Motor’s data teams to implement an ETL pipeline that closely integrates with their Azure Blob Storage and Azure Machine Learning services, enabling machine learning experiments that are seamless and goal-oriented.

Toyota Astra Motor adopts data analytics to transform from a one-size-fits-all marketing approach to a data-driven initiative


Toyota Astra Motor (TAM, a joint venture between between Toyota Motor Corporation and Astra International) is the de facto market leader in Indonesia’s automobile industry with more than 295k+ car unit sales in 2021, significantly more than any of its peers.

TAM looks to secure and reinforce its market leadership by ensuring high customer loyalty and brand recognition by investing into an excellent after-sales service business. The challenge lies in the vastness of data (~2 million cars sold over the last 10 years.

The ability to sift through the enormity of this customer database in a systematic and purposeful manner presents a challenge even for an organization such as TAM.

  • Laborious Process

    Tedious and time-consuming data preparation that averages 1,084 hours

  • Ineffective Outreach

    Manual after-sales promotion that are neither personalised nor effective

  • More than 2 million vehicle sales

    More than 2,000,000 automobile sales which takes considerable effort to process and systemise


  • A set of rigorous discovery processes that help surface insights on the automobile after-sales market landscape. This is an iterative process involving lots of hypothesis testing and exploratory-natured analysis.
  • Implementations of two prediction algorithm that aids the after-sales business division of TAM:
    • Customer churn prediction: detect at-risk customers and the leading indicators of high churn risk
    • Product recommendation modelling: an unsupervised machine learning algorithm that produces personalised promotions for each customer segments
  • These deployments are done with the objective of greatly improving TAM’s customer retention rate through superior after-sales business over the next few years.


R (Programming Language)

R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing.​

Azure Machine Learning

Offered by Microsoft, Azure Machine Learning is a cloud-based service for building, training, and deploying machine learning models.​

Data Visualization Libraries

ggplot is a data visualization package for the statistical programming language R, and is used to create complex, publication-quality plots with a few lines of code.​

Machine Learning Libraries

Supertype uses a variety of machine learning libraries, including TensorFlow, Keras, and Scikit-learn, to build and train machine learning models.​


1. Automation of Data Preparation Process.

Toyota Astra Motor (TAM) work with Supertype’s data engineers to systemise more than 5 years of raw automobile services data and customer data using Azure Blob Storage. This automation is focused on the data cleansing and preparation process, so Supertype’s consultants architect a pipeline that automate the necessary extract-transform-load (ETL) process.

The resulting data is in turn fed into Azure Machine Learning (AML) for machine learning model experimentation, a joint effort by both teams. TAM’s big data team is involved in validating the business assumptions made throughout the data preparation and model experimentation.

2. Predictive Analytics and Unsupervised Learning.

Once the data is prepared in the Azure Machine Learning, Supertype’s data scientists conduct machine learning predictive experiments to identify customers that are at-risk of churn. The nature of the problem necessitates an open-ended, iterative approach, where unsupervised machine learning experiments are run against each hypothesis in service of a broader goal — product recommendation that feels personalised, incorporating data such as the vehicle model, type of parts, last replacement record etc.

These machine learning experiments are incorporated into TAM’s workflow, allowing the experiments to be run and seamlessly stored into Azure Blob Storage and ready for consumption.


Supertype helps us find insights & use cases based on the data we have through a deep understanding of the data & needs of our company.

Anthony Christian

Data Team, Toyota Astra Motor

Key Benefits

Improved offer in customer retention programs

Being able to identify, quantify and hence, reduce customer churn risk help TAM retain their customers in a way that is data-driven and proactive. This also has the added benefit of increasing customer satisfaction and after-sales service. An algorithm-driven product recommendation system helps the after-sales division with outreach and promotional messaging.

Highly conducive model experiments in 10 minutes batches.

A more standardised ETL data pipeline designed and implemented to integrate with TAM’s existing systems improve clarity, predictability, and data quality since erroneous data is automatically captured early in the data pipeline. The automated ETL pipeline also replaces the manual labor of data processing and labelling, which was estimated to be north of 1,084 hours and costs northwards of $240,000 annually. Because it integrates with TAM’s existing data and technology stack, experiments on new data can be executed in spans of 10 minutes.

Contact Us

Want more case study?

This case study, along with others in our archive are available in the PDF format. To schedule a call with us and learn more about our past projects, leave your contact method below.

Your work email, so we can reach you. We respect your privacy and do not engage in any spamming activities.


More content from Supertype