Articles
Articles, musings, and general observations from the Supertype Collective and Foundry team
Unveiling YouTube Insights – Introduction, Data Collection, Data Processing, and Database (Part 1)
In this post, we will develop a website that integrates sentiment analysis techniques and a Large Language Model to provide a comprehensive understanding of YouTube comments, enabling users to extract...
Read MoreUnveiling YouTube Insights – Django REST Framework, API, and AI Chatbot (Part 2)
In this post, we will develop a website that integrates sentiment analysis techniques and a Large Language Model to provide a comprehensive understanding of YouTube comments, enabling users to extract...
Read MoreBuilding a Streaming Data Pipeline with Open Source Stacks | Analyzing and Visualizing Data in a Dashboard (Part 4)
In the fourth and final part of the article, we will perform some analytical queries on the MySQL database and create a dashboard using Streamlit.
Read MoreBuilding a Streaming Data Pipeline with Open Source Stacks | Transactional Data Ingestion and Processing (Part 3)
In the third part of the article, we will demonstrate how to ingest and process transactional data using Kafka and Spark Streaming.
Read MoreBuilding a Streaming Data Pipeline with Open Source Stacks | Project Overview and Environment Setup (Part 1)
In the first part of the article, we will discuss the overview of the project and how to set up the environment using Docker Compose.
Read MoreBuilding a Streaming Data Pipeline with Open Source Stacks | OLTP and OLAP Databases Setup (Part 2)
In the second part of the article, we will walk through the design and implementation of OLTP and OLAP databases using Cassandra and MySQL respectively.
Read MoreUsing Airflow and YouTube API to Automatically Retrieve Trending Videos
This article explains how to set up a workflow using Apache Airflow to retrieve trending videos data from YouTube Data API v3 and store them in BigQuery.
Read MoreTwitter Sentiment Analysis – Use Cases (Part 4)
End-to-end machine learning project on sentiment analysis. In this post, we will explore some use cases of Twitter sentiment analysis in the field of business and social science.
Read MoreBuilding a Streaming Pipeline for Warehouse Inventory Management System
A practical demonstration on building a streaming analytics pipeline for warehouse inventory management on Google Cloud Platform
Read MoreData Warehouse Modernization with BigQuery
A detailed documentation on building a modern data warehouse in BigQuery featuring storage and performance optimization
Read MoreOptimizing full text search in Postgres and BigQuery | Part 2 of Optimized Analytics Applications
A discussion on full text search and indexing strategies in Postgres and BigQuery.
Read MoreOptimizing queries in Postgres and BigQuery | Part 1 of Optimized Analytics Applications
A whirlwind tour of query optimization strategies ft. query plans, index scans, and BigQuery-specific optimizations
Read MoreCalling for Quizmasters for Supertype Fellowship | The Fellowship Thesis
Quizmasters are the ones that pull the strings behind Fellowship, and they are the supply line for well-crafted, well-engineered Challenges on Supertype Fellowship
Read MoreData-driven advertising | 3 use-cases
How I use data analytics to deliver value to my advertising clients; A case study on creative decay analysis, retention rate analysis, and exploratory data analysis (python code provided)
Read MoreTechnical Writing (Why & How to Write Well) for Analytics Professionals
Technical writing is the most overlooked skill by software engineers and analytics professionals. This is a set of pedagogical strategies and practical tips to improve your technical writing.
Read MoreThe power of data storytelling, and how to do it with Tableau
Leveraging Tableau to tell compelling stories that inspire action
Read MoreTwitter Sentiment Analysis – Creating Dashboard and Deploying Model with Streamlit (Part 3)
End-to-end machine learning project on sentiment analysis. In this post, we will walk through the steps of creating a dashboard and deploying our model as a web app with Streamlit.
Read MoreTwitter Sentiment Analysis – Data Preprocessing and Model Building (Part 2)
End-to-end machine learning project on sentiment analysis. In this post, we will walk through the steps of data preprocessing and model building.
Read MoreTwitter Sentiment Analysis – Introduction and Data Collection (Part 1)
End-to-end machine learning project on sentiment analysis. In this post, we will walk through the data collection process with distant supervision method.
Read MoreApplying data analytics to digital advertising (A workflow recap)
How Supertype's data scientist help a digital advertising client increase response rate by more than 20% (a workflow recap)
Read MoreSetting up (or transitioning to) Docker Compose v.2 on Linux
Upgrading to Docker Compose V2 from legacy v1 without Docker Desktop
Read MoreIntroduction to Apache Airflow
Everything you need to know to get started using Apache Airflow
Read MoreServing PyTorch Models Using TorchServe
How to use TorchServe to serve your PyTorch model (detailed TorchServe tutorial)
Read MoreThe Essential Guide to Docker
Demystifying Docker, Docker-Compose & Docker Desktop (comprehensive docker guide / docker tutorial) for beginners, ft. a real Streamlit app example.
Read MoreDeploying Machine Learning models with Vertex AI on Google Cloud Platform
Most data scientists prototypes their ML model using a notebook. Fortunately, GCP offers Vertex AI Workbench, a Jupyter-based infrastructure where you can host and manage your notebook for development.
Read MoreSoftware testing isn’t quality assurance
Why hiring software developers to write great software tests (unit tests) are not the same as performing quality assurance.
Read MoreDiving into Google Data Studio (Part 2)
In part 2 of the Google Data Studio series, we take a deeper look into how Data Studio compares to other business intelligence tools
Read MoreDjango REST + custom permissions + Talend API / Postman = ❤️
Developing in real life: Django REST Framework + testing your custom permissions in an API application with Talend API Tester or Postman
Read MoreRiver water level monitoring with Google Data Studio
In part 1 of Developing a River Water Level Monitoring System with Google Data Studio, Adi walks us through a practical introduction to Data Studio.
Read MoreAn introduction to RFM Analysis in R
Customer segmentation and identifying at-risk customer with RFM Analysis. A simple walkthrough with examples in R.
Read MoreAutomated Keywords Extraction from Job Descriptions on Indeed.com
Working with app review analytics: the data science methodologies, tools and problem-solving frameworks you need to make sense of customer reviews
Read MoreData Science for App Review Analysis
Working with app review analytics: the data science methodologies, tools and problem-solving frameworks you need to make sense of customer reviews
Read MoreDecision boundaries with PCA and FAMD
Multiple approaches to dimensionality reduction for pattern discovery, visualization and drawing decision boundaries (PCA, FAMD, PCAmix)
Read MoreTopic Extraction from Text
A journal entry on topic extraction from text (app reviews, e-commerce text reviews), using supervised, unsupervised and semi-supervised approaches
Read MoreHire from Indonesia
Unsure about hiring data analysts from Indonesia? Read on to see my thoughts on why this is an untapped market your business should seriously consider.
Read More