Andrew Tirto Kusumo

Senior Data Engineer

Building data infrastructure that scales.

About

Data Engineer with 7+ years of experience building scalable data pipelines, streaming architectures, and analytics platforms across fintech companies. Passionate about making teams work faster and better through reliable data infrastructure.

Languages

PythonSQLScalaJava

Big Data

Apache SparkKafkaPub/Sub

Orchestration

Apache AirflowDagsterPrefect

Cloud

AWS (Glue, Redshift, S3, EMR)GCP (BigQuery, Dataflow)Azure

Data Modeling

dbtData VaultStar Schema

Databases

PostgreSQLMySQLMongoDBCassandraDynamoDB

DevOps / Infra

DockerKubernetesTerraformCI/CD

Other

GitLinuxREST APIs

Experience

Funding Societies

Senior Data Engineer

Nov 2024 — Present

AirflowAWSSnowflakeQLIKPythonDocker

•Optimized ETL pipeline layers from 4 hours to 2.5 hours per run (~37.5% improvement)
•Led the Finance & Risk DE Team to generate multiple reports for Finance Closing, FP&A Reports, ECL, and Regulatory Reports
•Handling critical pipelines shared via Snowflake SharedDB to key external partners
•Acting as sprint leader, bridging DA requirements to the DE Team
•Migrating legacy pipelines to a more sustainable approach using ECS
•Assessing SDLC changes from Product/Engineering to identify impacts on DA Dashboards
•Maintaining and optimizing Snowflake costs with plans for further reduction
•Improving team performance through hands-on guidance and streamlined documentation

Paper.id

Senior Data Engineer

May 2024 — Oct 2024

AirflowdbtBigQueryArangoDBDatastreamPub/SubPythonDocker

•Built a streaming pipeline from scratch using Google Datastream, Pub/Sub, and Dataflow to ingest data from App DB to BigQuery
•Fixed existing dbt ELT inefficiencies, improving development time by ~100%
•Reduced BigQuery costs by ~20% per month through targeted optimization
•Created a cost management dashboard tracking project-level spend daily

Flip.id

Data Engineer Manager

Dec 2022 — May 2024

Senior Data Engineer

Sep 2021 — Dec 2022

dbtBigQueryDataflowDatastreamPub/SubPythonDockerGitLab CI

•Built a streaming pipeline from scratch using Google Datastream, Pub/Sub, and Dataflow
•Created end-to-end ELT pipelines with dbt, implementing tests and query dependencies
•Reduced BigQuery costs ~20% per month through strict partitioning and clustering
•Built and managed the Data Engineer team from zero — hiring, career framework, and processes
•Developed a credit scoring POC for a new lending product with Docker and FastAPI
•Provisioned Redash and Looker Studio dashboards for analysts and end users

JULO

Senior Data Engineer

Jan 2021 — Sep 2021

Data Engineer

Aug 2018 — Jan 2021

AirflowAWSGCPSparkPostgreSQLDockerCircleCIAnsible

•Managed and maintained Airflow data pipelines running 24/7
•Created database replicas for streaming to master DB for analytics
•Deployed ML models using Docker and H2O with feature implementation in Django
•Designed and implemented PostgreSQL 10 range partitioning for large tables
•Built an action log data archiver for a DB with nearly a billion rows
•Integrated CircleCI for automated testing and deployment

Projects

Real-Time Streaming Pipeline

Designed and implemented a CDC-based streaming pipeline to migrate the data platform from batch-only ingestion to real-time, enabling near-instant data availability in BigQuery.

Enabled MySQL CDC via binlog replication in coordination with DevOps, feeding change events through Pub/Sub into Dataflow
Architected a cost-efficient pipeline with separated staging and production layers for safe iteration and deployment
Replaced batch-dependent workflows, unlocking real-time analytics and reporting for management

MySQL CDCDatastreamPub/SubDataflowBigQuery

KUACI — Open Source KYC

Built an open-source data enrichment library for Indonesian KTP (national ID) that extracts gender, date of birth, district, and city from ID numbers — enabling automated validation against user-submitted data.

Originated from a hackathon project; designed to cross-validate KTP-derived fields against user input for fraud detection
Improved credit scoring accuracy by enriching identity data without additional user friction
Contributed to the GitHub Arctic Code Vault

PythonOpen SourceData Enrichment

Source

BigQuery Cost Optimization

Spearheaded a company-wide initiative to reduce BigQuery costs by auditing unoptimized queries, enforcing partitioning and clustering standards, and building observability tooling.

Implemented table partitioning and clustering strategies across key datasets to minimize scan costs
Built a cost monitoring dashboard and automated alerts for anomalous query spend
Conducted knowledge-sharing sessions and ongoing QC reviews to embed cost-awareness into the team culture

BigQuerydbtSQLMonitoringCost Management

DE Team from Zero

Took on my first engineering leadership role at Flip.id, building the Data Engineering team from the ground up — from hiring and process design to establishing team and career frameworks.

Grew the team from 1 to 4 engineers over 1.5 years, owning the full hiring pipeline end-to-end
Defined team frameworks covering GitHub workflows, PR review standards, and weekend on-call support rotations
Established a career framework for growth paths; team was recognized multiple times for responsiveness and reliability

LeadershipHiringProcess DesignMentoring

Get In Touch

Have a question or want to work together? Feel free to reach out.