Data Scientist
SQL,Python,Tableau,R,Hadoop Hive/Spark/Kafka,Rockset Database
Education
- Bachelor of Engineering (Chemical Engineering) with Honours, National University of Singapore (Jun 2020)
- Diploma in Chemical Engineering, Singapore Polytechnic (Mar 2014)
Skills
- Data & Statistical Analytics
- Data Visualization & Storytelling
- Clustering Algorithms
- Dimensionality Reduction
- Regression Models
- Deep Learning Neural Networks
- Recommendation Systems
- Big Data Architecture
- GMP Environment
- Biologics Manufacturing Operations
- Single Use Technology
- Change Control Management
Work Experience
Data Scientist | Daikai Engineering Pte Ltd (Jun 2024 - Present)
- Lead ETL processes using Python and MySQL, integrating data into Oracle Database.
- Conducted in-depth Exploratory Data Analysis (EDA) on multi-variate time series, enhancing feature engineering.
- Applied time-based clustering and graph-based semi-supervised learning to refine data annotation.
- Developed LSTM Autoencoder models for anomaly detection across engines and vessels.
- Authored and implemented company-wide PDPA policy.
- Customized Kanban project management system using Kintone to enhance team coordination and efficiency
- Provide comprehensive IT support, addressing hardware, software, network, server, and cybersecurity challenges to ensure smooth operations.
- Initiated ad-hoc projects to aid sales efforts and improve workflow efficiency.
Process Engineer (Alarms) | Pfizer Asia Pte Ltd (Apr 2023 - Apr 2024)
- Led a cross-functional team to revamp the GMP alarm management system.
- Defined User Requirement Specifications for batch and periodic alarm reviews.
- Implemented Statistical Process Control with control limits.
- Spearheaded alarm reduction initiatives across multiple departments, achieving a 56% reduction in nuisance alarms in the initial phase.
- Actively partook in continuous improvement projects.
Manufacturing Associate | Amgen Singapore Manufacturing (May 2021 - Dec 2022)
- Specialized in single-use technology for biologic downstream purification.
- Proficient in MES, LIMS, DeltaV, DCS, Control Studio, BOI, PI Vision, and Spotfire.
- Ensured zero product loss during manufacturing operations.
- Collaborated with subject matter experts on an OPEX project, reducing monthly operational costs by over $100,000.
Research Apprentice | Duke-NUS Medical School (May 2020 - Nov 2020)
- Managed multiple cell lines for novel research on Emerging Infectious Diseases.
- Conducted a wide array of biological assays on a daily basis:
- Modified Lowry Protein Assay
- PBMC Extraction
- RNA Extraction
- ELISA
- SDS-PAGE
- Polymerase Chain Reactions
Technical Executive Intern | TUV SUD PSB Pte Ltd (May 2019 - Aug 2019)
- Facilitated the certification of 10+ chemical products.
- Proficient in HPLC, GCMS, GCFID, GCNCD, FTIR, UV-VIS Spectroscopy.
- Developed an Excel VBA macro, reducing data handling time by 83%.
- Authored a guidebook for incoming interns, fostering learning and autonomy.
Administrative Executive | Singapore Armed Forces (Jan 2016 - Dec 2016)
- Awarded with the Outstanding Servicemen Award in recognition of significant contribution to the unit.
- Effectively planned for vehicular maintenance and manpower allocation according to dynamic demand.
- Tactfully resolved administrative hurdles on a daily basis.
Projects
Anomaly Detection in Maritime Engines (Aug 2024 - Present)
- Conducted Exploratory Data Analysis (EDA) on complex engine datasets to identify patterns and inform anomaly detection strategies.
- Developing advanced anomaly detection models using techniques such as hierarchical clustering, label propagation, and LSTM-based autoencoders.
- Pioneering innovative methodologies to improve predictive maintenance and operational efficiency.
EDA with SQL and Tableau for an Emergency Response Department (May 2024)
- Utilized SQL to explore data and provided critical business insights for management team.
- Built interactive and beautiful vizualizations using Tableau Public on EDA findings.
Hackathon - Hyperparameter Tuning in Neural Networks (Apr 2024)
- Built a deep learning neural network to predict survival rate of passengers on the Titanic.
- Utilized RandomSearchCV, GridSearchCV, Dask Tuner and Keras Tuner to optimize hyperparameters of a deep learning model.
- Achieved a model accuracy of 83% on unseen data.
Set up Hadoop Cloud Ecosystem (Dec 2023)
- Configured a functional Hadoop Hive/Spark/Kafka architecture using AWS EC2 through a Linux terminal.
Hackathon - Predictive Models (Nov 2023)
- Built a regression model in python using Random Forest and Deep Learning algorithms to predict the commuting experience of passengers on the Shinkansen.
- Model achieved 80% accuracy on unseen data.
- Ranked 13th in global cohort of 45 teams.
Recommendation System (Nov 2023)
- Built a product recommendation system using collaborative filtering and SVD Matrix Factorization in python to recommend products to target audiences in an online retail platform.
- Model achieved an accuracy of 88.4% Recall and 85.4% Precision.
- Key business insights and high value recommendations were proposed based on the analytics done.
Regression Models for Online Learning Platforms (Oct 2023)
- Built a prediction machine using Random Forest and Decision Tree algorithms to predict the conversion probability of leads for an online learning platform (leads are audiences that showed interest in the programs).
- Models were able to achieve a precision score of 83%, allowing business to focus their resources on clients that has an 83% likelihood to convert.
Exploratory Data Analysis on Foodhub business review (Sep 2023)
- Conducted in-depth EDA on Foodhub data using python.
- Delivered revenue-boosting insights through beautiful vizualizations and compelling storytelling.
Weight Profile Prediction (Jan 2020)
- Built a predictive model that can project how a person’s weight will vary through their life, according to their personality archetype.
- Model was pioneered using mathematical formulas from chemical engineering and key principles from behavorial psychology in python language.
Chromatogram Data Extractor (Jul 2019)
- Automated the data extraction and transcription process from chromatograms using Excel VBA, reducing process time from 30 to 5 minutes.
Certificates
Data Science Certificates
- Data Science and Machine learning: Making Data-Driven Decisions (MIT, 2023)
- Python Essential Training (Linkedin Learning, 2024)
- Python for Data Science (Linkedin Learning, 2024)
- Python for Data Science and Machine Learning Essential Learning (Linkedin Learning,2024)
- Building Recurrent Neural Networks with Python Keras (Linkedin Learning, 2024)
- Advanced SQL (HackerRank, 2024)
- Intermediate SQL (HackerRank, 2024)
- Basic SQL (HackerRank, 2024)
- Data Storytelling with Tableau (WSQ, 2024)
- Tableau for Data Scientists (Linkedin Learning, 2024)
- R for Data Science: Analysis and Visualization (Linkedin Learning, 2024)
- Mastering Big Data Analytics (Great Learning, 2024)
- Git Essential Training (Linkedin Learning, 2024)
- Reinforcement Learning Foundations (Linkedin Learning, 2025)
Engineering Certificates
- Follow GMP (PACE, 2021)
- Apply Manufacturing Technologies in a Regulated Environment (PACE, 2021)
- Operation of Inoculation and Fermentation Reactors (PACE, 2021)
- Operation of Tangential Flow Filtrators (PACE, 2021)
- Operate in a Controlled Clean Room Environment (PACE, 2021)
- Operate Chromatography Process (PACE, 2021)
- Apply Safety in Process plants (PACE,2021)
- Apply Process Quality Control Techniques (PACE, 2021)
- Apply Process Analytical Technology (PACE, 2021)
- Apply Continuous Process Improvement Techniques (PACE, 2021)
- MatLab (Mathworks, 2019)