Summary
Overview
Work History
Education
Skills
Websites
Accomplishments
Timeline
Generic
EUNBIN KANG

EUNBIN KANG

Data Scientist
Seoul

Summary

I have built my career as a Data Scientist in the fintech industry, pursuing business optimization through data-driven decision-making. Rather than being confined to predefined frameworks when solving a given problem, I find it engaging to identify the essence of the issue and explore the most optimal approach to tackle it along the way.

Overview

3
3
years of professional experience
7
7
years of post-secondary education
2
2
Languages

Work History

Data Scientist

FINDA
05.2022 - Current
  • Delinquency Prediction Modeling (Jan 2025 - Present)
    - Redefined short-/long-term delinquency labels and engineered time-series features by merging FINDA behavioral data with KCB credit profiles.
    - Prototyped and deployed an ExtraTreesClassifier default-risk model (KS 0.62, 94% precision) that cut early delinquency by 40%.
    - Led end-to-end data extraction and Python/PySpark coding, selecting KS as the primary credit-scoring metric and optimizing for high precision to protect revenue.
    - Built evaluation simulations and set up model governance to guide future CDS-pricing simulations and business scenarios.
    - Preparing production rollout via a Kubeflow(mlflow)-based MLOps pipeline for fully automated training, scoring, and monitoring.
  • AI-TFT Projects (Apr 2025 – Present)
    -
    Contributed to multiple internal AI initiatives as a member of the company’s AI Task Force (AI-TFT), launched to address AI demands across departments.
    - Developing a labeling model for transaction descriptions in consumer spending analysis, using a hybrid approach combining Text-to-Vector similarity search and a LightGBM classifier.
    - Collaborating on the Cashflow Projection feature of an AI-CFO system that helps assess and manage financial health for startups, including auto-generated financial reports using LLMs.
  • Anomaly Detection for Voice Phishing Prevention (May 2023 - Sep 2023)
    -
    Developed a real-time voice-phishing anomaly-detection model (logistic regression, formula-based for backend) to flag suspicious loan applications as reports surged from partner banks.
    - Led full pipeline—EDA on fraudulent behavior, model training, QA with backend—and optimized for high precision to minimize false positives that disrupt legitimate users.
    - Model now triggers in-app warnings at the limit-inquiry stage; enabled lenders that had halted loans due to fraud to resume service.
    - Post-launch audit: 16 of 20 newly reported fraud cases were correctly flagged; bank complaints fell >50%.
    - Prototyped an FDS PyTorch model for Tier-1 & regional banks (higher complexity cases), ready for deployment when fraud volumes rise again.
  • Pseudonymous Data Linkage (Aug 2024 - Present)
    -
    Led a pseudonymous data-linkage initiative that combined FINDA behavioral data with partner banks’ CSS (AS/BS) datasets to enhance their credit-scoring strategies.
    - Enriched the joint “FINDA Score” by adding new features derived from log-ins, limit-inquiry events, and MyData financial records.
    - Created composite variables (behavioral + transactional) to boost predictive power for alternative credit assessment.
    - Managed end-to-end governance—partner agreements, suitability review, data-combination via certified intermediary, and post-merge monitoring—in close collaboration with the Legal team.
  • Repayment Logic Accuracy Improvement (Jan 2024 - Jun 2024)
    -
    Re-engineered repayment-date and amount logic with MyData feeds, adding lender-specific rules to lift date accuracy from 60% to 80% and exact amount match from 15% to 45% (balloon loans 1%→70%).
    - Audited repayment algorithms by loan type and institution, fixing discrepancies flagged by CX to enable reliable cash-flow planning for users.
    - Deployed Redash dashboards for ongoing accuracy monitoring and alerting.
  • Data Pipeline Development & Analytical Coding
    -
    Implemented user-level DSR & stressed-DSR calculators to power approval/interest/limit models and CRM triggers.
    - Built batch data marts and authored Airflow DAGs for dependency-aware scheduling and monitoring.
    - Regularly refactored unused tables and inefficient SQL, reducing warehouse costs and query latency.
    - Automated migration of 100+ Athena queries to Spark SQL with a regex-based code converter, cutting manual rewrite time from days to minutes.
  • Anomaly Monitoring for Data Reliability
    -
    Developed a Slack bot that detects ETL task failures (batch mart loads, CDC replication) and anomalous approval rate spikes, instantly notifying data & ops teams.
    - Replaced siloed Tableau refresh emails by leveraging the Tableau REST API to broadcast data-source failures to shared Slack channels.
    - Wrote monitoring scripts for reverse-ETL pipelines, catching missing or inconsistent user-attribute loads from analytics to production and preventing CRM misfires.
    - Enhanced monitoring by visualizing time-series trends of key KPIs (e.g., approval rate, CDC latency) and surfacing anomalies based on seasonality and dynamic thresholds.
  • Data-Driven Decision Support
    -
    Built company-wide Tableau / Redash dashboards for OKRs, competitive-loan metrics, retention, and delinquency, giving all teams self-serve visibility for data-driven decisions.
    - Analyzed “aha-moments” to pinpoint product lock-in points, translating them into retention KPIs and growth targets.
    - Re-defined ad spend vs. revenue at the user level, measured payback periods, and diagnosed retargeting-channel drop-offs to maximize ROI on paid marketing.

Education

Bachelor of Science - Computer Science Engineering

Seoul National University
Gwanak-gu, Seoul
03.2015 - 02.2022

Skills

  • Python

  • SQL(Athena, Spark SQL)

  • Spark(Pyspark)

  • Machine Learning

  • Pytorch

  • Git/Github

  • Airflow

  • Tableau

  • Redash

  • Hadoop

  • Finance

Accomplishments

CFA Level II Candidate (Nov 23 – Present)

Pursuing the CFA designation to strengthen my domain knowledge in finance, particularly relevant to working in the fintech industry.


Timeline

Data Scientist

FINDA
05.2022 - Current

Bachelor of Science - Computer Science Engineering

Seoul National University
03.2015 - 02.2022
EUNBIN KANGData Scientist