AI Agent for Advertising (PoC)
- Developed and optimized Python APIs for automating product and service promotional video creation for small businesses using LangGraph.
- Developed LLM-based APIs for video material recommendation and script generation through prompt engineering, and implemented a feedback-driven refinement loop to enhance script quality based on self-feedback.
- Enhanced function-calling model performance via tool information data augmentation, SFT and research into efficient reasoning with reward shaping and model merging.
- Improved internal content quality score from 3.1 to 4.4, and reduced reasoning path length by 31% while maintaining performance on BFCL and in-house evaluation sets.
Smart NPC Agent for Gaming
- Designed a persona-specific LLM data pipeline and optimized RLHF for character-driven NPCs that guide users and deliver immersive dialogues.
- Enhanced character persona fidelity through character boundary SFT, hierarchical PPO tuning, and reward modeling using text generation regularization to address reward hacking.
- Developed an automated LLM evaluation pipeline with G-eval.
- Mitigated reward hacking issues and increased G-eval persona score by 1.2 points (from 3.6 to 4.8), preparing the service with 16 distinct character personas.
RL-based Stock Trading Agent (PoC)
- Developed a distributed RL framework that aggregates experiences from multiple workers to optimize trading agent performance, incorporating model segmentation and pre-trained market encoder freezing strategies.
- Applied and validated the agent with real transactions at Korea Investment & Securities, achieving +16.43bp over TWAP and +15.45bp over VWAP in a one-month live trading pilot.