Search

Joonwon Jang

I’m interested in NLP, especially in parametric knowledge of LLMs.
Especially understanding (Evaluation on benchmark, membership attacking), unlocking (In-context learning, SFT, alignment learning, reasoning), and expanding (Continual learning) their parametric knowledge to meet ultimate human needs.
I’m currently start my research on VLM’s jailbreaking and LLM coding abilities.
Check out my and my teammate’s PaperReview (e.g., Agent Survey, Recent Research Paper) and BlogPost here!
===================================== CV =====================================
Joonwon_CV.pdf
138.9KB
================================== LINKEDIN ==================================
================================ HF (Open Source) ===============================

Education

Master of Science in Graduate School of Artificial Intelligence, POSTECH. (2023.02 – 2025.02)
Bachelor's degree in Hotel &Tourism Management, Sejong University.(2017- 2023) (GPA: 4.37/4.5, Summa Cum Laude)
(International Student in School of Hotel and Tourism Managment HongKong Polytechnic University, 2018.09-2018.12)

Working Experience

ONOUT, LLM Research Engineer (Freelancer, 2024.07-2024.12)
LG AI Research @EXAONE LAB, LLM Research Intern (2025.03~)

Publications

International
DongGeon Lee*, Joonwon Jang*, Jihae Jeong, Hwanjo Yu. Are Vision-Language Models Safe in the Wild? A Meme-Based Benchmark Study (Arxiv)
Joonwon Jang, Jaehee Kim, Wonbin Kweon, Hwanjo Yu. Verbosity-Aware Rationale Reduction: Effective Reduction of Redundant Rationale via Principled Criteria (ACL 2025 Findings)
Seonghyeon Lee, HeeJae Chon, Joonwon Jang, Dongha Lee, Hwanjo Yu. How Diversely Can Language Models Solve Problems? Exploring the Algorithmic Diversity of Model-Generated Code (Arxiv)
WooJoo Kim, Joonwon Jang, Jinyi Yu, Yunsu Jeon, and Hwanjo Yu. EPR: An Expert Behavior-enhanced Paper Ranking Framework for the Automotive Industry (EMNLP 2024 Workshop)
Seonghyeon Lee, Suyeon Kim, Joonwon Jang, Heejae Chon, Dongha Lee, Hwanjo Yu. Eliciting Instruction-tuned Code Language Models' Capabilities to Utilize Auxiliary Function for Code Generation (EMNLP 2024 Findings)
Joonwon Jang, Sanghwan Jang, Wonbin Kweon, Minjin Jeon, and Hwanjo Yu. Rectifying Demonstration Shortcut in In-Context Learning, 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics. (NAACL 2024 Main)
Jaeyoung Lee, Joonwon Jang, and Misuk Kim. Hierarchical Graph Convolutional Network Approach for Detecting Low-Quality Documents, The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation. (LREC-COLING 2024)
Eunbi Choi, Yongrae Jo, Joel Jang, Joonwon Jang, and Minjoon Seo. Fixed Input Parameterization for Efficient Prompting, Findings of the Association for Computational Linguistics. (ACL 2023 Findings)
Joonwon Jang, Sung Il Kwag, and Young Dae Ko. Eco-friendly platooning operation algorithm of the autonomous vehicles, Journal of Intelligent Transportation Systems: Technology, Planning, and Operations. (JITS, 2023)
Joonwon Jang and Misuk Kim. Headline Token-based Discriminative Learning for Subheading Generation in News Article, The 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL) findings. (EACL 2023 Findings)
Joonwon Jang, Minju Kim, Yoonsik Cho, and Misuk Kim. Detecting incongruent news headlines with auxiliary textual information, Expert Systems with Applications. (ESWA, 2022) (IF 6.954)
Domestic
장준원 & 김미숙 (2022). 추출 기반의 뉴스 부제목 생성 프레임워크
장준원, 조하현, 이재영, & 김미숙. (2021). 제목과 본문이 다른 가짜뉴스 탐지를 위한 계층적 딥러닝 모델 개발 및 가짜 뉴스 데이터셋 구축. 한국정보과학회 학술발표논문집

Projects

Continual Learning Large Language Model toward Specific Domain | (2024.09 -2024.12, Onout)
Domain-Specific LLM (Pre-Training, Distributed Training)
Data Crawling (General Corpus & Domain Corpus)
(Reserved) Token expansion using the domain-specific corpus
Evaluation on Domain specific task (e.g., knowledge-probing, generation)
Achieved 2x KMMLU in-domain score
Generalized on unseen format
Preserved General Knowledge Shift After DAPT (e.g., KMMLU, HARAE, …)
Constructing an AI Ranking Model for Promising Technologies Selection (2023.08 -2024.08, Hyundai)
Crawling and Preprocessing Data with scopus api
Post-Training LMs via Citation Networks (Using Specter Framework)
Fashion advertisement generation through quantized LLM | (2023.06 -2023.09, Onout)
Supervised Fine-Tuning & Instruction Tuning & Quantization
Applying QLoRA on polyglot (5.8B, 12.8B)
Design Prompt for Augmentation (gpt api) / Instruction Tuning
Instruction tuning using human-labeled dataset / Self-instruct using ChatGPT-pro & GPT-4 api
Prompt Injection in chatbot system (2022.07-2022.12)
Long Context Handling & Chatbot
Parameterize long prompt (ex. previous session dialogs or persona in MSC dataset) to non-prompt model (i.e., student model) for efficient inference. (Mentor: Eunbi Choi)
YoYak (Long Sequence Summarization Framework For Korean) (2021.09 -2021.12) |
Long Context Handling & Summarization
Domain-agnostic TAPT (Task Adaptive Pre-Training) Longformer with Pegasus Objective Function for KoBART Model
Performance comparison with vanilla KoBART
Incongruent News Detection (2020.09-2022.06, KOCCA)
Generating dataset for detecting incongruent news
Developing a method to detect incongruent news using auxiliary textual information |
Developing Hierarchical Graph Convolutional Model for Incongruent Headline News Detection
Monetary Policy Board Announcement Analysis (2020.09-2020.10)
Financial Documents
Monetary Policy Statements, Financial Currency Transcript Text EDA
EDA with TF-IDF/Fasttext Embedding, LDA, and ML Models (RF, SVM)
Optimization of autonomous vehicle platooning (2019.05-2020.12)
Optimize the operation of platooning of the electric autonomous vehicles with hybrid genetic algorithm

Activities

Tobigs (2021.07-2022.07, 16th president)
BigData & AI & ML/DL
Wemajor (2018.03-2019.12)
Introducing Major to middle/high school students (volunteering)

Awards & Scholarships

2024 POSTECH Best Paper Awards (Excellence Award)
2024 Hyundai MOBIS AI (Infotainment System) Industry-Academic Cooperation Contest 3rd prize
Academic Excellence Scholarship (Spring 2017, Fall 2017, Spring 2018, Fall 2021, Spring 2022)
Coding Challenge in Sejong University - python (Fall 2021, 4th prize)
Korean Institute of Information Scientists and Engineers Undergraduate Student Paper Award (2021, 3rd prize)

Patents

가짜 뉴스 탐지용 데이터셋 생성 장치 및 이의 실행 방법, 출원번호: 10-2022-0025684 (2022)

Paper Review and Additional Study

 Recent Personal Paper Review
 Recent Team Paper Review (2023~)
 Past Personal NLP Paper Review (2021~2022)
Linear Algebra
Probability & Statistics
Data Structure
CS224N (2020.09-2020.12)

Skills

Programming languages – python, C, R
Data Mining - numpy, pandas, sklearn, MYSQL, SAS, SPARK
Machine Learning – tensorflow, keras, pytorch, pytorch-lightning, huggingface
Web Crawling – request, BeautifulSoup, FastAPI
Others – Git & Github, Flask, gephi, Docker

Related Course Work

Advanced Machine Learning (A+)
Linear Algebra and Programming (A+)
Introduction to Open Source (A+)
Data Problem&Solution and Practice (A+)
C Programming and Lab (A+)
Python Programming (A0)
Computer Structure (A+)
Database (A+)

Research Experience

Undergraduate intern at LKLAB (KAIST AI), supervised by Minjoon Seo. (2022.07-2022.12)
Undergraduate intern at IMLL (Sejong University), supervised by Misuk Kim. (2020.08-2022.07)