Investment Proposal: Deep Network - Expertise in GPT-3 Based LLM Foundation Model with Korean Morpheme Tokenization

1. Company Overview

Company Name :  Deep Network
CEO :  Seokweon Jang   /   sayhi7@daum.net  


Mission: Development and analysis of multi-language foundational models, specializing in both Korean and English, based on advanced deep learning and AI model techniques.
Core Expertise: Design and implementation of LLM (Large Language Model) based on GPT-3

Deep Network is a specialized one-person tech startup focused on the development of large-scale language models (LLMs) utilizing the latest AI technologies. Through two years of dedicated research and development, we have achieved 90% proficiency in implementing Korean and English tokenization on GPT-3 based models. In particular, we have pioneered proprietary algorithms and design principles for morpheme-based Korean tokenization, creating unique technological value.


2. Purpose of Investment

Deep Network seeks to commercialize our Korean-centric LLM model to provide AI solutions that meet the needs of diverse industries. Through this investment, we aim to achieve the following goals:

  • Commercialization of Morpheme-Based Korean Tokenization Model: Our advanced tokenization system accurately parses Korean’s complex grammar and diverse expressions to enable natural and precise text processing.
  • Optimization of Korean/English LLM Foundation Model: Reconstructing the GPT-3 model to provide a Korea-optimized, multilingual LLM that is competitive both domestically and globally.
  • Further R&D Investment: Continued research to maximize NLP performance for structurally complex languages like Korean.

3. Distinctiveness of Korean Tokenization Technology

Deep Network’s Korean tokenization approach is built upon morpheme analysis, tailored specifically to Korean’s unique grammatical structure. Key advantages include:

  • Reflecting Korean Grammar: The model handles postpositions and endings accurately, decomposing sentences while preserving meaning, essential to Korean’s nuanced structure.
  • Context Preservation: Ensures that meaning is retained as each morpheme is analyzed and tokenized, enabling the model to maintain context and generate accurate responses.
  • High-Performance and Efficiency: A lightweight morpheme analysis algorithm maximizes computational efficiency, accelerating Korean text processing.

Our technology is designed to be readily applicable across various industries requiring Korean language processing and can be adapted for future expansion into global markets.


4. Core Achievements and Technical Implementation

Deep Network has achieved significant milestones in optimizing GPT-3 based LLM models for the Korean language environment:

  • Over 90% Completion of Tokenization Design: Tailored tokenization implementation for both English and Korean, understanding unique linguistic features of each.
  • Mastery of Morpheme-Based Korean Tokenization Design: Developed a methodology for decomposing Korean tokens while retaining context, providing the foundation for the LLM to understand and generate Korean text naturally.
  • Model Training with Large-Scale Datasets: Established a training pipeline to effectively apply our custom morpheme tokenization to large-scale Korean datasets.

5. Future Plans

With this investment, Deep Network has set the following goals:

  1. Multilingual Support Expansion and Performance Enhancement: Research expansion to additional languages beyond English and Korean.
  2. Development of Korean-Specific Application Models: Custom AI solutions tailored for businesses that primarily use Korean, enhancing business applicability.
  3. Commercialization and Market Entry: Aiming to commercialize the morpheme-based Korean LLM model, demonstrate Deep Network’s technological strength in both domestic and global language processing markets, and launch products.

6. Investment Request and Allocation Plan

Investment Request :  2 Billion KRW
Allocation Plan:

  • Infrastructure Expansion for R&D (30%)
  • Acquisition of High-Performance Korean Datasets and Further Training (30%)
  • Marketing and Operational Infrastructure for Commercialization (20%)
  • Team Expansion and Recruitment of Talent (20%)

Conclusion

Deep Network has successfully developed a GPT-3 based LLM model with outstanding performance in processing complex languages like Korean. Through our unique morpheme-based tokenization technology, we enable the LLM model to understand and process Korean text naturally, setting the foundation for wide-ranging applications across various industries. With this investment, we aim to achieve even greater results in the global and domestic AI technology markets.

 

Thank you for considering this opportunity to join Deep Network in advancing AI for Korean language innovation.

 

CEO : Seokwon Jang

+ Recent posts