Hello, I am Seokwon Jang from DeepNetwork, specializing in technical consulting for the lightweight implementation of ultra-large language models. I have been examining the status of Korean ultra-large model development companies (mainly large corporate AI research institutes) through articles.

Implementing an ultra-large language model also requires preparing a massive amount of precise training data, which is not easy. The infrastructure investment for ultra-large language models is so enormous that even large corporations would waver, so it’s quite challenging for a small business like mine to attempt to implement an ultra-large language model.

I have been mainly looking at domestic and foreign papers that stand out in the field of ultra-large language model lightweighting and on-device AI. The papers I am looking at in the field of ultra-large language model lightweighting and on-device AI are mainly Pruning papers, Quantization papers, and Knowledge Distillation papers.

The core of the Pruning paper is to determine what is important in the weight values, that is, which parts of the weight values to choose. The most basic paper among the Quantization papers is the BitNet paper, which implements Scaling 1-bit Transformers. The key point is that it introduces BitLinear to replace the nn.Linear layer, allowing training of 1-bit weights from the beginning.

I have posted many detailed posts on my blog about what preparations I have for technical consulting related to the implementation of ultra-large language models, but I have not yet received any consultation inquiries. A few days ago, the Korean government agency, Startup Promotion Agency, announced that it would support 20 million won in funds for the On-Device AI Challenge task as a PoC, so I have proposed to implement OCR (Optical Character Recognition) with a transformer model.

These days, I feel a bit lethargic because the detailed analysis of papers related to the lightweight implementation of ultra-large language models and the preparation for implementation in the TensorFlow environment are quite significant, but the economy in Korea is currently so difficult that I have not yet received any contact from anywhere, so my feelings are a bit complicated.

Even though I am almost 95% sure that I am prepared, I still have no contact because there is no PoC result yet. The implementation of the lightweighting of deep learning LLM is burdensome for a small business like me to do everything alone, so I am looking for a partner to co-work with, but I have not yet received any contact.

 

Deep Network, a one-person startup specializing in consulting for super-large language models  

E-mail : sayhi7@daum.net    

Representative of a one-person startup /  SeokWeon Jang

 

 

+ Recent posts