[Introduction to the Know-how of Handling Key Issues in Detail] When implementing LLM (Large Language Model) with Google Transformer Model in TensorFlow environment…

Notice

Recent Posts

Recent Comments

Link

« 2024/05 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

딥 네트워크 - 딥러닝 모델 분석/네트웍 통신/카메라 3A 튜닝 분야

[Introduction to the Know-how of Handling Key Issues in Detail] When implementing LLM (Large Language Model) with Google Transformer Model in TensorFlow environment… 본문

Kernel Porting/Linux

[Introduction to the Know-how of Handling Key Issues in Detail] When implementing LLM (Large Language Model) with Google Transformer Model in TensorFlow environment…

파란새 2024. 3. 15. 15:16

I am Seokwon Jang, a technical advisor specializing in ultra-large model technology at DeepNetwork. I approached the commercialization preparation of the ultra-large language model, ChatGPT, somewhat vaguely three years ago. In fact, many corporate officials may wonder if a one-person company like me can understand the implementation know-how of an ultra-large language model like ChatGPT.

For over three years, I have been reviewing and analyzing two foreign papers related to LLM (Large Language Model) every day. After reviewing and analyzing papers for three years, I learned what global companies are concerned about when implementing ultra-large models. The content I have reviewed and analyzed for three years is roughly that I have carefully studied about 100 key issues in the implementation design field of deep learning. Based on this, I first analyzed the design structure of LLM.

Once I understood the design structure of LLM to some extent, in the case of Facebook, it was said that they proceeded with LLM learning by designing and building a clustering of 16,000 NVIDIA A100 GPUs. If such a huge infrastructure cost is incurred, it may be difficult to secure profits, so I looked at a considerable part of the papers that research to reduce the parameters of LLM while maintaining performance. I also reviewed and analyzed papers on what preparations are needed to implement on-device AI, which is a model lightweight.

It was also possible to understand that lightweight implementation can be largely processed with Quantization design and knowledge distillation techniques. The Quantization part is absolutely necessary for on-device AI design and integration design within SOC, and NVIDIA has already integrated FP8 function in H100 GPU.

Did DeepNetwork only review and analyze up to here? This is not all. When implementing LLM or sLLM, it is possible to review and analyze what issues should be considered when customizing each part of several layers of Google Transformer model in TensorFlow environment and how to solve these issues.

I can’t speak English, but I can send and receive English emails. I ask for careful review by AI managers of domestic and foreign global companies. I have learned how to catch fish, so please don’t underestimate the LLM and sLLM technology of DeepNetwork just because there are no fish caught right now.

Deep Network, a one-person startup specializing in consulting for super-large language models

E-mail : sayhi7@daum.net

Representative of a one-person startup / SeokWeon Jang

저작자표시 비영리 변경금지

'Kernel Porting > Linux' 카테고리의 다른 글

Hello, I am the representative of Deepnetwork, a one-person company specializing in electric vehicle battery charging control. (0)	2024.03.22
“Do you know that a one-person enterprise like me, specializing in the analysis of the detailed algorithm design structure of large language models, can also have expertise related to LLM?” (0)	2024.03.22
The implementation of NVLink-C2C’s 900GB/s bandwidth should also be based on the time required for the NVIDIA Grace Hopper Superchip to read 96GB of HBM3 memory and write it to GH200’s 141GB of HBM3e memory, shouldn’t it? (0)	2024.03.14
[일인기업 딥네트워크 펌웨어 개발 및 기술자문] :: 합성개구레이더(SAR) 도플러 효과 동작원리 분석 및 타겟 검출 및 인식 전문기업 (0)	2024.03.13
[Google Transformer Model Technical Consulting Specialist][DeepNetwork, a one-person enterprise, is a professional company website for detailed analysis of the Transformer model structure…] (0)	2024.03.12

'Kernel Porting/Linux' Related Articles

딥 네트워크 - 딥러닝 모델 분석/네트웍 통신/카메라 3A 튜닝 분야

[Introduction to the Know-how of Handling Key Issues in Detail] When implementing LLM (Large Language Model) with Google Transformer Model in TensorFlow environment… 본문

[Introduction to the Know-how of Handling Key Issues in Detail] When implementing LLM (Large Language Model) with Google Transformer Model in TensorFlow environment…

'Kernel Porting > Linux' 카테고리의 다른 글

티스토리툴바