//-------------------------------------
eleuther.ai
* GPT-Neo (2021-03)
https://github.com/EleutherAI/gpt-neo
* GPT-J (2021-06) - 60억 (6B) - OpenAI GPT 모델 중 Curie급
GPT-3의 오픈소스 버전
https://huggingface.co/EleutherAI/gpt-j-6B
- 사용법
https://huggingface.co/docs/transformers/model_doc/gptj
https://github.com/kingoflolz/mesh-transformer-jax
* GPT-NeoX (2022-01) - 200억 (20B)
https://github.com/EleutherAI/gpt-neox
NLP Cloud 에서 API 서비스
//-----------------------------------------------------------------------------
사용법 예제 샘플 코드 소스
import torch
import gc
import os
import transformers
import time
from transformers import GPTJForCausalLM, AutoTokenizer
def gptj(text, dev):
device = torch.device(dev)
#config = transformers.GPTJConfig.from_pretrained("EleutherAI/gpt-j-6B")
model = GPTJForCausalLM.from_pretrained(
"EleutherAI/gpt-j-6B",
revision="float16",
torch_dtype=torch.float16,
low_cpu_mem_usage=True,
use_cache=True,
gradient_checkpointing=True,
)
model.to(device)
tokenizer = transformers.AutoTokenizer.from_pretrained(
"EleutherAI/gpt-j-6B", pad_token='<|endoftext|>', eos_token='<|endoftext|>', truncation_side='left')
prompt = tokenizer(text, return_tensors='pt',
truncation=True, max_length=2048)
prompt = {key: value.to(dev) for key, value in prompt.items()}
out = model.generate(**prompt,
# n=1,
min_length=16,
max_new_tokens=75,
do_sample=True,
top_k=35,
top_p=0.9,
# batch_size=512,
temperature=0.75,
no_repeat_ngram_size=4,
# clean_up_tokenization_spaces=True,
use_cache=True,
pad_token_id=tokenizer.eos_token_id
)
res = tokenizer.decode(out[0])
return res
#
text = "The Belgian national football team "
print("generated_text", gptj(text, "cuda"))
GPU에서 14초 정도 걸림
참고
//-----------------------------------------------------------------------------
< 참고 >
노트북 프로젝트 파일
https://github.com/NielsRogge/Transformers-Tutorials/blob/master/GPT-J-6B/Inference_with_GPT_J_6B.ipynb
pip install -q git+https://github.com/huggingface/transformers.git
pip install accelerate
- CPU에서는 작동 성공, CUDA에서 에러
model = GPTJForCausalLM.from_pretrained("EleutherAI/gpt-j-6B", revision="float16") # , low_cpu_mem_usage=True)
model.to(device) <==== 에러
OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 15.99 GiB total capacity; 15.10 GiB already allocated; 0 bytes free; 15.10 GiB
reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory
Management and PYTORCH_CUDA_ALLOC_CONF
해결방법
GPTJForCausalLM.from_pretrained() 에 옵션으로 torch_dtype=torch.float16,가 있어야 한다.
model = GPTJForCausalLM.from_pretrained(
"EleutherAI/gpt-j-6B",
revision="float16",
torch_dtype=torch.float16,
low_cpu_mem_usage=True,
use_cache=True,
gradient_checkpointing=True,
)
//-----------------------------------------------------------------------------
< 참고 >
OpenAI
* GPT-3 (2020-06)
* GPT-3.5 (2022-03)
//-------------------------------------
eleuther.ai
* GPT-Neo (2021-03)
https://github.com/EleutherAI/gpt-neo
* GPT-J (2021-06) - 60억 (6B) - OpenAI Curie급
GPT-3의 오픈소스 버전
https://huggingface.co/EleutherAI/gpt-j-6B
- 사용법
https://huggingface.co/docs/transformers/model_doc/gptj
https://github.com/kingoflolz/mesh-transformer-jax
* GPT-NeoX (2022-01) - 200억 (20B)
https://github.com/EleutherAI/gpt-neox
NLP Cloud 에서 API 서비스
* GPT-NeoX-20B (2022-02)
EleutherAI/gpt-neox-20b
https://huggingface.co/EleutherAI/gpt-neox-20b
'AI' 카테고리의 다른 글
text-generation-webui 사용법 (0) | 2023.03.21 |
---|---|
(Meta AI) LLaMA 사용법 (0) | 2023.03.21 |
[AI 음악] Riffusion 사용법 (0) | 2023.02.26 |
OpenAI Whisper 음성 인식 사용법 (5) | 2023.02.24 |
Nvidia GPU 코어 클럭이 일정 이상 올라가지 않는 문제 해결 방법 (0) | 2023.02.03 |