imcaspar/gpt2-ml
Fork: 334 Star: 1716 (更新于 2024-12-23 20:26:03)
license: Apache-2.0
Language: Python .
GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型
最后发布版本: v1.0 ( 2020-05-29 13:06:25)
GPT2 for Multiple Languages
- Simplifed GPT2 train scripts(based on Grover, supporting TPUs)
- Ported bert tokenizer, multilingual corpus compatible
- 1.5B GPT2 pretrained Chinese model ( ~15G corpus, 10w steps )
- Batteries-included Colab demo #
- 1.5B GPT2 pretrained Chinese model ( ~30G corpus, 22w steps )
Pretrained Model
Size | Language | Corpus | Vocab | Link1 | Link2 | SHA256 |
---|---|---|---|---|---|---|
1.5B Params | Chinese | ~30G | CLUE ( 8021 tokens ) | Google Drive | Baidu Pan (ffz6) | e698cc97a7f5f706f84f58bb469d614e 51d3c0ce5f9ab9bf77e01e3fcb41d482 |
1.5B Params | Chinese | ~15G | Bert ( 21128 tokens ) | Google Drive | Baidu Pan (q9vr) | 4a6e5124df8db7ac2bdd902e6191b807 a6983a7f5d09fb10ce011f9a073b183e |
Corpus from THUCNews and nlp_chinese_corpus
Using Cloud TPU Pod v3-256 to train 22w steps
Google Colab
With just 2 clicks (not including Colab auth process), the 1.5B pretrained Chinese model demo is ready to go:
Train
Disclaimer
The contents in this repository are for academic research purpose, and we do not provide any conclusive remarks.
Citation
@misc{GPT2-ML,
author = {Zhibo Zhang},
title = {GPT2-ML: GPT-2 for Multiple Languages},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/imcaspar/gpt2-ml}},
}
Reference
https://github.com/google-research/bert
https://github.com/rowanz/grover
Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC)
Press
主题(topics):
bert, chinese, colab, gpt-2, nlp, pretrained-models, tensorflow, text-generation, tpu
imcaspar/gpt2-ml同语言 Python最近更新仓库
2025-01-18 21:26:31 sunnypilot/sunnypilot
2025-01-17 23:34:10 Skyvern-AI/skyvern
2025-01-17 19:49:33 ultralytics/ultralytics
2025-01-17 19:12:03 XiaoMi/ha_xiaomi_home
2025-01-17 08:27:45 comfyanonymous/ComfyUI
2025-01-17 04:56:19 QuivrHQ/MegaParse