{"id":420,"date":"2023-07-21T07:21:54","date_gmt":"2023-07-21T07:21:54","guid":{"rendered":"http:\/\/ocean:8000\/?p=420"},"modified":"2023-07-24T03:09:29","modified_gmt":"2023-07-24T03:09:29","slug":"%e5%a4%a7%e8%a6%8f%e6%a8%a1%e8%a8%80%e8%aa%9e%e3%83%a2%e3%83%87%e3%83%ab%e5%88%86%e6%95%a3%e5%ad%a6%e7%bf%92%e3%83%8f%e3%83%83%e3%82%ab%e3%82%bd%e3%83%b3%e3%81%ab%e5%8f%82%e5%8a%a0%e3%81%97%e3%81%a6","status":"publish","type":"post","link":"http:\/\/ocean:8000\/event\/%e5%a4%a7%e8%a6%8f%e6%a8%a1%e8%a8%80%e8%aa%9e%e3%83%a2%e3%83%87%e3%83%ab%e5%88%86%e6%95%a3%e5%ad%a6%e7%bf%92%e3%83%8f%e3%83%83%e3%82%ab%e3%82%bd%e3%83%b3%e3%81%ab%e5%8f%82%e5%8a%a0%e3%81%97%e3%81%a6\/","title":{"rendered":"\u5927\u898f\u6a21\u8a00\u8a9e\u30e2\u30c7\u30eb\u5206\u6563\u5b66\u7fd2\u30cf\u30c3\u30ab\u30bd\u30f3\u306b\u53c2\u52a0\u3057\u3066\u304d\u307e\u3057\u305f\uff01"},"content":{"rendered":"\n
\"\"<\/figure>\n\n\n\n

<\/p>\n\n\n\n

\u6587\u8cac\uff1a\u7530\u4e2d\u5eb7\u592a\u90ce<\/p>\n\n\n\n

<\/p>\n\n\n\n

\u6700\u8fd1ChatGPT\u3084Bard\u306b\u4ee3\u8868\u3055\u308c\u308b\u5927\u898f\u6a21\u8a00\u8a9e\u30e2\u30c7\u30eb\uff08Large Language Model; LLM\uff09\u304c\u6ce8\u76ee\u3092\u96c6\u3081\u3066\u3044\u307e\u3059\u3002\u3053\u308c\u3089\u306e\u30e2\u30c7\u30eb\u306f\u9ad8\u3044\u81ea\u7136\u8a00\u8a9e\u51e6\u7406\u80fd\u529b\u3092\u6301\u3063\u3066\u304a\u308a\u3001\u69d8\u3005\u306a\u5206\u91ce\u3067\u306e\u5fdc\u7528\u304c\u691c\u8a0e\u3055\u308c\u3066\u3044\u307e\u3059\u3002\u4e00\u65b9\u3067\u3001LLM\u3092\u8a13\u7df4\u3059\u308b\u305f\u3081\u306b\u306f\u5927\u91cf\u306e\u8a08\u7b97\u30ea\u30bd\u30fc\u30b9\u3092\u5fc5\u8981\u3068\u3057\u3001\u30c7\u30fc\u30bf\u30d1\u30e9\u30ec\u30eb\u3001\u30e2\u30c7\u30eb\u30d1\u30e9\u30ec\u30eb\u3068\u3044\u3063\u305f\u5206\u6563\u5b66\u7fd2\u3084\u30cf\u30a4\u30d1\u30fc\u30d1\u30e9\u30e1\u30fc\u30bf\u306e\u30c1\u30e5\u30fc\u30cb\u30f3\u30b0\u304c\u5fc5\u8981\u3068\u306a\u308a\u307e\u3059\u3002LLM\u306e\u5b66\u7fd2\u306b\u5fc5\u8981\u306a\u6280\u8853\u7684\u77e5\u898b\u3092\u84c4\u3048\u308b\u305f\u3081\u3001\u56fd\u7acb\u7814\u7a76\u958b\u767a\u6cd5\u4eba\u7523\u696d\u6280\u8853\u7dcf\u5408\u7814\u7a76\u6240\u304c\u63d0\u4f9b\u3059\u308b\u30b9\u30fc\u30d1\u30fc\u30b3\u30f3\u30d4\u30e5\u30fc\u30bf\u30fc\u3001ABCI\u3092\u6d3b\u7528\u3057\u305f\u7b2c1\u56de\u5927\u898f\u6a21\u8a00\u8a9e\u30e2\u30c7\u30eb\u5206\u6563\u5b66\u7fd2\u30cf\u30c3\u30ab\u30bd\u30f3\uff08https:\/\/abci.ai\/event\/2023\/06\/13\/ja_event.html<\/a>\uff09\u306b\u65e5\u9ad9\u96c5\u4fca\u3001\u6a4b\u672c\u667a\u6d0b\u3001\u7530\u4e2d\u5eb7\u592a\u90ce\u304c\u53c2\u52a0\u3057\u307e\u3057\u305f\u3002\u672c\u30d6\u30ed\u30b0\u3067\u306f\u7530\u4e2d\u5eb7\u592a\u90ce\u304c\u30cf\u30c3\u30ab\u30bd\u30f3\u3067\u306e\u53d6\u308a\u7d44\u307f\u3068LLM\u306e\u5b66\u7fd2\u306b\u5fc5\u8981\u306a\u6280\u8853\u7684\u77e5\u898b\u3092\u7d39\u4ecb\u3044\u305f\u3057\u307e\u3059\u3002<\/p>\n\n\n\n

<\/p>\n\n\n\n

\u4eca\u56de\u306fLLM\u3092\u30b9\u30af\u30e9\u30c3\u30c1\u304b\u3089\u30de\u30eb\u30c1\u30ce\u30fc\u30c9\u3067\u52b9\u7387\u7684\u306b\u5b66\u7fd2\u3059\u308b\u65b9\u6cd5\u3092\u5b66\u3076\u3053\u3068\u3092\u76ee\u6a19\u306b\u3057\u3001\uff13\u4eba\u306e\u30e1\u30f3\u30d0\u30fc\u304c\u305d\u308c\u305e\u308c\u7570\u306a\u308b\u30e9\u30a4\u30d6\u30e9\u30ea\u3092\u7528\u3044\u3066\u691c\u8a3c\u3092\u884c\u3044\u307e\u3057\u305f\u3002<\/p>\n\n\n\n

<\/p>\n\n\n\n

Transformers\u3068DeepSpeed\u3092\u7528\u3044\u305fOPT [Zhang et al., 2022]\u306e\u5b66\u7fd2<\/h3>\n\n\n\n

<\/p>\n\n\n\n

OPT\u306fMeta\u793e\u304c\u516c\u958b\u3057\u3066\u3044\u308b\u30aa\u30fc\u30d7\u30f3\u30bd\u30fc\u30b9\u306eLLM\u3067\u3001\u30d1\u30e9\u30e1\u30fc\u30bf\u6570\u304c1\u51042500\u4e07 (125m)\u306e\u6bd4\u8f03\u7684\u5c0f\u3055\u306a\u3082\u306e\u304b\u3089\u3001GPT-3[Brown et al., 2020]\u3068\u540c\u7b49\u306e1750\u5104 (175b)\u306e\u30e2\u30c7\u30eb\u307e\u3067\u516c\u958b\u3055\u308c\u3066\u3044\u307e\u3059\u3002\u672c\u30cf\u30c3\u30ab\u30bd\u30f3\u3067\u306fVRAM16GB\u306eV100\u3092\u5229\u7528\u3057\u307e\u3057\u305f\u304c\u3001\u30d1\u30e9\u30e1\u30fc\u30bf\u6570\u306e\u591a\u3044\u30e2\u30c7\u30eb\u3092\u5b66\u7fd2\u3057\u3088\u3046\u3068\u3059\u308b\u3068\u3001\u5de5\u592b\u306a\u3057\u3067\u306fVRAM\u306e\u5bb9\u91cf\u304c\u8db3\u308a\u307e\u305b\u3093\u3002\u305d\u3053\u3067\u3001DeepSpeed\uff08https:\/\/github.com\/microsoft\/<\/a>DeepSpeed<\/a>\uff09\u3068\u3044\u3046Microsoft\u793e\u304c\u63d0\u4f9b\u3057\u3066\u3044\u308b\u30e9\u30a4\u30d6\u30e9\u30ea\u3092\u7528\u3044\u307e\u3057\u305f\u3002\u3053\u306e\u30e9\u30a4\u30d6\u30e9\u30ea\u306b\u306fZeRO [Rajbhandari et al., 2020]\u3068\u547c\u3070\u308c\u308b\u30e1\u30e2\u30ea\u52b9\u7387\u5316\u624b\u6cd5\u304c\u5b9f\u88c5\u3055\u308c\u3066\u304a\u308a\u3001\u30d1\u30e9\u30e1\u30fc\u30bf\u6570\u306e\u591a\u3044\u30e2\u30c7\u30eb\u306e\u5b66\u7fd2\u3092\u53ef\u80fd\u306b\u3057\u3066\u3044\u307e\u3059\u3002\u624b\u6cd5\u306e\u8a73\u7d30\u306f\u3053\u306e\u30d6\u30ed\u30b0\u3067\u7d39\u4ecb\u3055\u308c\u3066\u3044\u307e\u3059\uff08https:\/\/<\/a>www.microsoft.com<\/a>\/en-us\/research\/blog\/zero-deepspeed-new-system-optimizations-enable-training-models-with-over-100-billion-parameters\/<\/a>\uff09\u3002\u691c\u8a3c\u3067\u306fZeRO stage 3\u3092\u5229\u7528\u3057\u307e\u3057\u305f\u3002<\/p>\n\n\n\n

<\/p>\n\n\n\n

\u672c\u30bb\u30af\u30b7\u30e7\u30f3\u3067\u306f\u4ee5\u4e0b\u306e\uff13\u70b9\u306b\u3064\u3044\u3066\u7d39\u4ecb\u3057\u307e\u3059\u3002<\/p>\n\n\n\n