BloombergGPT-A-Large-Language-Model-for-Finance

manyouzhe

1个月前发布

2.16MB76页0256

第1页 / 共76页

第2页 / 共76页

第3页 / 共76页

第4页 / 共76页

第5页 / 共76页

该文档为免费文档，您可直接下载完整版进行阅读

文章版权归作者所有，未经允许请勿转载。

THE END

智慧城市

文本预览

1 IntroductionThe release of GPT-3 in 2020 (Brown et al.,2020)demonstrated the powerful benefitsof training very large auto-regressive language models (LLMs).GPT-3 had 175 billionparameters,a hundredfold increase over the previous GPT-2 model,and did remarkablywell across a wide range of now popular LLM tasks,including reading comprehension,open-ended question answering,and code generation.This performance has been replicatedacross several other models (Chowdhery et al.,2022;Scao et al.,2022;Zhang et al.,2022a).Furthermore,evidence suggests that large models exhibit emergent behaviors;growth allowsthem to acquire abilities not present in smaller models (Wei et al.,2022a).A notableexample of emergent behavior is the ability to perform tasks via few-shot prompting,where amodel can learn a task from just a few examples.This ability improves well-above random aswe increase the size of language models.Broadly speaking,few-shot prompting dramaticallyexpands the range of tasks supported by models and lowers the barrier to entry for usersseeking automation for new language tasks.After GPT-3,models grew in size to 280 billion (Gopher,Rae et al.,2021),540 bil-lion (PaLM,Chowdhery et al.,2022),and 1 trillion parameters (Megatron,Korthikantiet al.,2022).Work also explored other important aspects of achieving a high-performingLLM,such as different training objectives (Tay et al.,2022b),multilingual models (Scaoet al.,2022),more efficient and smaller models (Black et al.,2022),and finding data andparameter-efficient training sizes (Hoffmann et al.,2022).These efforts have almost exclusively focused on general LLMs,trained on datasetsthat cover a broad range of topics and domains.While these have included some datasetsfor specialized domains (e.g.,code (Chen et al.,2021a)or biomedical articles (Gao et al.,2021))the focus has been on building LLMs with broad capabilities.Recent efforts trainingmodels using only domain-specific data have yielded models that,while much smaller,beatgeneral purpose LLMs on tasks within those domains,such as science (Taylor et al.,2022)and medicine (Bolton et al.,2023;Luo et al.,2022;Lehman et al.,2023).These findingsmotivate further development of models focused on specific domains.Financial Technology (FinTech)is a large and growing area with NLP technologieshaving an increasingly important role (Xing et al.,2018;Fisher et al.,2016;Dredze et al.,2016).Financial NLP tasks (Shah et al.,2022)include sentiment analysis (Araci,2019),named entity recognition (Salinas Alvarado et al.,2015),news classification (Sinha andKhandait,2020),and question answering (Chen et al.,2021b,2022).While the range oftasks is similar to those found in general NLP benchmarks,the complexity and terminologyof the financial domain warrant a domain-specific system.For all of the reasons generativeLLMs are attractive in general-few-shot learning,text generation,conversational systems,etc.it would be valuable to have a LLM focused on the financial domain.While the

喜欢就支持一下吧

请登录后发表评论

登录注册

暂无评论内容