BloombergGPT-A-Large-Language-Model-for-Finance

BloombergGPT-A-Large-Language-Model-for-Finance-文库
BloombergGPT-A-Large-Language-Model-for-Finance
此内容为免费资源,请登录后查看
0
免费资源

第1页 / 共76页

第2页 / 共76页

第3页 / 共76页

第4页 / 共76页

第5页 / 共76页
该文档为免费文档,您可直接下载完整版进行阅读
© 版权声明
THE END
1 IntroductionThe release of GPT-3 in 2020 (Brown et al.,2020)demonstrated the powerful benefitsof training very large auto-regressive language models (LLMs).GPT-3 had 175 billionparameters,a hundredfold increase over the previous GPT-2 model,and did remarkablywell across a wide range of now popular LLM tasks,including reading comprehension,open-ended question answering,and code generation.This performance has been replicatedacross several other models (Chowdhery et al.,2022;Scao et al.,2022;Zhang et al.,2022a).Furthermore,evidence suggests that large models exhibit emergent behaviors;growth allowsthem to acquire abilities not present in smaller models (Wei et al.,2022a).A notableexample of emergent behavior is the ability to perform tasks via few-shot prompting,where amodel can learn a task from just a few examples.This ability improves well-above random aswe increase the size of language models.Broadly speaking,few-shot prompting dramaticallyexpands the range of tasks supported by models and lowers the barrier to entry for usersseeking automation for new language tasks.After GPT-3,models grew in size to 280 billion (Gopher,Rae et al.,2021),540 bil-lion (PaLM,Chowdhery et al.,2022),and 1 trillion parameters (Megatron,Korthikantiet al.,2022).Work also explored other important aspects of achieving a high-performingLLM,such as different training objectives (Tay et al.,2022b),multilingual models (Scaoet al.,2022),more efficient and smaller models (Black et al.,2022),and finding data andparameter-efficient training sizes (Hoffmann et al.,2022).These efforts have almost exclusively focused on general LLMs,trained on datasetsthat cover a broad range of topics and domains.While these have included some datasetsfor specialized domains (e.g.,code (Chen et al.,2021a)or biomedical articles (Gao et al.,2021))the focus has been on building LLMs with broad capabilities.Recent efforts trainingmodels using only domain-specific data have yielded models that,while much smaller,beatgeneral purpose LLMs on tasks within those domains,such as science (Taylor et al.,2022)and medicine (Bolton et al.,2023;Luo et al.,2022;Lehman et al.,2023).These findingsmotivate further development of models focused on specific domains.Financial Technology (FinTech)is a large and growing area with NLP technologieshaving an increasingly important role (Xing et al.,2018;Fisher et al.,2016;Dredze et al.,2016).Financial NLP tasks (Shah et al.,2022)include sentiment analysis (Araci,2019),named entity recognition (Salinas Alvarado et al.,2015),news classification (Sinha andKhandait,2020),and question answering (Chen et al.,2021b,2022).While the range oftasks is similar to those found in general NLP benchmarks,the complexity and terminologyof the financial domain warrant a domain-specific system.For all of the reasons generativeLLMs are attractive in general-few-shot learning,text generation,conversational systems,etc.it would be valuable to have a LLM focused on the financial domain.While the
喜欢就支持一下吧
点赞6 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容