NLP New Papers / Books / Telegram Index

Open in telegram

☆☆☆☆☆

⚑ Report channel

246 @nlp_new_papers

Description

Fully automatic :)

هدف کانال بیشتر آشنایی با کارهای جدید است، نه اینکه مکفی از خواندن مقالات باشد.

searching words: [NLP, natural language processing]

similar channel in vision field:
https://t.me/Image_Processing_New_Papers

ادمین:
@dangerous_seif

We recommend to visit

Z-Library Official ?

617,627 @zlibrary_official

News and announcements of the library. No books here.
??Official Chinese channel: t.me/zlib_china_official
? https://z-library.sk
https://en.wikipedia.org/wiki/Z-Library
? https://twitter.com/Z_Lib_official
? https://mastodon.social/@Z_Lib_official

Last updated 1 year ago

Intel Slava Z

421,876 @intelslava

Intel slava is a Russian News aggregator who covers Conflicts/Geopolitics and urgent news from around the world.

For paid promotions and feedback contact us at: @CEOofBelarus

Last updated 6 months, 3 weeks ago

Books Hub: Ebook & Audiobook

303,870 @bookshub25

💫Welcome to the best book channel of Telegram.

✨Buy ads: https://telega.io/c/BooksHub25

✨Contact admin ➠ @Bookshub_contact_bot

✨ Copyright Disclaimer➠ https://telegra.ph/LEGAL-COPYRIGHT-DISCLAIMER-09-18

1 year, 10 months ago

ChatGPT summarized:
In this paper, the authors describe a new type of artificial intelligence called multi-step logical reasoning that leverages machine learning to perform inference and planning. They demonstrate that their system outperforms previous methods such as those of trained machine learning by incorporating explicit planning into the inference procedure. They discuss several different approaches they have used to develop this new kind of reasoning and explain how they stack up against other systems in order to predict future results. In particular, they examine two different types of reasoning: deterministic and stochastic. The deterministic approach is based on brute strength reasoning, which predicts the likelihood of reaching a given conclusion with little or no planning. The stochastics approach relies on explicit planning and involves multiple steps where the decisions are weighted according to how likely each step is to lead to the desired outcome. The latter approach has been shown to be more accurate and reliable than the former. The authors believe that this strategy can be extended to all sorts of

Abstract:
Language models have been shown to perform remarkably well on a wide range of natural language processing tasks. In this paper, we propose LEAP, a novel system that uses language models to perform multi-step logical reasoning and incorporates explicit planning into the inference procedure. Explicit planning enables the system to make more informed reasoning decisions at each step by looking ahead into their future effects. Moreover, we propose a training strategy that safeguards the planning process from being led astray by spurious features. Our full system significantly outperforms other competing methods on multiple standard datasets. When using small T5 models as its core selection and deduction components, our system performs competitively compared to GPT-3 despite having only about 1B parameters (i.e., 175 times smaller than GPT-3). When using GPT-3.5, it significantly outperforms chain-of-thought prompting on the challenging PrOntoQA dataset. We have conducted extensive empirical studies to demonstrate that explicit planning plays a crucial role in the system's performance.

نویسندگان:
Hongyu Zhao, Kangrui Wang, Mo Yu, Hongyuan Mei

تاریخ انتشار:
7 October, 2023

355 #

1 year, 10 months ago

عنوان مقاله:
[Explicit Planning Helps Language Models in Logical Reasoning

برنامه ریزی صریح به مدل های زبان در استدلال منطقی کمک می کند](https://arxiv.org/abs/2303.15714)

خلاصه متن با استفاده از ChatGPT:
در این مقاله، نویسندگان نوع جدیدی از هوش مصنوعی به نام استدلال منطقی چند مرحله‌ای را توصیف می‌کنند که از یادگیری ماشین برای انجام استنتاج و برنامه‌ریزی استفاده می‌کند. آنها نشان می‌دهند که سیستم آنها از روش‌های قبلی مانند روش‌های یادگیری ماشینی آموزش‌دیده با گنجاندن برنامه‌ریزی صریح در روش استنتاج بهتر عمل می‌کند. آنها چندین رویکرد مختلف را که برای توسعه این نوع استدلال جدید استفاده کرده‌اند، مورد بحث قرار می‌دهند و توضیح می‌دهند که چگونه آنها را در مقابل سایر سیستم‌ها قرار می‌دهند تا نتایج آینده را پیش‌بینی کنند. به طور خاص، آنها دو نوع استدلال مختلف را بررسی می کنند: قطعی و تصادفی. رویکرد قطعی مبتنی بر استدلال قدرت بی رحمانه است، که احتمال دستیابی به یک نتیجه معین را با برنامه ریزی کم یا بدون برنامه ریزی پیش بینی می کند. رویکرد تصادفی بر برنامه ریزی صریح متکی است و شامل مراحل متعددی است که در آن تصمیمات بر اساس احتمال اینکه هر مرحله به نتیجه مطلوب منجر شود وزن می شود. نشان داده شده است که رویکرد دوم دقیق تر و قابل اعتمادتر از اولی است. نویسندگان بر این باورند که این استراتژی را می توان به همه انواع گسترش داد

قسمت چکیده (abstract) مقاله:
نشان داده شده است که مدل های زبان در طیف وسیعی از وظایف پردازش زبان طبیعی به طور قابل ملاحظه ای خوب عمل می کنند. در این مقاله، ما LEAP را پیشنهاد می‌کنیم، یک سیستم جدید که از مدل‌های زبانی برای اجرای استدلال منطقی چند مرحله‌ای استفاده می‌کند و برنامه‌ریزی صریح را در روش استنتاج گنجانده است. برنامه‌ریزی صریح سیستم را قادر می‌سازد تا تصمیمات استدلالی آگاهانه‌تری را در هر مرحله با نگاه کردن به اثرات آینده آنها اتخاذ کند. علاوه بر این، ما یک استراتژی آموزشی پیشنهاد می‌کنیم که فرآیند برنامه‌ریزی را از گمراه شدن توسط ویژگی‌های جعلی محافظت می‌کند. سیستم کامل ما به طور قابل توجهی بهتر از سایر روش های رقیب در چندین مجموعه داده استاندارد عمل می کند. هنگام استفاده از مدل های کوچک T5 به عنوان اجزای اصلی انتخاب و کسر، سیستم ما در مقایسه با GPT-3 با وجود داشتن تنها حدود 1B پارامتر (یعنی 175 برابر کوچکتر از GPT-3) عملکرد رقابتی دارد. هنگام استفاده از GPT-3.5، به طور قابل توجهی بهتر از خواسته های زنجیره ای از فکر در مجموعه داده چالش برانگیز ProntoQA عمل می کند. ما مطالعات تجربی گسترده ای انجام داده ایم تا نشان دهیم که برنامه ریزی صریح نقش مهمی در عملکرد سیستم ایفا می کند.

243 #

1 year, 10 months ago

ChatGPT summarized:
In this chapter, Wollstonecraft explains the concept of "modify and correct intervention," which is a scientific term for an attempt to change the outcome of a scientific experiment by adding some kind of error correction. In other words, she wants to show how something that's hard to explain to a casual observer can be put into practice in a real-world situation. Here, she means that she thinks it might be possible to use machine learning to predict the future behavior of biological organisms and vice versa. Jacqueline Harding Stanford UniversityAbstract Neural models achieve high performance on many natural language processing (NLP) benchmark tasks, but their performance on these tasks is notoriously poorly understood. This paper attempts to fill that lacuna. In this paper, we introduce a framework for evaluating the representational claims made about neural NLP models, proposing three criteria with which to evaluate whether a component of a model represents a property and operationalizing these criteria using "probing classifiers." The project of

Abstract:
Despite its centrality in the philosophy of cognitive science, there has been little prior philosophical work engaging with the notion of representation in contemporary NLP practice. This paper attempts to fill that lacuna: drawing on ideas from cognitive science, I introduce a framework for evaluating the representational claims made about components of neural NLP models, proposing three criteria with which to evaluate whether a component of a model represents a property and operationalising these criteria using probing classifiers, a popular analysis technique in NLP (and deep learning more broadly). The project of operationalising a philosophically-informed notion of representation should be of interest to both philosophers of science and NLP practitioners. It affords philosophers a novel testing-ground for claims about the nature of representation, and helps NLPers organise the large literature on probing experiments, suggesting novel avenues for empirical research.

نویسندگان:
Jacqueline Harding

تاریخ انتشار:
7 October, 2023

175 #

1 year, 10 months ago

عنوان مقاله:
[Operationalising Representation in Natural Language Processing

نمایندگی عملیاتی در پردازش زبان طبیعی](https://arxiv.org/abs/2306.08193)

خلاصه متن با استفاده از ChatGPT:
در این فصل، Wollstonecraft مفهوم "تغییر و اصلاح مداخله" را توضیح می دهد، که یک اصطلاح علمی برای تلاش برای تغییر نتیجه یک آزمایش علمی با افزودن نوعی تصحیح خطا است. به عبارت دیگر، او می‌خواهد نشان دهد که چگونه می‌توان چیزی را که برای یک ناظر معمولی توضیح دادن آن دشوار است، در یک موقعیت واقعی عملی کرد. در اینجا منظور او این است که فکر می‌کند ممکن است از یادگیری ماشین برای پیش‌بینی رفتار آتی موجودات بیولوژیکی و بالعکس استفاده شود. مدل‌های عصبی در بسیاری از وظایف معیار پردازش زبان طبیعی (NLP) به عملکرد بالایی دست می‌یابند، اما عملکرد آن‌ها در این وظایف به‌طور بدنی درک نشده است. این مقاله سعی در پر کردن این خلأ دارد. در این مقاله، ما چارچوبی را برای ارزیابی ادعاهای بازنمایی ارائه شده در مورد مدل‌های NLP عصبی معرفی می‌کنیم، و سه معیار را پیشنهاد می‌کنیم که با آن می‌توان به ارزیابی اینکه آیا جزء یک مدل یک ویژگی را نشان می‌دهد یا خیر، و این معیارها را با استفاده از «طبقه‌بندی‌کننده‌های کاوشگر» عملیاتی می‌کند. پروژه از

قسمت چکیده (abstract) مقاله:
علیرغم مرکزیت آن در فلسفه علوم شناختی، کار فلسفی قبلی کمی درگیر مفهوم بازنمایی در عمل NLP معاصر بوده است. این مقاله سعی می‌کند این خلأ را پر کند: با تکیه بر ایده‌هایی از علوم شناختی، چارچوبی را برای ارزیابی ادعاهای بازنمایی مطرح شده در مورد اجزای مدل‌های NLP عصبی معرفی می‌کنم، و سه معیار را پیشنهاد می‌کنم که با آن می‌توان ارزیابی کرد که آیا جزء یک مدل یک ویژگی را نشان می‌دهد یا نه. این معیارها با استفاده از طبقه‌بندی‌کننده‌های کاوشگر، یک تکنیک تحلیلی محبوب در NLP (و یادگیری عمیق به طور گسترده‌تر). پروژه عملیاتی کردن یک مفهوم آگاهانه فلسفی از بازنمایی باید هم برای فیلسوفان علم و هم برای پزشکان NLP مورد علاقه باشد. این به فیلسوفان یک بستر آزمایشی جدید برای ادعاهای مربوط به ماهیت بازنمایی می‌دهد و به NLPers کمک می‌کند تا ادبیات وسیعی را در مورد آزمایش‌های کاوشگر سازماندهی کنند و راه‌های جدیدی را برای تحقیقات تجربی پیشنهاد کند.

152 #

1 year, 10 months ago

عنوان مقاله:
[Exploring the Usage of Chinese Pinyin in Pretraining

بررسی استفاده از پینیین چینی در پیش تمرین](https://arxiv.org/abs/2310.04960)

خلاصه متن با استفاده از ChatGPT:
در این مقاله، نویسندگان کاربردهای پینیین را در آموزش مدل‌های یادگیری ماشین بررسی کرده و روش جدیدی را برای انجام این کار پیشنهاد می‌کنند. مدل آن‌ها هم از حروف چینی سنتی و هم از ویژگی‌های خاص پینیین برای کمک به یادگیری کلمات و عبارات جدید استفاده می‌کند. آنها همچنین وظایفی مانند خواستگاری و رونویسی معکوس را انجام می دهند. آنها در مورد رویکردهای مختلفی که برای آموزش مدل خود استفاده می کنند، از جمله ترکیب و تطبیق انواع مختلف داده ها با سطوح مختلف دشواری بحث می کنند. آنها نتیجه می گیرند که مدل آنها قوی تر و قابل اعتمادتر از سیستم های قبلی مبتنی بر SOTA یا الگوریتم های یادگیری ماشین است.

قسمت چکیده (abstract) مقاله:
بر خلاف زبان های الفبایی، املا و تلفظ چینی متفاوت است. هم کاراکترها و هم پینیین نقش مهمی در درک زبان چینی دارند. در وظایف NLP چینی، ما تقریباً از کاراکترها یا کلمات به عنوان ورودی مدل استفاده می کنیم، و تعداد کمی از آثار نحوه استفاده از پینیین را مطالعه می کنند. با این حال، پینیین در بسیاری از سناریوها مانند تصحیح خطا و تحمل خطا برای خطاهای معرفی شده توسط ASR ضروری است. بیشتر این خطاها ناشی از کلمات تلفظی یکسان یا مشابه هستند و به اختصار از این نوع خطا به عنوان خطاهای SSP (همان یا مشابه تلفظ) یاد می کنیم. در این کار، روش‌های مختلف استفاده از پینیین در مدل‌های پیش‌آموزشی را بررسی می‌کنیم و یک روش پیش‌آموزشی جدید به نام PmBERT را پیشنهاد می‌کنیم. روش ما از کاراکترها و پینیین به صورت موازی برای پیش تمرین استفاده می کند. از طریق وظایف پیش‌آموزشی ظریف، کاراکترها و نمایش پینیین با هم ترکیب می‌شوند که می‌تواند تحمل خطا برای خطاهای SSP را افزایش دهد. ما آزمایش‌های جامع و آزمایش‌های فرسایشی انجام می‌دهیم تا بفهمیم چه چیزی یک مدل زبان چینی تقویت‌شده آوایی قوی را ایجاد می‌کند. نتایج تجربی در هر دو مجموعه داده اضافه شده با نویز ساخته شده و مجموعه داده های تصحیح خطا عمومی نشان می دهد که مدل ما در مقایسه با مدل های SOTA قوی تر است.

ChatGPT summarized:
In this paper, the authors explore the uses of pinyin in training machine learning models and propose a new method to do so. Their model uses both traditional Chinese characters and Pinyin-specific features to help it learn new words and phrases. They also perform tasks such as matchmaking and reverse transcription. They discuss the various approaches they use to train their model, including mixing and matching different types of data to different levels of difficulty. They conclude that their model is more robust and reliable than previous systems based on SOTA or machine learning algorithms.

Abstract:
Unlike alphabetic languages, Chinese spelling and pronunciation are different. Both characters and pinyin take an important role in Chinese language understanding. In Chinese NLP tasks, we almost adopt characters or words as model input, and few works study how to use pinyin. However, pinyin is essential in many scenarios, such as error correction and fault tolerance for ASR-introduced errors. Most of these errors are caused by the same or similar pronunciation words, and we refer to this type of error as SSP(the same or similar pronunciation) errors for short. In this work, we explore various ways of using pinyin in pretraining models and propose a new pretraining method called PmBERT. Our method uses characters and pinyin in parallel for pretraining. Through delicate pretraining tasks, the characters and pinyin representation are fused, which can enhance the error tolerance for SSP errors. We do comprehensive experiments and ablation tests to explore what makes a robust phonetic enhanced Chinese language model. The experimental results on both the constructed noise-added dataset and the public error-correction dataset demonstrate that our model is more robust compared to SOTA models.

نویسندگان:
Baojun Wang, Kun Xu, Lifeng Shang

تاریخ انتشار:
7 October, 2023

129 #

1 year, 10 months ago

ChatGPT summarized:
This paper discusses various machine learning techniques used to predict future weather and traffic patterns. Specifically, the authors examine two different types of forecasting: conventional time series modeling and natural language processing. They examine how well each of these approaches perform against each other in real-world scenarios and across multiple datasets. Forecasting is one of the most commonly used tools to predict business performance. This paper discusses several different approaches to predicting future value from time series data in order to streamline their forecasting efforts. They systematically perform various tasks within each dataset and report the results in Table 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45 . In this chapter, the team develops a new statistical technique to predict market

Abstract:
The past decade has witnessed significant advances in time series modeling with deep learning. While achieving state-of-the-art results, the best-performing architectures vary highly across applications and domains. On the other hand, for natural language processing, Generative Pre-trained Transformer (GPT) has demonstrated impressive performance via training one general-purpose model across various textual datasets. It is intriguing to explore whether GPT-type architectures can be effective for time series, capturing the intrinsic dynamic attributes and leading to significant accuracy improvements. In this paper, we propose a novel framework, TEMPO, that can effectively learn time series representations. We focus on utilizing two essential inductive biases of the time series task for pre-trained models: (i) decomposition of the complex interaction between trend, seasonal and residual components; and (ii) introducing the selection-based prompts to facilitate distribution adaptation in non-stationary time series. TEMPO expands the capability for dynamically modeling real-world temporal phenomena from data within diverse domains. Our experiments demonstrate the superior performance of TEMPO, with 20\%-60\% improvement over state-of-the-art methods on a number of time series benchmark datasets. This performance gain is observed not only in standard supervised learning settings but also in scenarios involving previously unseen datasets. This compelling finding highlights \modelname's potential to constitute a foundational model building framework.

نویسندگان:
Defu Cao, Furong Jia, Sercan O Arik, Tomas Pfister, Yixiang Zheng, Wen Ye, Yan Liu

تاریخ انتشار:
7 October, 2023

82 #

1 year, 10 months ago

عنوان مقاله:
[TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting

TEMPO: ترانسفورماتور مولد از پیش آموزش‌دیده مبتنی بر سریع برای پیش‌بینی سری‌های زمانی](https://arxiv.org/abs/2310.04948)

خلاصه متن با استفاده از ChatGPT:
این مقاله تکنیک‌های مختلف یادگیری ماشینی مورد استفاده برای پیش‌بینی آب و هوا و الگوهای ترافیکی آینده را مورد بحث قرار می‌دهد. به طور خاص، نویسندگان دو نوع مختلف پیش‌بینی را بررسی می‌کنند: مدل‌سازی سری‌های زمانی مرسوم و پردازش زبان طبیعی. آنها بررسی می کنند که هر یک از این رویکردها در سناریوهای دنیای واقعی و در مجموعه داده های متعدد چقدر در برابر یکدیگر عمل می کنند. پیش بینی یکی از رایج ترین ابزارهای مورد استفاده برای پیش بینی عملکرد کسب و کار است. این مقاله چندین رویکرد مختلف برای پیش‌بینی ارزش آینده از داده‌های سری زمانی را به منظور ساده‌سازی تلاش‌های پیش‌بینی آنها مورد بحث قرار می‌دهد. آنها به طور سیستماتیک وظایف مختلفی را در هر مجموعه داده انجام می دهند و نتایج را در جدول 10، 11، 12، 13، 14، 15، 16، 17، 18، 19، 20، 21، 22، 23، 24، 25، 26، 27، گزارش می دهند. 28، 29، 30، 31، 32، 33، 34، 35، 36، 37، 38، 39، 40، 41، 42، 43، 44، 45. در این فصل، تیم یک تکنیک آماری جدید برای پیش‌بینی بازار توسعه می‌دهد

قسمت چکیده (abstract) مقاله:
دهه گذشته شاهد پیشرفت های قابل توجهی در مدل سازی سری های زمانی با یادگیری عمیق بوده است. در حین دستیابی به نتایج پیشرفته، معماری های با بهترین عملکرد در برنامه ها و دامنه ها بسیار متفاوت هستند. از سوی دیگر، برای پردازش زبان طبیعی، ترانسفورماتور پیش‌آموزشی (GPT) عملکرد چشمگیری را از طریق آموزش یک مدل همه منظوره در مجموعه داده‌های متنی مختلف نشان داده است. کشف اینکه آیا معماری‌های نوع GPT می‌توانند برای سری‌های زمانی مؤثر باشند، ویژگی‌های دینامیکی ذاتی را ثبت کنند و منجر به بهبود دقت قابل توجهی شوند، جالب است. در این مقاله، ما یک چارچوب جدید، TEMPO، پیشنهاد می‌کنیم که می‌تواند به طور موثر نمایش سری‌های زمانی را بیاموزد. ما بر روی استفاده از دو سوگیری استقرایی ضروری از کار سری زمانی برای مدل های از پیش آموزش دیده تمرکز می کنیم: (1) تجزیه تعامل پیچیده بین اجزای روند، فصلی و باقیمانده. و (ب) معرفی اعلان‌های مبتنی بر انتخاب برای تسهیل انطباق توزیع در سری‌های زمانی غیر ثابت. TEMPO قابلیت مدل‌سازی پویا پدیده‌های زمانی دنیای واقعی را از داده‌ها در حوزه‌های مختلف گسترش می‌دهد. آزمایش‌های ما عملکرد برتر TEMPO را با 20% -60% بهبود نسبت به روش‌های پیشرفته در تعدادی از مجموعه داده‌های معیار سری زمانی نشان می‌دهند. این افزایش عملکرد نه تنها در تنظیمات یادگیری تحت نظارت استاندارد، بلکه در سناریوهایی که شامل مجموعه داده‌های قبلاً دیده نشده بودند نیز مشاهده می‌شود. این یافته قانع کننده، پتانسیل \modelname را برای ایجاد یک چارچوب ساخت مدل پایه برجسته می کند.

77 #

1 year, 10 months ago

ChatGPT summarized:
In this paper, the authors perform a deep dive into large language models' ability on understanding graph data. They systematically benchmarking leading LLMs on diverse graph predictio n tasks spanning node, edge, and graph levels. By comparing LLMs’ performance with specialized graph models, they offer insights into the strengths and limitation of employing LLMs for graph analytics. In addition, they demonstrate avenues for further exploration in applying them to graph analysis. The experiments are conducted on 5 commonly used graph benchm datasets for node-level, edge-level and graph consolidation tasks. The results are presented in as many papers as possible. Large language models have achieved impressive perf ormance on natural language processing tasks, but their capabilities on graph-structured data remain relatively unexplored. In this study, they conduct a series of experiments on 5 different graph taskdatabases. The report covers sample datasets from several different research institutes including:Artificial intelligence (AI) ;Deep machine learning (D

Abstract:
Large language models (LLMs) have achieved impressive performance on many natural language processing tasks. However, their capabilities on graph-structured data remain relatively unexplored. In this paper, we conduct a series of experiments benchmarking leading LLMs on diverse graph prediction tasks spanning node, edge, and graph levels. We aim to assess whether LLMs can effectively process graph data and leverage topological structures to enhance performance, compared to specialized graph neural networks. Through varied prompt formatting and task/dataset selection, we analyze how well LLMs can interpret and utilize graph structures. By comparing LLMs' performance with specialized graph models, we offer insights into the strengths and limitations of employing LLMs for graph analytics. Our findings provide insights into LLMs' capabilities and suggest avenues for further exploration in applying them to graph analytics.

نویسندگان:
Yuntong Hu, Zheng Zhang, Liang Zhao

تاریخ انتشار:
7 October, 2023

62 #

1 year, 10 months ago

عنوان مقاله:
[Beyond Text: A Deep Dive into Large Language Models' Ability on Understanding Graph Data

فراتر از متن: فرو رفتن عمیق در مدل های زبان بزرگ توانایی درک داده های نمودار](https://arxiv.org/abs/2310.04944)

خلاصه متن با استفاده از ChatGPT:
در این مقاله، نویسندگان به بررسی توانایی مدل‌های زبان بزرگ در درک داده‌های نمودار پرداختند. آنها به طور سیستماتیک LLM های پیشرو را بر روی وظایف پیش بینی گراف متنوعی که سطوح گره، لبه و گراف را در بر می گیرند، محک می زنند. با مقایسه عملکرد LLM ها با مدل های گراف تخصصی، آنها بینشی در مورد نقاط قوت و محدودیت استفاده از LLM برای تجزیه و تحلیل گراف ارائه می دهند. علاوه بر این، آنها راه هایی را برای کاوش بیشتر در استفاده از آنها در تجزیه و تحلیل نمودار نشان می دهند. آزمایش‌ها بر روی 5 مجموعه داده‌های محک گراف که معمولاً مورد استفاده قرار می‌گیرند برای وظایف ادغام گره، سطح لبه و گراف انجام می‌شوند. نتایج تا حد امکان در مقالات متعددی ارائه شده است. مدل‌های زبان بزرگ به عملکرد قابل توجهی در وظایف پردازش زبان طبیعی دست یافته‌اند، اما قابلیت‌های آن‌ها در داده‌های ساختاریافته گراف نسبتا ناشناخته باقی می‌ماند. در این مطالعه، آنها مجموعه‌ای از آزمایش‌ها را بر روی 5 پایگاه داده‌های مختلف نمودار انجام می‌دهند. این گزارش مجموعه داده‌های نمونه از چندین موسسه تحقیقاتی مختلف را پوشش می‌دهد، از جمله: هوش مصنوعی (AI)؛ یادگیری ماشین عمیق (D

قسمت چکیده (abstract) مقاله:
مدل های زبان بزرگ (LLM) در بسیاری از وظایف پردازش زبان طبیعی به عملکرد چشمگیری دست یافته اند. با این حال، قابلیت‌های آن‌ها در داده‌های ساختاریافته گرافی نسبتا ناشناخته باقی می‌مانند. در این مقاله، ما مجموعه‌ای از آزمایش‌ها را انجام می‌دهیم که LLM‌های پیشرو را بر روی وظایف پیش‌بینی گراف متنوع در سطوح گره، لبه و گراف محک می‌زنند. هدف ما ارزیابی این است که آیا LLM ها می توانند به طور موثر داده های گراف را پردازش کنند و از ساختارهای توپولوژیکی برای افزایش عملکرد در مقایسه با شبکه های عصبی گراف تخصصی استفاده کنند. از طریق قالب‌بندی سریع و انتخاب وظایف/مجموعه داده‌ها، ما تجزیه و تحلیل می‌کنیم که چگونه LLMها می‌توانند ساختارهای نمودار را تفسیر و استفاده کنند. با مقایسه عملکرد LLM ها با مدل های گراف تخصصی، ما بینش هایی را در مورد نقاط قوت و محدودیت های به کارگیری LLM ها برای تجزیه و تحلیل گراف ارائه می دهیم. یافته‌های ما بینش‌هایی در مورد قابلیت‌های LLM ارائه می‌کند و راه‌هایی را برای کاوش بیشتر در استفاده از آن‌ها در تجزیه و تحلیل گراف پیشنهاد می‌کند.

58 #

1 year, 10 months ago

ChatGPT summarized:
This paper describes several experiments in which the authors attempt to apply machine learning to various natural language processing problems. They examine the performance of several different machine learning algorithms designed to assess the fluency of native speakers of different languages and cultures. In particular, they examine the results of a new test called the "Indonesian Multi-Task Language Understanding Benchmark. This test consists of fourteen nine06 questions across 63 tasks and education levels, with 46% of the questions focusing on proficiency in the Indonesian language and knowledge of nine local languages and culture. Results across all subject areas are obtained from real-time data collected from teachers as well as third-party sources. The study uses machinelearning to predict student progress on various subjects from primary school to university entrance exams. They note that only GPT-3.5 can pass the Indonesian primary school exam with limited knowledge of local Indonee language and culture, but that other models such as BLOOMZ and Falcon perform at even lower

Abstract:
Large language models have made significant advancements in natural language processing (NLP), exhibiting human performance across various classic NLP tasks. These tasks, however, focus on structure and semantics, and few are designed to assess reasoning abilities and real-world knowledge, which are increasingly vital given that these models are trained on extensive textual data and information. While prior research primarily focuses on English, in this work, we gather a collection of exam problems from primary school to university entrance tests in Indonesia, and evaluate whether large language models can pass the exams. We obtain 14,906 questions across 63 tasks and levels, with 46\% of the questions focusing on assessing proficiency in the Indonesian language and knowledge of nine local languages and cultures in Indonesia. Our empirical evaluations show that GPT-3.5 only manages to pass the Indonesian primary school level, with limited knowledge of the Indonesian local languages and cultures. Other smaller models such as BLOOMZ and Falcon fail the exams.

نویسندگان:
Fajri Koto, Nurul Aisyah, Haonan Li, Timothy Baldwin

تاریخ انتشار:
7 October, 2023

51 #

We recommend to visit

Z-Library Official ?

617,627 @zlibrary_official

Last updated 1 year ago

Intel Slava Z

421,876 @intelslava

Intel slava is a Russian News aggregator who covers Conflicts/Geopolitics and urgent news from around the world.

For paid promotions and feedback contact us at: @CEOofBelarus

Last updated 6 months, 3 weeks ago

Books Hub: Ebook & Audiobook

303,870 @bookshub25