Towards NLP🇺🇦

Description
All ngrams about Natural Language Processing that are of interest to @iamdddaryna
Advertising
We recommend to visit

Community chat: https://t.me/hamster_kombat_chat_2

Twitter: x.com/hamster_kombat

YouTube: https://www.youtube.com/@HamsterKombat_Official

Bot: https://t.me/hamster_kombat_bot
Game: https://t.me/hamster_kombat_bot/

Last updated 3 months, 1 week ago

Your easy, fun crypto trading app for buying and trading any crypto on the market

Last updated 3 months ago

Turn your endless taps into a financial tool.
Join @tapswap_bot


Collaboration - @taping_Guru

Last updated 3 days, 5 hours ago

7 months ago

NLP for Positive Impact Workshop

We are thrilled to invite submissions to the Third Workshop on NLP for Positive Impact!

? Workshop Website: https://sites.google.com/view/nlp4positiveimpact
? Important Dates:
Submission Deadline: June 15, 2024, 11:59 PM AoE
Commitment Deadline: August 20, 2024
Notification of Acceptance: September 20, 2024
Camera-Ready Papers Due: October 3, 2024
Workshop Date: Co-located with EMNLP 2024 in November, Miami

This workshop is a platform to explore how all skyrocketing NLP ? can address critical global issues and support the UN sustainability goals ? We are looking for innovative research that focuses on the societal impact of NLP, including areas like healthcare, education, inequality, climate change, and more.

? Special Theme: Tackling digital violence through NLP and AI ?
We encourage interdisciplinary collaborations and value submissions that connect NLP with other fields and NGOs. Submissions should include a discussion on the ethical and societal implications of the work, aiming for a positive impact.

? Submission Types:
Case studies of real-world deployments
Position papers proposing new tasks or directions
Literature reviews
Philosophical discussions
Approaches to interdisciplinary collaboration
Ethical considerations
Join us in Miami and share your research with a vibrant community dedicated to using NLP for the greater good. Let's harness the power of language-oriented AI to make a positive difference in the world!

? Contact: [email protected]

Looking forward to your contributions!

Organizers:
Zhijing Jin (Max Planck Institute & ETH Zurich)
Daryna Dementieva (Technical University of Munich)
Steven Wilson (Oakland University)
Oana Ignat (University of Michigan)
Jieyu Zhao (University of Maryland, College Park)
Joel Tetreault (Dataminr, Inc.)
Rada Michaela (University of Michigan)

Google

NLP for Positive Impact

Call for Papers Submission Channel 1: ARR We align our paper acceptance with ARR cycles: Submission deadline: June ARR, i.e., June 15, 2024, 11:59pm AoE Commitment deadline to the Workshop: August 20, 2024 (We accept both archival and non-archival submissions.…

**NLP for Positive Impact Workshop**
8 months ago

TextDetox CLEF 2024: Test PhaseOur shared task on multilingual text detoxification is ongoing and reaching its final phase?

We are releasing the parallel pairs for the dev part:
https://huggingface.co/datasets/textdetox/multilingual_paradetox

and new toxic sentences for the test part:
https://huggingface.co/datasets/textdetox/multilingual_paradetox_test

We are waiting for you submission here:
https://codalab.lisn.upsaclay.fr/competitions/18243
till May 12th?

You can submit for ANY language! There are 9 of them: English, Spanish, German, Chinese, Arabic, Hindi, Ukrainian, Russian, and Amharic.

huggingface.co

textdetox/multilingual_paradetox · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

**TextDetox CLEF 2024: Test Phase**Our shared task on [multilingual text detoxification](https://pan.webis.de/clef24/pan24-web/text-detoxification.html) is ongoing and reaching its final phase***?***
8 months, 1 week ago

TextDetox CLEF 2024 We are glad to invite you to participate in the first of its kind multilingual Text Detoxification shared task! https://pan.webis.de/clef24/pan24-web/text-detoxification.html TL;DR Task formulation: transfer a text style from toxic…

huggingface.co

textdetox (Multilingual Text Detoxification)

Text Style Transfer, Text Detoxification, Toxic Speech Detection and Mitigation, Multilingualism

TextDetox CLEF 2024 We are glad to invite you to participate in the first of its kind multilingual Text Detoxification …
8 months, 2 weeks ago

A little guide to building Large Language Models in 2024

by Thomas Wolf ?

Video [link]
Presentation [link]

YouTube

A little guide to building Large Language Models in 2024

A little guide through all you need to know to train a good performance large language model in 2024. This is an introduction talk with link to references for further reading. This is the first video of a 2 part series: - Video 1 (this video): covering all…

9 months, 1 week ago

Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language ModelsPhD application season is starting. If you were afraid, that the only topic you will be suggested is only to prompt LLMs, here are good scientifically proved news for you — there are still plenty to do in NLP!

Amazing colleagues from the Michigan University has prepared a list of still open NLP research questions, 45 of them! Including:
Multilinguality
Reasoning
Knowledge Bases
Language Grounding
Computational Social Science
Online Environments
Child Language Acquisition
Non-verbal Communication
Synthetic Datasets
Interpretability
Efficient NLP
NLP in Education
NLP in Healthcare
NLP and Ethics

Yes, in some direction, we have gone already a long way, so other topics are becoming important and just possible already to explore✨

Check the full text (is appearing at COLING):
https://arxiv.org/abs/2305.12544

P.S. And I am reminding, that we are having multilingual safe-language important shared task on texts detoxification — start you first research experiments now?

arXiv.org

Has It All Been Solved? Open NLP Research Questions Not Solved by...

Recent progress in large language models (LLMs) has enabled the deployment of many generative NLP applications. At the same time, it has also led to a misleading public discourse that ``it's all...

**Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models**PhD application season is starting. *If …
9 months, 2 weeks ago

TextDetox CLEF 2024We are glad to invite you to participate in the first of its kind multilingual Text Detoxification shared task!

https://pan.webis.de/clef24/pan24-web/text-detoxification.html

TL;DRTask formulation: transfer a text style from toxic to neutral (i.e. what a fk is this about? -> what is this about?)
9 Languages: English, Spanish, Chinese, Hindi, Arabic, German, Russian, Ukrainian, and Amharic
?** https://huggingface.co/textdetox

More details:

Identification of toxicity in user texts is an active area of research. Today, social networks such as Facebook, Instagram are trying to address the problem of toxicity. However, they usually simply block such kinds of texts. We suggest a proactive reaction to toxicity from the user. Namely, we aim at presenting a neutral version of a user message which preserves meaningful content. We denote this task as text detoxification.

In this competition, we suggest you create detoxification systems for 9 languages from several linguistic families. However, the availability of training corpora will differ between the languages. For English and Russian, the parallel corpora of several thousand toxic-detoxified pairs (as presented above) are available. So, you can fine-tune text generation models on them. For other languages, for the dev phase, no such corpora will be provided. The main challenge of this competition will be to perform both supervised and unsupervised cross-lingual detoxification.

You are very welcome to test all modern LLMs on text detoxification and safety with our data as well as experiment with different unsupervised approaches based on MLMs or other paraphrasing methods!

The final leaderboard will be built on a manual evaluation of a test set subset performed via crowdsourcing at Toloka.ai platform.

In the end, you will have an opportunity to write and then present a paper at CLEF 2024 (https://clef2024.imag.fr/) which will take place in Grenoble, France!

Important DatesFebruary 1, 2024: First data available and run submission opens.
April 22, 2024: Registration closes.
May 6, 2024: Run submission deadline and results out.
May 31, 2024: Participants paper submission.
July 8, 2024: Camera-ready participant papers submission.
September 9-12, 2024: CLEF Conference in Grenoble and Touché Workshop.

huggingface.co

textdetox (Multilingual Text Detoxification)

Text Style Transfer, Text Detoxification, Toxic Speech Detection and Mitigation, Multilingualism

**TextDetox CLEF 2024**We are glad to invite you to participate in the first of its kind multilingual Text Detoxification shared …
10 months, 1 week ago

Ukrainian Texts Classification Corpora p2We continue to enrich datasets for the classification of texts in the Ukrainian language. This time, we worked on the translation of English-language data into Ukrainian and obtained:

  1. Ukrainian NLI corpus: https://huggingface.co/datasets/ukr-detect/ukr-nli-dataset-translated-stanford translated from Stanford SNLI.
  2. Ukrainian Formality corpus: https://huggingface.co/datasets/ukr-detect/ukr-formality-dataset-translated-gyafc translated from English GYAFC
  3. In addition to the toxicity corpus presented previously, translated data from the English Jigsaw Toxicity Classification dataset https://huggingface.co/datasets/ukr-detect/ukr-toxicity-dataset-translated-jigsaw

You are very welcomed to use and test them?

huggingface.co

ukr-detect/ukr-nli-dataset-translated-stanford · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

**Ukrainian Texts Classification Corpora p2**We continue to enrich datasets for the classification of texts in the Ukrainian language. This time, …
11 months, 2 weeks ago

ELLIS Winter School on Foundation ModelsAmsterdam 12-15th March
https://amsterdam-fomo.github.io/

Foundation Models, and their origin, analysis and development have been typically associated with the US and Big Tech. Yet, a critical share of important insights and novel approaches do come from Europe, both within academia and industry. Part of this winter school's goal is to highlight these fresh perspectives and give the students an in-depth look into how Europe is guiding its own research agenda with unique directions and bringing together the community. The workshop will take place at the University of Amsterdam.Lectures from top researchers from DeepMind, Google Research, and top EU unis.

Deadline to apply: 15th February 2024 23:59 CET

11 months, 3 weeks ago

Happy New Year 2024

Thank you for being interested in NLP and my view on it ?

For new year, I have some new ideas for the community -- stay tuned ?

Be professional, believe in yourself, be open for new ideas, and all other positive tokens in your texts ?

1 year, 2 months ago
**A Benchmark Dataset to Distinguish Human-Written …

A Benchmark Dataset to Distinguish Human-Written and Machine-Generated Scientific Papers

SCIENTISTS ARE GOING TO SUBMIT PAPERS WRITTEN BY CHATGPT, THE SCIENCE GONNA DIE

Or not?

Our chair work about if we can detect machine-generated or paraphrased articles.
TL;DR: yes, we can, even with logistic regression.

For generation, we tried out: GPT-2, GPT-3, ChatGPT, Galactica, and SciGen.
Article looks like: Abstract + Intro + Conclusion.

?dataset with ~70k rows of generated scientific texts by different models;
There, you can also find fine-tuned ?Galactica and ?RoBERTa for detection.

The full paper with all tables of results and explainability investigations [link]

We recommend to visit

Community chat: https://t.me/hamster_kombat_chat_2

Twitter: x.com/hamster_kombat

YouTube: https://www.youtube.com/@HamsterKombat_Official

Bot: https://t.me/hamster_kombat_bot
Game: https://t.me/hamster_kombat_bot/

Last updated 3 months, 1 week ago

Your easy, fun crypto trading app for buying and trading any crypto on the market

Last updated 3 months ago

Turn your endless taps into a financial tool.
Join @tapswap_bot


Collaboration - @taping_Guru

Last updated 3 days, 5 hours ago