low-resource-nlp
Here are 106 public repositories matching this topic...
MTEB: Massive Text Embedding Benchmark
-
Updated
Jun 17, 2026 - Python
[EMNLP 2023] đź’¬ Language Identification with Support for More Than 2000 Labels
-
Updated
Apr 15, 2026 - Python
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
-
Updated
Jun 11, 2026 - Python
This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.
-
Updated
Oct 23, 2024 - Python
NLP pipelines for Tagalog using spaCy
-
Updated
Jul 20, 2025 - Python
AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/
-
Updated
Jan 10, 2024 - Jupyter Notebook
Code and datasets for the ACL 2021 paper "OntoED: Low-resource Event Detection with Ontology Embedding"
-
Updated
Apr 19, 2022 - Python
A Scandinavian Benchmark for sentence embeddings
-
Updated
Dec 5, 2025 - Python
[SIGIR 2023] Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph Construction
-
Updated
Apr 5, 2023 - Python
This is a repository for NaijaSenti. A Lacuna Funded Project for the development of sentiment corpus for four Nigerian languages: Igbo, Hausa, Yoruba and Pidgin.
-
Updated
Oct 14, 2025 - Jupyter Notebook
[ACL'24] MC^2: A Multilingual Corpus of Minority Languages in China (Tibetan, Uyghur, Kazakh, and Mongolian)
-
Updated
Jan 17, 2026 - Python
[ACL'24 Findings] Teaching Large Language Models an Unseen Language on the Fly
-
Updated
Jan 6, 2026 - Python
Materials for AACL-IJCNLP-2022 tutorial: Efficient and Robust Knowledge Graph Construction
-
Updated
Feb 3, 2023
Official codebase for the ACL 2025 Findings paper: Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval.
-
Updated
Jul 26, 2025 - Jupyter Notebook
A curated list of awesome sentiment analysis studies, in which attitude corresponds to the text position conveyed by Subject towards other Object mentioned in text such as: entities, events, etc.
-
Updated
Mar 23, 2026
Awesome Lao Natural Language Processing
-
Updated
Mar 7, 2025
Code for paper "ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models"
-
Updated
Mar 29, 2024 - Python
This repository contains the code, data, and associated models of the paper titled "BanglaParaphrase: A High-Quality Bangla Paraphrase Dataset", accepted in Proceedings of the Asia-Pacific Chapter of the Association for Computational Linguistics: AACL 2022.
-
Updated
Nov 14, 2022 - Python
Pashto Natural Language Processing Toolkit
-
Updated
May 21, 2025
A comprehensive overview of research regarding Natural Language Processing (NLP) of Manipuri language.
-
Updated
Nov 30, 2024
Improve this page
Add a description, image, and links to the low-resource-nlp topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the low-resource-nlp topic, visit your repo's landing page and select "manage topics."