This competition invites you to fine-tune Gemma 2 for a specific language or cultural context. By creating clear, easy-to-follow notebooks, you’ll empower others to learn and contribute to the development of language models for diverse communities.
With over 7,000 languages and countless cultural differences, AI has the potential to foster global understanding. In a step towards broader linguistic inclusion, we’re launching a Kaggle competition focused on adapting Gemma 2, Google’s open model family, for 73 eligible languages. These languages were selected to represent a diverse range and to align with the expertise of our judging panel for effective evaluation. Our initial focus on these languages will allow us to establish a robust foundation of techniques and resources that will later enable us to support under-resourced languages.
You’re challenged to create notebooks that demonstrate the complete process of adapting Gemma 2, including:
- Dataset Creation/Curation: Explain how you crafted or curated the dataset used for fine-tuning. This includes details about data sources, preprocessing steps, and any considerations related to data quality and cultural sensitivity.
- Fine-tuning Gemma: Provide a detailed explanation of your fine-tuning approach, including hyperparameter choices, training procedures, and any techniques used to enhance performance (e.g., few-shot prompting, retrieval-augmented generation).
- Inference and Evaluation: Demonstrate how to run inference with your fine-tuned model and discuss how you evaluated its performance.
Your notebooks should be designed to be easily understood and replicated by others, enabling them to adapt Gemma 2 for even more languages and cultural contexts. Consider exploring areas like:
- Language Fluency: Fine-tune Gemma to generate fluent and accurate text in the target language, potentially for tasks like translation, dialogue generation, or storytelling.
- Literary Traditions: Adapt Gemma for generating or analyzing poetry, folklore, or other traditional literary forms.
- Historical Texts: Fine-tune Gemma to understand and process historical documents or scripts.
Participants will also need to publish their trained models on Kaggle Models.
Ready to contribute to a more inclusive and interconnected world? Join the competition today and help us unlock the potential of language AI for everyone!
Awards:- $150,000
Deadline:- 14-01-2025