• Contact us
  • Privacy Policy
  • Terms & Conditions
  • Cancellation & Refund Policy
  • Login
  • Register
GMC
  • Home
  • Competitions
    • Design
    • Idea competition
    • Undergraduate scholarships
    • Photography
    • Music and art
    • Video
    • Business competition
    • Writing
    • Internship
  • Scholarships
  • Apps
    • Free Android App
    • Free IOS App
  • Promote
    • Post Opportunity
    • Promote competition
  • Premium Competitions
No Result
View All Result
  • Home
  • Competitions
    • Design
    • Idea competition
    • Undergraduate scholarships
    • Photography
    • Music and art
    • Video
    • Business competition
    • Writing
    • Internship
  • Scholarships
  • Apps
    • Free Android App
    • Free IOS App
  • Promote
    • Post Opportunity
    • Promote competition
  • Premium Competitions
No Result
View All Result
Competitions | Hackathons | Contests | scholarships
No Result
View All Result
Home Expired

LLMs – You Can’t Please Them All

saadithya by saadithya
January 4, 2025
in Expired
Reading Time: 2 mins read
0
LLMs - You Can't Please Them All
19
SHARES
Share on FacebookShare on TwitterShare on LinkedInShare on Whatsapp

This competition challenges you to identify exploits for an LLM-as-a-judge system designed to evaluate the quality of essays. You’ll be given a list of essay topics and your goal will be to submit an essay that maximizes disagreement between the LLM judges. Your work will help to form a better understanding of the capabilities and limitations of using LLMs for subjective evaluations tasks at scale.

It’s increasingly common to use LLMs for subjective evaluations such as ranking and scoring the quality of generated text. However, any automated rating system is vulnerable to exploits. Different models will have different degrees of self-bias, position-bias, length-bias, and style-bias that might negatively impact their ability to provide robust assessments (Zheng 2023, Wang 2023, Panickssery 2024). Likewise, different models will have different degrees of vulnerabilities to targeted exploits, such as universal jailbreaks, that can be used to misguide the system (Wallace 2021, Zou 2023, Li 2024, Rando 2024).

One method to improve the robustness of automated judging systems is to include multiple LLM models to form a LLM-judging committee. Each model is distantly related to the other, decreasing the chances of having common vulnerabilities. An advantage of LLM-judging committees is that they are less sensitive to exploits that impact only a single model. This competition attempts to answer the question of whether or not individual LLM judges can be coerced into returning inflated scores that diverge substantially from a group consensus.

By identifying exploits used to unfairly bias an evaluation in a given direction, you will help the ML community better understand the strengths and weaknesses of using AI systems to make subjective decisions at scale.

Awards:-

  • 1st Place – $12,000
  • 2nd Place – $10,000
  • 3rd Place – $10,000
  • 4th Place – $10,000
  • 5th Place – $8,000

We’re excited to launch this experimental competition and want to be completely transparent about its nature. Because LLM behavior can be unpredictable, we might encounter unexpected issues that affect scoring or the competition’s overall integrity. We plan to award points and medals, but there may be course corrections along the way, and we reserve the right to remove points and medals. We’ll keep participants informed and make any necessary adjustments with ample time remaining in the competition.

Deadline:- 25-02-2025

Take this challenge

Tags: LLMsYou Can't Please Them All
Previous Post

FIDE & Google Efficient Chess AI Challenge

Next Post

Konwinski Prize

saadithya

saadithya

Related Posts

NEXT HOUSE USA - Architecture Competition
Expired

NEXT HOUSE USA – Architecture Competition

by saadithya
December 27, 2025
€100,000 Prize Kingspan MICROHOME 2026
Expired

€100,000 Prize Kingspan MICROHOME 2026

by saadithya
December 27, 2025
Cobot for fuel tank fitment in a vehicle assembly line
Expired

Cobot for fuel tank fitment in a vehicle assembly line

by saadithya
December 27, 2025
AI for Healthy India
Expired

AI for Healthy India

by saadithya
December 27, 2025
Health and Safety Solutions for Delivery Workers
Expired

Health and Safety Solutions for Delivery Workers

by saadithya
December 27, 2025
Load More

Browse by Category

  • Architecture competition
  • Business competition
  • Coding competitions
  • Competitions
  • Design competitions
  • Expired
  • Fellowship
  • General
  • Hackathons
  • Idea competition
  • Internship
  • Medical
  • MicroBachelors
  • Micromasters
  • Music and art
  • Online Contest
  • Photography Competitions
  • Poetry Competitions
  • Sponsored
  • Subscribers Only
  • Technology
  • Undergraduate scholarships
  • Video competitions
  • Writing Competitions
Competitions | Hackathons | Contests | scholarships

An Innovation portal for making innovation opportunities accessible to all.

Android App

Click to download

IOS App

Click to download

© 2025 Givemechallenge

No Result
View All Result
  • Home
  • Login
  • Premium Competitions
  • Competitions
  • Idea competition
  • Internship
  • Undergraduate scholarships
  • Design competitions
  • Photography Competitions
  • Writing Competitions
  • Video competitions
  • Coding competitions
  • Medical
  • Music and art
  • Poetry Competitions
  • Premium Android App
  • Free Android App
  • Free IOS App

© 2025 Givemechallenge

Welcome Back!

Login to your account below

Forgotten Password? Sign Up
/*

Create New Account!

Fill the forms bellow to register

All fields are required. Log In
*/

Retrieve your password

Please enter your username or email address to reset your password.

Log In
loader