Grading Guidelines for Subject Matter Experts

Your expertise plays a critical role in ensuring that our model evaluations are thorough, objective, and actionable. By providing careful and consistent grading, you help shape the future of AI model selection and prompt optimization, ensuring that our evaluation platform delivers reliable and meaningful results.

Why Your Grading Matters

Your evaluations impact every stage of the model selection process. By assessing responses against custom grading parameters
Your objective grading helps identify strengths and weaknesses in model outputs
Actionable feedback from your evaluations informs prompt adjustments and model improvements.

Consistent, unbiased grading builds a trusted evaluation framework that can scale from small samples to large datasets

Your assessments are integrated into our proprietary Jury LLM, which learns from expert inputs to provide broader evaluations across various projects.

Understanding Custom Grading Parameters

Our custom grading parameters are designed to capture key aspects of model performance, such as
How correct and precise is the response?
Does the output align with the expected voice and manner?
Are all elements of the query adequately addressed?
Is the response clear, coherent, and well-organized?
Does the output provide innovative and relevant insights?
Each parameter comes with a detailed description and set of expectations. A user can create parameters specific to their job. Before you begin grading:
Familiarize yourself with each parameter’s definition and criteria.
Consider the project’s context and objectives when applying the grading scale.
If any parameter seems unclear, consult the project documentation or reach out to the support team.

Do

Evaluate Objectively
Base your grading on the criteria provided, ensuring consistency across all responses.

Provide Detailed Feedback
Offer clear, constructive comments that explain your score and suggest potential improvements.

Be Consistent
Use the same standards for similar responses to maintain a reliable evaluation process.

Document Your Rationale
Record your reasoning where possible. This transparency aids in refining the process and provides valuable insights for future evaluations.

Take Your Time
Ensure you thoroughly assess each response, especially when determining nuances in tone and context.

Follow Project Guidelines
Adhere strictly to any additional instructions specific to the project.

Don’t

Let Personal Bias Influence Your Grading
Maintain neutrality by focusing solely on the criteria outlined.

Rush Through Evaluations
Hasty assessments can lead to inconsistencies and may impact overall model selection.

Use Vague Feedback
Avoid comments like “good” or “bad” without context. Specific insights are far more useful.

Mix Up Criteria
Keep each parameter distinct. Evaluate accuracy separately from tone, clarity, or creativity.

Ignore Context
Each project may have unique requirements; always consider these when grading.

Impact on Model Selection

Your grading directly feeds into the model selection process by
Your expert evaluations train our proprietary system to scale assessments across larger datasets.
Detailed feedback and scores help generate comprehensive reports and recommendations for model improvements.
By identifying trends and areas for improvement, your input drives iterative enhancements in model performance and prompt engineering.
Scroll to Top