Metrics Overview

Modified on Thu, 19 Oct 2023 at 07:46 AM

Metrics are essential for evaluating the quality and performance of AI models. They help users to compare different models, identify strengths and weaknesses, and optimize their solutions. However, there are many different types of metrics for different modalities and tasks, and it can be challenging to understand and use them correctly. 

That's why aiXplain provides a comprehensive guide that introduces users to various evaluation metrics available for Benchmarking AI models on our platform ?: 

Translation Metrics

  • BLEU - Measures ngram overlap with reference translation. Favors fluency over adequacy.

  • chRF - Measures character ngram overlap with reference. Alleviates BLEU's sensitivity to morphology.

  • METEOR - Matches unigrams and stems/synonyms between generated and reference translations.

  • TER - Counts edits required to modify hypothesis for it to match reference.

  • COMET-DA - Predicts human ranking scores based on reference. Highly reliable.

Transcription Metrics

  • WER - Measures insertions, deletions and substitutions relative to reference transcript.

Speech Quality Metrics

  • PESQ - Predicts subjective opinion scores. Range 1-5. Higher = better quality.

  • COMET-QE - Estimates speech quality without reference. Useful for model selection.

  • NORESQA-MOS - Predicts human mean opinion scores without reference.

  • DNSMOS - Non-reference speech quality score. Accounts for distortions.

  • VISQOL - Estimates speech quality from vocoder features without reference.

  • WARP-Q - Lightweight non-reference speech quality score based on priors.

  • CLSSS - Estimates human scores without reference. Less reliable than reference-based.

I hope this article has proven to be both helpful and informative for you. We greatly appreciate your decision to select aiXplain as your AI creation and optimization partner. Should you have any inquiries or feedback, please don't hesitate to reach out to us at your convenience.

Contact us at:

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article