Over the weekend, I tested 3 LLMs to get relevancy score
Haiku didn't care to follow my instructions most of the time, btw I used claude to write the prompt.
Gemini Flash worked pretty well
Perplexity worked really well couple of time and it started to mess with the response format.
Using Gemini Flash for now.
Still trying to solve the hallucination problem.