I summarise the kinds of evaluations that are needed for a structured data generation task.
How to think about creating a dataset for LLM finetuning evaluation
I summarise the kinds of evaluations that are needed for a structured data generation task.