In our evaluation from the IEP analysis’s failure instances, we sought to establish the elements restricting LLM overall performance. Supplied the pronounced disparity between open-supply models and GPT models, with a few failing to provide coherent responses persistently, our Assessment focused on the GPT-4 model, the most Innovative model avail