Add tests for llama fp8 in [toy tests](https://github.com/iree-org/iree-test-suites/blob/main/sharktank_models/llama3.1/test_llama.py) to expand coverage and avoid compile regressions like [this](https://github.com/iree-org/iree/issues/20528)