#llm inference architecture