Large language models (LLM) are advanced AI systems trained on extensive textual data, leveraging deep learning techniques to understand and generate human-like language. Today's LLMs with billions of parameters are so huge that hardly any single computing node can train, fine-tune, or infer from them. Therefore, several distributed computing techniques are being introduced in the literature to properly utilize LLMs. We have explored the application of distributed computing techniques in LLMs from two angles. \begin{itemize} \item We study the techniques that democratize the LLM, that is, how large models can be run on consumer-grade computers. Here, we also implement a novel metaheuristics-based modification to an existing system. \item We perform a comparative study on three state-of-the-art LLM serving techniques. \end{itemize}
翻译:暂无翻译