This project is modified on the basis of langchain-ChatGLM, so all the functions implemented by the entire framework come from this project. Here is just a reference method to add TigerBot model loading.
🤖️ A question-and-answer application based on local knowledge bases implemented using the idea of langchain. The goal is to build a set of question-and-answer solutions for knowledge bases that are friendly to Chinese scenarios and open source models and can run offline.
💡 Inspired by GanymedeNil's project document.ai and the ChatGLM-6B Pull Request created by Alex Zhangji, a local knowledge base Q&A application that can be implemented using an open source model is established throughout the process. It now supports direct access to large language models such as ChatGLM-6B, or access to models such as Vicuna, Alpaca, LLaMA, Koala, RWKV through fastchat api.
✅ In this project, the default Embedding is GanymedeNil/text2vec-large-chinese, and the default LLM is TigerBot-7B-sft. Relying on the above model, this project can realize all offline private deployments using the open source model.
⛓️ The implementation principle of this project is shown in the figure below. The process includes loading files -> reading text -> text segmentation -> text vectorization -> question vectorization -> matching in the text vector The top k most similar to the question vector -> the matched text is added to the prompt as the context and the question -> submit to LLM to generate the answer.
Visit Official Website