ChatGLM-6B

Tsinghua KEG
6.5K liked
Website
entry-slick
entry-slick
About ChatGLM-6B

ChatGLM-6B is an open-source, Chinese-English bilingual dialogue language model based on the General Language Model (GLM) architecture with 6.2 billion parameters. Combined with model quantization technology, users can deploy locally on consumer-grade graphics cards (only 6GB of video memory is required at the INT4 quantization level). ChatGLM-6B uses technology similar to ChatGPT, optimized for Chinese Q&A and dialogue. After about 1T identifiers of Chinese and English bilingual training, supplemented by supervision and fine-tuning, feedback self-help, human feedback reinforcement learning and other technologies, ChatGLM-6B with 6.2 billion parameters has been able to generate answers that are quite in line with human preferences. For more information, please refer to our blog.

In order to facilitate downstream developers to customize models for their own application scenarios, we have also implemented an efficient parameter fine-tuning method (use guide) based on P-Tuning v2. Under the INT4 quantization level, only 7GB of video memory is required to start fine-tuning.

However, due to the small size of ChatGLM-6B, it is currently known to have considerable limitations, such as factual/mathematical logic errors, possible generation of harmful/biased content, weak contextual ability, self-awareness confusion, and Generate content that completely contradicts Chinese instructions for English instructions. Please understand these issues before use to avoid misunderstandings. A larger ChatGLM based on the 130 billion parameter GLM-130B is in beta development.

Visit Official Website

https://github.com/THUDM/ChatGLM-6B

Reviews
Show more
Smarty
This is definitely trained in Chinese
image
partager
Jackierun
Personally, I feel that although ChatGLM-6B is very strong in similar parameter rankings, it still has little practical value.
partager
Tsinghua KEG
RT @Xianbao_QIAN: Here comes the Transformers compatible version of the CogVLM model from @thukeg

THUDM/cogvlm-ch...
Gradio demo: ht…
link
THUDM/cogvlm-chat-hf · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
partager
Tsinghua KEG
RT @kaggle: 🤖 Now on #KaggleModels!

Discover @thukeg's ChatGLM, an open bilingual language model based on the General Language Model (GLM…
1
partager
Community Posts
Tsinghua KEG
RT @kaggle: 🤖 Now on #KaggleModels!

Discover @thukeg's ChatGLM, an open bilingual language model based on the General Language Model (GLM…
1
partager
Tsinghua KEG
RT @Xianbao_QIAN: Here comes the Transformers compatible version of the CogVLM model from @thukeg

THUDM/cogvlm-ch...
Gradio demo: ht…
link
THUDM/cogvlm-chat-hf · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
partager
Tsinghua KEG
#CogVLM
-------------
From @SkalskiP:I'm super impressed with Qwen-VL and CogVLM!

I've done a few (probably very naive) tests to compare LLaVA, BakLLaVA, Qwen-VL, CogVLM, and GPT-4V.

Tests include VQA, OCR, and zero-shot detection.

Any ideas on what else I should test?
partager
Tsinghua KEG
GitHub - THUDM/...
-------------
From @SkalskiP:looking for OpenAI-4V alternatives?

- LLaVA
- BakLLaVA
- CogVLM
- Qwen-VL

different tasks:
- VQA - answering questions about images
- OCR
- reading text - zero-shot detections

link:
link
GitHub - THUDM/CogVLM: a state-of-the-art-level open visual language model | 多模态预训练模型
a state-of-the-art-level open visual language model | 多模态预训练模型 - GitHub - THUDM/CogVLM: a state-of-the-art-level open visual language model | 多模态预训练模型
partager
Tsinghua KEG
RT @NielsRogge: Very nice: the authors of CogVLM (one of the best open-source alternatives to GPT4-V) have added a @huggingface compatible…
partager
Tsinghua KEG
💜💙🤗🤗🧡♥️
Thank you for all the LIKEs!
-------------
From @Omar Sanseviero:The top 15 most-liked organizations on @huggingface

1. @StabilityAI 20k likes
2. @AIatMeta 20k
3. @runwayml 11k
4. CompVis 10k
5. @thukeg 7k
6. @BigscienceW 7k
7. @TIIuae 7k
8. @Microsoft 6.5k
9. @GoogleAI 6k
10. @OpenAI 4k
11. @BigCodeProject 4k
12. @MosaicML 4k
13. @UKPLab 3k…
partager
Tsinghua KEG
RT @karpathy: New YouTube video: 1hr general-audience introduction to Large Language Models
[1hr Talk] Intr...

Based on a 30min talk…
link
[1hr Talk] Intro to Large Language Models
This is a 1 hour general-audience introduction to Large Language Models: the core technical component behind systems like ChatGPT, Claude, and Bard. What the...
partager
Tsinghua KEG
Big congrats to Prof Shimin! Very well deserved!! We’ve learned so much from him!!
-------------
From @Tsinghua CS:Professor Hu Shimin from #Tsinghua DCST was elected as Academician of Chinese Academy of Sciences, for his great contributions to Computer Graphics, Geometric Computing and Artificial Intelligence! He also developed the widely-used DL framework Jittor. Congrats to Prof Hu! 🎉🎉🎉
partager
Tsinghua KEG
wow, proud that we @thukeg (thudm) from @thudcst @Tsinghua_Uni are among the very top. Also thanks @huggingface 🤗

THUDM
-------------
From @Omar Sanseviero:I was curious about which universities are using Hugging Face. The answer: over 5000 groups 🤯

Explore all universities at

Some groups with the most likes: @thukeg, @HelsinkiNLP, @humphrey_shi Labs, @UKPLab, @uwnlp, and @stanfordnlp 🔥
link
THUDM
Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University. ChatGLM, WebGLM, VisualGLM, GLM-130B, CodeGeeX, CogDL, CogView, CogVideo, CogVLM, AMiner - THUDM
partager
Tsinghua KEG
How to #RLHF for LLMs: #PPO or #DPO?
Introducing #BPO (black-box prompt optimization) to align LLMs without model training.

1) ChatGPT + BPO > ChatGPT
2) GPT-4 + BPO > GPT-4
3) Vicuna + BPO > Vicuna + PPO/DPO
4) Vicuna + DPO + BPO > Vicuna + DPO

t.co/4BDRHjHu6N
image
partager
Tsinghua KEG
Thanks AK!
#CogVLM: an #open visual language model. Chat w CogVLM about images @Gradio GitHub - THUDM/...

CogVLM-17B tops 10+ benches: NoCaps, Flicker30k, RefCOCO, RefCOCO+, RefCOCOg, Visual7W, GQA, SciQA, VizWiz VQA & TDIUC, surpassing or matching PaLI-X 55B.
-------------
From @AK:CogVLM: Visual Expert for Pretrained Language Models

paper page:

introduce CogVLM, a powerful open-source visual language foundation model. Different from the popular shallow alignment method which maps image features into the input space of language…
link
GitHub - THUDM/CogVLM: a state-of-the-art-level open visual language model | 多模态预训练模型
a state-of-the-art-level open visual language model | 多模态预训练模型 - GitHub - THUDM/CogVLM: a state-of-the-art-level open visual language model | 多模态预训练模型
1
partager
Tsinghua KEG
RT @cenyk1230: #ChatGLM3-Turbo, the third-generation model developed by #Tsinghua & #ZhipuAI, has remarkably set a new record in the SuperC…
partager
Tsinghua KEG
RT @_akhaliq: CogVLM: Visual Expert for Pretrained Language Models

paper page: Paper page - Co...

introduce CogVLM, a powerful open-…
link
Paper page - CogVLM: Visual Expert for Pretrained Language Models
Join the discussion on this paper page
partager
Tsinghua KEG
RT @trending_repos: Trending repository of the day 📈

ChatGLM3 by @thukeg

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型

Last 2…
partager
Tsinghua KEG
RT @RichardSocher: Keep AI open.
partager
Tsinghua KEG
OPEN #ChatGLM3-6B: the 3rd gen
1) tops 44 tasks among <10B models
2) supports tool/function call, code interpreter, agent tasks, 32K
GitHub - THUDM/...
🤗THUDM/chatglm3-...

#ChatGLM-6Bs: 10M 🤗downloads, thank YOU! @huggingface @ClementDelangue @_akhaliq @osanseviero
partager
Tsinghua KEG
RT @vanstriendaniel: AgentInstruct by @thukeg available on the @huggingface Hub is a dataset of 1,866 high-quality interactions designed to…
partager
Tsinghua KEG
Btw, our CogVLM arxiv submission (ArXiv ID 5148899) has been "on hold" for about two weeks without clear reasons. Is arXiv supposed to be a timely "publishing" model? Please help if possible @arxiv @_akhaliq 😖
-------------
From @Tsinghua KEG (THUDM):#CogVLM: open vision language models - deep fusion btn LLM & image encoder w/ visual experts!

CogVLM-17B tops 14 cross-modal benchs, beats #BLIP2, #PaLI-17B/X-55B, #PaLM-E-84B.

Paper:
HF🤗:

@_akhaliq @osanseviero @huggingface
1
partager
Tsinghua KEG
RT @_akhaliq: AgentTuning: Enabling Generalized Agent Abilities for LLMs

paper page: Paper page - Ag...

Open large language models (…
link
Paper page - AgentTuning: Enabling Generalized Agent Abilities for LLMs
Join the discussion on this paper page
partager
Tsinghua KEG
RT @sleepychord: @iamrobotbear @thukeg @WilliamLamkin @_akhaliq @osanseviero @huggingface License updated, in short, one can use freely whi…
partager