GLM-4-9B-Chat-1M与Vue3前端集成:智能聊天界面开发
GLM-4-9B-Chat-1M与Vue3前端集成:智能聊天界面开发
想给你的Vue3项目加个智能聊天助手吗?那种能处理超长文档、支持多轮对话,还能帮你写代码的AI伙伴。今天咱们就来聊聊怎么把GLM-4-9B-Chat-1M这个大模型,无缝集成到你的Vue3前端项目里。
GLM-4-9B-Chat-1M是智谱AI推出的开源大模型,最大的亮点就是支持100万token的上下文长度,相当于能记住一本厚书的内容。这意味着你可以和它进行很长的对话,上传大文档让它分析,或者让它帮你处理复杂的任务。
这篇文章我会手把手带你完成整个集成过程,从API设计到前端实现,再到性能优化,让你能快速搭建一个功能完整的智能聊天界面。就算你之前没怎么接触过大模型集成,跟着步骤走也能搞定。
1. 项目准备与环境搭建
在开始写代码之前,我们需要先把基础环境准备好。这个过程不复杂,主要是安装一些必要的依赖。
1.1 创建Vue3项目
如果你还没有Vue3项目,可以用Vite快速创建一个。打开终端,执行下面的命令:
npm create vue@latest glm-chat-frontend
创建过程中,你可以根据自己的需要选择配置。我建议至少选择:
- TypeScript(类型检查能让代码更可靠)
- Vue Router(方便后续扩展页面)
- Pinia(状态管理,聊天记录、用户设置等数据用它管理很方便)
创建完成后,进入项目目录并安装依赖:
cd glm-chat-frontend
npm install
1.2 安装必要的依赖
我们的聊天界面需要一些额外的库来增强功能:
npm install axios # 用于HTTP请求
npm install markdown-it # 用于渲染Markdown格式的回复
npm install highlight.js # 代码高亮
npm install @vueuse/core # Vue组合式工具库,有很多实用函数
如果你想让界面更美观,可以安装一个UI组件库。这里我以Element Plus为例:
npm install element-plus
npm install @element-plus/icons-vue
然后在main.ts中引入:
import { createApp } from 'vue'
import ElementPlus from 'element-plus'
import 'element-plus/dist/index.css'
import App from './App.vue'
const app = createApp(App)
app.use(ElementPlus)
app.mount('#app')
1.3 后端API准备
前端需要连接一个后端服务来调用GLM-4-9B-Chat-1M模型。你可以选择:
- 自己部署后端:如果你有GPU服务器,可以部署模型服务
- 使用现成API:有些平台提供了GLM模型的API服务
- 本地测试:开发阶段可以用Mock数据模拟
为了教程的完整性,我会假设你已经有一个后端服务运行在http://localhost:8000,提供了以下接口:
POST /chat/completions- 发送消息并获取回复POST /chat/stream- 流式传输回复GET /models- 获取可用模型信息
如果你还没有后端,可以先创建一个简单的Mock服务来测试前端功能。在项目根目录创建mock-server.js:
const express = require('express')
const cors = require('cors')
const app = express()
app.use(cors())
app.use(express.json())
// 模拟流式响应
app.post('/chat/stream', (req, res) => {
res.setHeader('Content-Type', 'text/event-stream')
res.setHeader('Cache-Control', 'no-cache')
res.setHeader('Connection', 'keep-alive')
const message = req.body.messages[req.body.messages.length - 1].content
const response = `这是对"${message}"的模拟回复。`
// 模拟逐字输出
let i = 0
const interval = setInterval(() => {
if (i < response.length) {
res.write(`data: ${JSON.stringify({ content: response[i] })}\n\n`)
i++
} else {
clearInterval(interval)
res.write('data: [DONE]\n\n')
res.end()
}
}, 50)
})
app.listen(8000, () => {
console.log('Mock server running on http://localhost:8000')
})
运行这个Mock服务:node mock-server.js
2. API接口设计与封装
好的API封装能让前端代码更清晰,也更容易维护。我们来设计一个专门处理聊天请求的模块。
2.1 创建API服务层
在src目录下创建services文件夹,然后创建chatService.ts:
import axios from 'axios'
// API基础配置
const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || 'http://localhost:8000'
// 创建axios实例
const apiClient = axios.create({
baseURL: API_BASE_URL,
timeout: 30000, // 30秒超时,长文本处理可能需要更长时间
headers: {
'Content-Type': 'application/json',
},
})
// 消息类型定义
export interface ChatMessage {
role: 'user' | 'assistant' | 'system'
content: string
timestamp?: number
}
// 聊天请求参数
export interface ChatRequest {
messages: ChatMessage[]
model?: string
temperature?: number
max_tokens?: number
stream?: boolean
}
// 聊天响应
export interface ChatResponse {
id: string
object: string
created: number
model: string
choices: Array<{
index: number
message: ChatMessage
finish_reason: string
}>
usage?: {
prompt_tokens: number
completion_tokens: number
total_tokens: number
}
}
// 流式响应数据块
export interface StreamChunk {
id: string
object: string
created: number
model: string
choices: Array<{
index: number
delta: {
content?: string
role?: string
}
finish_reason: string | null
}>
}
class ChatService {
// 普通聊天(一次性返回完整回复)
async chatCompletion(request: ChatRequest): Promise<ChatResponse> {
try {
const response = await apiClient.post<ChatResponse>('/chat/completions', {
...request,
stream: false,
})
return response.data
} catch (error) {
console.error('Chat completion error:', error)
throw error
}
}
// 流式聊天(逐字返回)
async chatStream(
request: ChatRequest,
onChunk: (chunk: string) => void,
onComplete: () => void,
onError: (error: Error) => void
) {
try {
const response = await fetch(`${API_BASE_URL}/chat/stream`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
...request,
stream: true,
}),
})
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`)
}
const reader = response.body?.getReader()
if (!reader) {
throw new Error('No reader available')
}
const decoder = new TextDecoder()
let buffer = ''
while (true) {
const { done, value } = await reader.read()
if (done) break
buffer += decoder.decode(value, { stream: true })
const lines = buffer.split('\n')
buffer = lines.pop() || ''
for (const line of lines) {
if (line.trim() === '') continue
if (line.startsWith('data: ')) {
const data = line.slice(6)
if (data === '[DONE]') {
onComplete()
return
}
try {
const parsed = JSON.parse(data) as StreamChunk
const content = parsed.choices[0]?.delta?.content || ''
if (content) {
onChunk(content)
}
} catch (e) {
console.error('Parse error:', e)
}
}
}
}
} catch (error) {
onError(error as Error)
}
}
// 获取可用模型列表
async getModels(): Promise<string[]> {
try {
const response = await apiClient.get<{ models: string[] }>('/models')
return response.data.models
} catch (error) {
console.error('Get models error:', error)
// 返回默认模型列表
return ['glm-4-9b-chat-1m', 'glm-4-9b-chat']
}
}
}
export const chatService = new ChatService()
2.2 环境变量配置
创建.env.development文件:
VITE_API_BASE_URL=http://localhost:8000
VITE_APP_TITLE=GLM智能聊天
创建.env.production文件:
VITE_API_BASE_URL=https://your-api-server.com
VITE_APP_TITLE=GLM智能聊天
然后在vite.config.ts中配置环境变量支持:
import { defineConfig } from 'vite'
import vue from '@vitejs/plugin-vue'
export default defineConfig({
plugins: [vue()],
define: {
'process.env': {}
}
})
3. 聊天界面核心组件开发
现在我们来创建聊天界面的核心组件。我会带你一步步构建一个功能完整的聊天界面。
3.1 创建聊天状态管理
首先,我们需要一个地方来管理聊天的状态。在src/stores目录下创建chatStore.ts:
import { defineStore } from 'pinia'
import { ref, computed } from 'vue'
import type { ChatMessage } from '@/services/chatService'
export const useChatStore = defineStore('chat', () => {
// 聊天消息列表
const messages = ref<ChatMessage[]>([
{
role: 'assistant',
content: '你好!我是基于GLM-4-9B-Chat-1M的智能助手。我可以处理长达100万token的上下文,有什么可以帮你的吗?',
timestamp: Date.now(),
},
])
// 当前用户输入
const userInput = ref('')
// 是否正在生成回复
const isGenerating = ref(false)
// 是否使用流式输出
const useStreaming = ref(true)
// 当前选择的模型
const selectedModel = ref('glm-4-9b-chat-1m')
// 添加消息
const addMessage = (message: ChatMessage) => {
messages.value.push({
...message,
timestamp: message.timestamp || Date.now(),
})
}
// 更新最后一条消息(用于流式输出)
const updateLastMessage = (content: string) => {
if (messages.value.length > 0) {
const lastMessage = messages.value[messages.value.length - 1]
if (lastMessage.role === 'assistant') {
lastMessage.content += content
}
}
}
// 清空聊天记录
const clearMessages = () => {
messages.value = [
{
role: 'assistant',
content: '聊天记录已清空。有什么可以帮你的吗?',
timestamp: Date.now(),
},
]
}
// 计算token使用量(简化估算)
const estimatedTokens = computed(() => {
const text = messages.value.map(m => m.content).join(' ')
// 简单估算:中文字符算1个token,英文字母和空格算0.25个token
const chineseChars = (text.match(/[\u4e00-\u9fa5]/g) || []).length
const otherChars = text.length - chineseChars
return Math.ceil(chineseChars + otherChars * 0.25)
})
// 是否接近上下文限制(100万token)
const isNearContextLimit = computed(() => {
return estimatedTokens.value > 900000
})
return {
messages,
userInput,
isGenerating,
useStreaming,
selectedModel,
addMessage,
updateLastMessage,
clearMessages,
estimatedTokens,
isNearContextLimit,
}
})
3.2 创建聊天界面组件
在src/components目录下创建ChatInterface.vue:
<template>
<div class="chat-container">
<!-- 顶部工具栏 -->
<div class="chat-header">
<div class="header-left">
<h2>GLM智能聊天</h2>
<span class="model-badge">{{ selectedModel }}</span>
</div>
<div class="header-right">
<el-tooltip content="清空聊天记录">
<el-button @click="clearChat" :disabled="isGenerating" circle>
<el-icon><Delete /></el-icon>
</el-button>
</el-tooltip>
<el-switch
v-model="useStreaming"
active-text="流式输出"
inactive-text="完整输出"
:disabled="isGenerating"
/>
<el-select
v-model="selectedModel"
placeholder="选择模型"
size="small"
style="width: 180px; margin-left: 10px;"
:disabled="isGenerating"
>
<el-option
v-for="model in availableModels"
:key="model"
:label="model"
:value="model"
/>
</el-select>
</div>
</div>
<!-- 聊天消息区域 -->
<div ref="messagesContainer" class="messages-container">
<div
v-for="(message, index) in messages"
:key="index"
:class="['message-bubble', message.role]"
>
<div class="message-avatar">
<el-avatar :size="32">
<span v-if="message.role === 'user'">👤</span>
<span v-else></span>
</el-avatar>
</div>
<div class="message-content">
<div class="message-header">
<span class="message-role">{{ message.role === 'user' ? '你' : 'AI助手' }}</span>
<span class="message-time">{{ formatTime(message.timestamp) }}</span>
</div>
<div class="message-body" v-html="renderMessage(message.content)"></div>
</div>
</div>
<!-- 生成中的指示器 -->
<div v-if="isGenerating && !useStreaming" class="generating-indicator">
<el-icon class="is-loading"><Loading /></el-icon>
<span>正在思考中...</span>
</div>
</div>
<!-- 输入区域 -->
<div class="input-container">
<div class="input-tools">
<el-tooltip content="上传文件(支持长文本)">
<el-button @click="handleFileUpload" :disabled="isGenerating" circle>
<el-icon><Upload /></el-icon>
</el-button>
</el-tooltip>
<el-tooltip content="使用示例问题">
<el-button @click="insertExample" :disabled="isGenerating" circle>
<el-icon><MagicStick /></el-icon>
</el-button>
</el-tooltip>
</div>
<div class="input-area">
<el-input
v-model="userInput"
type="textarea"
:rows="3"
placeholder="输入你的问题...(支持长文本,最多100万token)"
:disabled="isGenerating"
@keydown.enter.exact.prevent="handleSend"
resize="none"
/>
<div class="input-actions">
<div class="token-info">
<span v-if="estimatedTokens > 0">
估算Token: {{ estimatedTokens.toLocaleString() }}
<el-tooltip v-if="isNearContextLimit" content="接近上下文限制,建议清空历史记录">
<el-icon color="#e6a23c"><Warning /></el-icon>
</el-tooltip>
</span>
</div>
<el-button
type="primary"
@click="handleSend"
:loading="isGenerating"
:disabled="!userInput.trim()"
>
发送
</el-button>
</div>
</div>
</div>
</div>
</template>
<script setup lang="ts">
import { ref, computed, onMounted, nextTick, watch } from 'vue'
import { useChatStore } from '@/stores/chatStore'
import { chatService, type ChatMessage } from '@/services/chatService'
import MarkdownIt from 'markdown-it'
import hljs from 'highlight.js'
import 'highlight.js/styles/github.css'
import {
Delete,
Upload,
MagicStick,
Loading,
Warning,
} from '@element-plus/icons-vue'
import { ElMessage } from 'element-plus'
// 初始化Markdown渲染器
const md = new MarkdownIt({
html: true,
linkify: true,
typographer: true,
highlight: function (str, lang) {
if (lang && hljs.getLanguage(lang)) {
try {
return hljs.highlight(str, { language: lang }).value
} catch (__) {}
}
return ''
}
})
// 使用store
const chatStore = useChatStore()
const messages = computed(() => chatStore.messages)
const userInput = computed({
get: () => chatStore.userInput,
set: (value) => { chatStore.userInput = value }
})
const isGenerating = computed(() => chatStore.isGenerating)
const useStreaming = computed({
get: () => chatStore.useStreaming,
set: (value) => { chatStore.useStreaming = value }
})
const selectedModel = computed({
get: () => chatStore.selectedModel,
set: (value) => { chatStore.selectedModel = value }
})
const estimatedTokens = computed(() => chatStore.estimatedTokens)
const isNearContextLimit = computed(() => chatStore.isNearContextLimit)
// 可用模型列表
const availableModels = ref<string[]>([])
// 消息容器引用(用于自动滚动)
const messagesContainer = ref<HTMLElement>()
// 初始化
onMounted(async () => {
await loadModels()
scrollToBottom()
})
// 加载可用模型
const loadModels = async () => {
try {
availableModels.value = await chatService.getModels()
} catch (error) {
console.error('Failed to load models:', error)
availableModels.value = ['glm-4-9b-chat-1m', 'glm-4-9b-chat']
}
}
// 渲染Markdown消息
const renderMessage = (content: string) => {
return md.render(content)
}
// 格式化时间
const formatTime = (timestamp?: number) => {
if (!timestamp) return ''
const date = new Date(timestamp)
return `${date.getHours().toString().padStart(2, '0')}:${date.getMinutes().toString().padStart(2, '0')}`
}
// 滚动到底部
const scrollToBottom = () => {
nextTick(() => {
if (messagesContainer.value) {
messagesContainer.value.scrollTop = messagesContainer.value.scrollHeight
}
})
}
// 监听消息变化,自动滚动
watch(messages, () => {
scrollToBottom()
}, { deep: true })
// 发送消息
const handleSend = async () => {
const input = userInput.value.trim()
if (!input || isGenerating.value) return
// 添加用户消息
const userMessage: ChatMessage = {
role: 'user',
content: input,
}
chatStore.addMessage(userMessage)
// 清空输入框
userInput.value = ''
// 添加空的助手消息(用于流式输出)
if (useStreaming.value) {
chatStore.addMessage({
role: 'assistant',
content: '',
})
}
// 设置生成状态
chatStore.isGenerating = true
try {
if (useStreaming.value) {
// 流式输出
await chatService.chatStream(
{
messages: [...messages.value.slice(0, -1)], // 排除最后一条空的助手消息
model: selectedModel.value,
temperature: 0.7,
max_tokens: 2000,
stream: true,
},
(chunk) => {
// 更新最后一条消息
chatStore.updateLastMessage(chunk)
},
() => {
// 生成完成
chatStore.isGenerating = false
},
(error) => {
// 错误处理
console.error('Stream error:', error)
chatStore.updateLastMessage('\n\n[生成中断,请重试]')
chatStore.isGenerating = false
ElMessage.error('生成过程中出现错误')
}
)
} else {
// 完整输出
const response = await chatService.chatCompletion({
messages: messages.value.filter(m => m.role !== 'assistant' || m.content), // 排除空的助手消息
model: selectedModel.value,
temperature: 0.7,
max_tokens: 2000,
stream: false,
})
// 添加助手回复
const assistantMessage = response.choices[0].message
chatStore.addMessage(assistantMessage)
chatStore.isGenerating = false
}
} catch (error) {
console.error('Chat error:', error)
chatStore.isGenerating = false
ElMessage.error('发送失败,请检查网络连接或API服务')
// 移除空的助手消息(如果是流式输出)
if (useStreaming.value && messages.value[messages.value.length - 1].content === '') {
chatStore.messages.pop()
}
}
}
// 清空聊天
const clearChat = () => {
if (isGenerating.value) return
chatStore.clearMessages()
ElMessage.success('聊天记录已清空')
}
// 处理文件上传
const handleFileUpload = () => {
// 这里可以实现文件上传逻辑
ElMessage.info('文件上传功能待实现')
}
// 插入示例问题
const insertExample = () => {
const examples = [
'帮我写一个Vue3的组合式函数,用于处理防抖搜索',
'用Python实现一个快速排序算法,并添加详细注释',
'解释一下什么是注意力机制,用简单的例子说明',
'帮我总结一下《红楼梦》的主要人物关系',
'写一段关于人工智能未来发展的短文,300字左右',
]
const randomExample = examples[Math.floor(Math.random() * examples.length)]
userInput.value = randomExample
}
</script>
<style scoped>
.chat-container {
display: flex;
flex-direction: column;
height: 100vh;
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
}
.chat-header {
display: flex;
justify-content: space-between;
align-items: center;
padding: 16px 24px;
background: rgba(255, 255, 255, 0.95);
box-shadow: 0 2px 12px rgba(0, 0, 0, 0.1);
z-index: 10;
}
.header-left {
display: flex;
align-items: center;
gap: 12px;
}
.header-left h2 {
margin: 0;
color: #333;
font-size: 20px;
}
.model-badge {
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
padding: 4px 12px;
border-radius: 16px;
font-size: 12px;
font-weight: 500;
}
.header-right {
display: flex;
align-items: center;
gap: 12px;
}
.messages-container {
flex: 1;
overflow-y: auto;
padding: 24px;
background: rgba(249, 250, 251, 0.95);
}
.message-bubble {
display: flex;
margin-bottom: 24px;
animation: fadeIn 0.3s ease;
}
@keyframes fadeIn {
from {
opacity: 0;
transform: translateY(10px);
}
to {
opacity: 1;
transform: translateY(0);
}
}
.message-bubble.user {
flex-direction: row-reverse;
}
.message-avatar {
flex-shrink: 0;
margin: 0 12px;
}
.message-content {
max-width: 70%;
background: white;
border-radius: 12px;
padding: 16px;
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
}
.message-bubble.user .message-content {
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
}
.message-header {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 8px;
font-size: 12px;
opacity: 0.8;
}
.message-role {
font-weight: 600;
}
.message-time {
font-size: 11px;
}
.message-body {
line-height: 1.6;
}
.message-body :deep(pre) {
background: #f6f8fa;
border-radius: 6px;
padding: 12px;
overflow-x: auto;
margin: 8px 0;
}
.message-body :deep(code) {
background: #f6f8fa;
padding: 2px 4px;
border-radius: 4px;
font-family: 'Monaco', 'Menlo', 'Ubuntu Mono', monospace;
}
.message-bubble.user .message-body :deep(pre),
.message-bubble.user .message-body :deep(code) {
background: rgba(255, 255, 255, 0.1);
color: white;
}
.generating-indicator {
display: flex;
align-items: center;
justify-content: center;
gap: 8px;
padding: 16px;
color: #666;
font-size: 14px;
}
.input-container {
padding: 16px 24px;
background: rgba(255, 255, 255, 0.95);
border-top: 1px solid #e5e7eb;
}
.input-tools {
display: flex;
gap: 8px;
margin-bottom: 12px;
}
.input-area {
position: relative;
}
.input-actions {
display: flex;
justify-content: space-between;
align-items: center;
margin-top: 12px;
}
.token-info {
font-size: 12px;
color: #666;
display: flex;
align-items: center;
gap: 4px;
}
:deep(.el-textarea__inner) {
border-radius: 12px;
border: 1px solid #e5e7eb;
padding: 12px 16px;
font-size: 14px;
line-height: 1.6;
resize: none;
}
:deep(.el-textarea__inner:focus) {
border-color: #667eea;
box-shadow: 0 0 0 2px rgba(102, 126, 234, 0.1);
}
</style>
3.3 创建主页面
在src/views目录下创建ChatView.vue:
<template>
<div class="chat-view">
<ChatInterface />
</div>
</template>
<script setup lang="ts">
import ChatInterface from '@/components/ChatInterface.vue'
</script>
<style scoped>
.chat-view {
height: 100vh;
overflow: hidden;
}
</style>
更新路由配置src/router/index.ts:
import { createRouter, createWebHistory } from 'vue-router'
import ChatView from '@/views/ChatView.vue'
const router = createRouter({
history: createWebHistory(import.meta.env.BASE_URL),
routes: [
{
path: '/',
name: 'chat',
component: ChatView,
},
],
})
export default router
4. 高级功能与优化
基础功能完成后,我们可以添加一些高级功能来提升用户体验。
4.1 长文本处理优化
GLM-4-9B-Chat-1M支持100万token,但实际使用中我们需要考虑性能问题。创建src/utils/textProcessor.ts:
/**
* 长文本处理工具
*/
// 估算文本的token数量
export function estimateTokens(text: string): number {
// 简单估算:中文字符算1个token,英文字母和空格算0.25个token
const chineseChars = (text.match(/[\u4e00-\u9fa5]/g) || []).length
const otherChars = text.length - chineseChars
return Math.ceil(chineseChars + otherChars * 0.25)
}
// 分割长文本为多个chunk
export function splitLongText(
text: string,
maxTokens: number = 32000, // 每个chunk的最大token数
overlapTokens: number = 500 // chunk之间的重叠token数,保持上下文连贯
): string[] {
const chunks: string[] = []
let currentChunk = ''
let currentTokens = 0
// 按段落分割
const paragraphs = text.split(/\n\s*\n/)
for (const paragraph of paragraphs) {
const paragraphTokens = estimateTokens(paragraph)
// 如果单个段落就超过最大限制,需要进一步分割
if (paragraphTokens > maxTokens) {
// 按句子分割
const sentences = paragraph.split(/[。!?.!?]/)
for (const sentence of sentences) {
const sentenceTokens = estimateTokens(sentence)
if (currentTokens + sentenceTokens > maxTokens) {
if (currentChunk) {
chunks.push(currentChunk)
// 保留重叠部分
const overlapText = getOverlapText(currentChunk, overlapTokens)
currentChunk = overlapText + sentence
currentTokens = estimateTokens(currentChunk)
}
} else {
currentChunk += sentence + '。'
currentTokens += sentenceTokens + 1 // 加上句号的token
}
}
} else if (currentTokens + paragraphTokens > maxTokens) {
chunks.push(currentChunk)
// 保留重叠部分
const overlapText = getOverlapText(currentChunk, overlapTokens)
currentChunk = overlapText + paragraph
currentTokens = estimateTokens(currentChunk)
} else {
currentChunk += paragraph + '\n\n'
currentTokens += paragraphTokens + 2 // 加上换行符的token
}
}
// 添加最后一个chunk
if (currentChunk) {
chunks.push(currentChunk)
}
return chunks
}
// 获取文本末尾的重叠部分
function getOverlapText(text: string, overlapTokens: number): string {
const words = text.split('')
let overlapText = ''
let overlapTokenCount = 0
// 从后往前取词,直到达到指定的token数
for (let i = words.length - 1; i >= 0 && overlapTokenCount < overlapTokens; i--) {
const word = words[i]
overlapText = word + overlapText
overlapTokenCount += estimateTokens(word)
}
return overlapText
}
// 处理文件上传
export async function processUploadedFile(file: File): Promise<string> {
return new Promise((resolve, reject) => {
const reader = new FileReader()
reader.onload = (e) => {
try {
const content = e.target?.result as string
// 根据文件类型处理
if (file.type === 'application/pdf') {
// PDF文件处理(需要第三方库)
resolve(`[PDF文件: ${file.name}]\n内容需要PDF解析库处理`)
} else if (file.type.includes('text') || file.name.endsWith('.txt') || file.name.endsWith('.md')) {
// 文本文件直接读取
resolve(content)
} else if (file.type.includes('word') || file.name.endsWith('.docx')) {
// Word文件处理
resolve(`[Word文件: ${file.name}]\n内容需要DOCX解析库处理`)
} else {
resolve(`[文件: ${file.name}]\n文件类型暂不支持直接读取`)
}
} catch (error) {
reject(error)
}
}
reader.onerror = () => {
reject(new Error('文件读取失败'))
}
reader.readAsText(file)
})
}
4.2 聊天历史管理
添加聊天历史保存和加载功能。更新chatStore.ts:
// 在原有代码基础上添加以下功能
// 保存聊天记录到本地存储
const saveChatHistory = () => {
try {
const history = {
messages: messages.value,
timestamp: Date.now(),
model: selectedModel.value,
}
localStorage.setItem('glm_chat_history', JSON.stringify(history))
} catch (error) {
console.error('Failed to save chat history:', error)
}
}
// 从本地存储加载聊天记录
const loadChatHistory = () => {
try {
const saved = localStorage.getItem('glm_chat_history')
if (saved) {
const history = JSON.parse(saved)
messages.value = history.messages
selectedModel.value = history.model || 'glm-4-9b-chat-1m'
}
} catch (error) {
console.error('Failed to load chat history:', error)
}
}
// 导出聊天记录
const exportChatHistory = () => {
try {
const data = {
messages: messages.value,
metadata: {
exportedAt: new Date().toISOString(),
model: selectedModel.value,
totalMessages: messages.value.length,
estimatedTokens: estimatedTokens.value,
},
}
const blob = new Blob([JSON.stringify(data, null, 2)], { type: 'application/json' })
const url = URL.createObjectURL(blob)
const a = document.createElement('a')
a.href = url
a.download = `glm-chat-history-${Date.now()}.json`
document.body.appendChild(a)
a.click()
document.body.removeChild(a)
URL.revokeObjectURL(url)
return true
} catch (error) {
console.error('Failed to export chat history:', error)
return false
}
}
// 导入聊天记录
const importChatHistory = (file: File): Promise<boolean> => {
return new Promise((resolve, reject) => {
const reader = new FileReader()
reader.onload = (e) => {
try {
const content = e.target?.result as string
const data = JSON.parse(content)
if (data.messages && Array.isArray(data.messages)) {
messages.value = data.messages
selectedModel.value = data.metadata?.model || 'glm-4-9b-chat-1m'
saveChatHistory()
resolve(true)
} else {
reject(new Error('无效的聊天记录文件'))
}
} catch (error) {
reject(error)
}
}
reader.onerror = () => {
reject(new Error('文件读取失败'))
}
reader.readAsText(file)
})
}
// 自动保存(当消息变化时)
watch(messages, () => {
saveChatHistory()
}, { deep: true, immediate: true })
// 初始化时加载历史记录
onMounted(() => {
loadChatHistory()
})
4.3 性能优化建议
在实际使用中,你可能会遇到一些性能问题。这里有一些优化建议:
- 虚拟滚动:如果聊天记录非常多,考虑使用虚拟滚动技术
- 图片懒加载:消息中的图片延迟加载
- Web Worker:将Markdown渲染等耗时操作放到Web Worker中
- 请求取消:用户发送新消息时取消之前的请求
- 本地缓存:缓存常用的回复模板
创建一个性能优化工具文件src/utils/performance.ts:
/**
* 性能优化工具
*/
// 防抖函数
export function debounce<T extends (...args: any[]) => any>(
func: T,
wait: number
): (...args: Parameters<T>) => void {
let timeout: NodeJS.Timeout | null = null
return (...args: Parameters<T>) => {
if (timeout) clearTimeout(timeout)
timeout = setTimeout(() => func(...args), wait)
}
}
// 节流函数
export function throttle<T extends (...args: any[]) => any>(
func: T,
limit: number
): (...args: Parameters<T>) => void {
let inThrottle: boolean = false
return (...args: Parameters<T>) => {
if (!inThrottle) {
func(...args)
inThrottle = true
setTimeout(() => (inThrottle = false), limit)
}
}
}
// 测量函数执行时间
export function measurePerformance<T extends (...args: any[]) => any>(
func: T,
label: string = 'Function'
): (...args: Parameters<T>) => ReturnType<T> {
return (...args: Parameters<T>) => {
const start = performance.now()
const result = func(...args)
const end = performance.now()
console.log(`${label} took ${(end - start).toFixed(2)}ms`)
return result
}
}
// 批量更新优化
export function batchUpdate(callback: () => void) {
if (typeof requestAnimationFrame !== 'undefined') {
requestAnimationFrame(callback)
} else {
setTimeout(callback, 0)
}
}
// 内存使用监控
export class MemoryMonitor {
private static instance: MemoryMonitor
private updateInterval: NodeJS.Timeout | null = null
private constructor() {}
static getInstance(): MemoryMonitor {
if (!MemoryMonitor.instance) {
MemoryMonitor.instance = new MemoryMonitor()
}
return MemoryMonitor.instance
}
startMonitoring(interval: number = 30000) {
if (this.updateInterval) {
clearInterval(this.updateInterval)
}
this.updateInterval = setInterval(() => {
if ('memory' in performance) {
const memory = (performance as any).memory
console.log('Memory usage:', {
usedJSHeapSize: `${(memory.usedJSHeapSize / 1024 / 1024).toFixed(2)} MB`,
totalJSHeapSize: `${(memory.totalJSHeapSize / 1024 / 1024).toFixed(2)} MB`,
jsHeapSizeLimit: `${(memory.jsHeapSizeLimit / 1024 / 1024).toFixed(2)} MB`,
})
}
}, interval)
}
stopMonitoring() {
if (this.updateInterval) {
clearInterval(this.updateInterval)
this.updateInterval = null
}
}
}
5. 部署与生产环境配置
5.1 构建优化配置
更新vite.config.ts,添加构建优化:
import { defineConfig } from 'vite'
import vue from '@vitejs/plugin-vue'
import { visualizer } from 'rollup-plugin-visualizer'
export default defineConfig({
plugins: [
vue(),
visualizer({
open: false,
gzipSize: true,
brotliSize: true,
}),
],
build: {
rollupOptions: {
output: {
manualChunks: {
'vendor': ['vue', 'vue-router', 'pinia'],
'ui-library': ['element-plus'],
'markdown': ['markdown-it', 'highlight.js'],
},
chunkFileNames: 'assets/js/[name]-[hash].js',
entryFileNames: 'assets/js/[name]-[hash].js',
assetFileNames: 'assets/[ext]/[name]-[hash].[ext]',
},
},
chunkSizeWarningLimit: 1000,
minify: 'terser',
terserOptions: {
compress: {
drop_console: true,
drop_debugger: true,
},
},
},
server: {
proxy: {
'/api': {
target: 'http://localhost:8000',
changeOrigin: true,
rewrite: (path) => path.replace(/^\/api/, ''),
},
},
},
})
5.2 Docker部署配置
创建Dockerfile:
# 构建阶段
FROM node:18-alpine as build-stage
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# 生产阶段
FROM nginx:alpine as production-stage
COPY --from=build-stage /app/dist /usr/share/nginx/html
COPY nginx.conf /etc/nginx/nginx.conf
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
创建nginx.conf:
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
# 开启gzip压缩
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_types text/plain text/css text/xml text/javascript application/javascript application/xml+rss application/json;
server {
listen 80;
server_name localhost;
root /usr/share/nginx/html;
index index.html;
# 处理前端路由
location / {
try_files $uri $uri/ /index.html;
}
# API代理(如果需要)
location /api/ {
proxy_pass http://backend:8000/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# 静态资源缓存
location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
expires 1y;
add_header Cache-Control "public, immutable";
}
}
}
创建docker-compose.yml:
version: '3.8'
services:
frontend:
build: .
ports:
- "8080:80"
depends_on:
- backend
networks:
- glm-network
backend:
image: your-glm-backend-image
ports:
- "8000:8000"
environment:
- MODEL_NAME=glm-4-9b-chat-1m
volumes:
- ./models:/app/models
networks:
- glm-network
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
networks:
glm-network:
driver: bridge
5.3 环境变量配置
创建.env.production:
VITE_API_BASE_URL=/api
VITE_APP_TITLE=GLM智能聊天助手
VITE_ENABLE_ANALYTICS=false
VITE_SENTRY_DSN=
6. 总结
走完这一整套流程,你应该已经成功把GLM-4-9B-Chat-1M集成到Vue3项目里了。从最开始的API设计,到聊天界面的搭建,再到各种优化和部署配置,我们基本上覆盖了前端集成的所有关键环节。
实际用下来,这种集成方式还是挺灵活的。流式输出的体验确实比一次性返回要好,用户能实时看到生成过程,感觉更自然。长文本处理那块,虽然我们做了分块和重叠的优化,但实际使用中还是要根据具体场景调整参数,毕竟100万token的上下文不是所有场景都需要。
性能方面,前端能做的优化其实有限,主要还是靠后端模型的推理速度。不过我们做的那些虚拟滚动、防抖节流、内存监控,对提升用户体验还是有帮助的。特别是聊天记录多了以后,这些优化措施就能看出效果了。
如果你在实际使用中遇到问题,我建议先从简单的配置开始,跑通了再逐步添加复杂功能。GLM-4-9B-Chat-1M的能力确实很强,但也要合理使用,别一次性喂太多内容,循序渐进地测试它的长文本处理能力。
这个项目还有很多可以扩展的地方,比如添加更多模型支持、实现文件上传解析、加入对话模板等等。你可以根据自己的需求慢慢完善。最重要的是保持代码结构清晰,这样后续维护和扩展都会比较方便。
获取更多AI镜像
想探索更多AI镜像和应用场景?访问 CSDN星图镜像广场,提供丰富的预置镜像,覆盖大模型推理、图像生成、视频生成、模型微调等多个领域,支持一键部署。
更多推荐



所有评论(0)