GLM-4-9B-Chat-1M与Vue3前端集成:智能聊天界面开发

想给你的Vue3项目加个智能聊天助手吗?那种能处理超长文档、支持多轮对话,还能帮你写代码的AI伙伴。今天咱们就来聊聊怎么把GLM-4-9B-Chat-1M这个大模型,无缝集成到你的Vue3前端项目里。

GLM-4-9B-Chat-1M是智谱AI推出的开源大模型,最大的亮点就是支持100万token的上下文长度,相当于能记住一本厚书的内容。这意味着你可以和它进行很长的对话,上传大文档让它分析,或者让它帮你处理复杂的任务。

这篇文章我会手把手带你完成整个集成过程,从API设计到前端实现,再到性能优化,让你能快速搭建一个功能完整的智能聊天界面。就算你之前没怎么接触过大模型集成,跟着步骤走也能搞定。

1. 项目准备与环境搭建

在开始写代码之前,我们需要先把基础环境准备好。这个过程不复杂,主要是安装一些必要的依赖。

1.1 创建Vue3项目

如果你还没有Vue3项目,可以用Vite快速创建一个。打开终端,执行下面的命令:

npm create vue@latest glm-chat-frontend

创建过程中,你可以根据自己的需要选择配置。我建议至少选择:

  • TypeScript(类型检查能让代码更可靠)
  • Vue Router(方便后续扩展页面)
  • Pinia(状态管理,聊天记录、用户设置等数据用它管理很方便)

创建完成后,进入项目目录并安装依赖:

cd glm-chat-frontend
npm install

1.2 安装必要的依赖

我们的聊天界面需要一些额外的库来增强功能:

npm install axios  # 用于HTTP请求
npm install markdown-it  # 用于渲染Markdown格式的回复
npm install highlight.js  # 代码高亮
npm install @vueuse/core  # Vue组合式工具库,有很多实用函数

如果你想让界面更美观,可以安装一个UI组件库。这里我以Element Plus为例:

npm install element-plus
npm install @element-plus/icons-vue

然后在main.ts中引入:

import { createApp } from 'vue'
import ElementPlus from 'element-plus'
import 'element-plus/dist/index.css'
import App from './App.vue'

const app = createApp(App)
app.use(ElementPlus)
app.mount('#app')

1.3 后端API准备

前端需要连接一个后端服务来调用GLM-4-9B-Chat-1M模型。你可以选择:

  1. 自己部署后端:如果你有GPU服务器,可以部署模型服务
  2. 使用现成API:有些平台提供了GLM模型的API服务
  3. 本地测试:开发阶段可以用Mock数据模拟

为了教程的完整性,我会假设你已经有一个后端服务运行在http://localhost:8000,提供了以下接口:

  • POST /chat/completions - 发送消息并获取回复
  • POST /chat/stream - 流式传输回复
  • GET /models - 获取可用模型信息

如果你还没有后端,可以先创建一个简单的Mock服务来测试前端功能。在项目根目录创建mock-server.js

const express = require('express')
const cors = require('cors')
const app = express()

app.use(cors())
app.use(express.json())

// 模拟流式响应
app.post('/chat/stream', (req, res) => {
  res.setHeader('Content-Type', 'text/event-stream')
  res.setHeader('Cache-Control', 'no-cache')
  res.setHeader('Connection', 'keep-alive')
  
  const message = req.body.messages[req.body.messages.length - 1].content
  const response = `这是对"${message}"的模拟回复。`
  
  // 模拟逐字输出
  let i = 0
  const interval = setInterval(() => {
    if (i < response.length) {
      res.write(`data: ${JSON.stringify({ content: response[i] })}\n\n`)
      i++
    } else {
      clearInterval(interval)
      res.write('data: [DONE]\n\n')
      res.end()
    }
  }, 50)
})

app.listen(8000, () => {
  console.log('Mock server running on http://localhost:8000')
})

运行这个Mock服务:node mock-server.js

2. API接口设计与封装

好的API封装能让前端代码更清晰,也更容易维护。我们来设计一个专门处理聊天请求的模块。

2.1 创建API服务层

src目录下创建services文件夹,然后创建chatService.ts

import axios from 'axios'

// API基础配置
const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || 'http://localhost:8000'

// 创建axios实例
const apiClient = axios.create({
  baseURL: API_BASE_URL,
  timeout: 30000, // 30秒超时,长文本处理可能需要更长时间
  headers: {
    'Content-Type': 'application/json',
  },
})

// 消息类型定义
export interface ChatMessage {
  role: 'user' | 'assistant' | 'system'
  content: string
  timestamp?: number
}

// 聊天请求参数
export interface ChatRequest {
  messages: ChatMessage[]
  model?: string
  temperature?: number
  max_tokens?: number
  stream?: boolean
}

// 聊天响应
export interface ChatResponse {
  id: string
  object: string
  created: number
  model: string
  choices: Array<{
    index: number
    message: ChatMessage
    finish_reason: string
  }>
  usage?: {
    prompt_tokens: number
    completion_tokens: number
    total_tokens: number
  }
}

// 流式响应数据块
export interface StreamChunk {
  id: string
  object: string
  created: number
  model: string
  choices: Array<{
    index: number
    delta: {
      content?: string
      role?: string
    }
    finish_reason: string | null
  }>
}

class ChatService {
  // 普通聊天(一次性返回完整回复)
  async chatCompletion(request: ChatRequest): Promise<ChatResponse> {
    try {
      const response = await apiClient.post<ChatResponse>('/chat/completions', {
        ...request,
        stream: false,
      })
      return response.data
    } catch (error) {
      console.error('Chat completion error:', error)
      throw error
    }
  }

  // 流式聊天(逐字返回)
  async chatStream(
    request: ChatRequest,
    onChunk: (chunk: string) => void,
    onComplete: () => void,
    onError: (error: Error) => void
  ) {
    try {
      const response = await fetch(`${API_BASE_URL}/chat/stream`, {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          ...request,
          stream: true,
        }),
      })

      if (!response.ok) {
        throw new Error(`HTTP error! status: ${response.status}`)
      }

      const reader = response.body?.getReader()
      if (!reader) {
        throw new Error('No reader available')
      }

      const decoder = new TextDecoder()
      let buffer = ''

      while (true) {
        const { done, value } = await reader.read()
        if (done) break

        buffer += decoder.decode(value, { stream: true })
        const lines = buffer.split('\n')
        buffer = lines.pop() || ''

        for (const line of lines) {
          if (line.trim() === '') continue
          if (line.startsWith('data: ')) {
            const data = line.slice(6)
            if (data === '[DONE]') {
              onComplete()
              return
            }
            try {
              const parsed = JSON.parse(data) as StreamChunk
              const content = parsed.choices[0]?.delta?.content || ''
              if (content) {
                onChunk(content)
              }
            } catch (e) {
              console.error('Parse error:', e)
            }
          }
        }
      }
    } catch (error) {
      onError(error as Error)
    }
  }

  // 获取可用模型列表
  async getModels(): Promise<string[]> {
    try {
      const response = await apiClient.get<{ models: string[] }>('/models')
      return response.data.models
    } catch (error) {
      console.error('Get models error:', error)
      // 返回默认模型列表
      return ['glm-4-9b-chat-1m', 'glm-4-9b-chat']
    }
  }
}

export const chatService = new ChatService()

2.2 环境变量配置

创建.env.development文件:

VITE_API_BASE_URL=http://localhost:8000
VITE_APP_TITLE=GLM智能聊天

创建.env.production文件:

VITE_API_BASE_URL=https://your-api-server.com
VITE_APP_TITLE=GLM智能聊天

然后在vite.config.ts中配置环境变量支持:

import { defineConfig } from 'vite'
import vue from '@vitejs/plugin-vue'

export default defineConfig({
  plugins: [vue()],
  define: {
    'process.env': {}
  }
})

3. 聊天界面核心组件开发

现在我们来创建聊天界面的核心组件。我会带你一步步构建一个功能完整的聊天界面。

3.1 创建聊天状态管理

首先,我们需要一个地方来管理聊天的状态。在src/stores目录下创建chatStore.ts

import { defineStore } from 'pinia'
import { ref, computed } from 'vue'
import type { ChatMessage } from '@/services/chatService'

export const useChatStore = defineStore('chat', () => {
  // 聊天消息列表
  const messages = ref<ChatMessage[]>([
    {
      role: 'assistant',
      content: '你好!我是基于GLM-4-9B-Chat-1M的智能助手。我可以处理长达100万token的上下文,有什么可以帮你的吗?',
      timestamp: Date.now(),
    },
  ])

  // 当前用户输入
  const userInput = ref('')

  // 是否正在生成回复
  const isGenerating = ref(false)

  // 是否使用流式输出
  const useStreaming = ref(true)

  // 当前选择的模型
  const selectedModel = ref('glm-4-9b-chat-1m')

  // 添加消息
  const addMessage = (message: ChatMessage) => {
    messages.value.push({
      ...message,
      timestamp: message.timestamp || Date.now(),
    })
  }

  // 更新最后一条消息(用于流式输出)
  const updateLastMessage = (content: string) => {
    if (messages.value.length > 0) {
      const lastMessage = messages.value[messages.value.length - 1]
      if (lastMessage.role === 'assistant') {
        lastMessage.content += content
      }
    }
  }

  // 清空聊天记录
  const clearMessages = () => {
    messages.value = [
      {
        role: 'assistant',
        content: '聊天记录已清空。有什么可以帮你的吗?',
        timestamp: Date.now(),
      },
    ]
  }

  // 计算token使用量(简化估算)
  const estimatedTokens = computed(() => {
    const text = messages.value.map(m => m.content).join(' ')
    // 简单估算:中文字符算1个token,英文字母和空格算0.25个token
    const chineseChars = (text.match(/[\u4e00-\u9fa5]/g) || []).length
    const otherChars = text.length - chineseChars
    return Math.ceil(chineseChars + otherChars * 0.25)
  })

  // 是否接近上下文限制(100万token)
  const isNearContextLimit = computed(() => {
    return estimatedTokens.value > 900000
  })

  return {
    messages,
    userInput,
    isGenerating,
    useStreaming,
    selectedModel,
    addMessage,
    updateLastMessage,
    clearMessages,
    estimatedTokens,
    isNearContextLimit,
  }
})

3.2 创建聊天界面组件

src/components目录下创建ChatInterface.vue

<template>
  <div class="chat-container">
    <!-- 顶部工具栏 -->
    <div class="chat-header">
      <div class="header-left">
        <h2>GLM智能聊天</h2>
        <span class="model-badge">{{ selectedModel }}</span>
      </div>
      <div class="header-right">
        <el-tooltip content="清空聊天记录">
          <el-button @click="clearChat" :disabled="isGenerating" circle>
            <el-icon><Delete /></el-icon>
          </el-button>
        </el-tooltip>
        <el-switch
          v-model="useStreaming"
          active-text="流式输出"
          inactive-text="完整输出"
          :disabled="isGenerating"
        />
        <el-select
          v-model="selectedModel"
          placeholder="选择模型"
          size="small"
          style="width: 180px; margin-left: 10px;"
          :disabled="isGenerating"
        >
          <el-option
            v-for="model in availableModels"
            :key="model"
            :label="model"
            :value="model"
          />
        </el-select>
      </div>
    </div>

    <!-- 聊天消息区域 -->
    <div ref="messagesContainer" class="messages-container">
      <div
        v-for="(message, index) in messages"
        :key="index"
        :class="['message-bubble', message.role]"
      >
        <div class="message-avatar">
          <el-avatar :size="32">
            <span v-if="message.role === 'user'">👤</span>
            <span v-else></span>
          </el-avatar>
        </div>
        <div class="message-content">
          <div class="message-header">
            <span class="message-role">{{ message.role === 'user' ? '你' : 'AI助手' }}</span>
            <span class="message-time">{{ formatTime(message.timestamp) }}</span>
          </div>
          <div class="message-body" v-html="renderMessage(message.content)"></div>
        </div>
      </div>
      
      <!-- 生成中的指示器 -->
      <div v-if="isGenerating && !useStreaming" class="generating-indicator">
        <el-icon class="is-loading"><Loading /></el-icon>
        <span>正在思考中...</span>
      </div>
    </div>

    <!-- 输入区域 -->
    <div class="input-container">
      <div class="input-tools">
        <el-tooltip content="上传文件(支持长文本)">
          <el-button @click="handleFileUpload" :disabled="isGenerating" circle>
            <el-icon><Upload /></el-icon>
          </el-button>
        </el-tooltip>
        <el-tooltip content="使用示例问题">
          <el-button @click="insertExample" :disabled="isGenerating" circle>
            <el-icon><MagicStick /></el-icon>
          </el-button>
        </el-tooltip>
      </div>
      
      <div class="input-area">
        <el-input
          v-model="userInput"
          type="textarea"
          :rows="3"
          placeholder="输入你的问题...(支持长文本,最多100万token)"
          :disabled="isGenerating"
          @keydown.enter.exact.prevent="handleSend"
          resize="none"
        />
        <div class="input-actions">
          <div class="token-info">
            <span v-if="estimatedTokens > 0">
              估算Token: {{ estimatedTokens.toLocaleString() }}
              <el-tooltip v-if="isNearContextLimit" content="接近上下文限制,建议清空历史记录">
                <el-icon color="#e6a23c"><Warning /></el-icon>
              </el-tooltip>
            </span>
          </div>
          <el-button
            type="primary"
            @click="handleSend"
            :loading="isGenerating"
            :disabled="!userInput.trim()"
          >
            发送
          </el-button>
        </div>
      </div>
    </div>
  </div>
</template>

<script setup lang="ts">
import { ref, computed, onMounted, nextTick, watch } from 'vue'
import { useChatStore } from '@/stores/chatStore'
import { chatService, type ChatMessage } from '@/services/chatService'
import MarkdownIt from 'markdown-it'
import hljs from 'highlight.js'
import 'highlight.js/styles/github.css'
import {
  Delete,
  Upload,
  MagicStick,
  Loading,
  Warning,
} from '@element-plus/icons-vue'
import { ElMessage } from 'element-plus'

// 初始化Markdown渲染器
const md = new MarkdownIt({
  html: true,
  linkify: true,
  typographer: true,
  highlight: function (str, lang) {
    if (lang && hljs.getLanguage(lang)) {
      try {
        return hljs.highlight(str, { language: lang }).value
      } catch (__) {}
    }
    return ''
  }
})

// 使用store
const chatStore = useChatStore()
const messages = computed(() => chatStore.messages)
const userInput = computed({
  get: () => chatStore.userInput,
  set: (value) => { chatStore.userInput = value }
})
const isGenerating = computed(() => chatStore.isGenerating)
const useStreaming = computed({
  get: () => chatStore.useStreaming,
  set: (value) => { chatStore.useStreaming = value }
})
const selectedModel = computed({
  get: () => chatStore.selectedModel,
  set: (value) => { chatStore.selectedModel = value }
})
const estimatedTokens = computed(() => chatStore.estimatedTokens)
const isNearContextLimit = computed(() => chatStore.isNearContextLimit)

// 可用模型列表
const availableModels = ref<string[]>([])

// 消息容器引用(用于自动滚动)
const messagesContainer = ref<HTMLElement>()

// 初始化
onMounted(async () => {
  await loadModels()
  scrollToBottom()
})

// 加载可用模型
const loadModels = async () => {
  try {
    availableModels.value = await chatService.getModels()
  } catch (error) {
    console.error('Failed to load models:', error)
    availableModels.value = ['glm-4-9b-chat-1m', 'glm-4-9b-chat']
  }
}

// 渲染Markdown消息
const renderMessage = (content: string) => {
  return md.render(content)
}

// 格式化时间
const formatTime = (timestamp?: number) => {
  if (!timestamp) return ''
  const date = new Date(timestamp)
  return `${date.getHours().toString().padStart(2, '0')}:${date.getMinutes().toString().padStart(2, '0')}`
}

// 滚动到底部
const scrollToBottom = () => {
  nextTick(() => {
    if (messagesContainer.value) {
      messagesContainer.value.scrollTop = messagesContainer.value.scrollHeight
    }
  })
}

// 监听消息变化,自动滚动
watch(messages, () => {
  scrollToBottom()
}, { deep: true })

// 发送消息
const handleSend = async () => {
  const input = userInput.value.trim()
  if (!input || isGenerating.value) return

  // 添加用户消息
  const userMessage: ChatMessage = {
    role: 'user',
    content: input,
  }
  chatStore.addMessage(userMessage)
  
  // 清空输入框
  userInput.value = ''

  // 添加空的助手消息(用于流式输出)
  if (useStreaming.value) {
    chatStore.addMessage({
      role: 'assistant',
      content: '',
    })
  }

  // 设置生成状态
  chatStore.isGenerating = true

  try {
    if (useStreaming.value) {
      // 流式输出
      await chatService.chatStream(
        {
          messages: [...messages.value.slice(0, -1)], // 排除最后一条空的助手消息
          model: selectedModel.value,
          temperature: 0.7,
          max_tokens: 2000,
          stream: true,
        },
        (chunk) => {
          // 更新最后一条消息
          chatStore.updateLastMessage(chunk)
        },
        () => {
          // 生成完成
          chatStore.isGenerating = false
        },
        (error) => {
          // 错误处理
          console.error('Stream error:', error)
          chatStore.updateLastMessage('\n\n[生成中断,请重试]')
          chatStore.isGenerating = false
          ElMessage.error('生成过程中出现错误')
        }
      )
    } else {
      // 完整输出
      const response = await chatService.chatCompletion({
        messages: messages.value.filter(m => m.role !== 'assistant' || m.content), // 排除空的助手消息
        model: selectedModel.value,
        temperature: 0.7,
        max_tokens: 2000,
        stream: false,
      })

      // 添加助手回复
      const assistantMessage = response.choices[0].message
      chatStore.addMessage(assistantMessage)
      chatStore.isGenerating = false
    }
  } catch (error) {
    console.error('Chat error:', error)
    chatStore.isGenerating = false
    ElMessage.error('发送失败,请检查网络连接或API服务')
    
    // 移除空的助手消息(如果是流式输出)
    if (useStreaming.value && messages.value[messages.value.length - 1].content === '') {
      chatStore.messages.pop()
    }
  }
}

// 清空聊天
const clearChat = () => {
  if (isGenerating.value) return
  chatStore.clearMessages()
  ElMessage.success('聊天记录已清空')
}

// 处理文件上传
const handleFileUpload = () => {
  // 这里可以实现文件上传逻辑
  ElMessage.info('文件上传功能待实现')
}

// 插入示例问题
const insertExample = () => {
  const examples = [
    '帮我写一个Vue3的组合式函数,用于处理防抖搜索',
    '用Python实现一个快速排序算法,并添加详细注释',
    '解释一下什么是注意力机制,用简单的例子说明',
    '帮我总结一下《红楼梦》的主要人物关系',
    '写一段关于人工智能未来发展的短文,300字左右',
  ]
  const randomExample = examples[Math.floor(Math.random() * examples.length)]
  userInput.value = randomExample
}
</script>

<style scoped>
.chat-container {
  display: flex;
  flex-direction: column;
  height: 100vh;
  background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
}

.chat-header {
  display: flex;
  justify-content: space-between;
  align-items: center;
  padding: 16px 24px;
  background: rgba(255, 255, 255, 0.95);
  box-shadow: 0 2px 12px rgba(0, 0, 0, 0.1);
  z-index: 10;
}

.header-left {
  display: flex;
  align-items: center;
  gap: 12px;
}

.header-left h2 {
  margin: 0;
  color: #333;
  font-size: 20px;
}

.model-badge {
  background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
  color: white;
  padding: 4px 12px;
  border-radius: 16px;
  font-size: 12px;
  font-weight: 500;
}

.header-right {
  display: flex;
  align-items: center;
  gap: 12px;
}

.messages-container {
  flex: 1;
  overflow-y: auto;
  padding: 24px;
  background: rgba(249, 250, 251, 0.95);
}

.message-bubble {
  display: flex;
  margin-bottom: 24px;
  animation: fadeIn 0.3s ease;
}

@keyframes fadeIn {
  from {
    opacity: 0;
    transform: translateY(10px);
  }
  to {
    opacity: 1;
    transform: translateY(0);
  }
}

.message-bubble.user {
  flex-direction: row-reverse;
}

.message-avatar {
  flex-shrink: 0;
  margin: 0 12px;
}

.message-content {
  max-width: 70%;
  background: white;
  border-radius: 12px;
  padding: 16px;
  box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
}

.message-bubble.user .message-content {
  background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
  color: white;
}

.message-header {
  display: flex;
  justify-content: space-between;
  align-items: center;
  margin-bottom: 8px;
  font-size: 12px;
  opacity: 0.8;
}

.message-role {
  font-weight: 600;
}

.message-time {
  font-size: 11px;
}

.message-body {
  line-height: 1.6;
}

.message-body :deep(pre) {
  background: #f6f8fa;
  border-radius: 6px;
  padding: 12px;
  overflow-x: auto;
  margin: 8px 0;
}

.message-body :deep(code) {
  background: #f6f8fa;
  padding: 2px 4px;
  border-radius: 4px;
  font-family: 'Monaco', 'Menlo', 'Ubuntu Mono', monospace;
}

.message-bubble.user .message-body :deep(pre),
.message-bubble.user .message-body :deep(code) {
  background: rgba(255, 255, 255, 0.1);
  color: white;
}

.generating-indicator {
  display: flex;
  align-items: center;
  justify-content: center;
  gap: 8px;
  padding: 16px;
  color: #666;
  font-size: 14px;
}

.input-container {
  padding: 16px 24px;
  background: rgba(255, 255, 255, 0.95);
  border-top: 1px solid #e5e7eb;
}

.input-tools {
  display: flex;
  gap: 8px;
  margin-bottom: 12px;
}

.input-area {
  position: relative;
}

.input-actions {
  display: flex;
  justify-content: space-between;
  align-items: center;
  margin-top: 12px;
}

.token-info {
  font-size: 12px;
  color: #666;
  display: flex;
  align-items: center;
  gap: 4px;
}

:deep(.el-textarea__inner) {
  border-radius: 12px;
  border: 1px solid #e5e7eb;
  padding: 12px 16px;
  font-size: 14px;
  line-height: 1.6;
  resize: none;
}

:deep(.el-textarea__inner:focus) {
  border-color: #667eea;
  box-shadow: 0 0 0 2px rgba(102, 126, 234, 0.1);
}
</style>

3.3 创建主页面

src/views目录下创建ChatView.vue

<template>
  <div class="chat-view">
    <ChatInterface />
  </div>
</template>

<script setup lang="ts">
import ChatInterface from '@/components/ChatInterface.vue'
</script>

<style scoped>
.chat-view {
  height: 100vh;
  overflow: hidden;
}
</style>

更新路由配置src/router/index.ts

import { createRouter, createWebHistory } from 'vue-router'
import ChatView from '@/views/ChatView.vue'

const router = createRouter({
  history: createWebHistory(import.meta.env.BASE_URL),
  routes: [
    {
      path: '/',
      name: 'chat',
      component: ChatView,
    },
  ],
})

export default router

4. 高级功能与优化

基础功能完成后,我们可以添加一些高级功能来提升用户体验。

4.1 长文本处理优化

GLM-4-9B-Chat-1M支持100万token,但实际使用中我们需要考虑性能问题。创建src/utils/textProcessor.ts

/**
 * 长文本处理工具
 */

// 估算文本的token数量
export function estimateTokens(text: string): number {
  // 简单估算:中文字符算1个token,英文字母和空格算0.25个token
  const chineseChars = (text.match(/[\u4e00-\u9fa5]/g) || []).length
  const otherChars = text.length - chineseChars
  return Math.ceil(chineseChars + otherChars * 0.25)
}

// 分割长文本为多个chunk
export function splitLongText(
  text: string,
  maxTokens: number = 32000, // 每个chunk的最大token数
  overlapTokens: number = 500 // chunk之间的重叠token数,保持上下文连贯
): string[] {
  const chunks: string[] = []
  let currentChunk = ''
  let currentTokens = 0
  
  // 按段落分割
  const paragraphs = text.split(/\n\s*\n/)
  
  for (const paragraph of paragraphs) {
    const paragraphTokens = estimateTokens(paragraph)
    
    // 如果单个段落就超过最大限制,需要进一步分割
    if (paragraphTokens > maxTokens) {
      // 按句子分割
      const sentences = paragraph.split(/[。!?.!?]/)
      for (const sentence of sentences) {
        const sentenceTokens = estimateTokens(sentence)
        
        if (currentTokens + sentenceTokens > maxTokens) {
          if (currentChunk) {
            chunks.push(currentChunk)
            // 保留重叠部分
            const overlapText = getOverlapText(currentChunk, overlapTokens)
            currentChunk = overlapText + sentence
            currentTokens = estimateTokens(currentChunk)
          }
        } else {
          currentChunk += sentence + '。'
          currentTokens += sentenceTokens + 1 // 加上句号的token
        }
      }
    } else if (currentTokens + paragraphTokens > maxTokens) {
      chunks.push(currentChunk)
      // 保留重叠部分
      const overlapText = getOverlapText(currentChunk, overlapTokens)
      currentChunk = overlapText + paragraph
      currentTokens = estimateTokens(currentChunk)
    } else {
      currentChunk += paragraph + '\n\n'
      currentTokens += paragraphTokens + 2 // 加上换行符的token
    }
  }
  
  // 添加最后一个chunk
  if (currentChunk) {
    chunks.push(currentChunk)
  }
  
  return chunks
}

// 获取文本末尾的重叠部分
function getOverlapText(text: string, overlapTokens: number): string {
  const words = text.split('')
  let overlapText = ''
  let overlapTokenCount = 0
  
  // 从后往前取词,直到达到指定的token数
  for (let i = words.length - 1; i >= 0 && overlapTokenCount < overlapTokens; i--) {
    const word = words[i]
    overlapText = word + overlapText
    overlapTokenCount += estimateTokens(word)
  }
  
  return overlapText
}

// 处理文件上传
export async function processUploadedFile(file: File): Promise<string> {
  return new Promise((resolve, reject) => {
    const reader = new FileReader()
    
    reader.onload = (e) => {
      try {
        const content = e.target?.result as string
        // 根据文件类型处理
        if (file.type === 'application/pdf') {
          // PDF文件处理(需要第三方库)
          resolve(`[PDF文件: ${file.name}]\n内容需要PDF解析库处理`)
        } else if (file.type.includes('text') || file.name.endsWith('.txt') || file.name.endsWith('.md')) {
          // 文本文件直接读取
          resolve(content)
        } else if (file.type.includes('word') || file.name.endsWith('.docx')) {
          // Word文件处理
          resolve(`[Word文件: ${file.name}]\n内容需要DOCX解析库处理`)
        } else {
          resolve(`[文件: ${file.name}]\n文件类型暂不支持直接读取`)
        }
      } catch (error) {
        reject(error)
      }
    }
    
    reader.onerror = () => {
      reject(new Error('文件读取失败'))
    }
    
    reader.readAsText(file)
  })
}

4.2 聊天历史管理

添加聊天历史保存和加载功能。更新chatStore.ts

// 在原有代码基础上添加以下功能

// 保存聊天记录到本地存储
const saveChatHistory = () => {
  try {
    const history = {
      messages: messages.value,
      timestamp: Date.now(),
      model: selectedModel.value,
    }
    localStorage.setItem('glm_chat_history', JSON.stringify(history))
  } catch (error) {
    console.error('Failed to save chat history:', error)
  }
}

// 从本地存储加载聊天记录
const loadChatHistory = () => {
  try {
    const saved = localStorage.getItem('glm_chat_history')
    if (saved) {
      const history = JSON.parse(saved)
      messages.value = history.messages
      selectedModel.value = history.model || 'glm-4-9b-chat-1m'
    }
  } catch (error) {
    console.error('Failed to load chat history:', error)
  }
}

// 导出聊天记录
const exportChatHistory = () => {
  try {
    const data = {
      messages: messages.value,
      metadata: {
        exportedAt: new Date().toISOString(),
        model: selectedModel.value,
        totalMessages: messages.value.length,
        estimatedTokens: estimatedTokens.value,
      },
    }
    
    const blob = new Blob([JSON.stringify(data, null, 2)], { type: 'application/json' })
    const url = URL.createObjectURL(blob)
    const a = document.createElement('a')
    a.href = url
    a.download = `glm-chat-history-${Date.now()}.json`
    document.body.appendChild(a)
    a.click()
    document.body.removeChild(a)
    URL.revokeObjectURL(url)
    
    return true
  } catch (error) {
    console.error('Failed to export chat history:', error)
    return false
  }
}

// 导入聊天记录
const importChatHistory = (file: File): Promise<boolean> => {
  return new Promise((resolve, reject) => {
    const reader = new FileReader()
    
    reader.onload = (e) => {
      try {
        const content = e.target?.result as string
        const data = JSON.parse(content)
        
        if (data.messages && Array.isArray(data.messages)) {
          messages.value = data.messages
          selectedModel.value = data.metadata?.model || 'glm-4-9b-chat-1m'
          saveChatHistory()
          resolve(true)
        } else {
          reject(new Error('无效的聊天记录文件'))
        }
      } catch (error) {
        reject(error)
      }
    }
    
    reader.onerror = () => {
      reject(new Error('文件读取失败'))
    }
    
    reader.readAsText(file)
  })
}

// 自动保存(当消息变化时)
watch(messages, () => {
  saveChatHistory()
}, { deep: true, immediate: true })

// 初始化时加载历史记录
onMounted(() => {
  loadChatHistory()
})

4.3 性能优化建议

在实际使用中,你可能会遇到一些性能问题。这里有一些优化建议:

  1. 虚拟滚动:如果聊天记录非常多,考虑使用虚拟滚动技术
  2. 图片懒加载:消息中的图片延迟加载
  3. Web Worker:将Markdown渲染等耗时操作放到Web Worker中
  4. 请求取消:用户发送新消息时取消之前的请求
  5. 本地缓存:缓存常用的回复模板

创建一个性能优化工具文件src/utils/performance.ts

/**
 * 性能优化工具
 */

// 防抖函数
export function debounce<T extends (...args: any[]) => any>(
  func: T,
  wait: number
): (...args: Parameters<T>) => void {
  let timeout: NodeJS.Timeout | null = null
  
  return (...args: Parameters<T>) => {
    if (timeout) clearTimeout(timeout)
    timeout = setTimeout(() => func(...args), wait)
  }
}

// 节流函数
export function throttle<T extends (...args: any[]) => any>(
  func: T,
  limit: number
): (...args: Parameters<T>) => void {
  let inThrottle: boolean = false
  
  return (...args: Parameters<T>) => {
    if (!inThrottle) {
      func(...args)
      inThrottle = true
      setTimeout(() => (inThrottle = false), limit)
    }
  }
}

// 测量函数执行时间
export function measurePerformance<T extends (...args: any[]) => any>(
  func: T,
  label: string = 'Function'
): (...args: Parameters<T>) => ReturnType<T> {
  return (...args: Parameters<T>) => {
    const start = performance.now()
    const result = func(...args)
    const end = performance.now()
    console.log(`${label} took ${(end - start).toFixed(2)}ms`)
    return result
  }
}

// 批量更新优化
export function batchUpdate(callback: () => void) {
  if (typeof requestAnimationFrame !== 'undefined') {
    requestAnimationFrame(callback)
  } else {
    setTimeout(callback, 0)
  }
}

// 内存使用监控
export class MemoryMonitor {
  private static instance: MemoryMonitor
  private updateInterval: NodeJS.Timeout | null = null
  
  private constructor() {}
  
  static getInstance(): MemoryMonitor {
    if (!MemoryMonitor.instance) {
      MemoryMonitor.instance = new MemoryMonitor()
    }
    return MemoryMonitor.instance
  }
  
  startMonitoring(interval: number = 30000) {
    if (this.updateInterval) {
      clearInterval(this.updateInterval)
    }
    
    this.updateInterval = setInterval(() => {
      if ('memory' in performance) {
        const memory = (performance as any).memory
        console.log('Memory usage:', {
          usedJSHeapSize: `${(memory.usedJSHeapSize / 1024 / 1024).toFixed(2)} MB`,
          totalJSHeapSize: `${(memory.totalJSHeapSize / 1024 / 1024).toFixed(2)} MB`,
          jsHeapSizeLimit: `${(memory.jsHeapSizeLimit / 1024 / 1024).toFixed(2)} MB`,
        })
      }
    }, interval)
  }
  
  stopMonitoring() {
    if (this.updateInterval) {
      clearInterval(this.updateInterval)
      this.updateInterval = null
    }
  }
}

5. 部署与生产环境配置

5.1 构建优化配置

更新vite.config.ts,添加构建优化:

import { defineConfig } from 'vite'
import vue from '@vitejs/plugin-vue'
import { visualizer } from 'rollup-plugin-visualizer'

export default defineConfig({
  plugins: [
    vue(),
    visualizer({
      open: false,
      gzipSize: true,
      brotliSize: true,
    }),
  ],
  build: {
    rollupOptions: {
      output: {
        manualChunks: {
          'vendor': ['vue', 'vue-router', 'pinia'],
          'ui-library': ['element-plus'],
          'markdown': ['markdown-it', 'highlight.js'],
        },
        chunkFileNames: 'assets/js/[name]-[hash].js',
        entryFileNames: 'assets/js/[name]-[hash].js',
        assetFileNames: 'assets/[ext]/[name]-[hash].[ext]',
      },
    },
    chunkSizeWarningLimit: 1000,
    minify: 'terser',
    terserOptions: {
      compress: {
        drop_console: true,
        drop_debugger: true,
      },
    },
  },
  server: {
    proxy: {
      '/api': {
        target: 'http://localhost:8000',
        changeOrigin: true,
        rewrite: (path) => path.replace(/^\/api/, ''),
      },
    },
  },
})

5.2 Docker部署配置

创建Dockerfile

# 构建阶段
FROM node:18-alpine as build-stage

WORKDIR /app

COPY package*.json ./
RUN npm ci --only=production

COPY . .
RUN npm run build

# 生产阶段
FROM nginx:alpine as production-stage

COPY --from=build-stage /app/dist /usr/share/nginx/html
COPY nginx.conf /etc/nginx/nginx.conf

EXPOSE 80

CMD ["nginx", "-g", "daemon off;"]

创建nginx.conf

events {
    worker_connections 1024;
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # 开启gzip压缩
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_types text/plain text/css text/xml text/javascript application/javascript application/xml+rss application/json;

    server {
        listen 80;
        server_name localhost;
        root /usr/share/nginx/html;
        index index.html;

        # 处理前端路由
        location / {
            try_files $uri $uri/ /index.html;
        }

        # API代理(如果需要)
        location /api/ {
            proxy_pass http://backend:8000/;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }

        # 静态资源缓存
        location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
            expires 1y;
            add_header Cache-Control "public, immutable";
        }
    }
}

创建docker-compose.yml

version: '3.8'

services:
  frontend:
    build: .
    ports:
      - "8080:80"
    depends_on:
      - backend
    networks:
      - glm-network

  backend:
    image: your-glm-backend-image
    ports:
      - "8000:8000"
    environment:
      - MODEL_NAME=glm-4-9b-chat-1m
    volumes:
      - ./models:/app/models
    networks:
      - glm-network
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

networks:
  glm-network:
    driver: bridge

5.3 环境变量配置

创建.env.production

VITE_API_BASE_URL=/api
VITE_APP_TITLE=GLM智能聊天助手
VITE_ENABLE_ANALYTICS=false
VITE_SENTRY_DSN=

6. 总结

走完这一整套流程,你应该已经成功把GLM-4-9B-Chat-1M集成到Vue3项目里了。从最开始的API设计,到聊天界面的搭建,再到各种优化和部署配置,我们基本上覆盖了前端集成的所有关键环节。

实际用下来,这种集成方式还是挺灵活的。流式输出的体验确实比一次性返回要好,用户能实时看到生成过程,感觉更自然。长文本处理那块,虽然我们做了分块和重叠的优化,但实际使用中还是要根据具体场景调整参数,毕竟100万token的上下文不是所有场景都需要。

性能方面,前端能做的优化其实有限,主要还是靠后端模型的推理速度。不过我们做的那些虚拟滚动、防抖节流、内存监控,对提升用户体验还是有帮助的。特别是聊天记录多了以后,这些优化措施就能看出效果了。

如果你在实际使用中遇到问题,我建议先从简单的配置开始,跑通了再逐步添加复杂功能。GLM-4-9B-Chat-1M的能力确实很强,但也要合理使用,别一次性喂太多内容,循序渐进地测试它的长文本处理能力。

这个项目还有很多可以扩展的地方,比如添加更多模型支持、实现文件上传解析、加入对话模板等等。你可以根据自己的需求慢慢完善。最重要的是保持代码结构清晰,这样后续维护和扩展都会比较方便。


获取更多AI镜像

想探索更多AI镜像和应用场景?访问 CSDN星图镜像广场,提供丰富的预置镜像,覆盖大模型推理、图像生成、视频生成、模型微调等多个领域,支持一键部署。

Logo

Agent 垂直技术社区,欢迎活跃、内容共建。

更多推荐