langchain4j-集成图片理解和图片生成
langchain4j-集成图片理解和图片生成
·
image模块
新建image模块

pom文件如下:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>com.whc</groupId>
<artifactId>langChain4j-whc</artifactId>
<version>1.0-SNAPSHOT</version>
</parent>
<artifactId>langchain4j-whc-image</artifactId>
<properties>
<maven.compiler.source>17</maven.compiler.source>
<maven.compiler.target>17</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!--langChain4j -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai</artifactId>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j</artifactId>
</dependency>
<!-- 阿里百炼平台 -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-community-dashscope-spring-boot-starter</artifactId>
</dependency>
<!--lombok-->
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
<!--hutool-->
<dependency>
<groupId>cn.hutool</groupId>
<artifactId>hutool-all</artifactId>
<version>5.8.22</version>
</dependency>
<!--test-->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
两个模型
图片理解模型
这里用的是VL模型。
通过API参考获取baseUrl

图片生成模型
主要是用通义万相模型
可以使用2.1或者2.2

2.2的模型
注意,使用万相模型,只输入模型名称就可以了。


配置参数
server:
port: 9003
spring:
application:
name: langChain_whc_image
ai:
dashScope:
# 配置dashScope
apiKey: ${AI_DASHSCOPE_API_KEY}
vlModelName: ${AI_DASHSCOPE_VL_MODEL_NAME:qwen-vl-max}
vlBaseUrl: ${AI_DASHSCOPE_VL_BASE_URL:https://dashscope.aliyuncs.com/compatible-mode/v1}
wxModelName: ${AI_DASHSCOPE_WX_MODEL_NAME:wan2.2-t2i-plus}
bean配置
配置两个模型的ChatModel的bean对象
@Slf4j
@Data
@Configuration
public class LlmConfig {
@Value("${ai.dashScope.apiKey}")
private String dashScopeApiKey;
@Value("${ai.dashScope.vlModelName}")
private String dashScopeVlModelName;
@Value("${ai.dashScope.vlBaseUrl}")
private String dashScopeVlBaseUrl;
@Value("${ai.dashScope.wxModelName}")
private String dashScopeWxModelName;
/**
* 通义千问VL-Max模型,用于图片理解
*
* @return ChatModel
*/
@Bean
public ChatModel vlModel() {
return OpenAiChatModel.builder()
.apiKey(dashScopeApiKey)
.modelName(dashScopeVlModelName)
.baseUrl(dashScopeVlBaseUrl)
.build();
}
/**
* 通义千问万相模型,本模型不需要配置baseUrl<br/>
* 万相需要引入dashscope-sdk-java包,在langchain4j-community-dashscope-spring-boot-starter中已包含引入
*
*
* @return ChatModel
*/
@Bean
public WanxImageModel wxModel() {
return WanxImageModel.builder()
.apiKey(dashScopeApiKey)
.modelName(dashScopeWxModelName)
.build();
}
}
图片理解
放置一个图片在resources目录下,让机器人来识别这个图片

接口
接口代码大部分是由通义灵码编写完成。
@Slf4j
@RestController
@RequiredArgsConstructor
public class ImageModelController {
private final ChatModel vlModel;
private final ResourceLoader resourceLoader;
@GetMapping(value = "/image/call")
public String readImageContent(@RequestParam(value = "imageName") String imageName) {
try {
// 加载resources/images目录下的图片
Resource resource = resourceLoader.getResource("classpath:images/" + imageName);
// 将图片转换为Base64编码
byte[] byteArray = resource.getContentAsByteArray();
String base64Data = Base64.getEncoder().encodeToString(byteArray);
// 构造包含图片和文本的UserMessage
UserMessage userMessage = UserMessage.from(
TextContent.from("请描述图片中的人物动作,人物形态以及人物个人信息介绍!"),
ImageContent.from(base64Data, "image/png")
);
// 调用vlModel的chat方法
ChatResponse chatResponse = vlModel.chat(userMessage);
String imageContent = chatResponse.aiMessage().text();
log.info("imageContent: {}", imageContent);
return imageContent;
} catch (Exception e) {
log.error("处理图片时出错: ", e);
return "处理图片时出错: " + e.getMessage();
}
}
}
测试效果:
这是接口调用返回信息

图片生成
参考文档:

简单改造
@GetMapping(value = "/image/wx/call")
public String basicCall(@RequestParam(value = "prompt") String prompt ) throws ApiException {
ImageSynthesisParam param =
ImageSynthesisParam.builder()
.apiKey(llmConfig.getDashScopeApiKey())
.model(llmConfig.getDashScopeWxModelName())
.prompt(prompt)
.n(1)
//.size("1328*1328")
.build();
ImageSynthesis imageSynthesis = new ImageSynthesis();
ImageSynthesisResult result;
try {
result = imageSynthesis.call(param);
} catch (Exception e) {
throw new RuntimeException(e.getMessage());
}
return result.toString();
}
测试
GET http://localhost:9003/image/wx/call?prompt=18岁亚洲少女,齐刘海双马尾,粉色草莓发夹,身穿白色短款卫衣和牛仔背带裤,站在奶油色房间的飘窗上,手里拿着一杯珍珠奶茶.背景左侧有日落灯投下的橘色光斑
返回结果:
ImageSynthesisResult(requestId = 31977e99 - ded1 - 9 b23 - bd45 - 292 be0b47f57,
output = ImageSynthesisOutput(
taskId = cb6a3415 - 9 b48 - 4573 - 91 ba - a2f1bacc7f36, taskStatus = SUCCEEDED, code = null, message = null, results = [{
orig_prompt = 18 岁亚洲少女,
齐刘海双马尾,
粉色草莓发夹,
身穿白色短款卫衣和牛仔背带裤,
站在奶油色房间的飘窗上,
手里拿着一杯珍珠奶茶.背景左侧有日落灯投下的橘色光斑,
actual_prompt = 日系清新少女写真, 18 岁亚洲女孩扎齐刘海双马尾, 佩戴粉色草莓发夹, 身穿白色短款卫衣与牛仔背带裤, 俏皮站在奶油色房间的飘窗上, 手捧一杯珍珠奶茶。 背景左侧洒落日落灯的橘色光斑, 营造温暖柔和氛围, 画面呈现青春活力与恬静治愈感, 近景侧身对角线构图, 柔焦光影效果。,
url = https: //dashscope-result-wlcb-acdr-1.oss-cn-wulanchabu-acdr-1.aliyuncs.com/1d/d2/20250824/60eed7f2/cb6a3415-9b48-4573-91ba-a2f1bacc7f36546788731.png?Expires=1756134608&OSSAccessKeyId=LTAI5tKPD3TMqf2Lna1fASuh&Signature=WOfz5XYpwXMETVh75Q8UqKadiGg%3D}], taskMetrics=ImageSynthesisTaskMetrics(total=1, succeeded=1, failed=0)), usage=ImageSynthesisUsage(imageCount=1))
复制返回结果中的url下载图片
如下是生成的图片效果。

更多推荐



所有评论(0)