<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>LLM Evaluation | YuxiaDing's homepage</title><link>https://yuxiading.github.io/tags/llm-evaluation/</link><atom:link href="https://yuxiading.github.io/tags/llm-evaluation/index.xml" rel="self" type="application/rss+xml"/><description>LLM Evaluation</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Thu, 01 May 2025 00:00:00 +0000</lastBuildDate><image><url>https://yuxiading.github.io/media/icon_hu_982c5d63a71b2961.png</url><title>LLM Evaluation</title><link>https://yuxiading.github.io/tags/llm-evaluation/</link></image><item><title>Construction and Analysis of a Five-Factor Personality Assessment Model for Large Language Models (LLMs)</title><link>https://yuxiading.github.io/projects/llm-personality-assessment/</link><pubDate>Thu, 01 May 2025 00:00:00 +0000</pubDate><guid>https://yuxiading.github.io/projects/llm-personality-assessment/</guid><description>&lt;p&gt;This project develops a multidimensional assessment framework for studying AI agent personality traits in large language models. Based on the Big Five model and personality-oriented prompts inspired by psychological scales such as NEO-PI-R, it quantitatively compares behavioral differences among models including DeepSeek-V3 and Qwen 2.5.&lt;/p&gt;
&lt;p&gt;The analysis uses Python to process model-generated text and statistical methods, including correlation analysis, to compare model behavior across personality dimensions. The project provides benchmark-style results for future research on AI agent behavioral consistency and personality modeling.&lt;/p&gt;</description></item></channel></rss>