教育背景

经济统计学 · 学士
2023.09 – 2027.06
GPA: 4.01 / 5.00

专业课程

高等数学 (97) 线性代数 (94) 概率论与数理统计 (92) 计量经济学 (91) 时间序列分析 (96) 回归分析 (95) 非参数统计 (94) 多元统计分析 (93) 数据科学 (95) 计算机技术(Python)(91) R 软件基础与统计分析应用 (100) 金融学 (94) 西方经济学 (96)

竞赛奖项 :全国大学生统计建模大赛省级(北京市)一等奖、WorldQuant 量化投资大赛金牌

其他奖项 :校级竞赛类奖学金、思源宣讲团宣讲比赛院级一等奖、军训板报比赛校级二等奖

实习经历

2026.02 – 2026.03

实习概述:参与 OECD 支柱二全球最低税合规项目,服务中粮集团、太平洋保险、中国信达、长城汽车等多家大型企业客户,独立承担实体股权架构分析、基础信息表编制、公司名称翻译及财务数据汇总等数据全流程工作

工具开发:主导中粮集团 1,258 家实体的股权结构数据清洗,面对大量公司名称前后不一致导致 Excel 公式无法自动匹配的问题,自主开发 Python 模糊匹配程序,基于字符串相似度算法实现实体名称自动比对与分级归类,生成匹配报告交付团队与客户确认。基于上述数据处理痛点,进一步开发通用型 Excel 标准化工具(支持大小写统一、空格规范化、CJK 字符清洗),两种工具均以 Flask 构建 Web 界面并开源至 GitHub,形成可复用的工具化解决方案 NameLink Excel-Standardizer

股权穿透:负责太平洋保险集团近 200 家实际控制企业的股权穿透分析,综合运用启信宝、天眼查等企业信息平台进行交叉验证,梳理多层嵌套股权架构,输出并表所需的标准化股权结构信息

Overview: Participated in OECD Pillar Two Global Minimum Tax compliance projects, serving major enterprise clients including COFCO Group, Pacific Insurance, China Cinda, and Great Wall Motor; independently handled full-cycle data tasks including entity equity structure analysis, basic information table compilation, company name translation, and financial data consolidation.

Tool Development: Led data cleansing of equity structure data for 1,258 entities of COFCO Group; to address inconsistent company names causing Excel formula matching failures, independently developed a Python fuzzy matching tool based on string similarity algorithms to automate entity name comparison and hierarchical classification, generating matching reports for team and client confirmation. Building on these data processing pain points, further developed a general-purpose Excel standardization tool (supporting case normalization, whitespace standardization, and CJK character cleansing); both tools built with Flask web interface and open-sourced on GitHub as reusable solutions NameLink Excel-Standardizer

Equity Penetration: Responsible for equity penetration analysis of approximately 200 actually controlled entities of Pacific Insurance Group; cross-validated using Qixinbao, Tianyancha, and other enterprise information platforms; mapped multi-layer nested equity structures and produced standardized equity structure information for consolidation purposes.

对公业务部
2025.07 – 2025.08

授信分析:参与企业授信报告编写与财务尽调工作,运用企查查等平台采集目标企业工商信息、股权结构等数据;深入分析客户财务报表,运用 Excel 构建信用评分模型,结合多种图表分析方法,从偿债能力(流动比率、速动比率)等多维度量化评估企业信用风险,为信贷审批决策提供数据支撑。

业务管理:协助完成对公业务全流程管理,整理汇总外汇业务等 12 类高频业务咨询记录,建立标准化知识体系;参与企业银行账户开户流程,协助审核企业营业执照等开户资料,熟悉外汇业务登记、询证函办理等业务流程。

Credit Analysis: Participated in corporate credit report writing and financial due diligence; collected company registration information and equity structure data via Qichacha; conducted in-depth analysis of client financial statements; built a credit scoring model in Excel using multiple chart analysis methods to quantitatively assess corporate credit risk across dimensions including solvency (current ratio, quick ratio), providing data support for credit approval decisions.

Business Management: Assisted in full-process management of corporate banking operations; compiled and organized 12 categories of high-frequency business consultation records including foreign exchange services, establishing a standardized knowledge base; participated in corporate bank account opening processes; familiar with foreign exchange registration and letter of inquiry procedures.

2024.07 – 2024.08

毛利对标:参与某项目的跨单位毛利率对标分析,运用 Excel 处理所属施工单位在系统内外主要市场的经营数据;通过横纵向对比分析近年毛利率变化趋势,识别各单位市场竞争优势,为项目投标策略优化提供数据支撑。

风控建模:协助搭建两套财务风控模型并完善可视化分析工作:①合同资产质量评价模型:抽取 190 余个项目的新增合同资产样本,从收入结算及时性得分、超合同额确认风险得分、项目完工状态得分三大维度构建加权综合评分体系;运用 Excel 进行数据清洗并计算新增合同资产质量综合得分,精准识别 35 个超合同额项目风险敞口;②市场质量分析模型:整合 21 个细分市场共 4,000 余个项目数据,从项目毛利率等关键指标进行多维量化分析;建立横向单位间对标、纵向 5 年趋势分析的双向评价机制,成功识别高风险市场领域,为优化新签合同布局提供决策依据。

Gross Margin Benchmarking: Participated in cross-unit gross margin benchmarking analysis for a specific project; processed operating data of affiliated construction units across major markets using Excel; identified market competitive advantages through horizontal and vertical comparative analysis of gross margin trends over recent years, providing data support for bidding strategy optimization.

Risk Control Modeling: Assisted in building two financial risk control models with visualization: ①Contract Asset Quality Evaluation Model: extracted samples of newly added contract assets from 190+ projects and built a weighted composite scoring system across three dimensions (revenue settlement timeliness, over-contract recognition risk, project completion status); performed data cleansing in Excel to calculate quality scores, precisely identifying risk exposures of 35 over-contract projects; ②Market Quality Analysis Model: integrated data from 4,000+ projects across 21 sub-markets with multi-dimensional quantitative analysis of key metrics including project gross margin; established a dual evaluation mechanism of cross-unit horizontal benchmarking and 5-year vertical trend analysis, successfully identifying high-risk market segments and providing decision support for optimizing new contract allocation.

论文发表

Orion: Spatiotemporal Neural Networks with Self-Validating Directed Graphs for Traffic and Epidemic Forecasting
第一作者(通讯) KDD 2026 Research Track Cycle 2 Under Review Code
Orion 框架总览:Embedding → Belt Block → Fusion Block,三阶段渐进训练
图 1 — Orion 框架总览:多尺度 Embedding、N 层 Belt Block(TE-CausGAT + Multihead Attention)与 Fusion Block,三阶段渐进训练策略

问题切入:现有时空 GNN 依赖对称邻接矩阵,无法区分"谁影响谁",交通事故沿 A→B→C 单向传播拥堵,对称图却错误赋予 C→A 权重;且已有图学习方法缺乏机制验证学到的边权是否具有预测意义而非虚假相关。

方法设计:提出 Orion 框架,核心 TE-CausGAT 模块基于 Granger 因果原理分段估计时变有向依赖矩阵,通过贝叶斯融合整合学习结构与先验邻接;首创预测一致性自验证——对节点 i 执行 mask 干预后测量节点 j 预测变化Δŷ_j,要求其与边权 C_{ij} 一致,训练中自动过滤未通过干预测试的虚假边。架构上 Belt Block 并行建模小时/日/周三尺度,三阶段渐进训练避免联合优化的局部最优。

实验结果:5 个基准上取得 SOTA:METR-LA MAE 降低 6.60%,PEMS-08 降低 8.65%,流行病数据 CA/TX 分别降低 22.70% 和 37.73%。DREAM-3 基因调控网络有向图发现 AUC 达 0.6295,超越最强基线 CUTS 6.42%,验证所学结构与 ground-truth 因果关系的一致性

查看完整论文 (PDF)

Orion Framework Overview: Embedding → Belt Block → Fusion Block with Three-Stage Training
Figure 1 — Orion Framework Overview: Multi-scale Embedding, N-layer Belt Block (TE-CausGAT + Multihead Attention) & Fusion Block, with Three-Stage Progressive Training

Problem Motivation: Existing spatiotemporal GNNs rely on symmetric adjacency matrices and cannot distinguish "who influences whom"; traffic accidents propagate congestion unidirectionally along A→B→C, yet symmetric graphs erroneously assign weight to C→A. Moreover, existing graph learning methods lack mechanisms to verify whether learned edge weights carry predictive significance rather than spurious correlation.

Method Design: Proposed the Orion framework with a core TE-CausGAT module that estimates time-varying directed dependency matrices via Granger causality principles and integrates learned structures with prior adjacency through Bayesian fusion; pioneered prediction consistency self-validation—masking node i to measure prediction change Δŷj of node j, requiring consistency with edge weight Cij, and automatically filtering spurious edges failing the intervention test during training. Architecturally, Belt Block models three temporal scales (hourly/daily/weekly) in parallel, with three-stage progressive training to avoid joint optimization local optima.

Experimental Results: Achieved SOTA across 5 benchmarks: METR-LA MAE reduced by 6.60%, PEMS-08 by 8.65%, epidemic datasets CA/TX reduced by 22.70% and 37.73% respectively. Directed graph discovery AUC on DREAM-3 gene regulatory network reached 0.6295, surpassing the strongest baseline CUTS by 6.42%, validating consistency between learned structures and ground-truth causal relationships.

View Full Paper (PDF)

项目经历

雅思写作 AI Agent 智能评审与个人学习库开发
独立设计 & 开发
2026.02 – 2026.03

系统架构:基于 Claude Code 的 Skill 机制,独立设计了一套模块化的 Prompt Engineering Pipeline。以 SKILL.md 为主控调度文件,串联题目解读、作文修改、网页生成三个独立提示词模块,配合雅思官方评分标准与高分范文语料库,构成完整技能包。用户只需将题目和原文放入指定目录,一条指令即可触发 Agent 自动执行完整的八步工作流。

评审流程:核心理念是构建个人定制化学习库而非通用批改工具。Agent 依次完成"学习评分标准→学习范文语料→解读题目→阅读原文→评审修改→生成文件→更新网页→交付成果"八步流程;修改策略采用"语法纠正→词汇提升→句式优化→内容建议"四级渐进机制,保留原文结构前提下精准提升,每处修改标注类型与理由;评分对接雅思官方实现四维度(TR/CC/LR/GRA)打分与前后对比,结合个人写作习惯与语料库生成可复用模板句式,兼顾评审精度与教学实用性。

产出与扩展:系统自动生成并增量更新个人学习网站(主页、写作模板页、好词好句页等),支持修改过程可视化、多视图切换与分类筛选等交互功能,并通过 GitHub Pages 部署实现手机电脑多端同步访问。在大作文批改系统成熟后,通过 Bootstrapping 将整套 Pipeline 平滑扩展至小作文评审,验证了模块化 Prompt 架构的可扩展性。 Task 2 学习网站 Task 1 学习网站

System Architecture: Independently designed a modular Prompt Engineering Pipeline based on Claude Code's Skill mechanism. Using SKILL.md as the master dispatch file, the system chains three independent prompt modules — topic interpretation, essay revision, and webpage generation — together with official IELTS Band Descriptors and a high-scoring essay corpus to form a complete Agent skill package. Users simply place the topic and essay in a designated directory, and a single command triggers the Agent to automatically execute the full eight-step workflow.

Review Pipeline: The core philosophy is to build a personalized learning library rather than a generic correction tool. The Agent sequentially completes eight steps: learn scoring criteria → study sample essays → interpret topic → read original essay → review & revise → generate files → update website → deliver results. The revision strategy adopts a four-tier progressive mechanism (grammar correction → vocabulary enhancement → sentence optimization → content suggestions), making targeted improvements while preserving the original structure, with each modification annotated by type and rationale. Scoring aligns with official IELTS four-dimensional assessment (TR/CC/LR/GRA) with before-and-after comparison, generating reusable template expressions tailored to personal writing habits.

Output & Extension: The system automatically generates and incrementally updates a personal learning website (homepage, writing templates, vocabulary highlights, etc.), supporting revision process visualization, multi-view switching, and category filtering, deployed via GitHub Pages for cross-device access. After the Task 2 (essay) system matured, the entire Pipeline was smoothly extended to Task 1 (report) via Bootstrapping, validating the extensibility of the modular Prompt architecture. Task 2 Learning Site Task 1 Learning Site

中国国航经营分析与五维竞争力评价
核心成员
2025.11 – 2025.12

舆情采集:复现并优化开源爬虫 EastMoney_Crawler,使用 Selenium 模拟浏览器访问东方财富股吧,结合 CSS 选择器提取帖子标题、浏览量、发帖时间等字段,配合随机延时与 stealth.js 反检测注入规避反爬;数据落盘 MongoDB,共采集国航、东航、南航三家公司 2.5 万条全年舆情数据,清洗后输出标准化 CSV

情感分析:构建两套评分方案并系统对比——①CFSD 金融词典在 Excel 中通过 LEN/SUBSTITUTE 统计正负词频计算归一化得分;②FinBERT(bert-base-chinese 金融微调)使用 T4 GPU 批量推理提取三分类概率计算连续得分。通过 6 组典型案例定量论证 FinBERT 在否定句式、反讽及隐含情感场景下的优越性(如"恐慌时反着来"词典误判 25 分、FinBERT 正确识别 58.64 分),最终采用 FinBERT 得分纳入评价体系。

综合评价:参与构建"财务三维+舆情+ESG"五维竞争力模型,采用 Min-Max 标准化与多组权重稳健性检验,量化三大航在资源、运营、适配、市场感知及可持续发展维度的结构性差异。

查看完整论文 (PDF)

Sentiment Data Collection: Reproduced and optimized the open-source crawler EastMoney_Crawler; used Selenium to simulate browser access to EastMoney stock forums, extracting fields such as post titles, view counts, and posting times via CSS selectors; combined random delays and stealth.js anti-detection injection to bypass crawlers; stored data in MongoDB, collecting 25,000 full-year sentiment records across Air China, China Eastern, and China Southern; outputted standardized CSV after cleansing.

Sentiment Analysis: Built and systematically compared two scoring schemes: ①CFSD Financial Dictionary using LEN/SUBSTITUTE in Excel to calculate normalized positive/negative word frequency scores; ②FinBERT (bert-base-chinese financial fine-tune) using T4 GPU batch inference to extract three-class probabilities and compute continuous scores. Quantitatively demonstrated FinBERT's superiority across 6 typical cases involving negation, sarcasm, and implicit sentiment (e.g., "panic, buy the opposite" misclassified by dictionary at 25 pts, correctly identified by FinBERT at 58.64 pts); ultimately adopted FinBERT scores in the evaluation framework.

Comprehensive Evaluation: Participated in building the "Financial 3D + Sentiment + ESG" five-dimensional competitiveness model; applied Min-Max normalization and multi-weight robustness testing to quantify structural differences among the three major airlines across dimensions of resources, operations, adaptability, market perception, and sustainable development.

View Full Paper (PDF)

WorldQuant 量化投资大赛金牌
个人项目
2025.08 – 至今

WorldQuant 量化投资大赛:大赛聚焦 Alpha 因子挖掘与优化。开发 10 余个高质量因子(Sharpe≥1.25,换手率≤70%),涵盖价量特征、财务指标、另类数据等维度。通过特征工程优化信号质量,控制交易成本。目前累计得分 19,538,全球排名前 5%,获得大赛金牌与 WorldQuant 量化顾问资格。

因子验证与多因子模型构建:筛选表现优异因子(Sharpe≥2.0)迁移至 A 股市场,使用 Tushare 获取全市场历史数据,搭建标准化单因子检验框架(含收益率分析、IC/IR 计算、分层回测等),验证因子有效性;整合 Sharpe≥1.0 的优质因子,基于 2018 年至今 A 股数据,构建 XGBoost/LightGBM 集成学习模型,实现因子非线性融合。

WorldQuant Competition: Competition focuses on Alpha factor mining and optimization. Developed 10+ high-quality factors (Sharpe ≥1.25, turnover ≤70%), covering price-volume features, financial indicators, and alternative data dimensions; optimized signal quality through feature engineering while controlling transaction costs. Currently accumulated score of 19,538, Top 5% globally, awarded Gold Medal and WorldQuant Quant Advisor qualification.

Factor Validation & Multi-factor Model Construction: Migrated top-performing factors (Sharpe ≥2.0) to the A-share market; used Tushare to obtain full-market historical data; built a standardized single-factor testing framework (including return analysis, IC/IR calculation, and layered backtesting); validated factor effectiveness; integrated high-quality factors (Sharpe ≥1.0) and constructed an XGBoost/LightGBM ensemble learning model based on A-share data from 2018 to present for non-linear factor fusion.

微博热词数据挖掘项目
数据采集负责人
2025.07 – 2025.08

舆情采集:基于 Selenium 与 ChromeDriver 搭建微博关键词语料自动化采集管线,采用 7 日滑动时间窗口分段策略规避搜索接口分页上限,通过 XPath 结构化抽取发布人、时间及正文字段,累计采集数万条语料并以 Excel 格式交付,支撑后续文本挖掘与情感分析建模。

Data Collection: Built an automated Weibo keyword corpus collection pipeline based on Selenium and ChromeDriver; adopted a 7-day sliding window segmentation strategy to bypass search API pagination limits; structurally extracted author, timestamp, and content fields via XPath; accumulated tens of thousands of records delivered in Excel format, supporting downstream text mining and sentiment analysis modeling.

全国大学生统计建模大赛省级一等奖
队长
2025.03 – 2025.04

技术创新:提出面向交通流量预测的深度学习模型,设计基于 KNN 的动态图注意力网络与多头自注意力融合模块,构建三周期输入及权重约束机制捕捉多尺度时空依赖;该工作为后续 KDD 2026 投稿论文 Orion 框架奠定了基础。 Code

实验验证:带领团队处理 PeMS 等 5 个大规模交通数据集(1000+ 监测点、10 万+ 时间步),预测 MAE 较基线降低 15–30%,长期预测误差增长率降低 50%;斩获北京市省级一等奖。

查看参赛论文 (PDF)

Technical Innovation: Proposed a deep learning model for traffic flow prediction; designed a KNN-based dynamic graph attention network fused with multi-head self-attention module; constructed a three-period input and weight constraint mechanism to capture multi-scale spatiotemporal dependencies; this work laid the foundation for the subsequently submitted KDD 2026 paper Orion framework. Code

Experimental Validation: Led the team in processing 5 large-scale traffic datasets from PeMS and others (1,000+ monitoring points, 100,000+ time steps); prediction MAE reduced by 15–30% compared to baselines, long-term prediction error growth rate reduced by 50%; won Beijing provincial-level First Prize.

View Competition Paper (PDF)

Excel 多语翻译工具开发项目
独立开发者
2025.02 – 2025.03

工具开发:独立设计并开发一款 Excel 多语翻译工具(Python 5000+ LOC),集成 DeepSeek、Claude 等 5 种翻译引擎,实现单元格内容智能分类(公式/数字/URL 等自动跳过)、JSON 批量翻译、断点续传与指数退避重试机制;基于 Flask + SSE 构建 Web 可视化界面,支持实时进度推送、双语对照翻译报告生成。目前程序已开源至 GitHub ExcelTranslator-Pro

Tool Development: Independently designed and developed an Excel multilingual translation tool (Python 5,000+ LOC), integrating 5 translation engines including DeepSeek and Claude; implemented intelligent content classification (formulas/numbers/URLs auto-skipped), JSON batch translation, checkpoint resume, and exponential backoff retry; built a Flask + SSE web visualization interface supporting real-time progress updates and bilingual comparison translation report generation; currently open-sourced on GitHub ExcelTranslator-Pro

基于自研多模态 Transformers 和 LLM 的股价预测研究
个人项目
2024.09 – 至今

技术方案:使用网络爬虫构建新数据集并设计创新多模态 Transformers 架构。融合预训练 BERT 处理新闻文本,采用多头注意力机制处理六大股价技术指标;通过优化 Encoder-Decoder 结构增强模型的长期依赖捕捉能力。

项目成果:独立完成模型开发,初步实验预测精度显著优于传统模型;正进行超参数优化,目标高水平会议论文。

Technical Approach: Built a new dataset using web crawlers and designed an innovative multimodal Transformers architecture; integrated pre-trained BERT for news text processing; applied multi-head attention mechanism to handle six key stock technical indicators; enhanced long-term dependency capture by optimizing the Encoder-Decoder structure.

Project Outcomes: Independently completed model development; preliminary experiments show prediction accuracy significantly outperforming traditional models; currently conducting hyperparameter optimization with the goal of publishing in a top-tier conference.

实践经历

数学与统计学院思源宣讲团 · 宣讲员
2024.05 – 2024.10

内容设计:深入研究"自指问题"的理论内涵与现实意义,查阅 20+ 篇相关文献资料,设计 15 分钟的宣讲内容,创新性地将抽象概念与生活实例相结合,同时阐述了对人工智能的深入思考,在学院思源宣讲比赛中斩获一等奖(前 5%)。

Content Design: Conducted in-depth research on the theoretical significance and real-world implications of "self-referential problems"; reviewed 20+ related academic papers; designed a 15-minute presentation that innovatively combined abstract concepts with everyday examples while articulating deep reflections on artificial intelligence; won First Prize (Top 5%) in the College Siyuan Lecture Competition.

自由摄影师与创业实践 · 第一负责人
2024.01 – 2024.03

实践综述:制定多渠道获客方案,在天坛公园等热门景区设立固定拍摄点位与个人定制路线,独立完成从客户需求分析到后期修图全流程;2 个月内实现营业收入 2 万元,积累优质作品 500+ 张,客户好评率达 98%。

Overview: Developed a multi-channel client acquisition strategy; established fixed shooting spots and personalized route services at popular scenic spots such as the Temple of Heaven; independently handled the full process from client needs analysis to post-production editing; generated revenue of CNY 20,000 in 2 months, accumulated 500+ quality works, with a 98% client satisfaction rate.

学院摄影部 · 摄影干事
2023.09 – 2024.06

摄影执行:负责校级运动会、各类表彰大会等 30+ 场活动拍摄;熟练掌握 LR、PS 等工具,形成标准化工作流程

Photography Execution: Responsible for shooting 30+ events including university-level sports meets and commendation ceremonies; proficient in LR, PS and other tools with a standardized workflow established.

技能 / 兴趣

语言技能

雅思 6.5 分、六级 513 分、国际人才英语考试(中级)取得"良好"成绩

编程与数据分析

Python(Pandas / NumPy / PyTorch / Scikit-learn)、R、SQL、C语言;熟悉时序与图结构数据建模,能独立搭建从数据清洗、特征工程到模型调优的端到端 ML Pipeline。

Python (Pandas / NumPy / PyTorch / Scikit-learn), R, SQL, C; proficient in time-series and graph-structured data modeling; capable of independently building end-to-end ML Pipelines from data cleansing and feature engineering to model optimization.

Python R SQL C PyTorch Scikit-learn Pandas NumPy

工具与平台

LaTeX、SPSS、EViews;MS Office(已通过计算机二级考试);能熟练使用 Claude Code、GitHub Copilot 等 AI 辅助开发工具链,掌握 Prompt Engineering 与 AI Agent 工作流,具备 OpenClaw 等开源 Agent 的部署与 Skills 开发经验;同时具备 Vibe Coding 能力,可快速将创新想法落地为可运行原型,显著提升数据处理与开发效率。

LaTeX, SPSS, EViews; MS Office (passed Computer Proficiency Level 2 Exam); proficient with AI-assisted development tools including Claude Code and GitHub Copilot; skilled in Prompt Engineering and AI Agent workflows; experienced in deploying open-source Agents such as OpenClaw and developing Skills; also capable of Vibe Coding to rapidly turn innovative ideas into working prototypes, significantly enhancing data processing and development efficiency.

LaTeX SPSS EViews MS Office Claude Code GitHub Copilot Prompt Engineering AI Agent Vibe Coding

兴趣爱好

🎵 葫芦丝(10 级) 🎹 钢琴(五级) 🎶 竹笛(四级) 🎨 油画 🖌️ 水彩 🏸 羽毛球 🏀 篮球 🏓 乒乓球 📸 摄影 🧩 魔方