- v2.2.0.patch: Git补丁文件(56KB) - git_patch_deploy_v2.2.sh: 自动化部署脚本 - DEPLOY_v2.2.0.md: 完整部署指南 - DEPLOY_v2.2_CHECKLIST.md: 部署检查清单
1700 lines
56 KiB
Diff
1700 lines
56 KiB
Diff
From d7d21e19c9b578c8684051539b2ff3664e2c4ca5 Mon Sep 17 00:00:00 2001
|
||
From: Jowe <123822645+Selei1983@users.noreply.github.com>
|
||
Date: Tue, 30 Dec 2025 22:04:35 +0800
|
||
Subject: [PATCH 1/2] =?UTF-8?q?release:=20v2.2.0=20-=20=E5=8D=9A=E6=9F=A5?=
|
||
=?UTF-8?q?=E6=96=B0=E9=97=BB=E6=90=9C=E7=B4=A2=E5=8A=9F=E8=83=BD?=
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=UTF-8
|
||
Content-Transfer-Encoding: 8bit
|
||
|
||
新增功能:
|
||
- 集成博查Web Search API,自动获取网站相关新闻
|
||
- News模型添加source_name和source_icon字段
|
||
- 新闻管理后台界面优化
|
||
- 网站详情页新闻展示(标题、摘要、来源、链接)
|
||
- 定期任务脚本支持批量获取新闻
|
||
- 完整的API路由和测试脚本
|
||
|
||
技术实现:
|
||
- NewsSearcher工具类封装博查API
|
||
- 智能新闻搜索和去重机制
|
||
- 数据库迁移脚本migrate_news_fields.py
|
||
- API路由:/api/fetch-site-news 和 /api/fetch-all-news
|
||
- Cron任务脚本:fetch_news_cron.py
|
||
|
||
修改文件:
|
||
- config.py: 添加博查API配置
|
||
- models.py: News模型扩展
|
||
- app.py: 新闻获取路由和NewsAdmin优化
|
||
- templates/detail_new.html: 新闻展示UI
|
||
|
||
新增文件:
|
||
- utils/news_searcher.py (271行)
|
||
- migrate_news_fields.py (99行)
|
||
- fetch_news_cron.py (167行)
|
||
- test_news_feature.py (142行)
|
||
- NEWS_FEATURE_v2.2.md (408行)
|
||
|
||
统计:9个文件,1348行新增
|
||
|
||
🤖 Generated with [Claude Code](https://claude.com/claude-code)
|
||
|
||
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
||
---
|
||
NEWS_FEATURE_v2.2.md | 408 ++++++++++++++++++++++++++++++++++++++
|
||
app.py | 214 +++++++++++++++++++-
|
||
config.py | 10 +
|
||
fetch_news_cron.py | 167 ++++++++++++++++
|
||
migrate_news_fields.py | 99 +++++++++
|
||
models.py | 4 +
|
||
templates/detail_new.html | 41 +++-
|
||
test_news_feature.py | 142 +++++++++++++
|
||
utils/news_searcher.py | 271 +++++++++++++++++++++++++
|
||
9 files changed, 1348 insertions(+), 8 deletions(-)
|
||
create mode 100644 NEWS_FEATURE_v2.2.md
|
||
create mode 100644 fetch_news_cron.py
|
||
create mode 100644 migrate_news_fields.py
|
||
create mode 100644 test_news_feature.py
|
||
create mode 100644 utils/news_searcher.py
|
||
|
||
diff --git a/NEWS_FEATURE_v2.2.md b/NEWS_FEATURE_v2.2.md
|
||
new file mode 100644
|
||
index 0000000..c16bbe0
|
||
--- /dev/null
|
||
+++ b/NEWS_FEATURE_v2.2.md
|
||
@@ -0,0 +1,408 @@
|
||
+# ZJPB v2.2.0 - 新闻搜索功能
|
||
+
|
||
+**版本**: v2.2.0
|
||
+**发布日期**: 2025-01-30
|
||
+**主要功能**: 集成博查Web Search API,自动获取网站相关新闻
|
||
+
|
||
+---
|
||
+
|
||
+## 📋 功能概述
|
||
+
|
||
+v2.2.0版本引入了全新的新闻搜索和展示功能,通过博查AI搜索引擎API,自动为每个网站获取最新的相关新闻,并在网站详情页进行展示。
|
||
+
|
||
+### ✨ 核心特性
|
||
+
|
||
+1. **智能新闻搜索** 🔍
|
||
+ - 基于网站名称自动搜索相关新闻
|
||
+ - 支持自定义搜索时间范围(一天、一周、一月、一年)
|
||
+ - 自动排除网站自身的内容,确保新闻来源多样性
|
||
+
|
||
+2. **新闻管理系统** 📰
|
||
+ - 后台新闻列表管理
|
||
+ - 支持手动编辑和删除新闻
|
||
+ - 新闻来源信息展示(网站名称、图标)
|
||
+
|
||
+3. **前台新闻展示** 🎨
|
||
+ - 网站详情页展示最多5条相关新闻
|
||
+ - 显示新闻标题、摘要、来源、发布时间
|
||
+ - 点击新闻直接跳转到原文链接
|
||
+
|
||
+4. **定期任务支持** ⏰
|
||
+ - 提供cron任务脚本,支持定期批量获取新闻
|
||
+ - 可配置获取数量、时间范围等参数
|
||
+
|
||
+---
|
||
+
|
||
+## 🚀 快速开始
|
||
+
|
||
+### 1. 配置博查API
|
||
+
|
||
+在`.env`文件中添加博查API配置:
|
||
+
|
||
+```env
|
||
+# 博查 Web Search API配置
|
||
+BOCHA_API_KEY=sk-your-api-key-here
|
||
+BOCHA_BASE_URL=https://api.bocha.cn
|
||
+```
|
||
+
|
||
+获取API Key:访问 [博查AI开放平台](https://open.bocha.cn) 注册并获取API密钥。
|
||
+
|
||
+### 2. 数据库迁移
|
||
+
|
||
+运行迁移脚本,为News表添加新字段:
|
||
+
|
||
+```bash
|
||
+python migrate_news_fields.py
|
||
+```
|
||
+
|
||
+迁移会添加以下字段:
|
||
+- `source_name`: 新闻来源网站名称
|
||
+- `source_icon`: 新闻来源网站图标URL
|
||
+
|
||
+### 3. 测试新闻获取
|
||
+
|
||
+#### 方法1:使用Python测试脚本
|
||
+
|
||
+```bash
|
||
+# 测试NewsSearcher类
|
||
+python utils/news_searcher.py
|
||
+```
|
||
+
|
||
+#### 方法2:使用后台API
|
||
+
|
||
+登录后台后,使用API接口:
|
||
+
|
||
+**单个网站获取新闻**:
|
||
+```bash
|
||
+curl -X POST http://localhost:5000/api/fetch-site-news \
|
||
+ -H "Content-Type: application/json" \
|
||
+ -d '{
|
||
+ "site_id": 1,
|
||
+ "count": 10,
|
||
+ "freshness": "oneMonth"
|
||
+ }'
|
||
+```
|
||
+
|
||
+**批量获取新闻**:
|
||
+```bash
|
||
+curl -X POST http://localhost:5000/api/fetch-all-news \
|
||
+ -H "Content-Type: application/json" \
|
||
+ -d '{
|
||
+ "count": 5,
|
||
+ "freshness": "oneMonth",
|
||
+ "limit": 10
|
||
+ }'
|
||
+```
|
||
+
|
||
+### 4. 查看新闻
|
||
+
|
||
+1. 访问任意网站详情页(例如:`http://localhost:5000/site/12345678`)
|
||
+2. 滚动到页面下方,查看"相关新闻"部分
|
||
+3. 点击新闻标题或"阅读全文"链接跳转到原文
|
||
+
|
||
+---
|
||
+
|
||
+## 📖 使用指南
|
||
+
|
||
+### 后台管理
|
||
+
|
||
+#### 新闻管理界面
|
||
+
|
||
+1. 登录后台:`http://localhost:5000/admin`
|
||
+2. 点击左侧菜单"新闻管理"
|
||
+3. 可以查看、编辑、删除已获取的新闻
|
||
+
|
||
+**新闻列表字段**:
|
||
+- ID
|
||
+- 关联网站
|
||
+- 新闻标题
|
||
+- 来源网站
|
||
+- 新闻类型
|
||
+- 发布时间
|
||
+- 是否启用
|
||
+
|
||
+#### 手动获取新闻
|
||
+
|
||
+虽然提供了API接口,但目前没有直接的后台UI按钮。可以通过以下方式触发:
|
||
+
|
||
+1. 使用API接口(参见快速开始)
|
||
+2. 使用定期任务脚本(参见下文)
|
||
+
|
||
+### 定期任务设置
|
||
+
|
||
+使用`fetch_news_cron.py`脚本定期自动获取新闻。
|
||
+
|
||
+#### 手动执行
|
||
+
|
||
+```bash
|
||
+# 默认参数:处理10个网站,每个网站获取5条新闻,时间范围一个月
|
||
+python fetch_news_cron.py
|
||
+
|
||
+# 自定义参数
|
||
+python fetch_news_cron.py --limit 20 --count 10 --freshness oneWeek
|
||
+```
|
||
+
|
||
+**参数说明**:
|
||
+- `--limit`: 处理的网站数量限制(默认:10)
|
||
+- `--count`: 每个网站获取的新闻数量(默认:5)
|
||
+- `--freshness`: 新闻时间范围(可选:noLimit, oneDay, oneWeek, oneMonth, oneYear)
|
||
+
|
||
+#### 配置Crontab
|
||
+
|
||
+在Linux服务器上配置定期任务:
|
||
+
|
||
+```bash
|
||
+# 编辑crontab
|
||
+crontab -e
|
||
+
|
||
+# 添加以下行(每天早上8点执行)
|
||
+0 8 * * * cd /opt/1panel/apps/zjpb && /opt/1panel/apps/zjpb/venv/bin/python fetch_news_cron.py --limit 10 >> logs/news_fetch.log 2>&1
|
||
+
|
||
+# 或每6小时执行一次
|
||
+0 */6 * * * cd /opt/1panel/apps/zjpb && /opt/1panel/apps/zjpb/venv/bin/python fetch_news_cron.py --limit 20 >> logs/news_fetch.log 2>&1
|
||
+```
|
||
+
|
||
+**注意**:
|
||
+- 确保创建`logs`目录:`mkdir -p logs`
|
||
+- 修改路径为实际的项目路径
|
||
+- 根据API配额合理设置执行频率
|
||
+
|
||
+---
|
||
+
|
||
+## 🗂️ 文件结构
|
||
+
|
||
+### 新增文件
|
||
+
|
||
+```
|
||
+zjpb/
|
||
+├── utils/
|
||
+│ └── news_searcher.py # 博查API封装类
|
||
+├── migrate_news_fields.py # 数据库迁移脚本
|
||
+├── fetch_news_cron.py # 定期任务脚本
|
||
+└── NEWS_FEATURE_v2.2.md # 本文档
|
||
+```
|
||
+
|
||
+### 修改文件
|
||
+
|
||
+```
|
||
+zjpb/
|
||
+├── config.py # 添加博查API配置
|
||
+├── models.py # News模型添加source_name/source_icon字段
|
||
+├── app.py # 添加新闻获取API路由,更新NewsAdmin
|
||
+└── templates/
|
||
+ └── detail_new.html # 优化新闻展示UI
|
||
+```
|
||
+
|
||
+---
|
||
+
|
||
+## 🔧 技术实现
|
||
+
|
||
+### API集成
|
||
+
|
||
+使用博查Web Search API进行新闻搜索:
|
||
+
|
||
+- **接口地址**: `https://api.bocha.cn/v1/web-search`
|
||
+- **认证方式**: Bearer Token
|
||
+- **请求方法**: POST
|
||
+- **返回格式**: JSON
|
||
+
|
||
+### 数据模型
|
||
+
|
||
+News模型字段:
|
||
+
|
||
+| 字段 | 类型 | 说明 |
|
||
+|------|------|------|
|
||
+| id | Integer | 主键 |
|
||
+| site_id | Integer | 关联网站ID(外键) |
|
||
+| title | String(200) | 新闻标题 |
|
||
+| content | Text | 新闻内容/摘要 |
|
||
+| news_type | String(50) | 新闻类型 |
|
||
+| url | String(500) | 新闻链接 |
|
||
+| source_name | String(100) | ⭐ 来源网站名称(新增) |
|
||
+| source_icon | String(500) | ⭐ 来源网站图标(新增) |
|
||
+| published_at | DateTime | 发布时间 |
|
||
+| is_active | Boolean | 是否启用 |
|
||
+| created_at | DateTime | 创建时间 |
|
||
+| updated_at | DateTime | 更新时间 |
|
||
+
|
||
+### API路由
|
||
+
|
||
+**1. 单个网站获取新闻**
|
||
+
|
||
+- **路径**: `/api/fetch-site-news`
|
||
+- **方法**: POST
|
||
+- **权限**: 需要登录
|
||
+- **参数**:
|
||
+ ```json
|
||
+ {
|
||
+ "site_id": 1,
|
||
+ "count": 10,
|
||
+ "freshness": "oneMonth"
|
||
+ }
|
||
+ ```
|
||
+
|
||
+**2. 批量获取新闻**
|
||
+
|
||
+- **路径**: `/api/fetch-all-news`
|
||
+- **方法**: POST
|
||
+- **权限**: 需要登录
|
||
+- **参数**:
|
||
+ ```json
|
||
+ {
|
||
+ "count": 5,
|
||
+ "freshness": "oneMonth",
|
||
+ "limit": 10
|
||
+ }
|
||
+ ```
|
||
+
|
||
+---
|
||
+
|
||
+## 📊 配置选项
|
||
+
|
||
+在`config.py`中的新闻相关配置:
|
||
+
|
||
+```python
|
||
+# 博查 Web Search API配置
|
||
+BOCHA_API_KEY = os.environ.get('BOCHA_API_KEY')
|
||
+BOCHA_BASE_URL = os.environ.get('BOCHA_BASE_URL') or 'https://api.bocha.cn'
|
||
+BOCHA_SEARCH_ENDPOINT = '/v1/web-search'
|
||
+
|
||
+# 新闻搜索配置
|
||
+NEWS_SEARCH_COUNT = 10 # 每次搜索返回的新闻数量
|
||
+NEWS_SEARCH_FRESHNESS = 'oneMonth' # 默认搜索一个月内的新闻
|
||
+NEWS_SEARCH_SUMMARY = True # 是否显示摘要
|
||
+```
|
||
+
|
||
+---
|
||
+
|
||
+## ⚠️ 注意事项
|
||
+
|
||
+### API配额限制
|
||
+
|
||
+- 博查API根据充值金额有请求频率限制
|
||
+- 建议合理设置定期任务频率,避免过度消耗配额
|
||
+- 详情参见:[博查API定价](https://open.bocha.cn)
|
||
+
|
||
+### 去重机制
|
||
+
|
||
+系统会根据新闻URL自动去重,同一条新闻不会重复保存。
|
||
+
|
||
+### 搜索策略
|
||
+
|
||
+- 搜索关键词:`{网站名称} 最新 新闻`
|
||
+- 自动排除:网站自身域名的内容
|
||
+- 时间优先:优先显示最新发布的新闻
|
||
+
|
||
+### 性能优化
|
||
+
|
||
+- 批量获取时建议限制数量(`--limit 10-20`)
|
||
+- 避免短时间内频繁调用API
|
||
+- 数据库查询已优化,使用索引和去重
|
||
+
|
||
+---
|
||
+
|
||
+## 🐛 故障排查
|
||
+
|
||
+### 1. 新闻获取失败
|
||
+
|
||
+**可能原因**:
|
||
+- 博查API Key未配置或无效
|
||
+- API配额不足
|
||
+- 网络连接问题
|
||
+
|
||
+**解决方法**:
|
||
+```bash
|
||
+# 检查环境变量
|
||
+python -c "import os; from dotenv import load_dotenv; load_dotenv(); print(os.getenv('BOCHA_API_KEY'))"
|
||
+
|
||
+# 测试API连接
|
||
+python utils/news_searcher.py
|
||
+```
|
||
+
|
||
+### 2. 数据库字段不存在
|
||
+
|
||
+**错误信息**:`Unknown column 'source_name' in 'field list'`
|
||
+
|
||
+**解决方法**:
|
||
+```bash
|
||
+# 运行数据库迁移
|
||
+python migrate_news_fields.py
|
||
+```
|
||
+
|
||
+### 3. 详情页不显示新闻
|
||
+
|
||
+**可能原因**:
|
||
+- 数据库中没有该网站的新闻记录
|
||
+- 新闻被设置为不启用(is_active=False)
|
||
+
|
||
+**解决方法**:
|
||
+```bash
|
||
+# 为该网站获取新闻
|
||
+curl -X POST http://localhost:5000/api/fetch-site-news \
|
||
+ -H "Content-Type: application/json" \
|
||
+ -d '{"site_id": YOUR_SITE_ID}'
|
||
+```
|
||
+
|
||
+### 4. 定期任务不执行
|
||
+
|
||
+**检查清单**:
|
||
+- [ ] Crontab配置是否正确
|
||
+- [ ] Python路径是否正确
|
||
+- [ ] 日志文件是否有写入权限
|
||
+- [ ] 查看cron日志:`grep CRON /var/log/syslog`
|
||
+
|
||
+---
|
||
+
|
||
+## 📈 未来改进
|
||
+
|
||
+### 计划功能
|
||
+
|
||
+- [ ] 后台UI按钮直接触发新闻获取
|
||
+- [ ] 新闻分类和标签支持
|
||
+- [ ] 新闻摘要AI优化
|
||
+- [ ] 新闻热度排序
|
||
+- [ ] 用户收藏新闻功能
|
||
+- [ ] 新闻RSS订阅
|
||
+
|
||
+### 性能优化
|
||
+
|
||
+- [ ] 使用异步任务队列(Celery)
|
||
+- [ ] 新闻缓存机制
|
||
+- [ ] 图片CDN加速
|
||
+
|
||
+---
|
||
+
|
||
+## 📞 技术支持
|
||
+
|
||
+- **项目名称**: ZJPB - 焦提示词 | AI工具导航
|
||
+- **版本**: v2.2.0
|
||
+- **发布日期**: 2025-01-30
|
||
+- **博查API文档**: https://bocha-ai.feishu.cn/wiki/RXEOw02rFiwzGSkd9mUcqoeAnNK
|
||
+
|
||
+---
|
||
+
|
||
+## 📝 更新日志
|
||
+
|
||
+### v2.2.0 (2025-01-30)
|
||
+
|
||
+**新增**:
|
||
+- ✨ 集成博查Web Search API
|
||
+- ✨ 新闻自动搜索和存储功能
|
||
+- ✨ News模型添加source_name和source_icon字段
|
||
+- ✨ 网站详情页新闻展示优化
|
||
+- ✨ 后台新闻管理界面增强
|
||
+- ✨ 定期任务脚本(fetch_news_cron.py)
|
||
+- ✨ API路由:/api/fetch-site-news 和 /api/fetch-all-news
|
||
+
|
||
+**修改**:
|
||
+- 🔧 config.py添加博查API配置
|
||
+- 🔧 NewsAdmin添加source_name字段显示
|
||
+- 🔧 detail_new.html优化新闻展示UI
|
||
+
|
||
+**文档**:
|
||
+- 📖 NEWS_FEATURE_v2.2.md 功能文档
|
||
+- 📖 migrate_news_fields.py 迁移脚本文档
|
||
+
|
||
+---
|
||
+
|
||
+**祝您使用愉快!** 🎉
|
||
diff --git a/app.py b/app.py
|
||
index 7a2e392..b0f27d4 100644
|
||
--- a/app.py
|
||
+++ b/app.py
|
||
@@ -9,6 +9,7 @@ from config import config
|
||
from models import db, Site, Tag, Admin as AdminModel, News, site_tags, PromptTemplate
|
||
from utils.website_fetcher import WebsiteFetcher
|
||
from utils.tag_generator import TagGenerator
|
||
+from utils.news_searcher import NewsSearcher
|
||
|
||
def create_app(config_name='default'):
|
||
"""应用工厂函数"""
|
||
@@ -442,6 +443,205 @@ def create_app(config_name='default'):
|
||
'message': f'生成失败: {str(e)}'
|
||
}), 500
|
||
|
||
+ # ========== 新闻获取路由 ==========
|
||
+ @app.route('/api/fetch-site-news', methods=['POST'])
|
||
+ @login_required
|
||
+ def fetch_site_news():
|
||
+ """为指定网站获取最新新闻"""
|
||
+ try:
|
||
+ data = request.get_json()
|
||
+ site_id = data.get('site_id')
|
||
+ count = data.get('count', app.config.get('NEWS_SEARCH_COUNT', 10))
|
||
+ freshness = data.get('freshness', app.config.get('NEWS_SEARCH_FRESHNESS', 'oneMonth'))
|
||
+
|
||
+ if not site_id:
|
||
+ return jsonify({
|
||
+ 'success': False,
|
||
+ 'message': '请提供网站ID'
|
||
+ }), 400
|
||
+
|
||
+ # 获取网站信息
|
||
+ site = Site.query.get(site_id)
|
||
+ if not site:
|
||
+ return jsonify({
|
||
+ 'success': False,
|
||
+ 'message': '网站不存在'
|
||
+ }), 404
|
||
+
|
||
+ # 检查博查API配置
|
||
+ api_key = app.config.get('BOCHA_API_KEY')
|
||
+ if not api_key:
|
||
+ return jsonify({
|
||
+ 'success': False,
|
||
+ 'message': '博查API未配置,请在.env文件中设置BOCHA_API_KEY'
|
||
+ }), 500
|
||
+
|
||
+ # 创建新闻搜索器
|
||
+ searcher = NewsSearcher(api_key)
|
||
+
|
||
+ # 搜索新闻
|
||
+ news_items = searcher.search_site_news(
|
||
+ site_name=site.name,
|
||
+ site_url=site.url,
|
||
+ count=count,
|
||
+ freshness=freshness
|
||
+ )
|
||
+
|
||
+ if not news_items:
|
||
+ return jsonify({
|
||
+ 'success': False,
|
||
+ 'message': '未找到相关新闻'
|
||
+ }), 404
|
||
+
|
||
+ # 保存新闻到数据库
|
||
+ saved_count = 0
|
||
+ for item in news_items:
|
||
+ # 检查新闻是否已存在(根据URL判断)
|
||
+ existing_news = News.query.filter_by(
|
||
+ site_id=site_id,
|
||
+ url=item['url']
|
||
+ ).first()
|
||
+
|
||
+ if not existing_news:
|
||
+ # 创建新闻记录
|
||
+ news = News(
|
||
+ site_id=site_id,
|
||
+ title=item['title'],
|
||
+ content=item.get('summary') or item.get('snippet', ''),
|
||
+ url=item['url'],
|
||
+ source_name=item.get('site_name', ''),
|
||
+ source_icon=item.get('site_icon', ''),
|
||
+ published_at=item.get('published_at'),
|
||
+ news_type='Search Result',
|
||
+ is_active=True
|
||
+ )
|
||
+ db.session.add(news)
|
||
+ saved_count += 1
|
||
+
|
||
+ # 提交事务
|
||
+ db.session.commit()
|
||
+
|
||
+ return jsonify({
|
||
+ 'success': True,
|
||
+ 'message': f'成功获取并保存 {saved_count} 条新闻',
|
||
+ 'total_found': len(news_items),
|
||
+ 'saved': saved_count,
|
||
+ 'news_items': searcher.format_news_for_display(news_items)
|
||
+ })
|
||
+
|
||
+ except Exception as e:
|
||
+ db.session.rollback()
|
||
+ return jsonify({
|
||
+ 'success': False,
|
||
+ 'message': f'获取失败: {str(e)}'
|
||
+ }), 500
|
||
+
|
||
+ @app.route('/api/fetch-all-news', methods=['POST'])
|
||
+ @login_required
|
||
+ def fetch_all_news():
|
||
+ """批量为所有网站获取新闻"""
|
||
+ try:
|
||
+ data = request.get_json()
|
||
+ count_per_site = data.get('count', 5) # 每个网站获取的新闻数量
|
||
+ freshness = data.get('freshness', app.config.get('NEWS_SEARCH_FRESHNESS', 'oneMonth'))
|
||
+ limit = data.get('limit', 10) # 限制处理的网站数量
|
||
+
|
||
+ # 检查博查API配置
|
||
+ api_key = app.config.get('BOCHA_API_KEY')
|
||
+ if not api_key:
|
||
+ return jsonify({
|
||
+ 'success': False,
|
||
+ 'message': '博查API未配置,请在.env文件中设置BOCHA_API_KEY'
|
||
+ }), 500
|
||
+
|
||
+ # 获取启用的网站(按更新时间排序,优先处理旧的)
|
||
+ sites = Site.query.filter_by(is_active=True).order_by(Site.updated_at).limit(limit).all()
|
||
+
|
||
+ if not sites:
|
||
+ return jsonify({
|
||
+ 'success': False,
|
||
+ 'message': '没有可用的网站'
|
||
+ }), 404
|
||
+
|
||
+ # 创建新闻搜索器
|
||
+ searcher = NewsSearcher(api_key)
|
||
+
|
||
+ # 统计信息
|
||
+ total_saved = 0
|
||
+ total_found = 0
|
||
+ processed_sites = []
|
||
+
|
||
+ # 为每个网站获取新闻
|
||
+ for site in sites:
|
||
+ try:
|
||
+ # 搜索新闻
|
||
+ news_items = searcher.search_site_news(
|
||
+ site_name=site.name,
|
||
+ site_url=site.url,
|
||
+ count=count_per_site,
|
||
+ freshness=freshness
|
||
+ )
|
||
+
|
||
+ site_saved = 0
|
||
+ for item in news_items:
|
||
+ # 检查是否已存在
|
||
+ existing_news = News.query.filter_by(
|
||
+ site_id=site.id,
|
||
+ url=item['url']
|
||
+ ).first()
|
||
+
|
||
+ if not existing_news:
|
||
+ news = News(
|
||
+ site_id=site.id,
|
||
+ title=item['title'],
|
||
+ content=item.get('summary') or item.get('snippet', ''),
|
||
+ url=item['url'],
|
||
+ source_name=item.get('site_name', ''),
|
||
+ source_icon=item.get('site_icon', ''),
|
||
+ published_at=item.get('published_at'),
|
||
+ news_type='Search Result',
|
||
+ is_active=True
|
||
+ )
|
||
+ db.session.add(news)
|
||
+ site_saved += 1
|
||
+
|
||
+ total_found += len(news_items)
|
||
+ total_saved += site_saved
|
||
+
|
||
+ processed_sites.append({
|
||
+ 'id': site.id,
|
||
+ 'name': site.name,
|
||
+ 'found': len(news_items),
|
||
+ 'saved': site_saved
|
||
+ })
|
||
+
|
||
+ except Exception as e:
|
||
+ # 单个网站失败不影响其他网站
|
||
+ processed_sites.append({
|
||
+ 'id': site.id,
|
||
+ 'name': site.name,
|
||
+ 'error': str(e)
|
||
+ })
|
||
+ continue
|
||
+
|
||
+ # 提交事务
|
||
+ db.session.commit()
|
||
+
|
||
+ return jsonify({
|
||
+ 'success': True,
|
||
+ 'message': f'批量获取完成,共处理 {len(processed_sites)} 个网站',
|
||
+ 'total_found': total_found,
|
||
+ 'total_saved': total_saved,
|
||
+ 'processed_sites': processed_sites
|
||
+ })
|
||
+
|
||
+ except Exception as e:
|
||
+ db.session.rollback()
|
||
+ return jsonify({
|
||
+ 'success': False,
|
||
+ 'message': f'批量获取失败: {str(e)}'
|
||
+ }), 500
|
||
+
|
||
# ========== 批量导入路由 ==========
|
||
@app.route('/admin/batch-import', methods=['GET', 'POST'])
|
||
@login_required
|
||
@@ -908,9 +1108,9 @@ def create_app(config_name='default'):
|
||
# 显示操作列
|
||
column_display_actions = True
|
||
|
||
- column_list = ['id', 'site', 'title', 'news_type', 'published_at', 'is_active']
|
||
- column_searchable_list = ['title', 'content']
|
||
- column_filters = ['site', 'news_type', 'is_active', 'published_at']
|
||
+ column_list = ['id', 'site', 'title', 'source_name', 'news_type', 'published_at', 'is_active']
|
||
+ column_searchable_list = ['title', 'content', 'source_name']
|
||
+ column_filters = ['site', 'news_type', 'source_name', 'is_active', 'published_at']
|
||
column_labels = {
|
||
'id': 'ID',
|
||
'site': '关联网站',
|
||
@@ -918,16 +1118,19 @@ def create_app(config_name='default'):
|
||
'content': '新闻内容',
|
||
'news_type': '新闻类型',
|
||
'url': '新闻链接',
|
||
+ 'source_name': '来源网站',
|
||
+ 'source_icon': '来源图标',
|
||
'published_at': '发布时间',
|
||
'is_active': '是否启用',
|
||
'created_at': '创建时间',
|
||
'updated_at': '更新时间'
|
||
}
|
||
- form_columns = ['site', 'title', 'content', 'news_type', 'url', 'published_at', 'is_active']
|
||
+ form_columns = ['site', 'title', 'content', 'news_type', 'url', 'source_name', 'source_icon', 'published_at', 'is_active']
|
||
|
||
# 可选的新闻类型
|
||
form_choices = {
|
||
'news_type': [
|
||
+ ('Search Result', 'Search Result'),
|
||
('Product Update', 'Product Update'),
|
||
('Industry News', 'Industry News'),
|
||
('Company News', 'Company News'),
|
||
@@ -935,6 +1138,9 @@ def create_app(config_name='default'):
|
||
]
|
||
}
|
||
|
||
+ # 默认排序
|
||
+ column_default_sort = ('published_at', True) # 按发布时间倒序排列
|
||
+
|
||
# Prompt模板管理视图
|
||
class PromptAdmin(SecureModelView):
|
||
can_edit = True
|
||
diff --git a/config.py b/config.py
|
||
index 9cde07b..7a44182 100644
|
||
--- a/config.py
|
||
+++ b/config.py
|
||
@@ -46,6 +46,16 @@ class Config:
|
||
DEEPSEEK_API_KEY = os.environ.get('DEEPSEEK_API_KEY')
|
||
DEEPSEEK_BASE_URL = os.environ.get('DEEPSEEK_BASE_URL') or 'https://api.deepseek.com'
|
||
|
||
+ # 博查 Web Search API配置
|
||
+ BOCHA_API_KEY = os.environ.get('BOCHA_API_KEY')
|
||
+ BOCHA_BASE_URL = os.environ.get('BOCHA_BASE_URL') or 'https://api.bocha.cn'
|
||
+ BOCHA_SEARCH_ENDPOINT = '/v1/web-search'
|
||
+
|
||
+ # 新闻搜索配置
|
||
+ NEWS_SEARCH_COUNT = 10 # 每次搜索返回的新闻数量
|
||
+ NEWS_SEARCH_FRESHNESS = 'oneMonth' # 默认搜索一个月内的新闻
|
||
+ NEWS_SEARCH_SUMMARY = True # 是否显示摘要
|
||
+
|
||
class DevelopmentConfig(Config):
|
||
"""开发环境配置"""
|
||
DEBUG = True
|
||
diff --git a/fetch_news_cron.py b/fetch_news_cron.py
|
||
new file mode 100644
|
||
index 0000000..9e6b092
|
||
--- /dev/null
|
||
+++ b/fetch_news_cron.py
|
||
@@ -0,0 +1,167 @@
|
||
+"""
|
||
+定期新闻获取任务脚本
|
||
+用途:定期为网站批量获取最新新闻
|
||
+使用:python fetch_news_cron.py [options]
|
||
+
|
||
+可以通过crontab定时执行:
|
||
+# 每天早上8点执行,获取10个网站的新闻
|
||
+0 8 * * * cd /path/to/zjpb && /path/to/venv/bin/python fetch_news_cron.py --limit 10 >> logs/news_fetch.log 2>&1
|
||
+"""
|
||
+import os
|
||
+import sys
|
||
+import argparse
|
||
+from datetime import datetime
|
||
+from dotenv import load_dotenv
|
||
+
|
||
+# 加载环境变量
|
||
+load_dotenv()
|
||
+
|
||
+# 添加项目根目录到Python路径
|
||
+sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
|
||
+
|
||
+from app import create_app
|
||
+from models import db, Site, News
|
||
+from utils.news_searcher import NewsSearcher
|
||
+
|
||
+
|
||
+def fetch_news_for_sites(limit=10, count_per_site=5, freshness='oneMonth'):
|
||
+ """
|
||
+ 批量为网站获取新闻
|
||
+
|
||
+ Args:
|
||
+ limit: 处理的网站数量限制
|
||
+ count_per_site: 每个网站获取的新闻数量
|
||
+ freshness: 新闻时间范围
|
||
+ """
|
||
+ # 创建Flask应用上下文
|
||
+ app = create_app(os.getenv('FLASK_ENV', 'production'))
|
||
+
|
||
+ with app.app_context():
|
||
+ # 检查博查API配置
|
||
+ api_key = app.config.get('BOCHA_API_KEY')
|
||
+ if not api_key:
|
||
+ print(f"[{datetime.now()}] 错误:未配置BOCHA_API_KEY")
|
||
+ return False
|
||
+
|
||
+ # 获取启用的网站(按更新时间排序,优先处理旧的)
|
||
+ sites = Site.query.filter_by(is_active=True).order_by(Site.updated_at).limit(limit).all()
|
||
+
|
||
+ if not sites:
|
||
+ print(f"[{datetime.now()}] 没有可处理的网站")
|
||
+ return False
|
||
+
|
||
+ print(f"[{datetime.now()}] 开始批量获取新闻,共 {len(sites)} 个网站")
|
||
+ print(f"配置:每个网站 {count_per_site} 条新闻,时间范围:{freshness}")
|
||
+ print("-" * 60)
|
||
+
|
||
+ # 创建新闻搜索器
|
||
+ searcher = NewsSearcher(api_key)
|
||
+
|
||
+ # 统计信息
|
||
+ total_saved = 0
|
||
+ total_found = 0
|
||
+ success_count = 0
|
||
+ error_count = 0
|
||
+
|
||
+ # 为每个网站获取新闻
|
||
+ for i, site in enumerate(sites, 1):
|
||
+ print(f"[{i}/{len(sites)}] 处理网站: {site.name}")
|
||
+
|
||
+ try:
|
||
+ # 搜索新闻
|
||
+ news_items = searcher.search_site_news(
|
||
+ site_name=site.name,
|
||
+ site_url=site.url,
|
||
+ count=count_per_site,
|
||
+ freshness=freshness
|
||
+ )
|
||
+
|
||
+ if not news_items:
|
||
+ print(f" └─ 未找到新闻")
|
||
+ continue
|
||
+
|
||
+ site_saved = 0
|
||
+ for item in news_items:
|
||
+ # 检查是否已存在
|
||
+ existing_news = News.query.filter_by(
|
||
+ site_id=site.id,
|
||
+ url=item['url']
|
||
+ ).first()
|
||
+
|
||
+ if not existing_news:
|
||
+ news = News(
|
||
+ site_id=site.id,
|
||
+ title=item['title'],
|
||
+ content=item.get('summary') or item.get('snippet', ''),
|
||
+ url=item['url'],
|
||
+ source_name=item.get('site_name', ''),
|
||
+ source_icon=item.get('site_icon', ''),
|
||
+ published_at=item.get('published_at'),
|
||
+ news_type='Search Result',
|
||
+ is_active=True
|
||
+ )
|
||
+ db.session.add(news)
|
||
+ site_saved += 1
|
||
+
|
||
+ # 提交该网站的新闻
|
||
+ db.session.commit()
|
||
+
|
||
+ total_found += len(news_items)
|
||
+ total_saved += site_saved
|
||
+ success_count += 1
|
||
+
|
||
+ print(f" └─ 找到 {len(news_items)} 条,保存 {site_saved} 条新闻")
|
||
+
|
||
+ except Exception as e:
|
||
+ error_count += 1
|
||
+ print(f" └─ 错误: {str(e)}")
|
||
+ db.session.rollback()
|
||
+ continue
|
||
+
|
||
+ print("-" * 60)
|
||
+ print(f"[{datetime.now()}] 批量获取完成")
|
||
+ print(f"成功: {success_count} 个网站, 失败: {error_count} 个网站")
|
||
+ print(f"共找到 {total_found} 条新闻,保存 {total_saved} 条新新闻")
|
||
+ print("=" * 60)
|
||
+
|
||
+ return True
|
||
+
|
||
+
|
||
+def main():
|
||
+ """主函数"""
|
||
+ parser = argparse.ArgumentParser(description='定期新闻获取任务')
|
||
+ parser.add_argument('--limit', type=int, default=10, help='处理的网站数量限制(默认:10)')
|
||
+ parser.add_argument('--count', type=int, default=5, help='每个网站获取的新闻数量(默认:5)')
|
||
+ parser.add_argument('--freshness', type=str, default='oneMonth',
|
||
+ choices=['noLimit', 'oneDay', 'oneWeek', 'oneMonth', 'oneYear'],
|
||
+ help='新闻时间范围(默认:oneMonth)')
|
||
+
|
||
+ args = parser.parse_args()
|
||
+
|
||
+ print("=" * 60)
|
||
+ print(f"定期新闻获取任务 - 开始时间: {datetime.now()}")
|
||
+ print("=" * 60)
|
||
+
|
||
+ try:
|
||
+ success = fetch_news_for_sites(
|
||
+ limit=args.limit,
|
||
+ count_per_site=args.count,
|
||
+ freshness=args.freshness
|
||
+ )
|
||
+
|
||
+ if success:
|
||
+ print(f"\n任务执行成功!")
|
||
+ sys.exit(0)
|
||
+ else:
|
||
+ print(f"\n任务执行失败!")
|
||
+ sys.exit(1)
|
||
+
|
||
+ except Exception as e:
|
||
+ print(f"\n[{datetime.now()}] 严重错误: {str(e)}")
|
||
+ import traceback
|
||
+ traceback.print_exc()
|
||
+ sys.exit(1)
|
||
+
|
||
+
|
||
+if __name__ == '__main__':
|
||
+ main()
|
||
diff --git a/migrate_news_fields.py b/migrate_news_fields.py
|
||
new file mode 100644
|
||
index 0000000..5020516
|
||
--- /dev/null
|
||
+++ b/migrate_news_fields.py
|
||
@@ -0,0 +1,99 @@
|
||
+"""
|
||
+数据库迁移脚本 - 为News表添加source_name和source_icon字段
|
||
+版本:v2.2.0
|
||
+日期:2025-01-30
|
||
+"""
|
||
+import pymysql
|
||
+import os
|
||
+from dotenv import load_dotenv
|
||
+
|
||
+# 加载环境变量
|
||
+load_dotenv()
|
||
+
|
||
+def migrate():
|
||
+ """执行数据库迁移"""
|
||
+ # 数据库配置
|
||
+ db_config = {
|
||
+ 'host': os.environ.get('DB_HOST', 'localhost'),
|
||
+ 'port': int(os.environ.get('DB_PORT', 3306)),
|
||
+ 'user': os.environ.get('DB_USER', 'root'),
|
||
+ 'password': os.environ.get('DB_PASSWORD', ''),
|
||
+ 'database': os.environ.get('DB_NAME', 'ai_nav'),
|
||
+ 'charset': 'utf8mb4'
|
||
+ }
|
||
+
|
||
+ try:
|
||
+ # 连接数据库
|
||
+ connection = pymysql.connect(**db_config)
|
||
+ cursor = connection.cursor()
|
||
+
|
||
+ print("=" * 60)
|
||
+ print("开始执行数据库迁移 v2.2.0")
|
||
+ print("=" * 60)
|
||
+
|
||
+ # 检查字段是否已存在
|
||
+ cursor.execute("""
|
||
+ SELECT COLUMN_NAME
|
||
+ FROM INFORMATION_SCHEMA.COLUMNS
|
||
+ WHERE TABLE_SCHEMA = %s
|
||
+ AND TABLE_NAME = 'news'
|
||
+ AND COLUMN_NAME IN ('source_name', 'source_icon')
|
||
+ """, (db_config['database'],))
|
||
+
|
||
+ existing_columns = [row[0] for row in cursor.fetchall()]
|
||
+
|
||
+ # 添加 source_name 字段
|
||
+ if 'source_name' not in existing_columns:
|
||
+ print("\n1. 添加 source_name 字段...")
|
||
+ cursor.execute("""
|
||
+ ALTER TABLE news
|
||
+ ADD COLUMN source_name VARCHAR(100)
|
||
+ COMMENT '新闻来源网站名称'
|
||
+ AFTER url
|
||
+ """)
|
||
+ print(">>> source_name 字段添加成功")
|
||
+ else:
|
||
+ print("\n1. source_name 字段已存在,跳过")
|
||
+
|
||
+ # 添加 source_icon 字段
|
||
+ if 'source_icon' not in existing_columns:
|
||
+ print("\n2. 添加 source_icon 字段...")
|
||
+ cursor.execute("""
|
||
+ ALTER TABLE news
|
||
+ ADD COLUMN source_icon VARCHAR(500)
|
||
+ COMMENT '新闻来源网站图标URL'
|
||
+ AFTER source_name
|
||
+ """)
|
||
+ print(">>> source_icon 字段添加成功")
|
||
+ else:
|
||
+ print("\n2. source_icon 字段已存在,跳过")
|
||
+
|
||
+ # 提交事务
|
||
+ connection.commit()
|
||
+
|
||
+ print("\n" + "=" * 60)
|
||
+ print(">>> 数据库迁移完成!")
|
||
+ print("=" * 60)
|
||
+
|
||
+ # 显示表结构
|
||
+ print("\n当前 news 表结构:")
|
||
+ cursor.execute("DESCRIBE news")
|
||
+ for row in cursor.fetchall():
|
||
+ print(f" - {row[0]}: {row[1]} {row[2]}")
|
||
+
|
||
+ except Exception as e:
|
||
+ print(f"\n>>> 迁移失败:{str(e)}")
|
||
+ if 'connection' in locals():
|
||
+ connection.rollback()
|
||
+ raise
|
||
+
|
||
+ finally:
|
||
+ if 'cursor' in locals():
|
||
+ cursor.close()
|
||
+ if 'connection' in locals():
|
||
+ connection.close()
|
||
+ print("\n数据库连接已关闭")
|
||
+
|
||
+
|
||
+if __name__ == '__main__':
|
||
+ migrate()
|
||
diff --git a/models.py b/models.py
|
||
index fae1c97..6de887f 100644
|
||
--- a/models.py
|
||
+++ b/models.py
|
||
@@ -90,6 +90,8 @@ class News(db.Model):
|
||
content = db.Column(db.Text, comment='新闻内容')
|
||
news_type = db.Column(db.String(50), default='Industry News', comment='新闻类型')
|
||
url = db.Column(db.String(500), comment='新闻链接')
|
||
+ source_name = db.Column(db.String(100), comment='新闻来源网站名称')
|
||
+ source_icon = db.Column(db.String(500), comment='新闻来源网站图标URL')
|
||
published_at = db.Column(db.DateTime, default=datetime.now, comment='发布时间')
|
||
is_active = db.Column(db.Boolean, default=True, comment='是否启用')
|
||
created_at = db.Column(db.DateTime, default=datetime.now, comment='创建时间')
|
||
@@ -110,6 +112,8 @@ class News(db.Model):
|
||
'content': self.content,
|
||
'news_type': self.news_type,
|
||
'url': self.url,
|
||
+ 'source_name': self.source_name,
|
||
+ 'source_icon': self.source_icon,
|
||
'published_at': self.published_at.strftime('%Y-%m-%d') if self.published_at else None,
|
||
'created_at': self.created_at.strftime('%Y-%m-%d %H:%M:%S') if self.created_at else None
|
||
}
|
||
diff --git a/templates/detail_new.html b/templates/detail_new.html
|
||
index c84192f..18b43a1 100644
|
||
--- a/templates/detail_new.html
|
||
+++ b/templates/detail_new.html
|
||
@@ -628,10 +628,43 @@
|
||
</h2>
|
||
{% for news in news_list %}
|
||
<div class="news-item">
|
||
- <span class="news-badge">{{ news.news_type }}</span>
|
||
- <h4>{{ news.title }}</h4>
|
||
- <p>{{ news.content[:200] }}...</p>
|
||
- <div class="news-date">{{ news.published_at.strftime('%b %d, %Y') }}</div>
|
||
+ <div style="display: flex; justify-content: space-between; align-items: flex-start; margin-bottom: 12px;">
|
||
+ <span class="news-badge">{{ news.news_type }}</span>
|
||
+ {% if news.source_name %}
|
||
+ <div style="display: flex; align-items: center; gap: 6px; font-size: 12px; color: var(--text-muted);">
|
||
+ {% if news.source_icon %}
|
||
+ <img src="{{ news.source_icon }}" alt="{{ news.source_name }}" style="width: 16px; height: 16px; border-radius: 2px;">
|
||
+ {% endif %}
|
||
+ <span>{{ news.source_name }}</span>
|
||
+ </div>
|
||
+ {% endif %}
|
||
+ </div>
|
||
+ <h4>
|
||
+ {% if news.url %}
|
||
+ <a href="{{ news.url }}" target="_blank" rel="noopener noreferrer" style="color: var(--text-primary); text-decoration: none;">
|
||
+ {{ news.title }}
|
||
+ </a>
|
||
+ {% else %}
|
||
+ {{ news.title }}
|
||
+ {% endif %}
|
||
+ </h4>
|
||
+ {% if news.content %}
|
||
+ <p>{{ news.content[:200] }}{% if news.content|length > 200 %}...{% endif %}</p>
|
||
+ {% endif %}
|
||
+ <div style="display: flex; justify-content: space-between; align-items: center;">
|
||
+ <div class="news-date">
|
||
+ {% if news.published_at %}
|
||
+ {{ news.published_at.strftime('%Y年%m月%d日') }}
|
||
+ {% else %}
|
||
+ 未知日期
|
||
+ {% endif %}
|
||
+ </div>
|
||
+ {% if news.url %}
|
||
+ <a href="{{ news.url }}" target="_blank" rel="noopener noreferrer" style="font-size: 12px; color: var(--primary-blue); text-decoration: none;">
|
||
+ 阅读全文 ↗
|
||
+ </a>
|
||
+ {% endif %}
|
||
+ </div>
|
||
</div>
|
||
{% endfor %}
|
||
</div>
|
||
diff --git a/test_news_feature.py b/test_news_feature.py
|
||
new file mode 100644
|
||
index 0000000..3ac30e3
|
||
--- /dev/null
|
||
+++ b/test_news_feature.py
|
||
@@ -0,0 +1,142 @@
|
||
+"""
|
||
+测试v2.2新闻功能 - 完整流程测试
|
||
+"""
|
||
+import os
|
||
+import sys
|
||
+from dotenv import load_dotenv
|
||
+
|
||
+# 加载环境变量
|
||
+load_dotenv()
|
||
+
|
||
+# 添加项目路径
|
||
+sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
|
||
+
|
||
+from app import create_app
|
||
+from models import db, Site, News
|
||
+from utils.news_searcher import NewsSearcher
|
||
+
|
||
+def test_news_feature():
|
||
+ """测试新闻功能"""
|
||
+ print("=" * 60)
|
||
+ print("v2.2 新闻功能测试")
|
||
+ print("=" * 60)
|
||
+
|
||
+ # 创建应用上下文
|
||
+ app = create_app('development')
|
||
+
|
||
+ with app.app_context():
|
||
+ # 1. 测试API配置
|
||
+ print("\n[1/4] 检查API配置...")
|
||
+ api_key = app.config.get('BOCHA_API_KEY')
|
||
+ if not api_key:
|
||
+ print(">>> 错误:BOCHA_API_KEY未配置")
|
||
+ return False
|
||
+ print(f">>> API Key: {api_key[:20]}...")
|
||
+
|
||
+ # 2. 测试数据库连接
|
||
+ print("\n[2/4] 检查数据库...")
|
||
+ try:
|
||
+ site_count = Site.query.filter_by(is_active=True).count()
|
||
+ print(f">>> 找到 {site_count} 个启用的网站")
|
||
+
|
||
+ if site_count == 0:
|
||
+ print(">>> 警告:没有可用的网站")
|
||
+ return False
|
||
+
|
||
+ except Exception as e:
|
||
+ print(f">>> 数据库错误:{e}")
|
||
+ return False
|
||
+
|
||
+ # 3. 测试新闻搜索
|
||
+ print("\n[3/4] 测试新闻搜索...")
|
||
+ searcher = NewsSearcher(api_key)
|
||
+
|
||
+ # 获取第一个网站
|
||
+ site = Site.query.filter_by(is_active=True).first()
|
||
+ print(f">>> 测试网站:{site.name}")
|
||
+
|
||
+ try:
|
||
+ news_items = searcher.search_site_news(
|
||
+ site_name=site.name,
|
||
+ site_url=site.url,
|
||
+ count=3,
|
||
+ freshness='oneWeek'
|
||
+ )
|
||
+
|
||
+ print(f">>> 找到 {len(news_items)} 条新闻")
|
||
+
|
||
+ if news_items:
|
||
+ print("\n新闻列表:")
|
||
+ for i, item in enumerate(news_items, 1):
|
||
+ print(f" {i}. {item['title'][:50]}...")
|
||
+ print(f" 来源:{item.get('site_name', '未知')}")
|
||
+ print(f" URL:{item['url'][:60]}...")
|
||
+
|
||
+ except Exception as e:
|
||
+ print(f">>> 搜索失败:{e}")
|
||
+ return False
|
||
+
|
||
+ # 4. 测试保存到数据库
|
||
+ print(f"\n[4/4] 测试保存到数据库...")
|
||
+
|
||
+ if not news_items:
|
||
+ print(">>> 没有新闻可保存")
|
||
+ return True
|
||
+
|
||
+ try:
|
||
+ saved_count = 0
|
||
+ for item in news_items[:2]: # 只保存前2条作为测试
|
||
+ # 检查是否已存在
|
||
+ existing = News.query.filter_by(
|
||
+ site_id=site.id,
|
||
+ url=item['url']
|
||
+ ).first()
|
||
+
|
||
+ if not existing:
|
||
+ news = News(
|
||
+ site_id=site.id,
|
||
+ title=item['title'],
|
||
+ content=item.get('summary') or item.get('snippet', ''),
|
||
+ url=item['url'],
|
||
+ source_name=item.get('site_name', ''),
|
||
+ source_icon=item.get('site_icon', ''),
|
||
+ published_at=item.get('published_at'),
|
||
+ news_type='Search Result',
|
||
+ is_active=True
|
||
+ )
|
||
+ db.session.add(news)
|
||
+ saved_count += 1
|
||
+
|
||
+ db.session.commit()
|
||
+ print(f">>> 成功保存 {saved_count} 条新闻")
|
||
+
|
||
+ # 验证保存
|
||
+ total_news = News.query.filter_by(site_id=site.id).count()
|
||
+ print(f">>> 该网站共有 {total_news} 条新闻记录")
|
||
+
|
||
+ except Exception as e:
|
||
+ db.session.rollback()
|
||
+ print(f">>> 保存失败:{e}")
|
||
+ return False
|
||
+
|
||
+ print("\n" + "=" * 60)
|
||
+ print(">>> 所有测试通过!")
|
||
+ print("=" * 60)
|
||
+
|
||
+ # 提供下一步建议
|
||
+ print("\n下一步操作:")
|
||
+ print(f"1. 访问网站详情页查看新闻:http://localhost:5000/site/{site.code}")
|
||
+ print(f"2. 访问后台新闻管理:http://localhost:5000/admin/newsadmin/")
|
||
+ print(f"3. 运行定期任务脚本:python fetch_news_cron.py --limit 5")
|
||
+
|
||
+ return True
|
||
+
|
||
+if __name__ == '__main__':
|
||
+ try:
|
||
+ success = test_news_feature()
|
||
+ sys.exit(0 if success else 1)
|
||
+ except Exception as e:
|
||
+ print(f"\n严重错误:{e}")
|
||
+ import traceback
|
||
+ traceback.print_exc()
|
||
+ sys.exit(1)
|
||
diff --git a/utils/news_searcher.py b/utils/news_searcher.py
|
||
new file mode 100644
|
||
index 0000000..452eb13
|
||
--- /dev/null
|
||
+++ b/utils/news_searcher.py
|
||
@@ -0,0 +1,271 @@
|
||
+"""
|
||
+新闻搜索工具 - 使用博查 Web Search API
|
||
+"""
|
||
+import requests
|
||
+import json
|
||
+from datetime import datetime
|
||
+from typing import List, Dict, Optional
|
||
+
|
||
+
|
||
+class NewsSearcher:
|
||
+ """博查新闻搜索器"""
|
||
+
|
||
+ def __init__(self, api_key: str, base_url: str = 'https://api.bocha.cn'):
|
||
+ """
|
||
+ 初始化新闻搜索器
|
||
+
|
||
+ Args:
|
||
+ api_key: 博查API密钥
|
||
+ base_url: API基础URL
|
||
+ """
|
||
+ self.api_key = api_key
|
||
+ self.base_url = base_url
|
||
+ self.endpoint = f"{base_url}/v1/web-search"
|
||
+
|
||
+ def search_news(
|
||
+ self,
|
||
+ query: str,
|
||
+ count: int = 10,
|
||
+ freshness: str = 'oneMonth',
|
||
+ summary: bool = True,
|
||
+ include: Optional[str] = None,
|
||
+ exclude: Optional[str] = None
|
||
+ ) -> Dict:
|
||
+ """
|
||
+ 搜索新闻
|
||
+
|
||
+ Args:
|
||
+ query: 搜索关键词
|
||
+ count: 返回结果数量(1-50)
|
||
+ freshness: 时间范围(noLimit/oneDay/oneWeek/oneMonth/oneYear)
|
||
+ summary: 是否显示摘要
|
||
+ include: 指定搜索的网站范围(多个域名用|或,分隔)
|
||
+ exclude: 排除搜索的网站范围(多个域名用|或,分隔)
|
||
+
|
||
+ Returns:
|
||
+ 搜索结果字典
|
||
+ """
|
||
+ headers = {
|
||
+ 'Authorization': f'Bearer {self.api_key}',
|
||
+ 'Content-Type': 'application/json'
|
||
+ }
|
||
+
|
||
+ payload = {
|
||
+ 'query': query,
|
||
+ 'count': count,
|
||
+ 'freshness': freshness,
|
||
+ 'summary': summary
|
||
+ }
|
||
+
|
||
+ # 添加可选参数
|
||
+ if include:
|
||
+ payload['include'] = include
|
||
+ if exclude:
|
||
+ payload['exclude'] = exclude
|
||
+
|
||
+ try:
|
||
+ response = requests.post(
|
||
+ self.endpoint,
|
||
+ headers=headers,
|
||
+ data=json.dumps(payload),
|
||
+ timeout=30
|
||
+ )
|
||
+ response.raise_for_status()
|
||
+ return response.json()
|
||
+
|
||
+ except requests.exceptions.RequestException as e:
|
||
+ return {
|
||
+ 'success': False,
|
||
+ 'error': str(e),
|
||
+ 'code': getattr(response, 'status_code', None) if 'response' in locals() else None
|
||
+ }
|
||
+
|
||
+ def parse_news_items(self, search_result: Dict) -> List[Dict]:
|
||
+ """
|
||
+ 解析搜索结果为新闻列表
|
||
+
|
||
+ Args:
|
||
+ search_result: 博查API返回的搜索结果
|
||
+
|
||
+ Returns:
|
||
+ 新闻列表,每个新闻包含:title, url, snippet, summary, site_name, published_at等
|
||
+ """
|
||
+ news_items = []
|
||
+
|
||
+ # 检查返回数据格式
|
||
+ if 'data' not in search_result:
|
||
+ return news_items
|
||
+
|
||
+ data = search_result['data']
|
||
+ if 'webPages' not in data or 'value' not in data['webPages']:
|
||
+ return news_items
|
||
+
|
||
+ # 解析每条新闻
|
||
+ for item in data['webPages']['value']:
|
||
+ news_item = {
|
||
+ 'title': item.get('name', ''),
|
||
+ 'url': item.get('url', ''),
|
||
+ 'snippet': item.get('snippet', ''),
|
||
+ 'summary': item.get('summary', ''),
|
||
+ 'site_name': item.get('siteName', ''),
|
||
+ 'site_icon': item.get('siteIcon', ''),
|
||
+ 'published_at': self._parse_date(item.get('datePublished')),
|
||
+ 'display_url': item.get('displayUrl', ''),
|
||
+ 'language': item.get('language', ''),
|
||
+ }
|
||
+ news_items.append(news_item)
|
||
+
|
||
+ return news_items
|
||
+
|
||
+ def search_site_news(
|
||
+ self,
|
||
+ site_name: str,
|
||
+ site_url: Optional[str] = None,
|
||
+ count: int = 10,
|
||
+ freshness: str = 'oneMonth'
|
||
+ ) -> List[Dict]:
|
||
+ """
|
||
+ 搜索特定网站的相关新闻
|
||
+
|
||
+ Args:
|
||
+ site_name: 网站名称(用于搜索关键词)
|
||
+ site_url: 网站URL(可选,用于排除网站自身)
|
||
+ count: 返回结果数量
|
||
+ freshness: 时间范围
|
||
+
|
||
+ Returns:
|
||
+ 新闻列表
|
||
+ """
|
||
+ # 构建搜索关键词:网站名称 + "最新" + "新闻"
|
||
+ query = f"{site_name} 最新 新闻"
|
||
+
|
||
+ # 如果提供了网站URL,排除网站自身的结果
|
||
+ exclude = None
|
||
+ if site_url:
|
||
+ # 提取域名
|
||
+ try:
|
||
+ from urllib.parse import urlparse
|
||
+ parsed = urlparse(site_url)
|
||
+ domain = parsed.netloc or parsed.path
|
||
+ # 移除 www. 前缀
|
||
+ domain = domain.replace('www.', '')
|
||
+ exclude = domain
|
||
+ except Exception:
|
||
+ pass
|
||
+
|
||
+ # 执行搜索
|
||
+ search_result = self.search_news(
|
||
+ query=query,
|
||
+ count=count,
|
||
+ freshness=freshness,
|
||
+ summary=True,
|
||
+ exclude=exclude
|
||
+ )
|
||
+
|
||
+ # 解析结果
|
||
+ return self.parse_news_items(search_result)
|
||
+
|
||
+ def _parse_date(self, date_str: Optional[str]) -> Optional[datetime]:
|
||
+ """
|
||
+ 解析日期字符串
|
||
+
|
||
+ Args:
|
||
+ date_str: 日期字符串(例如:2025-02-23T08:18:30+08:00)
|
||
+
|
||
+ Returns:
|
||
+ datetime对象,如果解析失败返回None
|
||
+ """
|
||
+ if not date_str:
|
||
+ return None
|
||
+
|
||
+ try:
|
||
+ # 尝试解析 ISO 8601 格式
|
||
+ # 博查API返回格式:2025-02-23T08:18:30+08:00
|
||
+ if '+' in date_str or 'Z' in date_str:
|
||
+ # 使用 fromisoformat(Python 3.7+)
|
||
+ return datetime.fromisoformat(date_str.replace('Z', '+00:00'))
|
||
+ else:
|
||
+ # 简单格式
|
||
+ return datetime.strptime(date_str, '%Y-%m-%dT%H:%M:%S')
|
||
+ except Exception:
|
||
+ # 如果解析失败,返回None
|
||
+ return None
|
||
+
|
||
+ def format_news_for_display(self, news_items: List[Dict]) -> List[Dict]:
|
||
+ """
|
||
+ 格式化新闻用于前端展示
|
||
+
|
||
+ Args:
|
||
+ news_items: 新闻列表
|
||
+
|
||
+ Returns:
|
||
+ 格式化后的新闻列表
|
||
+ """
|
||
+ formatted_news = []
|
||
+
|
||
+ for item in news_items:
|
||
+ formatted_item = {
|
||
+ 'title': item['title'],
|
||
+ 'url': item['url'],
|
||
+ 'description': item.get('summary') or item.get('snippet', ''),
|
||
+ 'source': item.get('site_name', '未知来源'),
|
||
+ 'published_date': self._format_date(item.get('published_at')),
|
||
+ 'icon': item.get('site_icon', '')
|
||
+ }
|
||
+ formatted_news.append(formatted_item)
|
||
+
|
||
+ return formatted_news
|
||
+
|
||
+ def _format_date(self, dt: Optional[datetime]) -> str:
|
||
+ """
|
||
+ 格式化日期用于显示
|
||
+
|
||
+ Args:
|
||
+ dt: datetime对象
|
||
+
|
||
+ Returns:
|
||
+ 格式化的日期字符串
|
||
+ """
|
||
+ if not dt:
|
||
+ return '未知日期'
|
||
+
|
||
+ try:
|
||
+ # 返回格式:2025-01-30
|
||
+ return dt.strftime('%Y-%m-%d')
|
||
+ except Exception:
|
||
+ return '未知日期'
|
||
+
|
||
+
|
||
+# 测试代码
|
||
+if __name__ == '__main__':
|
||
+ import os
|
||
+ from dotenv import load_dotenv
|
||
+
|
||
+ load_dotenv()
|
||
+
|
||
+ # 从环境变量获取API密钥
|
||
+ api_key = os.environ.get('BOCHA_API_KEY')
|
||
+ if not api_key:
|
||
+ print("错误:未设置BOCHA_API_KEY环境变量")
|
||
+ exit(1)
|
||
+
|
||
+ # 创建搜索器
|
||
+ searcher = NewsSearcher(api_key)
|
||
+
|
||
+ # 测试搜索
|
||
+ print("正在搜索:ChatGPT 最新新闻...")
|
||
+ news_items = searcher.search_site_news(
|
||
+ site_name='ChatGPT',
|
||
+ count=5,
|
||
+ freshness='oneWeek'
|
||
+ )
|
||
+
|
||
+ # 显示结果
|
||
+ print(f"\n找到 {len(news_items)} 条新闻:\n")
|
||
+ for i, news in enumerate(news_items, 1):
|
||
+ print(f"{i}. {news['title']}")
|
||
+ print(f" 来源:{news['site_name']}")
|
||
+ print(f" 日期:{searcher._format_date(news['published_at'])}")
|
||
+ print(f" URL:{news['url']}")
|
||
+ print(f" 摘要:{news.get('summary', news.get('snippet', ''))[:100]}...")
|
||
+ print()
|
||
--
|
||
2.50.1.windows.1
|
||
|
||
|
||
From 495248bf5f161d83fb20246fdbf7c59d88959b27 Mon Sep 17 00:00:00 2001
|
||
From: Jowe <123822645+Selei1983@users.noreply.github.com>
|
||
Date: Tue, 30 Dec 2025 22:31:51 +0800
|
||
Subject: [PATCH 2/2] =?UTF-8?q?feat:=20v2.2.0=20=E6=99=BA=E8=83=BD?=
|
||
=?UTF-8?q?=E6=96=B0=E9=97=BB=E6=9B=B4=E6=96=B0=E5=92=8C=E5=B8=83=E5=B1=80?=
|
||
=?UTF-8?q?=E4=BC=98=E5=8C=96?=
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=UTF-8
|
||
Content-Transfer-Encoding: 8bit
|
||
|
||
- 实现每日首次访问自动更新新闻功能
|
||
- 每个网站获取3条一周内的新闻
|
||
- 新闻模块放置在左侧主栏
|
||
- 相似推荐移至右侧边栏
|
||
- 自动去重防止重复新闻
|
||
---
|
||
app.py | 64 +++++++++++++++++++++++++++++++++++++++
|
||
templates/detail_new.html | 42 ++++++++++---------------
|
||
2 files changed, 80 insertions(+), 26 deletions(-)
|
||
|
||
diff --git a/app.py b/app.py
|
||
index b0f27d4..2160534 100644
|
||
--- a/app.py
|
||
+++ b/app.py
|
||
@@ -116,6 +116,70 @@ def create_app(config_name='default'):
|
||
site.view_count += 1
|
||
db.session.commit()
|
||
|
||
+ # 智能新闻更新:检查今天是否已更新过新闻
|
||
+ from datetime import date
|
||
+ today = date.today()
|
||
+
|
||
+ # 检查该网站最新一条新闻的创建时间
|
||
+ latest_news = News.query.filter_by(
|
||
+ site_id=site.id
|
||
+ ).order_by(News.created_at.desc()).first()
|
||
+
|
||
+ # 判断是否需要更新新闻
|
||
+ need_update = False
|
||
+ if not latest_news:
|
||
+ # 没有任何新闻,需要获取
|
||
+ need_update = True
|
||
+ elif latest_news.created_at.date() < today:
|
||
+ # 最新新闻不是今天创建的,需要更新
|
||
+ need_update = True
|
||
+
|
||
+ # 如果需要更新,自动获取最新新闻
|
||
+ if need_update:
|
||
+ api_key = app.config.get('BOCHA_API_KEY')
|
||
+ if api_key:
|
||
+ try:
|
||
+ # 创建新闻搜索器
|
||
+ searcher = NewsSearcher(api_key)
|
||
+
|
||
+ # 获取新闻(限制3条,一周内的)
|
||
+ news_items = searcher.search_site_news(
|
||
+ site_name=site.name,
|
||
+ site_url=site.url,
|
||
+ count=3,
|
||
+ freshness='oneWeek'
|
||
+ )
|
||
+
|
||
+ # 保存新闻到数据库
|
||
+ if news_items:
|
||
+ for item in news_items:
|
||
+ # 检查是否已存在(根据URL去重)
|
||
+ existing = News.query.filter_by(
|
||
+ site_id=site.id,
|
||
+ url=item['url']
|
||
+ ).first()
|
||
+
|
||
+ if not existing:
|
||
+ news = News(
|
||
+ site_id=site.id,
|
||
+ title=item['title'],
|
||
+ content=item.get('summary') or item.get('snippet', ''),
|
||
+ url=item['url'],
|
||
+ source_name=item.get('site_name', ''),
|
||
+ source_icon=item.get('site_icon', ''),
|
||
+ published_at=item.get('published_at'),
|
||
+ news_type='Search Result',
|
||
+ is_active=True
|
||
+ )
|
||
+ db.session.add(news)
|
||
+
|
||
+ db.session.commit()
|
||
+
|
||
+ except Exception as e:
|
||
+ # 获取新闻失败,不影响页面显示
|
||
+ print(f"自动获取新闻失败:{str(e)}")
|
||
+ db.session.rollback()
|
||
+
|
||
# 获取该网站的相关新闻(最多显示5条)
|
||
news_list = News.query.filter_by(
|
||
site_id=site.id,
|
||
diff --git a/templates/detail_new.html b/templates/detail_new.html
|
||
index 18b43a1..b6ac54b 100644
|
||
--- a/templates/detail_new.html
|
||
+++ b/templates/detail_new.html
|
||
@@ -669,7 +669,10 @@
|
||
{% endfor %}
|
||
</div>
|
||
{% endif %}
|
||
+ </div>
|
||
|
||
+ <!-- 侧边栏 -->
|
||
+ <div class="sidebar-column">
|
||
<!-- Similar Recommendations -->
|
||
{% if recommended_sites %}
|
||
<div class="content-block">
|
||
@@ -677,35 +680,22 @@
|
||
<span>✨</span>
|
||
相似推荐
|
||
</h2>
|
||
- <div class="recommendations-grid">
|
||
- {% for rec_site in recommended_sites %}
|
||
- <a href="/site/{{ rec_site.code }}" class="recommendation-card">
|
||
- {% if rec_site.logo %}
|
||
- <img src="{{ rec_site.logo }}" alt="{{ rec_site.name }}" class="rec-logo">
|
||
- {% else %}
|
||
- <div class="rec-logo" style="background: linear-gradient(135deg, #0ea5e9 0%, #8b5cf6 100%);"></div>
|
||
- {% endif %}
|
||
- <div class="rec-info">
|
||
- <h4>{{ rec_site.name }}</h4>
|
||
- <p>{{ rec_site.short_desc or rec_site.description }}</p>
|
||
- <div class="rec-tags">
|
||
- {% for tag in rec_site.tags[:2] %}
|
||
- <span class="rec-tag">{{ tag.name }}</span>
|
||
- {% endfor %}
|
||
- </div>
|
||
- </div>
|
||
- <span class="arrow-icon">↗</span>
|
||
- </a>
|
||
- {% endfor %}
|
||
- </div>
|
||
+ {% for rec_site in recommended_sites %}
|
||
+ <a href="/site/{{ rec_site.code }}" class="recommendation-card" style="display: flex; gap: 12px; padding: 16px; border: 1px solid var(--border-color); border-radius: 12px; margin-bottom: 12px; text-decoration: none; transition: all 0.2s;">
|
||
+ {% if rec_site.logo %}
|
||
+ <img src="{{ rec_site.logo }}" alt="{{ rec_site.name }}" style="width: 48px; height: 48px; border-radius: 8px; flex-shrink: 0;">
|
||
+ {% else %}
|
||
+ <div style="width: 48px; height: 48px; border-radius: 8px; background: linear-gradient(135deg, #0ea5e9 0%, #8b5cf6 100%); flex-shrink: 0;"></div>
|
||
+ {% endif %}
|
||
+ <div style="flex: 1; min-width: 0;">
|
||
+ <h4 style="font-size: 14px; font-weight: 600; margin: 0 0 4px 0; color: var(--text-primary); overflow: hidden; text-overflow: ellipsis; white-space: nowrap;">{{ rec_site.name }}</h4>
|
||
+ <p style="font-size: 12px; color: var(--text-secondary); margin: 0; overflow: hidden; text-overflow: ellipsis; display: -webkit-box; -webkit-line-clamp: 2; -webkit-box-orient: vertical;">{{ rec_site.short_desc or rec_site.description }}</p>
|
||
+ </div>
|
||
+ </a>
|
||
+ {% endfor %}
|
||
</div>
|
||
{% endif %}
|
||
</div>
|
||
-
|
||
- <!-- 侧边栏 -->
|
||
- <div class="sidebar-column">
|
||
- <!-- 预留侧边栏位置,可以后续添加其他模块 -->
|
||
- </div>
|
||
</div>
|
||
</div>
|
||
{% endblock %}
|
||
--
|
||
2.50.1.windows.1
|
||
|