-
-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ feat: 在.nfo中为UGC视频添加作者、标签、分区字段 #132
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感谢贡献,但有些细节麻烦看一下~
另外需要注意下 CI 中的 Lint error~
yutto/api/ugc_video.py
Outdated
|
||
actors: list[Actor] = [] | ||
if res_json_data.get("staff") and isinstance(res_json_data["staff"], list): | ||
_index: int = 0 | ||
for staff in res_json_data["staff"]: | ||
actors.append( | ||
Actor( | ||
name=staff["name"], | ||
role=staff["title"], | ||
thumb=staff["face"], | ||
profile=f"https://space.bilibili.com/{staff['mid']}", | ||
order=_index, | ||
) | ||
) | ||
_index += 1 | ||
elif res_json_data.get("owner") and isinstance(res_json_data["owner"], dict): | ||
actors.append( | ||
Actor( | ||
name=res_json_data["owner"]["name"], | ||
role="UP主", | ||
thumb=res_json_data["owner"]["face"], | ||
profile=f"https://space.bilibili.com/{res_json_data['owner']['mid']}", | ||
order=0, | ||
) | ||
) | ||
else: | ||
Logger.warning(f"视频 {avid} 未找到演职人员信息") | ||
|
||
genres: list[str] = [] | ||
if res_json_data.get("tname") and isinstance(res_json_data["tname"], str): | ||
genres.append(res_json_data["tname"]) | ||
|
||
tags: list[str] = await get_ugc_video_tag(session, avid) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这段逻辑是否可以提取到一个函数里呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里有些奇怪,我还没仔细看你函数颗粒度,所以把生成metadata的都放在一起了,如果看起来比较长的话,可以补充一个ugc_info函数,稍等
@@ -45,6 +45,9 @@ class _UgcVideoInfo(TypedDict): | |||
pubdate: int | |||
description: str | |||
pages: list[_UgcVideoPageInfo] | |||
genre: list[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请问 genre
是什么的缩写呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
看到 commit message 里的描述了 genre 是分区是嘛?好奇怪的缩写 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有一说一我也觉得奇怪,在emby刮削的时候,genre被认为是「流派」...而且我自己把ugc内容当做movie来处理的,确实genre能识别;
于是使用genre对应分区名字;
如果有更好的选择,我觉得也可以改掉,目前这个nfo我还没有在emby和infuse上测试,尚不清楚genre能不能工作
class MetaData(TypedDict): | ||
title: str | ||
show_title: str | ||
plot: str | ||
thumb: str | ||
premiered: str | ||
dateadded: str | ||
actor: list[Actor] | ||
genre: list[str] | ||
tag: list[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metadata 新增了字段,Bangumi 和 Cheese 是否可以对齐一下呢?可以先空着,记个 TODO 即可
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
担心Bangumi和Cheese和UGC的MetaData格式不同,我觉得可以先不加,不加有影响吗?如果不加会导致bangumi报错的话,那就加个可好了...实际上MetaData应该区分UGCMetaData, BangumiMetaData吧?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不加有影响吗
不加会影响类型提示吧,Linter(pyright)应该会过不了
实际上MetaData应该区分UGCMetaData, BangumiMetaData吧?
Metadata 这块因为我没有深度使用过细节上不太清楚,不过确实是可以这样子的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
本地没lint到...ok,刚刚看了下,我准备加上这几个actor
genre
tag
website
嗯..稍等,还有一些想改的
|
@@ -147,6 +152,7 @@ async def extract_ugc_video_data( | |||
"series_title": UNKNOWN, | |||
"pubdate": UNKNOWN, | |||
"download_date": ugc_video_info["metadata"]["dateadded"], | |||
"owner_uid": owner_uid, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里也在 cheese 和 bangumi 里补充一下吧,直接 UNKNOWN
即可,另外文档(README.md
)里的「存放子路径模板」需要加一下这一个字段,并在表格里说明下什么情况会有这个字段~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
稍晚一些,可能要等几天,我先测试emby的nfo,一起调整
yutto/extractor/common.py
Outdated
@@ -139,6 +139,11 @@ async def extract_ugc_video_data( | |||
subtitles = await get_ugc_video_subtitles(session, avid, cid) if args.require_subtitle else [] | |||
danmaku = await get_danmaku(session, cid, args.danmaku_format) if args.require_danmaku else EmptyDanmakuData | |||
metadata = ugc_video_info["metadata"] if args.require_metadata else None | |||
owner_uid: str | None = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里有可能是 None 吗?如果可能是 None 的话,是怎么通过 PathTemplateVariableDict
的类型检查的呢?奇怪……
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯..考虑改为str
好了,UGC内容理论上一定有uid,但是非UGC就不一定了
在emby中选择混合/家庭内容,最终都会被刮削到movie, 整体来看yutto生成的metadata格式有点和emby不一致,导致无法识别,目前尚不清楚其他用户是如何使用yutto生成的metadata的。 几个主要的区别在于:
一个emby刮削完成后的参考是: <?xml version="1.0" encoding="utf-8" standalone="yes"?>
<movie>
<plot><![CDATA[这里是简介~~~~]]></plot>
<outline />
<lockdata>false</lockdata>
<dateadded>2023-05-10 00:13:55</dateadded>
<title>这是标题~~~</title>
<actor>
<name>人物2</name>
<type>Actor</type>
</actor>
<actor>
<name>人物3</name>
<role>UP主</role>
<type>Actor</type>
</actor>
<actor>
<name>人物1</name>
<type>Producer</type>
</actor>
<director>人物4</director>
<year>2023</year>
<sorttitle>这也是标题~~~~</sorttitle>
<premiered>2023-05-08</premiered>
<releasedate>2023-05-08</releasedate>
<genre>风格1</genre>
<genre>风格2</genre>
<studio>工作室1</studio>
<studio>工作室2</studio>
<tag>标签1</tag>
<tag>标签2</tag>
<fileinfo>
<streamdetails />
</fileinfo>
<show_title>这也是标题~~~</show_title>
<source />
<original_filename />
<website>https://www.bilibili.com/video/BV</website>
</movie> |
这个我也不太清楚,这个功能是 @WhileKing 最初在 #20 添加的,如果 @WhileKing 认为可以的话,这些字段是可以修改的~ |
没事了,看到上面的回复 edit 过了 😂 |
@SigureMo 可以看下上面关于emby生成的nfo和原来metadata生成的nfo对比说明,主要问题在于:在emby支持的.nfo中,列表中的元素不希望使用 这里存在一个假设,我认为需要metadata的用户,应该都是要刮削的,那么符合emby应该是较为通用的方案。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
动机
cloese #15
解决方案
注意,此PR对metadata有重大修改,由于dicttoxml生成的xml无法移除标签和重名标签,改用dict2xml生成,通过emby刮削验证
yutto 'https://www.bilibili.com/video/BV1vZ4y1M7mQ/' -d 'download' --with-metadata -tp='{owner_uid}/{name}'
将存放到download/100969474
目录类型