How to get the transcription grouped by chapters? #225

viniciusarruda · 2023-09-08T14:13:44Z

viniciusarruda
Sep 8, 2023

How to get the transcription grouped by chapters?
Or, an alternative way to get the video chapters range and title.

jdepoix · 2023-09-21T07:07:17Z

jdepoix
Sep 21, 2023
Maintainer

Hi @viniciusarruda, this module does not support retrieving chapters. You would have to find some other way to find out the chapters and then group them retrieved transcript by their timestamps.

2 replies

viniciusarruda Sep 21, 2023
Author

Do you have any idea on how can I do it?

jdepoix Sep 21, 2023
Maintainer

I am sorry, I haven't tried to retrieve chapters myself yet, so you'll have to do your own research 😊

Gardusio · 2023-09-25T13:43:38Z

Gardusio
Sep 25, 2023

I had the same goal to retrieve chapters titles and timestamps, here's how I've managed to extract chapters :

   #"start" field is in millis you can easily convert
   def extract_chapter_info(obj):
      return {
                "title": obj.get("chapterRenderer").get("title").get("simpleText"),
                "start": obj.get("chapterRenderer").get("timeRangeStartMillis"),
      }
   
   def _extract_chapters_json(self, html, video_id):
        splitted_html = html.split('"chapters":')
        
        # HANDLE NO CHAPTERS SCENARIO AS YOU LIKE
        
        chapters_json = json.loads(
            splitted_html[1].split(',"trackingParams"')[0].replace("\n", "")
        )

        return list(map(extract_chapter_info, chapters_json))

This is basically the same of _extract_captions_json here

hope this helps :) don't know if this could become a PR

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to get the transcription grouped by chapters? #225

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

How to get the transcription grouped by chapters? #225

viniciusarruda Sep 8, 2023

Replies: 2 comments · 2 replies

jdepoix Sep 21, 2023 Maintainer

viniciusarruda Sep 21, 2023 Author

jdepoix Sep 21, 2023 Maintainer

Gardusio Sep 25, 2023

viniciusarruda
Sep 8, 2023

Replies: 2 comments 2 replies

jdepoix
Sep 21, 2023
Maintainer

viniciusarruda Sep 21, 2023
Author

jdepoix Sep 21, 2023
Maintainer

Gardusio
Sep 25, 2023