Merge branch 'master' of https://github.com/yt-dlp/yt-dlp into ytdlp

* 'master' of https://github.com/yt-dlp/yt-dlp:
  [version] update
  Release 2023.01.02
  [cleanup] Misc
  [docs] Improvements
  [extractor/generic] Decode unicode-escaped embed URLs (#5919)
  Update to ytdl-commit-195f22f6
This commit is contained in:
Lesmiscore
2023-01-03 11:39:20 +09:00
29 changed files with 797 additions and 255 deletions

View File

@@ -18,7 +18,7 @@ body:
options:
- label: I'm reporting a broken site
required: true
- label: I've verified that I'm running yt-dlp version **2022.11.11** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.01.02** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@@ -62,7 +62,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.11.11 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.01.02 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -70,8 +70,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
Latest version: 2023.01.02, Current version: 2023.01.02
yt-dlp is up to date (2023.01.02)
<more lines>
render: shell
validations:

View File

@@ -18,7 +18,7 @@ body:
options:
- label: I'm reporting a new site support request
required: true
- label: I've verified that I'm running yt-dlp version **2022.11.11** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.01.02** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@@ -74,7 +74,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.11.11 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.01.02 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -82,8 +82,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
Latest version: 2023.01.02, Current version: 2023.01.02
yt-dlp is up to date (2023.01.02)
<more lines>
render: shell
validations:

View File

@@ -18,7 +18,7 @@ body:
options:
- label: I'm requesting a site-specific feature
required: true
- label: I've verified that I'm running yt-dlp version **2022.11.11** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.01.02** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@@ -70,7 +70,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.11.11 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.01.02 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -78,8 +78,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
Latest version: 2023.01.02, Current version: 2023.01.02
yt-dlp is up to date (2023.01.02)
<more lines>
render: shell
validations:

View File

@@ -18,7 +18,7 @@ body:
options:
- label: I'm reporting a bug unrelated to a specific site
required: true
- label: I've verified that I'm running yt-dlp version **2022.11.11** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.01.02** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@@ -55,7 +55,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.11.11 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.01.02 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -63,8 +63,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
Latest version: 2023.01.02, Current version: 2023.01.02
yt-dlp is up to date (2023.01.02)
<more lines>
render: shell
validations:

View File

@@ -20,7 +20,7 @@ body:
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **2022.11.11** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.01.02** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
@@ -51,7 +51,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.11.11 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.01.02 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -59,7 +59,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
Latest version: 2023.01.02, Current version: 2023.01.02
yt-dlp is up to date (2023.01.02)
<more lines>
render: shell

View File

@@ -26,7 +26,7 @@ body:
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **2022.11.11** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.01.02** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates
required: true
@@ -57,7 +57,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.11.11 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.01.02 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -65,7 +65,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
Latest version: 2023.01.02, Current version: 2023.01.02
yt-dlp is up to date (2023.01.02)
<more lines>
render: shell

View File

@@ -2,8 +2,6 @@
### Description of your *pull request* and other information
</details>
<!--
Explanation of your *pull request* in arbitrary form goes here. Please **make sure the description explains the purpose and effect** of your *pull request* and is worded well enough to be understood. Provide as much **context and examples** as possible
@@ -41,3 +39,5 @@ Fixes #
- [ ] New extractor ([Piracy websites will not be accepted](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy))
- [ ] Core bug fix/improvement
- [ ] New feature (It is strongly [recommended to open an issue first](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#adding-new-feature-or-making-overarching-changes))
</details>

5
.gitignore vendored
View File

@@ -30,6 +30,7 @@ cookies
*.f4v
*.flac
*.flv
*.gif
*.jpeg
*.jpg
*.m4a
@@ -163,8 +164,8 @@ cookies.txt
*/extractor/lazy_extractors.py
# Plugins
ytdlp_plugins/*
yt-dlp-plugins/*
ytdlp_plugins/
yt-dlp-plugins
# Added by GH Codespaces
pythonenv3.8/

View File

@@ -3,6 +3,7 @@ shirt-dev (collaborator)
coletdjnz/colethedj (collaborator)
Ashish0804 (collaborator)
nao20010128nao/Lesmiscore (collaborator)
bashonly (collaborator)
h-h-h-h
pauldubois98
nixxo
@@ -295,7 +296,6 @@ Mehavoid
winterbird-code
yashkc2025
aldoridhoni
bashonly
jacobtruman
masta79
palewire
@@ -357,3 +357,21 @@ SG5
the-marenga
tkgmomosheep
vitkhab
glensc
synthpop123
tntmod54321
milkknife
Bnyro
CapacitorSet
stelcodes
skbeh
muddi900
digitall
chengzhicn
mexus
JChris246
redraskal
Spicadox
barsnick
docbender
KurtBestor

View File

@@ -11,6 +11,135 @@
-->
## 2023.01.02
* **Improve plugin architecture** by [Grub4K](https://github.com/Grub4K), [coletdjnz](https://github.com/coletdjnz), [flashdagger](https://github.com/flashdagger), [pukkandan](https://github.com/pukkandan)
* Plugins can be loaded in any distribution of yt-dlp (binary, pip, source, etc.) and can be distributed and installed as packages. See [the readme](https://github.com/yt-dlp/yt-dlp/tree/05997b6e98e638d97d409c65bb5eb86da68f3b64#plugins) for more information
* Add `--compat-options 2021,2022`
* This allows devs to change defaults and make other potentially breaking changes more easily. If you need everything to work exactly as-is, put Use `--compat 2022` in your config to guard against future compat changes.
* [downloader/aria2c] Native progress for aria2c via RPC by [Lesmiscore](https://github.com/Lesmiscore), [pukkandan](https://github.com/pukkandan)
* Merge youtube-dl: Upto [commit/195f22f](https://github.com/ytdl-org/youtube-dl/commit/195f22f6) by [Grub4k](https://github.com/Grub4k), [pukkandan](https://github.com/pukkandan)
* Add pre-processor stage `video`
* Let `--parse/replace-in-metadata` run at any post-processing stage
* Add `--enable-file-urls` by [coletdjnz](https://github.com/coletdjnz)
* Add new field `aspect_ratio`
* Add `ac4` to known codecs
* Add `weba` to known extensions
* [FFmpegVideoConvertor] Add `gif` to `--recode-video`
* Add message when there are no subtitles/thumbnails
* Deprioritize HEVC-over-FLV formats by [Lesmiscore](https://github.com/Lesmiscore)
* Make early reject of `--match-filter` stricter
* Fix `--cookies-from-browser` CLI parsing
* Fix `original_url` in playlists
* Fix bug in writing playlist info-json
* Fix bugs in `PlaylistEntries`
* [downloader/ffmpeg] Fix headers for video+audio formats by [Grub4K](https://github.com/Grub4K), [bashonly](https://github.com/bashonly)
* [extractor] Add a way to distinguish IEs that returns only videos
* [extractor] Implement universal format sorting and deprecate `_sort_formats`
* [extractor] Let `_extract_format` functions obey `--ignore-no-formats`
* [extractor/generic] Add `fragment_query` extractor arg for DASH and HLS by [bashonly](https://github.com/bashonly), [pukkandan](https://github.com/pukkandan)
* [extractor/generic] Decode unicode-escaped embed URLs by [bashonly](https://github.com/bashonly)
* [extractor/generic] Don't report redirect to https
* [extractor/generic] Fix JSON LD manifest extraction by [bashonly](https://github.com/bashonly), [pukkandan](https://github.com/pukkandan)
* [extractor/generic] Use `Accept-Encoding: identity` for initial request by [coletdjnz](https://github.com/coletdjnz)
* [FormatSort] Add `mov` to `vext`
* [jsinterp] Escape regex that looks like nested set
* [webvtt] Handle premature EOF by [flashdagger](https://github.com/flashdagger)
* [utils] `classproperty`: Add cache support
* [utils] `get_exe_version`: Detect broken executables by [dirkf](https://github.com/dirkf), [pukkandan](https://github.com/pukkandan)
* [utils] `js_to_json`: Fix bug in [f55523c](https://github.com/yt-dlp/yt-dlp/commit/f55523c) by [ChillingPepper](https://github.com/ChillingPepper), [pukkandan](https://github.com/pukkandan)
* [utils] Make `ExtractorError` mutable
* [utils] Move `FileDownloader.parse_bytes` into utils
* [utils] Move format sorting code into `utils`
* [utils] `windows_enable_vt_mode`: Proper implementation by [Grub4K](https://github.com/Grub4K)
* [update] Workaround [#5632](https://github.com/yt-dlp/yt-dlp/issues/5632)
* [docs] Improvements
* [cleanup] Misc fixes and cleanup
* [cleanup] Use `random.choices` by [freezboltz](https://github.com/freezboltz)
* [extractor/airtv] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/amazonminitv] Add extractors by [GautamMKGarg](https://github.com/GautamMKGarg), [nyuszika7h](https://github.com/nyuszika7h)
* [extractor/beatbump] Add extractors by [Bobscorn](https://github.com/Bobscorn), [pukkandan](https://github.com/pukkandan)
* [extractor/europarl] Add EuroParlWebstream extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/kanal2] Add extractor by [bashonly](https://github.com/bashonly), [glensc](https://github.com/glensc), [pukkandan](https://github.com/pukkandan)
* [extractor/kankanews] Add extractor by [synthpop123](https://github.com/synthpop123)
* [extractor/kick] Add extractor by [bashonly](https://github.com/bashonly)
* [extractor/mediastream] Add extractor by [HobbyistDev](https://github.com/HobbyistDev), [elyse0](https://github.com/elyse0)
* [extractor/noice] Add NoicePodcast extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/oneplace] Add OnePlacePodcast extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/rumble] Add RumbleIE extractor by [flashdagger](https://github.com/flashdagger)
* [extractor/screencastify] Add extractor by [bashonly](https://github.com/bashonly)
* [extractor/trtcocuk] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/Veoh] Add user extractor by [tntmod54321](https://github.com/tntmod54321)
* [extractor/videoken] Add extractors by [bashonly](https://github.com/bashonly)
* [extractor/webcamerapl] Add extractor by [milkknife](https://github.com/milkknife)
* [extractor/amazon] Add `AmazonReviews` extractor by [bashonly](https://github.com/bashonly)
* [extractor/netverse] Add `NetverseSearch` extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/vimeo] Add `VimeoProIE` by [bashonly](https://github.com/bashonly), [pukkandan](https://github.com/pukkandan)
* [extractor/xiami] Remove extractors by [synthpop123](https://github.com/synthpop123)
* [extractor/youtube] Add `piped.video` by [Bnyro](https://github.com/Bnyro)
* [extractor/youtube] Consider language in format de-duplication
* [extractor/youtube] Extract DRC formats
* [extractor/youtube] Fix `ytuser:`
* [extractor/youtube] Fix bug in handling of music URLs
* [extractor/youtube] Subtitles cannot be translated to `und`
* [extractor/youtube:tab] Extract metadata from channel items by [coletdjnz](https://github.com/coletdjnz)
* [extractor/ARD] Add vtt subtitles by [CapacitorSet](https://github.com/CapacitorSet)
* [extractor/ArteTV] Extract chapters by [bashonly](https://github.com/bashonly), [iw0nderhow](https://github.com/iw0nderhow)
* [extractor/bandcamp] Add `album_artist` by [stelcodes](https://github.com/stelcodes)
* [extractor/bilibili] Fix `--no-playlist` for anthology
* [extractor/bilibili] Improve `_VALID_URL` by [skbeh](https://github.com/skbeh)
* [extractor/biliintl:series] Make partial download of series faster
* [extractor/BiliLive] Fix extractor
* [extractor/brightcove] Add `BrightcoveNewBaseIE` and fix embed extraction
* [extractor/cda] Support premium and misc improvements by [selfisekai](https://github.com/selfisekai)
* [extractor/ciscowebex] Support password-protected videos by [damianoamatruda](https://github.com/damianoamatruda)
* [extractor/curiositystream] Fix auth by [mnn](https://github.com/mnn)
* [extractor/embedly] Handle vimeo embeds
* [extractor/fifa] Fix Preplay extraction by [dirkf](https://github.com/dirkf)
* [extractor/foxsports] Fix extractor by [bashonly](https://github.com/bashonly)
* [extractor/gronkh] Fix `_VALID_URL` by [muddi900](https://github.com/muddi900)
* [extractor/hotstar] Improve format metadata
* [extractor/iqiyi] Fix `Iq` JS regex by [bashonly](https://github.com/bashonly)
* [extractor/la7] Improve extractor by [nixxo](https://github.com/nixxo)
* [extractor/mediaset] Better embed detection and error messages by [nixxo](https://github.com/nixxo)
* [extractor/mixch] Support `--wait-for-video`
* [extractor/naver] Improve `_VALID_URL` for `NaverNowIE` by [bashonly](https://github.com/bashonly)
* [extractor/naver] Treat fan subtitles as separate language
* [extractor/netverse] Extract comments by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/nosnl] Add support for /video by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/odnoklassniki] Extract subtitles by [bashonly](https://github.com/bashonly)
* [extractor/pinterest] Fix extractor by [bashonly](https://github.com/bashonly)
* [extractor/plutotv] Fix videos with non-zero start by [digitall](https://github.com/digitall)
* [extractor/polskieradio] Adapt to next.js redesigns by [selfisekai](https://github.com/selfisekai)
* [extractor/reddit] Add vcodec to fallback format by [chengzhicn](https://github.com/chengzhicn)
* [extractor/reddit] Extract crossposted media by [bashonly](https://github.com/bashonly)
* [extractor/reddit] Extract video embeds in text posts by [bashonly](https://github.com/bashonly)
* [extractor/rutube] Support private videos by [mexus](https://github.com/mexus)
* [extractor/sibnet] Separate from VKIE
* [extractor/slideslive] Fix extractor by [Grub4K](https://github.com/Grub4K), [bashonly](https://github.com/bashonly)
* [extractor/slideslive] Support embeds and slides by [Grub4K](https://github.com/Grub4K), [bashonly](https://github.com/bashonly), [pukkandan](https://github.com/pukkandan)
* [extractor/soundcloud] Support user permalink by [nosoop](https://github.com/nosoop)
* [extractor/spankbang] Fix extractor by [JChris246](https://github.com/JChris246)
* [extractor/stv] Detect DRM
* [extractor/swearnet] Fix description bug
* [extractor/tencent] Fix geo-restricted video by [elyse0](https://github.com/elyse0)
* [extractor/tiktok] Fix subs, `DouyinIE`, improve `_VALID_URL` by [bashonly](https://github.com/bashonly)
* [extractor/tiktok] Update `_VALID_URL`, add `api_hostname` arg by [bashonly](https://github.com/bashonly)
* [extractor/tiktok] Update API hostname by [redraskal](https://github.com/redraskal)
* [extractor/twitcasting] Fix videos with password by [Spicadox](https://github.com/Spicadox), [bashonly](https://github.com/bashonly)
* [extractor/twitter] Heed `--no-playlist` for multi-video tweets by [Grub4K](https://github.com/Grub4K), [bashonly](https://github.com/bashonly)
* [extractor/twitter] Refresh guest token when expired by [Grub4K](https://github.com/Grub4K), [bashonly](https://github.com/bashonly)
* [extractor/twitter:spaces] Add `Referer` to m3u8 by [nixxo](https://github.com/nixxo)
* [extractor/udemy] Fix lectures that have no URL and detect DRM
* [extractor/unsupported] Add more URLs
* [extractor/urplay] Support for audio-only formats by [barsnick](https://github.com/barsnick)
* [extractor/wistia] Improve extension detection by [Grub4k](https://github.com/Grub4k), [bashonly](https://github.com/bashonly), [pukkandan](https://github.com/pukkandan)
* [extractor/yle_areena] Support restricted videos by [docbender](https://github.com/docbender)
* [extractor/youku] Fix extractor by [KurtBestor](https://github.com/KurtBestor)
* [extractor/youporn] Fix metadata by [marieell](https://github.com/marieell)
* [extractor/redgifs] Fix bug in [8c188d5](https://github.com/yt-dlp/yt-dlp/commit/8c188d5d09177ed213a05c900d3523867c5897fd)
### 2022.11.11
* Merge youtube-dl: Upto [commit/de39d12](https://github.com/ytdl-org/youtube-dl/commit/de39d128)

View File

@@ -42,7 +42,7 @@ You can also find lists of all [contributors of yt-dlp](CONTRIBUTORS) and [autho
* Improved/fixed support for HiDive, HotStar, Hungama, LBRY, LinkedInLearning, Mxplayer, SonyLiv, TV2, Vimeo, VLive etc
## [Lesmiscore](https://github.com/Lesmiscore) (nao20010128nao)
## [Lesmiscore](https://github.com/Lesmiscore) <sup><sub>(nao20010128nao)</sup></sub>
**Bitcoin**: bc1qfd02r007cutfdjwjmyy9w23rjvtls6ncve7r3s
**Monacoin**: mona1q3tf7dzvshrhfe3md379xtvt2n22duhglv5dskr
@@ -50,3 +50,10 @@ You can also find lists of all [contributors of yt-dlp](CONTRIBUTORS) and [autho
* Download live from start to end for YouTube
* Added support for new websites AbemaTV, mildom, PixivSketch, skeb, radiko, voicy, mirrativ, openrec, whowatch, damtomo, 17.live, mixch etc
* Improved/fixed support for fc2, YahooJapanNews, tver, iwara etc
## [bashonly](https://github.com/bashonly)
* `--cookies-from-browser` support for Firefox containers
* Added support for new websites Genius, Kick, NBCStations, Triller, VideoKen etc
* Improved/fixed support for Anvato, Brightcove, Instagram, ParamountPlus, Reddit, SlidesLive, TikTok, Twitter, Vimeo etc

View File

@@ -17,8 +17,8 @@ pypi-files: AUTHORS Changelog.md LICENSE README.md README.txt supportedsites \
clean-test:
rm -rf test/testdata/sigs/player-*.js tmp/ *.annotations.xml *.aria2 *.description *.dump *.frag \
*.frag.aria2 *.frag.urls *.info.json *.live_chat.json *.meta *.part* *.tmp *.temp *.unknown_video *.ytdl \
*.3gp *.ape *.ass *.avi *.desktop *.f4v *.flac *.flv *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 *.mp4 \
*.mpga *.oga *.ogg *.opus *.png *.sbv *.srt *.swf *.swp *.tt *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp \
*.3gp *.ape *.ass *.avi *.desktop *.f4v *.flac *.flv *.gif *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 \
*.mp4 *.mpga *.oga *.ogg *.opus *.png *.sbv *.srt *.swf *.swp *.tt *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp \
*.images *.lock *.aac
clean-dist:
rm -rf ytdl-patched.1.temp.md ytdl-patched.1 README.txt MANIFEST build/ dist/ .coverage cover/ yt-dlp.tar.gz completions/ \

View File

@@ -161,13 +161,13 @@ You'll get `DEBUG` column with tokens in arguments, and list of unmatched tokens
## NEW FEATURES IN YT-DLP
* Merged with **youtube-dl v2021.12.17+ [commit/de39d12](https://github.com/ytdl-org/youtube-dl/commit/de39d128)** <!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* Merged with **youtube-dl v2021.12.17+ [commit/195f22f](https://github.com/ytdl-org/youtube-dl/commit/195f22f)** <!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in YouTube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API
* **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection than what is possible by simply using `--format` ([examples](#format-selection-examples))
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--write-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, playlist infojson etc. Note that the NicoNico livestreams are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--write-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, playlist infojson etc. Note that NicoNico livestreams are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
* **YouTube improvements**:
* Supports Clips, Stories (`ytstories:<channel UCID>`), Search (including filters)**\***, YouTube Music Search, Channel-specific search, Search prefixes (`ytsearch:`, `ytsearchdate:`)**\***, Mixes, YouTube Music Albums/Channels ([except self-uploaded music](https://github.com/yt-dlp/yt-dlp/issues/723)), and Feeds (`:ytfav`, `:ytwatchlater`, `:ytsubs`, `:ythistory`, `:ytrec`, `:ytnotif`)
@@ -315,7 +315,7 @@ You can use `yt-dlp -U` to update if you are [using the release binaries](#relea
If you [installed with PIP](https://github.com/yt-dlp/yt-dlp/wiki/Installation#with-pip), simply re-run the same command that was used to install the program
For other third-party package managers, see [the wiki](https://github.com/yt-dlp/yt-dlp/wiki/Installation) or refer their documentation
For other third-party package managers, see [the wiki](https://github.com/yt-dlp/yt-dlp/wiki/Installation#third-party-package-managers) or refer their documentation
If you [installed using Homebrew](#with-homebrew), run `brew upgrade lesmiscore/my/ytdl-patched`
@@ -342,7 +342,7 @@ File|Description
<!-- MANPAGE: END EXCLUDED SECTION -->
Note: The manpages, shell completion files etc. are available in the [source tarball](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.tar.gz)
**Note**: The manpages, shell completion files etc. are available in the [source tarball](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.tar.gz)
## DEPENDENCIES
Python versions 3.7+ (CPython and PyPy) are supported. Other versions and implementations may or may not work correctly.
@@ -358,8 +358,9 @@ While all the other dependencies are optional, `ffmpeg` and `ffprobe` are highly
* [**ffmpeg** and **ffprobe**](https://www.ffmpeg.org) - Required for [merging separate video and audio files](#format-selection) as well as for various [post-processing](#post-processing-options) tasks. License [depends on the build](https://www.ffmpeg.org/legal.html)
<!-- TODO: ffmpeg has merged this patch. Remove this note once there is new release -->
**Note**: There are some regressions in newer ffmpeg versions that causes various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds
There are bugs in ffmpeg that causes various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for some of these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds
**Important**: What you need is ffmpeg *binary*, **NOT** [the python package of the same name](https://pypi.org/project/ffmpeg)
### Networking
* [**certifi**](https://github.com/certifi/python-certifi)\* - Provides Mozilla's root certificate bundle. Licensed under [MPLv2](https://github.com/certifi/python-certifi/blob/master/LICENSE)
@@ -406,7 +407,7 @@ On some systems, you may need to use `py` or `python` instead of `python3`.
`pyinst.py` accepts any arguments that can be passed to `pyinstaller`, such as `--onefile/-F` or `--onedir/-D`, which is further [documented here](https://pyinstaller.org/en/stable/usage.html#what-to-generate).
Note that pyinstaller with versions below 4.4 [do not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment.
**Note**: Pyinstaller versions below 4.4 [do not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment.
**Important**: Running `pyinstaller` directly **without** using `pyinst.py` is **not** officially supported. This may or may not work correctly.
@@ -575,7 +576,9 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
--date DATE Download only videos uploaded on this date.
The date can be "YYYYMMDD" or in the format
[now|today|yesterday][-N[day|week|month|year]].
E.g. --date today-2weeks
E.g. "--date today-2weeks" downloads
only videos uploaded on the same day two
weeks ago
--datebefore DATE Download only videos uploaded on or before
this date. The date formats accepted is the
same as --date
@@ -652,8 +655,8 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
linear=1::2 --retry-sleep fragment:exp=1:20
--skip-unavailable-fragments Skip unavailable fragments for DASH,
hlsnative and ISM downloads (default)
(Alias: --no-abort-on-unavailable-fragment)
--abort-on-unavailable-fragment
(Alias: --no-abort-on-unavailable-fragments)
--abort-on-unavailable-fragments
Abort download if a fragment is unavailable
(Alias: --no-skip-unavailable-fragments)
--keep-fragments Keep downloaded fragments on disk after
@@ -1120,9 +1123,9 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
additional field "filepath" that contains
the final path of the downloaded file is
also available, and if no fields are passed,
%(filepath)q is appended to the end of the
command. This option can be used multiple
times
%(filepath,_filename|)q is appended to the
end of the command. This option can be used
multiple times
--no-exec Remove any previously defined --exec
--convert-subs FORMAT Convert the subtitles to another format
(currently supported: ass, lrc, srt, vtt)
@@ -1269,7 +1272,7 @@ E.g. with the following configuration file yt-dlp will always extract the audio,
-o ~/YouTube/%(title)s.%(ext)s
```
Note that options in configuration file are just the same options aka switches used in regular command line calls; thus there **must be no whitespace** after `-` or `--`, e.g. `-o` or `--proxy` but not `- o` or `-- proxy`. They must also be quoted when necessary as-if it were a UNIX shell.
**Note**: Options in configuration file are just the same options aka switches used in regular command line calls; thus there **must be no whitespace** after `-` or `--`, e.g. `-o` or `--proxy` but not `- o` or `-- proxy`. They must also be quoted when necessary as-if it were a UNIX shell.
You can use `--ignore-config` if you want to disable all configuration files for a particular yt-dlp run. If `--ignore-config` is found inside any configuration file, no further configuration will be loaded. For example, having the option in the portable configuration file prevents loading of home, user, and system configurations. Additionally, (for backward compatibility) if `--ignore-config` is found inside the system configuration file, the user configuration is not loaded.
@@ -1345,7 +1348,7 @@ Additionally, you can set different output templates for the various metadata fi
<a id="outtmpl-postprocess-note"></a>
Note: Due to post-processing (i.e. merging etc.), the actual output filename might differ. Use `--print after_move:filepath` to get the name after all post-processing is complete.
**Note**: Due to post-processing (i.e. merging etc.), the actual output filename might differ. Use `--print after_move:filepath` to get the name after all post-processing is complete.
The available fields are:
@@ -1468,7 +1471,7 @@ Available only in `--sponsorblock-chapter-title`:
Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. E.g. for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `yt-dlp test video` and id `BaW_jenozKc`, this will result in a `yt-dlp test video-BaW_jenozKc.mp4` file created in the current directory.
Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with placeholder value provided with `--output-na-placeholder` (`NA` by default).
**Note**: Some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with placeholder value provided with `--output-na-placeholder` (`NA` by default).
**Tip**: Look at the `-j` output to identify which fields are available for the particular URL
@@ -1609,7 +1612,7 @@ Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain). The comparand of a string comparison needs to be quoted with either double or single quotes if it contains spaces or special characters other than `._-`.
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the website. Any other field made available by the extractor can also be used for filtering.
**Note**: None of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the website. Any other field made available by the extractor can also be used for filtering.
Formats for which the value is not known are excluded unless you put a question mark (`?`) after the operator. You can combine format filters, so `-f "[height<=?720][tbr>500]"` selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s. You can also use the filters with `all` to download all formats that satisfy the filter, e.g. `-f "all[vcodec=none]"` selects all audio-only formats.
@@ -1862,7 +1865,7 @@ Some extractors accept additional arguments which can be passed using `--extract
The following extractors use this feature:
#### youtube
* `lang`: Language code to prefer translated metadata of this language (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
* `localization_mode`: Method for obtaining translated title and description. By default, `default` is used. `dataapi` does it by using YouTube Data API.
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The main clients are `web`, `android` and `ios` with variants `_music`, `_embedded`, `_embedscreen`, `_creator` (e.g. `web_embedded`); and `mweb` and `tv_embedded` (agegate bypass) with no variants. By default, `android,web` is used, but `tv_embedded` and `creator` variants are added as required for age-gated videos. Similarly, the music variants are added for `music.youtube.com` urls. You can use `all` to use all the clients, and `default` for the default clients.
@@ -1925,7 +1928,7 @@ The following extractors use this feature:
#### twitter
* `force_graphql`: Force usage of the GraphQL API. By default it will only be used if login cookies are provided
NOTE: These options may be changed/removed in the future without concern for backward compatibility
**Note**: These options may be changed/removed in the future without concern for backward compatibility
<!-- MANPAGE: MOVE "INSTALLATION" SECTION HERE -->
@@ -1974,26 +1977,25 @@ Plugins can be installed using various methods and locations.
3. **pip and other locations in `PYTHONPATH`**
* Plugin packages can be installed and managed using `pip`. See [yt-dlp-sample-plugins](https://github.com/yt-dlp/yt-dlp-sample-plugins) for an example.
* Note: plugin files between plugin packages installed with pip must have unique filenames
* Note: plugin files between plugin packages installed with pip must have unique filenames.
* Any path in `PYTHONPATH` is searched in for the `yt_dlp_plugins` namespace folder.
* Note: This does not apply for Pyinstaller/py2exe builds.
.zip, .egg and .whl archives containing a `yt_dlp_plugins` namespace folder in their root are also supported. These can be placed in the same locations `yt_dlp_plugins` namespace folders can be found.
- e.g. `${XDG_CONFIG_HOME}/yt-dlp/plugins/mypluginpkg.zip` where `mypluginpkg.zip` contains `yt_dlp_plugins/<type>/myplugin.py`
`.zip`, `.egg` and `.whl` archives containing a `yt_dlp_plugins` namespace folder in their root are also supported as plugin packages.
* e.g. `${XDG_CONFIG_HOME}/yt-dlp/plugins/mypluginpkg.zip` where `mypluginpkg.zip` contains `yt_dlp_plugins/<type>/myplugin.py`
Run yt-dlp with `--verbose`/`-v` to check if the plugin has been loaded.
Run yt-dlp with `--verbose` to check if the plugin has been loaded.
## Developing Plugins
See [yt-dlp-sample-plugins](https://github.com/yt-dlp/yt-dlp-sample-plugins) for a sample plugin package with instructions on how to set up an environment for plugin development.
See the [yt-dlp-sample-plugins](https://github.com/yt-dlp/yt-dlp-sample-plugins) repo for a template plugin package and the [Plugin Development](https://github.com/yt-dlp/yt-dlp/wiki/Plugin-Development) section of the wiki for a plugin development guide.
All public classes with a name ending in `IE`/`PP` are imported from each file for extractors and postprocessors repectively. This respects underscore prefix (e.g. `_MyBasePluginIE` is private) and `__all__`. Modules can similarly be excluded by prefixing the module name with an underscore (e.g. `_myplugin.py`)
All public classes with a name ending in `IE`/`PP` are imported from each file for extractors and postprocessors repectively. This respects underscore prefix (e.g. `_MyBasePluginIE` is private) and `__all__`. Modules can similarly be excluded by prefixing the module name with an underscore (e.g. `_myplugin.py`).
To replace an existing extractor with a subclass of one, set the `plugin_name` class keyword argument (e.g. `MyPluginIE(ABuiltInIE, plugin_name='myplugin')` will replace `ABuiltInIE` with `MyPluginIE`).
Due to the mechanics behind this, you should exclude the subclass extractor from being imported separately by making it private using one of the methods described above.
To replace an existing extractor with a subclass of one, set the `plugin_name` class keyword argument (e.g. `MyPluginIE(ABuiltInIE, plugin_name='myplugin')` will replace `ABuiltInIE` with `MyPluginIE`). Since the extractor replaces the parent, you should exclude the subclass extractor from being imported separately by making it private using one of the methods described above.
If you are a plugin author, add [yt-dlp-plugins](https://github.com/topics/yt-dlp-plugins) as a topic to your repository for discoverability
If you are a plugin author, add [yt-dlp-plugins](https://github.com/topics/yt-dlp-plugins) as a topic to your repository for discoverability.
See the [Developer Instructions](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#developer-instructions) on how to write and test an extractor.

View File

@@ -167,7 +167,7 @@
- **BilibiliSpacePlaylist**
- **BilibiliSpaceVideo**
- **BiliIntl**: [<abbr title="netrc machine"><em>biliintl</em></abbr>]
- **biliintl:series**: [<abbr title="netrc machine"><em>biliintl</em></abbr>]
- **biliIntl:series**: [<abbr title="netrc machine"><em>biliintl</em></abbr>]
- **BiliLive**
- **BioBioChileTV**
- **Biography**
@@ -872,7 +872,7 @@
- **NetPlusTVRecordings**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **Netverse**
- **NetversePlaylist**
- **NetverseSearch**:; "netsearch:" prefix
- **NetverseSearch**: "netsearch:" prefix
- **Netzkino**
- **Newgrounds**
- **Newgrounds:playlist**
@@ -1217,6 +1217,7 @@
- **SaltTVLive**: [<abbr title="netrc machine"><em>salttv</em></abbr>]
- **SaltTVRecordings**: [<abbr title="netrc machine"><em>salttv</em></abbr>]
- **SampleFocus**
- **SamplePlugin**: (**Currently broken**)
- **Sangiin**: 参議院インターネット審議中継 (archive)
- **Sapo**: SAPO Vídeos
- **savefrom.net**
@@ -1717,7 +1718,7 @@
- **YouPorn**
- **YourPorn**
- **YourUpload**
- **youtube**: YouTube
- **youtube+sample+NSIG+AGB**: YouTube
- **youtube:clip**
- **youtube:favorites**: YouTube liked videos; ":ytfav" keyword (requires cookies)
- **youtube:history**: Youtube watch history; ":ythis" keyword (requires cookies)

View File

@@ -3586,6 +3586,7 @@ class YoutubeDL:
reject = lambda k, v: v is None or k.startswith('__') or k in {
'requested_downloads', 'requested_formats', 'requested_subtitles', 'requested_entries',
'entries', 'filepath', '_filename', 'infojson_filename', 'original_url', 'playlist_autonumber',
'_format_sort_fields',
}
else:
reject = lambda k, v: False

View File

@@ -344,7 +344,7 @@ def validate_options(opts):
mobj = range_ != '-' and re.fullmatch(r'([^-]+)?\s*-\s*([^-]+)?', range_)
dur = mobj and (parse_timestamp(mobj.group(1) or '0'), parse_timestamp(mobj.group(2) or 'inf'))
if None in (dur or [None]):
raise ValueError(f'invalid {name} time range "{regex}". Must be of the form *start-end')
raise ValueError(f'invalid {name} time range "{regex}". Must be of the form "*start-end"')
ranges.append(dur)
continue
try:

View File

@@ -1898,6 +1898,11 @@ from .theweatherchannel import TheWeatherChannelIE
from .thisamericanlife import ThisAmericanLifeIE
from .thisav import ThisAVIE
from .thisoldhouse import ThisOldHouseIE
from .thisvid import (
ThisVidIE,
ThisVidMemberIE,
ThisVidPlaylistIE,
)
from .threespeak import (
ThreeSpeakIE,
ThreeSpeakUserIE,

View File

@@ -1013,7 +1013,7 @@ class BiliIntlIE(BiliIntlBaseIE):
class BiliIntlSeriesIE(BiliIntlBaseIE):
IE_NAME = 'biliintl:series'
IE_NAME = 'biliIntl:series'
_VALID_URL = r'https?://(?:www\.)?bili(?:bili\.tv|intl\.com)/(?:[a-zA-Z]{2}/)?play/(?P<id>\d+)/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://www.bilibili.tv/en/play/34613',

View File

@@ -1317,7 +1317,9 @@ class InfoExtractor:
Like _search_regex, but strips HTML tags and unescapes entities.
"""
res = self._search_regex(pattern, string, name, default, fatal, flags, group)
if res:
if isinstance(res, tuple):
return [clean_html(r).strip() for r in res]
elif res:
return clean_html(res).strip()
else:
return res
@@ -1451,10 +1453,16 @@ class InfoExtractor:
# And then there are the jokers who advertise that they use RTA, but actually don't.
AGE_LIMIT_MARKERS = [
r'Proudly Labeled <a href="http://www\.rtalabel\.org/" title="Restricted to Adults">RTA</a>',
r'>[^<]*you acknowledge you are at least (\d+) years old',
r'>\s*(?:18\s+U(?:\.S\.C\.|SC)\s+)?(?:§+\s*)?2257\b',
]
if any(re.search(marker, html) for marker in AGE_LIMIT_MARKERS):
return 18
return 0
age_limit = 0
for marker in AGE_LIMIT_MARKERS:
mobj = re.search(marker, html)
if mobj:
age_limit = max(age_limit, int(traverse_obj(mobj, 1, default=18)))
return age_limit
def _media_rating_search(self, html):
# See http://www.tjg-designs.com/WP/metadata-code-examples-adding-metadata-to-your-web-pages/
@@ -3281,7 +3289,7 @@ class InfoExtractor:
def _find_jwplayer_data(self, webpage, video_id=None, transform_source=js_to_json):
mobj = re.search(
r'(?s)jwplayer\((?P<quote>[\'"])[^\'" ]+(?P=quote)\)(?!</script>).*?\.setup\s*\((?P<options>[^)]+)\)',
r'''(?s)jwplayer\s*\(\s*(?P<q>'|")(?!(?P=q)).+(?P=q)\s*\)(?!</script>).*?\.\s*setup\s*\(\s*(?P<options>(?:\([^)]*\)|[^)])+)\s*\)''',
webpage)
if mobj:
try:
@@ -3302,19 +3310,20 @@ class InfoExtractor:
def _parse_jwplayer_data(self, jwplayer_data, video_id=None, require_title=True,
m3u8_id=None, mpd_id=None, rtmp_params=None, base_url=None):
# JWPlayer backward compatibility: flattened playlists
# https://github.com/jwplayer/jwplayer/blob/v7.4.3/src/js/api/config.js#L81-L96
if 'playlist' not in jwplayer_data:
jwplayer_data = {'playlist': [jwplayer_data]}
entries = []
if not isinstance(jwplayer_data, dict):
return entries
# JWPlayer backward compatibility: single playlist item
playlist_items = jwplayer_data.get('playlist')
# JWPlayer backward compatibility: single playlist item/flattened playlists
# https://github.com/jwplayer/jwplayer/blob/v7.7.0/src/js/playlist/playlist.js#L10
if not isinstance(jwplayer_data['playlist'], list):
jwplayer_data['playlist'] = [jwplayer_data['playlist']]
# https://github.com/jwplayer/jwplayer/blob/v7.4.3/src/js/api/config.js#L81-L96
if not isinstance(playlist_items, list):
playlist_items = (playlist_items or jwplayer_data, )
for video_data in jwplayer_data['playlist']:
for video_data in playlist_items:
if not isinstance(video_data, dict):
continue
# JWPlayer backward compatibility: flattened sources
# https://github.com/jwplayer/jwplayer/blob/v7.4.3/src/js/playlist/item.js#L29-L35
if 'sources' not in video_data:
@@ -3352,6 +3361,13 @@ class InfoExtractor:
'timestamp': int_or_none(video_data.get('pubdate')),
'duration': float_or_none(jwplayer_data.get('duration') or video_data.get('duration')),
'subtitles': subtitles,
'alt_title': clean_html(video_data.get('subtitle')), # attributes used e.g. by Tele5 ...
'genre': clean_html(video_data.get('genre')),
'channel': clean_html(dict_get(video_data, ('category', 'channel'))),
'season_number': int_or_none(video_data.get('season')),
'episode_number': int_or_none(video_data.get('episode')),
'release_year': int_or_none(video_data.get('releasedate')),
'age_limit': int_or_none(video_data.get('age_restriction')),
}
# https://github.com/jwplayer/jwplayer/blob/master/src/js/utils/validator.js#L32
if len(formats) == 1 and re.search(r'^(?:http|//).*(?:youtube\.com|youtu\.be)/.+', formats[0]['url']):
@@ -3369,7 +3385,7 @@ class InfoExtractor:
def _parse_jwplayer_formats(self, jwplayer_sources_data, video_id=None,
m3u8_id=None, mpd_id=None, rtmp_params=None, base_url=None):
urls = []
urls = set()
formats = []
for source in jwplayer_sources_data:
if not isinstance(source, dict):
@@ -3378,14 +3394,14 @@ class InfoExtractor:
base_url, self._proto_relative_url(source.get('file')))
if not source_url or source_url in urls:
continue
urls.append(source_url)
urls.add(source_url)
source_type = source.get('type') or ''
ext = mimetype2ext(source_type) or determine_ext(source_url)
if source_type == 'hls' or ext == 'm3u8':
if source_type == 'hls' or ext == 'm3u8' or 'format=m3u8-aapl' in source_url:
formats.extend(self._extract_m3u8_formats(
source_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id=m3u8_id, fatal=False))
elif source_type == 'dash' or ext == 'mpd':
elif source_type == 'dash' or ext == 'mpd' or 'format=mpd-time-csf' in source_url:
formats.extend(self._extract_mpd_formats(
source_url, video_id, mpd_id=mpd_id, fatal=False))
elif ext == 'smil':
@@ -3400,13 +3416,12 @@ class InfoExtractor:
'ext': ext,
})
else:
format_id = str_or_none(source.get('label'))
height = int_or_none(source.get('height'))
if height is None:
if height is None and format_id:
# Often no height is provided but there is a label in
# format like "1080p", "720p SD", or 1080.
height = int_or_none(self._search_regex(
r'^(\d{3,4})[pP]?(?:\b|$)', str(source.get('label') or ''),
'height', default=None))
height = parse_resolution(format_id).get('height')
a_format = {
'url': source_url,
'width': int_or_none(source.get('width')),
@@ -3414,6 +3429,7 @@ class InfoExtractor:
'tbr': int_or_none(source.get('bitrate'), scale=1000),
'filesize': int_or_none(source.get('filesize')),
'ext': ext,
'format_id': format_id
}
if source_url.startswith('rtmp'):
a_format['ext'] = 'flv'
@@ -3567,7 +3583,7 @@ class InfoExtractor:
elif cls.IE_DESC:
desc += f' {cls.IE_DESC}'
if cls.SEARCH_KEY:
desc += f'; "{cls.SEARCH_KEY}:" prefix'
desc += f'{";" if cls.IE_DESC else ""} "{cls.SEARCH_KEY}:" prefix'
if search_examples:
_COUNTS = ('', '5', '10', 'all')
desc += f' (e.g. "{cls.SEARCH_KEY}{random.choice(_COUNTS)}:{random.choice(search_examples)}")'

View File

@@ -182,7 +182,7 @@ class CrunchyrollBetaIE(CrunchyrollBaseIE):
self.to_screen(
'To get all formats of a hardsub language, use '
'"--extractor-args crunchyrollbeta:hardsub=<language_code or all>". '
'See https://github.com/yt-dlp/yt-dlp#crunchyrollbeta for more info',
'See https://github.com/yt-dlp/yt-dlp#crunchyrollbeta-crunchyroll for more info',
only_once=True)
else:
full_format_langs = set(map(str.lower, available_formats))

View File

@@ -33,6 +33,7 @@ from ..utils import (
unified_timestamp,
unsmuggle_url,
url_or_none,
urljoin,
variadic,
xpath_attr,
xpath_text,
@@ -1858,11 +1859,13 @@ class GenericIE(InfoExtractor):
'display_id': 'kelis-4th-of-july',
'ext': 'mp4',
'title': 'Kelis - 4th Of July',
'thumbnail': 'https://kvs-demo.com/contents/videos_screenshots/0/105/preview.jpg',
'description': 'Kelis - 4th Of July',
'thumbnail': r're:https://(?:www\.)?kvs-demo.com/contents/videos_screenshots/0/105/preview.jpg',
},
'params': {
'skip_download': True,
},
'expected_warnings': ['Untested major version'],
}, {
# KVS Player
'url': 'https://www.kvs-demo.com/embed/105/',
@@ -1871,35 +1874,12 @@ class GenericIE(InfoExtractor):
'display_id': 'kelis-4th-of-july',
'ext': 'mp4',
'title': 'Kelis - 4th Of July / Embed Player',
'thumbnail': 'https://kvs-demo.com/contents/videos_screenshots/0/105/preview.jpg',
'thumbnail': r're:https://(?:www\.)?kvs-demo.com/contents/videos_screenshots/0/105/preview.jpg',
},
'params': {
'skip_download': True,
},
}, {
# KVS Player
'url': 'https://thisvid.com/videos/french-boy-pantsed/',
'md5': '3397979512c682f6b85b3b04989df224',
'info_dict': {
'id': '2400174',
'display_id': 'french-boy-pantsed',
'ext': 'mp4',
'title': 'French Boy Pantsed - ThisVid.com',
'thumbnail': 'https://media.thisvid.com/contents/videos_screenshots/2400000/2400174/preview.mp4.jpg',
}
}, {
# KVS Player
'url': 'https://thisvid.com/embed/2400174/',
'md5': '3397979512c682f6b85b3b04989df224',
'info_dict': {
'id': '2400174',
'display_id': 'french-boy-pantsed',
'ext': 'mp4',
'title': 'French Boy Pantsed - ThisVid.com',
'thumbnail': 'https://media.thisvid.com/contents/videos_screenshots/2400000/2400174/preview.mp4.jpg',
}
}, {
# KVS Player
'url': 'https://youix.com/video/leningrad-zoj/',
'md5': '94f96ba95706dc3880812b27b7d8a2b8',
'info_dict': {
@@ -1907,8 +1887,8 @@ class GenericIE(InfoExtractor):
'display_id': 'leningrad-zoj',
'ext': 'mp4',
'title': 'Клип: Ленинград - ЗОЖ скачать, смотреть онлайн | Youix.com',
'thumbnail': 'https://youix.com/contents/videos_screenshots/18000/18485/preview_480x320_youix_com.mp4.jpg',
}
'thumbnail': r're:https://youix.com/contents/videos_screenshots/18000/18485/preview(?:_480x320_youix_com.mp4)?\.jpg',
},
}, {
# KVS Player
'url': 'https://youix.com/embed/18485',
@@ -1918,19 +1898,20 @@ class GenericIE(InfoExtractor):
'display_id': 'leningrad-zoj',
'ext': 'mp4',
'title': 'Ленинград - ЗОЖ',
'thumbnail': 'https://youix.com/contents/videos_screenshots/18000/18485/preview_480x320_youix_com.mp4.jpg',
}
'thumbnail': r're:https://youix.com/contents/videos_screenshots/18000/18485/preview(?:_480x320_youix_com.mp4)?\.jpg',
},
}, {
# KVS Player
'url': 'https://bogmedia.org/videos/21217/40-nochey-40-nights-2016/',
'md5': '94166bdb26b4cb1fb9214319a629fc51',
'info_dict': {
'id': '21217',
'display_id': '40-nochey-40-nights-2016',
'display_id': '40-nochey-2016',
'ext': 'mp4',
'title': '40 ночей (2016) - BogMedia.org',
'description': 'md5:4e6d7d622636eb7948275432eb256dc3',
'thumbnail': 'https://bogmedia.org/contents/videos_screenshots/21000/21217/preview_480p.mp4.jpg',
}
},
},
{
# Firebase Dynamic Link
@@ -1982,9 +1963,9 @@ class GenericIE(InfoExtractor):
'id': '389508',
'display_id': 'syren-de-mer-onlyfans-05-07-2020have-a-happy-safe-holiday5f014e68a220979bdb8cd-source',
'ext': 'mp4',
'title': 'Syren De Mer onlyfans_05-07-2020Have_a_happy_safe_holiday5f014e68a220979bdb8cd_source / Embed плеер',
'thumbnail': 'http://www.camhub.world/contents/videos_screenshots/389000/389508/preview.mp4.jpg',
}
'title': 'Syren De Mer onlyfans_05-07-2020Have_a_happy_safe_holiday5f014e68a220979bdb8cd_source / Embed плеер',
'thumbnail': r're:https?://www\.camhub\.world/contents/videos_screenshots/389000/389508/preview\.mp4\.jpg',
},
},
{
# Reddit-hosted video that will redirect and be processed by RedditIE
@@ -2187,7 +2168,8 @@ class GenericIE(InfoExtractor):
'age_limit': 0,
'direct': True,
}
}, {
},
{
'note': 'server returns data in brotli compression by default if `accept-encoding: *` is specified.',
'url': 'https://www.extra.cz/cauky-lidi-70-dil-babis-predstavil-pohadky-prymulanek-nebo-andrejovy-nove-saty-ac867',
'info_dict': {
@@ -2201,8 +2183,37 @@ class GenericIE(InfoExtractor):
'duration': 318.0,
'direct': True,
'age_limit': 0,
}
}
},
},
{
'note': 'JW Player embed with unicode-escape sequences in URL',
'url': 'https://www.medici.tv/en/concerts/lahav-shani-mozart-mahler-israel-philharmonic-abu-dhabi-classics',
'info_dict': {
'id': 'm',
'ext': 'mp4',
'title': 'Lahav Shani conducts the Israel Philharmonic\'s first-ever concert in Abu Dhabi',
'description': 'Mahler\'s ',
'uploader': 'www.medici.tv',
'age_limit': 0,
'thumbnail': r're:^https?://.+\.jpg',
},
'params': {
'skip_download': True,
},
},
{
'url': 'https://shooshtime.com/videos/284002/just-out-of-the-shower-joi/',
'md5': 'e2f0a4c329f7986280b7328e24036d60',
'info_dict': {
'id': '284002',
'display_id': 'just-out-of-the-shower-joi',
'ext': 'mp4',
'title': 'Just Out Of The Shower JOI - Shooshtime',
'thumbnail': 'https://i.shoosh.co/contents/videos_screenshots/284000/284002/preview.mp4.jpg',
'height': 720,
'age_limit': 18,
},
},
]
_CORRUPTED_SCHEME_CONVERSION_TABLE = {
@@ -2282,43 +2293,87 @@ class GenericIE(InfoExtractor):
'entries': entries,
}
def _kvs_getrealurl(self, video_url, license_code):
@classmethod
def _kvs_get_real_url(cls, video_url, license_code):
if not video_url.startswith('function/0/'):
return video_url # not obfuscated
url_path, _, url_query = video_url.partition('?')
urlparts = url_path.split('/')[2:]
license = self._kvs_getlicensetoken(license_code)
newmagic = urlparts[5][:32]
parsed = urllib.parse.urlparse(video_url[len('function/0/'):])
license = cls._kvs_get_license_token(license_code)
urlparts = parsed.path.split('/')
for o in range(len(newmagic) - 1, -1, -1):
new = ''
l = (o + sum(int(n) for n in license[o:])) % 32
HASH_LENGTH = 32
hash = urlparts[3][:HASH_LENGTH]
indices = list(range(HASH_LENGTH))
for i in range(0, len(newmagic)):
if i == o:
new += newmagic[l]
elif i == l:
new += newmagic[o]
else:
new += newmagic[i]
newmagic = new
# Swap indices of hash according to the destination calculated from the license token
accum = 0
for src in reversed(range(HASH_LENGTH)):
accum += license[src]
dest = (src + accum) % HASH_LENGTH
indices[src], indices[dest] = indices[dest], indices[src]
urlparts[5] = newmagic + urlparts[5][32:]
return '/'.join(urlparts) + '?' + url_query
urlparts[3] = ''.join(hash[index] for index in indices) + urlparts[3][HASH_LENGTH:]
return urllib.parse.urlunparse(parsed._replace(path='/'.join(urlparts)))
def _kvs_getlicensetoken(self, license):
modlicense = license.replace('$', '').replace('0', '1')
center = int(len(modlicense) / 2)
@staticmethod
def _kvs_get_license_token(license):
license = license.replace('$', '')
license_values = [int(char) for char in license]
modlicense = license.replace('0', '1')
center = len(modlicense) // 2
fronthalf = int(modlicense[:center + 1])
backhalf = int(modlicense[center:])
modlicense = str(4 * abs(fronthalf - backhalf))[:center + 1]
modlicense = str(4 * abs(fronthalf - backhalf))
retval = ''
for o in range(0, center + 1):
for i in range(1, 5):
retval += str((int(license[o + i]) + int(modlicense[o])) % 10)
return retval
return [
(license_values[index + offset] + current) % 10
for index, current in enumerate(map(int, modlicense))
for offset in range(4)
]
def _extract_kvs(self, url, webpage, video_id):
flashvars = self._search_json(
r'(?s:<script\b[^>]*>.*?var\s+flashvars\s*=)',
webpage, 'flashvars', video_id, transform_source=js_to_json)
# extract the part after the last / as the display_id from the
# canonical URL.
display_id = self._search_regex(
r'(?:<link href="https?://[^"]+/(.+?)/?" rel="canonical"\s*/?>'
r'|<link rel="canonical" href="https?://[^"]+/(.+?)/?"\s*/?>)',
webpage, 'display_id', fatal=False)
title = self._html_search_regex(r'<(?:h1|title)>(?:Video: )?(.+?)</(?:h1|title)>', webpage, 'title')
thumbnail = flashvars['preview_url']
if thumbnail.startswith('//'):
protocol, _, _ = url.partition('/')
thumbnail = protocol + thumbnail
url_keys = list(filter(re.compile(r'^video_(?:url|alt_url\d*)$').match, flashvars.keys()))
formats = []
for key in url_keys:
if '/get_file/' not in flashvars[key]:
continue
format_id = flashvars.get(f'{key}_text', key)
formats.append({
'url': urljoin(url, self._kvs_get_real_url(flashvars[key], flashvars['license_code'])),
'format_id': format_id,
'ext': 'mp4',
**(parse_resolution(format_id) or parse_resolution(flashvars[key])),
'http_headers': {'Referer': url},
})
if not formats[-1].get('height'):
formats[-1]['quality'] = 1
return {
'id': flashvars['video_id'],
'display_id': display_id,
'title': title,
'thumbnail': thumbnail,
'formats': formats,
}
def _real_extract(self, url):
if url.startswith('//'):
@@ -2676,6 +2731,17 @@ class GenericIE(InfoExtractor):
self.report_detected('video.js embed')
return [{'formats': formats, 'subtitles': subtitles}]
# Look for generic KVS player (before json-ld bc of some urls that break otherwise)
found = self._search_regex((
r'<script\b[^>]+?\bsrc\s*=\s*(["\'])https?://(?:\S+?/)+kt_player\.js\?v=(?P<ver>\d+(?:\.\d+)+)\1[^>]*>',
r'kt_player\s*\(\s*(["\'])(?:(?!\1)[\w\W])+\1\s*,\s*(["\'])https?://(?:\S+?/)+kt_player\.swf\?v=(?P<ver>\d+(?:\.\d+)+)\2\s*,',
), webpage, 'KVS player', group='ver', default=False)
if found:
self.report_detected('KWS Player')
if found.split('.')[0] not in ('4', '5', '6'):
self.report_warning(f'Untested major version ({found}) in player engine - download may fail.')
return [self._extract_kvs(url, webpage, video_id)]
# Looking for http://schema.org/VideoObject
json_ld = self._search_json_ld(webpage, video_id, default={})
if json_ld.get('url') not in (url, None):
@@ -2718,52 +2784,6 @@ class GenericIE(InfoExtractor):
['"]?file['"]?\s*:\s*["\'](.*?)["\']''', webpage))
if found:
self.report_detected('JW Player embed')
if not found:
# Look for generic KVS player
found = re.search(r'<script [^>]*?src="https?://.+?/kt_player\.js\?v=(?P<ver>(?P<maj_ver>\d+)(\.\d+)+)".*?>', webpage)
if found:
self.report_detected('KWS Player')
if found.group('maj_ver') not in ['4', '5']:
self.report_warning('Untested major version (%s) in player engine--Download may fail.' % found.group('ver'))
flashvars = re.search(r'(?ms)<script.*?>.*?var\s+flashvars\s*=\s*(\{.*?\});.*?</script>', webpage)
flashvars = self._parse_json(flashvars.group(1), video_id, transform_source=js_to_json)
# extract the part after the last / as the display_id from the
# canonical URL.
display_id = self._search_regex(
r'(?:<link href="https?://[^"]+/(.+?)/?" rel="canonical"\s*/?>'
r'|<link rel="canonical" href="https?://[^"]+/(.+?)/?"\s*/?>)',
webpage, 'display_id', fatal=False
)
title = self._html_search_regex(r'<(?:h1|title)>(?:Video: )?(.+?)</(?:h1|title)>', webpage, 'title')
thumbnail = flashvars['preview_url']
if thumbnail.startswith('//'):
protocol, _, _ = url.partition('/')
thumbnail = protocol + thumbnail
url_keys = list(filter(re.compile(r'video_url|video_alt_url\d*').fullmatch, flashvars.keys()))
formats = []
for key in url_keys:
if '/get_file/' not in flashvars[key]:
continue
format_id = flashvars.get(f'{key}_text', key)
formats.append({
'url': self._kvs_getrealurl(flashvars[key], flashvars['license_code']),
'format_id': format_id,
'ext': 'mp4',
**(parse_resolution(format_id) or parse_resolution(flashvars[key]))
})
if not formats[-1].get('height'):
formats[-1]['quality'] = 1
return [{
'id': flashvars['video_id'],
'display_id': display_id,
'title': title,
'thumbnail': thumbnail,
'formats': formats,
}]
if not found:
# Broaden the search a little bit
found = filter_video(re.findall(r'[^A-Za-z0-9]?(?:file|source)=(http[^\'"&]*)', webpage))
@@ -2844,6 +2864,7 @@ class GenericIE(InfoExtractor):
entries = []
for video_url in orderedSet(found):
video_url = video_url.encode().decode('unicode-escape')
video_url = unescapeHTML(video_url)
video_url = video_url.replace('\\/', '/')
video_url = urllib.parse.urljoin(url, video_url)

View File

@@ -1,71 +1,128 @@
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
get_element_by_class,
int_or_none,
merge_dicts,
url_or_none,
)
class PeekVidsIE(InfoExtractor):
class PeekVidsBaseIE(InfoExtractor):
def _real_extract(self, url):
domain, video_id = self._match_valid_url(url).group('domain', 'id')
webpage = self._download_webpage(url, video_id, expected_status=429)
if '>Rate Limit Exceeded' in webpage:
raise ExtractorError(
f'You are suspected as a bot. Wait, or pass the captcha on the site and provide cookies. {self._login_hint()}',
video_id=video_id, expected=True)
title = self._html_search_regex(r'(?s)<h1\b[^>]*>(.+?)</h1>', webpage, 'title')
display_id = video_id
video_id = self._search_regex(r'(?s)<video\b[^>]+\bdata-id\s*=\s*["\']?([\w-]+)', webpage, 'short video ID')
srcs = self._download_json(
f'https://www.{domain}/v-alt/{video_id}', video_id,
note='Downloading list of source files')
formats = []
for k, v in srcs.items():
f_url = url_or_none(v)
if not f_url:
continue
height = self._search_regex(r'^data-src(\d{3,})$', k, 'height', default=None)
if not height:
continue
formats.append({
'url': f_url,
'format_id': height,
'height': int_or_none(height),
})
if not formats:
formats = [{'url': url} for url in srcs.values()]
info = self._search_json_ld(webpage, video_id, expected_type='VideoObject', default={})
info.pop('url', None)
# may not have found the thumbnail if it was in a list in the ld+json
info.setdefault('thumbnail', self._og_search_thumbnail(webpage))
detail = (get_element_by_class('detail-video-block', webpage)
or get_element_by_class('detail-block', webpage) or '')
info['description'] = self._html_search_regex(
rf'(?s)(.+?)(?:{re.escape(info.get("description", ""))}\s*<|<ul\b)',
detail, 'description', default=None) or None
info['title'] = re.sub(r'\s*[,-][^,-]+$', '', info.get('title') or title) or self._generic_title(url)
def cat_tags(name, html):
l = self._html_search_regex(
rf'(?s)<span\b[^>]*>\s*{re.escape(name)}\s*:\s*</span>(.+?)</li>',
html, name, default='')
return list(filter(None, re.split(r'\s+', l)))
return merge_dicts({
'id': video_id,
'display_id': display_id,
'age_limit': 18,
'formats': formats,
'categories': cat_tags('Categories', detail),
'tags': cat_tags('Tags', detail),
'uploader': self._html_search_regex(r'[Uu]ploaded\s+by\s(.+?)"', webpage, 'uploader', default=None),
}, info)
class PeekVidsIE(PeekVidsBaseIE):
_VALID_URL = r'''(?x)
https?://(?:www\.)?peekvids\.com/
https?://(?:www\.)?(?P<domain>peekvids\.com)/
(?:(?:[^/?#]+/){2}|embed/?\?(?:[^#]*&)?v=)
(?P<id>[^/?&#]*)
'''
_TESTS = [{
'url': 'https://peekvids.com/pc/dane-jones-cute-redhead-with-perfect-tits-with-mini-vamp/BSyLMbN0YCd',
'md5': 'a00940646c428e232407e3e62f0e8ef5',
'md5': '2ff6a357a9717dc9dc9894b51307e9a2',
'info_dict': {
'id': 'BSyLMbN0YCd',
'title': ' Dane Jones - Cute redhead with perfect tits with Mini Vamp, SEXYhub',
'id': '1262717',
'display_id': 'BSyLMbN0YCd',
'title': ' Dane Jones - Cute redhead with perfect tits with Mini Vamp',
'ext': 'mp4',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'Watch Dane Jones - Cute redhead with perfect tits with Mini Vamp (7 min), uploaded by SEXYhub.com',
'description': 'md5:0a61df3620de26c0af8963b1a730cd69',
'timestamp': 1642579329,
'upload_date': '20220119',
'duration': 416,
'view_count': int,
'age_limit': 18,
'uploader': 'SEXYhub.com',
'categories': list,
'tags': list,
},
}]
_DOMAIN = 'www.peekvids.com'
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
short_video_id = self._html_search_regex(r'<video [^>]*data-id="(.+?)"', webpage, 'short video ID')
srcs = self._download_json(
f'https://{self._DOMAIN}/v-alt/{short_video_id}', video_id,
note='Downloading list of source files')
formats = [{
'url': url,
'ext': 'mp4',
'format_id': name[8:],
} for name, url in srcs.items() if len(name) > 8 and name.startswith('data-src')]
if not formats:
formats = [{'url': url} for url in srcs.values()]
info = self._search_json_ld(webpage, video_id, expected_type='VideoObject')
info.update({
'id': video_id,
'age_limit': 18,
'formats': formats,
})
return info
class PlayVidsIE(PeekVidsIE): # XXX: Do not subclass from concrete IE
_VALID_URL = r'https?://(?:www\.)?playvids\.com/(?:embed/|[^/]{2}/)?(?P<id>[^/?#]*)'
class PlayVidsIE(PeekVidsBaseIE):
_VALID_URL = r'https?://(?:www\.)?(?P<domain>playvids\.com)/(?:embed/|\w\w?/)?(?P<id>[^/?#]*)'
_TESTS = [{
'url': 'https://www.playvids.com/U3pBrYhsjXM/pc/dane-jones-cute-redhead-with-perfect-tits-with-mini-vamp',
'md5': 'cd7dfd8a2e815a45402369c76e3c1825',
'md5': '2f12e50213dd65f142175da633c4564c',
'info_dict': {
'id': 'U3pBrYhsjXM',
'title': ' Dane Jones - Cute redhead with perfect tits with Mini Vamp, SEXYhub',
'id': '1978030',
'display_id': 'U3pBrYhsjXM',
'title': ' Dane Jones - Cute redhead with perfect tits with Mini Vamp',
'ext': 'mp4',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'Watch Dane Jones - Cute redhead with perfect tits with Mini Vamp video in HD, uploaded by SEXYhub.com',
'description': 'md5:0a61df3620de26c0af8963b1a730cd69',
'timestamp': 1640435839,
'upload_date': '20211225',
'duration': 416,
'view_count': int,
'age_limit': 18,
'uploader': 'SEXYhub.com',
'categories': list,
'tags': list,
},
}, {
'url': 'https://www.playvids.com/es/U3pBrYhsjXM/pc/dane-jones-cute-redhead-with-perfect-tits-with-mini-vamp',
@@ -73,5 +130,62 @@ class PlayVidsIE(PeekVidsIE): # XXX: Do not subclass from concrete IE
}, {
'url': 'https://www.playvids.com/embed/U3pBrYhsjXM',
'only_matching': True,
}, {
'url': 'https://www.playvids.com/bKmGLe3IwjZ/sv/brazzers-800-phone-sex-madison-ivy-always-on-the-line',
'md5': 'e783986e596cafbf46411a174ab42ba6',
'info_dict': {
'id': '762385',
'display_id': 'bKmGLe3IwjZ',
'ext': 'mp4',
'title': 'Brazzers - 1 800 Phone Sex: Madison Ivy Always On The Line 6',
'description': 'md5:bdcd2db2b8ad85831a491d7c8605dcef',
'timestamp': 1516958544,
'upload_date': '20180126',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 480,
'uploader': 'Brazzers',
'age_limit': 18,
'view_count': int,
'age_limit': 18,
'categories': list,
'tags': list,
},
}, {
'url': 'https://www.playvids.com/v/47iUho33toY',
'md5': 'b056b5049d34b648c1e86497cf4febce',
'info_dict': {
'id': '700621',
'display_id': '47iUho33toY',
'ext': 'mp4',
'title': 'KATEE OWEN STRIPTIASE IN SEXY RED LINGERIE',
'description': None,
'timestamp': 1507052209,
'upload_date': '20171003',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 332,
'uploader': 'Cacerenele',
'age_limit': 18,
'view_count': int,
'categories': list,
'tags': list,
},
}, {
'url': 'https://www.playvids.com/z3_7iwWCmqt/sexy-teen-filipina-striptease-beautiful-pinay-bargirl-strips-and-dances',
'md5': 'efa09be9f031314b7b7e3bc6510cd0df',
'info_dict': {
'id': '1523518',
'display_id': 'z3_7iwWCmqt',
'ext': 'mp4',
'title': 'SEXY TEEN FILIPINA STRIPTEASE - Beautiful Pinay Bargirl Strips and Dances',
'description': None,
'timestamp': 1607470323,
'upload_date': '20201208',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 593,
'uploader': 'yorours',
'age_limit': 18,
'view_count': int,
'categories': list,
'tags': list,
},
}]
_DOMAIN = 'www.playvids.com'

226
yt_dlp/extractor/thisvid.py Normal file
View File

@@ -0,0 +1,226 @@
import itertools
import re
import urllib.parse
from .common import InfoExtractor
from ..utils import (
clean_html,
get_element_by_class,
int_or_none,
url_or_none,
urljoin,
)
class ThisVidIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?thisvid\.com/(?P<type>videos|embed)/(?P<id>[A-Za-z0-9-]+)'
_TESTS = [{
'url': 'https://thisvid.com/videos/sitting-on-ball-tight-jeans/',
'md5': '839becb572995687e11a69dc4358a386',
'info_dict': {
'id': '3533241',
'ext': 'mp4',
'title': 'Sitting on ball tight jeans',
'description': 'md5:372353bb995883d1b65fddf507489acd',
'thumbnail': r're:https?://\w+\.thisvid\.com/(?:[^/]+/)+3533241/preview\.jpg',
'uploader_id': '150629',
'uploader': 'jeanslevisjeans',
'display_id': 'sitting-on-ball-tight-jeans',
'age_limit': 18,
}
}, {
'url': 'https://thisvid.com/embed/3533241/',
'md5': '839becb572995687e11a69dc4358a386',
'info_dict': {
'id': '3533241',
'ext': 'mp4',
'title': 'Sitting on ball tight jeans',
'thumbnail': r're:https?://\w+\.thisvid\.com/(?:[^/]+/)+3533241/preview\.jpg',
'uploader_id': '150629',
'uploader': 'jeanslevisjeans',
'display_id': 'sitting-on-ball-tight-jeans',
'age_limit': 18,
}
}]
def _real_extract(self, url):
main_id, type_ = re.match(self._VALID_URL, url).group('id', 'type')
webpage = self._download_webpage(url, main_id)
title = self._html_search_regex(
r'<title\b[^>]*?>(?:Video:\s+)?(.+?)(?:\s+-\s+ThisVid(?:\.com| tube))?</title>',
webpage, 'title')
if type_ == 'embed':
# look for more metadata
video_alt_url = url_or_none(self._search_regex(
rf'''video_alt_url\s*:\s+'({self._VALID_URL}/)',''',
webpage, 'video_alt_url', default=None))
if video_alt_url and video_alt_url != url:
webpage = self._download_webpage(
video_alt_url, main_id,
note='Redirecting embed to main page', fatal=False) or webpage
video_holder = get_element_by_class('video-holder', webpage) or ''
if '>This video is a private video' in video_holder:
self.raise_login_required(
(clean_html(video_holder) or 'Private video').partition('\n')[0])
uploader = self._html_search_regex(
r'''(?s)<span\b[^>]*>Added by:\s*</span><a\b[^>]+\bclass\s*=\s*["']author\b[^>]+\bhref\s*=\s*["']https://thisvid\.com/members/([0-9]+/.{3,}?)\s*</a>''',
webpage, 'uploader', default='')
uploader = re.split(r'''/["'][^>]*>\s*''', uploader)
if len(uploader) == 2:
# id must be non-empty, uploader could be ''
uploader_id, uploader = uploader
uploader = uploader or None
else:
uploader_id = uploader = None
return self.url_result(
url, ie='Generic', url_transparent=True,
title=title,
age_limit=18,
uploader=uploader,
uploader_id=uploader_id)
class ThisVidPlaylistBaseIE(InfoExtractor):
_PLAYLIST_URL_RE = None
@classmethod
def _find_urls(cls, html):
for m in re.finditer(rf'''<a\b[^>]+\bhref\s*=\s*["'](?P<url>{cls._PLAYLIST_URL_RE}\b)[^>]+>''', html):
yield m.group('url')
def _generate_playlist_entries(self, url, playlist_id, html=None):
page_url = url
for page in itertools.count(1):
if not html:
html = self._download_webpage(
page_url, playlist_id, note=f'Downloading page {page}',
fatal=False) or ''
yield from self._find_urls(html)
next_page = get_element_by_class('pagination-next', html) or ''
if next_page:
# member list page
next_page = urljoin(url, self._search_regex(
r'''<a\b[^>]+\bhref\s*=\s*("|')(?P<url>(?!#)(?:(?!\1).)+)''',
next_page, 'next page link', group='url', default=None))
# in case a member page should have pagination-next with empty link, not just `else:`
if next_page is None:
# playlist page
parsed_url = urllib.parse.urlparse(page_url)
base_path, _, num = parsed_url.path.rpartition('/')
num = int_or_none(num)
if num is None:
base_path, num = parsed_url.path.rstrip('/'), 1
parsed_url = parsed_url._replace(path=f'{base_path}/{num + 1}')
next_page = urllib.parse.urlunparse(parsed_url)
if page_url == next_page:
next_page = None
if not next_page:
return
page_url, html = next_page, None
def _make_playlist_result(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
title = re.split(
r'(?i)\s*\|\s*ThisVid\.com\s*$',
self._og_search_title(webpage, default=None)
or self._html_search_regex(r'(?s)<title\b[^>]*>(.+?)</title', webpage, 'title', fatal=False) or '', 1)[0] or None
return self.playlist_from_matches(
self._generate_playlist_entries(url, playlist_id, webpage),
playlist_id=playlist_id, playlist_title=title, ie=ThisVidIE)
class ThisVidMemberIE(ThisVidPlaylistBaseIE):
_VALID_URL = r'https?://thisvid\.com/members/(?P<id>\d+)'
_TESTS = [{
'url': 'https://thisvid.com/members/2140501/',
'info_dict': {
'id': '2140501',
'title': 'Rafflesia\'s Profile',
},
'playlist_mincount': 16,
}, {
'url': 'https://thisvid.com/members/2140501/favourite_videos/',
'info_dict': {
'id': '2140501',
'title': 'Rafflesia\'s Favourite Videos',
},
'playlist_mincount': 15,
}, {
'url': 'https://thisvid.com/members/636468/public_videos/',
'info_dict': {
'id': '636468',
'title': 'Happymouth\'s Public Videos',
},
'playlist_mincount': 196,
}]
_PLAYLIST_URL_RE = ThisVidIE._VALID_URL
def _real_extract(self, url):
return self._make_playlist_result(url)
class ThisVidPlaylistIE(ThisVidPlaylistBaseIE):
_VALID_URL = r'https?://thisvid\.com/playlist/(?P<id>\d+)/video/(?P<video_id>[A-Za-z0-9-]+)'
_TESTS = [{
'url': 'https://thisvid.com/playlist/6615/video/big-italian-booty-28/',
'info_dict': {
'id': '6615',
'title': 'Underwear Stuff',
},
'playlist_mincount': 200,
}, {
'url': 'https://thisvid.com/playlist/6615/video/big-italian-booty-28/',
'info_dict': {
'id': '1072387',
'ext': 'mp4',
'title': 'Big Italian Booty 28',
'description': 'md5:1bccf7b13765e18fb27bf764dba7ede2',
'uploader_id': '367912',
'uploader': 'Jcmusclefun',
'age_limit': 18,
'display_id': 'big-italian-booty-28',
'thumbnail': r're:https?://\w+\.thisvid\.com/(?:[^/]+/)+1072387/preview\.jpg',
},
'params': {
'noplaylist': True,
},
}]
_PLAYLIST_URL_RE = _VALID_URL
def _generate_playlist_entries(self, url, playlist_id, html=None):
for wrapped_url in super()._generate_playlist_entries(url, playlist_id, html):
video_id = re.match(self._VALID_URL, wrapped_url).group('video_id')
yield urljoin(url, f'/videos/{video_id}/')
def _real_extract(self, url):
playlist_id, video_id = self._match_valid_url(url).group('id', 'video_id')
if not self._yes_playlist(playlist_id, video_id):
redirect_url = urljoin(url, f'/videos/{video_id}/')
return self.url_result(redirect_url, ThisVidIE)
result = self._make_playlist_result(url)
# Fix duplicated title (`the title - the title` => `the title`)
title = result['title']
t_len = len(title)
if t_len > 5 and t_len % 2 != 0:
t_len = t_len // 2
if title[t_len] == '-':
first, second = map(str.strip, (title[:t_len], title[t_len + 1:]))
if first and first == second:
result['title'] = first
return result

View File

@@ -270,9 +270,9 @@ class ZenYandexIE(InfoExtractor):
for s_url in stream_urls:
ext = determine_ext(s_url)
if ext == 'mpd':
formats.extend(self._extract_mpd_formats(s_url, id, mpd_id='dash'))
formats.extend(self._extract_mpd_formats(s_url, video_id, mpd_id='dash'))
elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(s_url, id, 'mp4'))
formats.extend(self._extract_m3u8_formats(s_url, video_id, 'mp4'))
return {
'id': video_id,
'title': video_json.get('title') or self._og_search_title(webpage),

View File

@@ -291,7 +291,7 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
"""Provide base functions for Youtube extractors"""
_RESERVED_NAMES = (
r'channel|c|user|playlist|watch|w|v|embed|e|watch_popup|clip|'
r'channel|c|user|playlist|watch|w|v|embed|e|live|watch_popup|clip|'
r'shorts|movies|results|search|shared|hashtag|trending|explore|feed|feeds|'
r'browse|oembed|get_video_info|iframe_api|s/player|source|'
r'storefront|oops|index|account|t/terms|about|upload|signin|logout')
@@ -3712,7 +3712,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'url': fmt_url,
'width': int_or_none(fmt.get('width')),
'language': join_nonempty(audio_track.get('id', '').split('.')[0],
'desc' if language_preference < -1 else ''),
'desc' if language_preference < -1 else '') or None,
'language_preference': language_preference,
# Strictly de-prioritize damaged and 3gp formats
'preference': -10 if is_damaged else -2 if itag == '17' else None,

View File

@@ -113,7 +113,7 @@ def parseOpts(overrideArguments=None, ignore_config_files='if_override'):
opts = optparse.Values({'verbose': True, 'print_help': False})
try:
try:
if overrideArguments:
if overrideArguments is not None:
root.append_config(overrideArguments, label='Override')
else:
root.append_config(sys.argv[1:], label='Command-line')
@@ -668,8 +668,9 @@ def create_parser():
'--date',
metavar='DATE', dest='date', default=None,
help=(
'Download only videos uploaded on this date. The date can be "YYYYMMDD" or in the format '
'[now|today|yesterday][-N[day|week|month|year]]. E.g. --date today-2weeks'))
'Download only videos uploaded on this date. '
'The date can be "YYYYMMDD" or in the format [now|today|yesterday][-N[day|week|month|year]]. '
'E.g. "--date today-2weeks" downloads only videos uploaded on the same day two weeks ago'))
selection.add_option(
'--datebefore',
metavar='DATE', dest='datebefore', default=None,
@@ -1000,11 +1001,11 @@ def create_parser():
'This option can be used multiple times to set the sleep for the different retry types, '
'e.g. --retry-sleep linear=1::2 --retry-sleep fragment:exp=1:20'))
downloader.add_option(
'--skip-unavailable-fragments', '--no-abort-on-unavailable-fragment',
'--skip-unavailable-fragments', '--no-abort-on-unavailable-fragments',
action='store_true', dest='skip_unavailable_fragments', default=True,
help='Skip unavailable fragments for DASH, hlsnative and ISM downloads (default) (Alias: --no-abort-on-unavailable-fragment)')
help='Skip unavailable fragments for DASH, hlsnative and ISM downloads (default) (Alias: --no-abort-on-unavailable-fragments)')
downloader.add_option(
'--abort-on-unavailable-fragment', '--no-skip-unavailable-fragments',
'--abort-on-unavailable-fragments', '--no-skip-unavailable-fragments',
action='store_false', dest='skip_unavailable_fragments',
help='Abort download if a fragment is unavailable (Alias: --no-skip-unavailable-fragments)')
downloader.add_option(
@@ -1802,7 +1803,7 @@ def create_parser():
'Supported values of "WHEN" are the same as that of --use-postprocessor (default: after_move). '
'Same syntax as the output template can be used to pass any field as arguments to the command. '
'After download, an additional field "filepath" that contains the final path of the downloaded file '
'is also available, and if no fields are passed, %(filepath)q is appended to the end of the command. '
'is also available, and if no fields are passed, %(filepath,_filename|)q is appended to the end of the command. '
'This option can be used multiple times'))
postproc.add_option(
'--no-exec',

View File

@@ -468,7 +468,7 @@ class FFmpegPostProcessor(PostProcessor, RunsFFmpeg, ShowsProgress):
"""
concat_file = f'{out_file}.concat'
self.write_debug(f'Writing concat spec to {concat_file}')
with open(concat_file, 'wt', encoding='utf-8') as f:
with open(concat_file, 'w', encoding='utf-8') as f:
f.writelines(self._concat_spec(in_files, concat_opts))
out_flags = list(self.stream_copy_opts(ext=determine_ext(out_file)))
@@ -783,7 +783,7 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
@staticmethod
def _get_chapter_opts(chapters, metadata_filename):
with open(metadata_filename, 'wt', encoding='utf-8') as f:
with open(metadata_filename, 'w', encoding='utf-8') as f:
def ffmpeg_escape(text):
return re.sub(r'([\\=;#\n])', r'\\\1', text)
@@ -1058,7 +1058,7 @@ class FFmpegSubtitlesConvertorPP(FFmpegPostProcessor):
with open(dfxp_file, 'rb') as f:
srt_data = dfxp2srt(f.read())
with open(srt_file, 'wt', encoding='utf-8') as f:
with open(srt_file, 'w', encoding='utf-8') as f:
f.write(srt_data)
old_file = srt_file

View File

@@ -3395,7 +3395,7 @@ def js_to_json(code, vars={}, *, strict=False):
try:
if not strict:
json.loads(vars[v])
except json.decoder.JSONDecodeError:
except json.JSONDecodeError:
return json.dumps(vars[v])
else:
return vars[v]
@@ -5466,7 +5466,7 @@ def traverse_obj(
The keys in the path can be one of:
- `None`: Return the current object.
- `str`/`int`: Return `obj[key]`. For `re.Match, return `obj.group(key)`.
- `str`/`int`: Return `obj[key]`. For `re.Match`, return `obj.group(key)`.
- `slice`: Branch out and return all values in `obj[key]`.
- `Ellipsis`: Branch out and return a list of all values.
- `tuple`/`list`: Branch out and return a list of all matching values.

View File

@@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py
__version__ = '2022.11.11'
__version__ = '2023.01.02'
RELEASE_GIT_HEAD = '8b644025b'
RELEASE_GIT_HEAD = 'd83b0ad80'
VARIANT = None