[cleanup, docs] Misc cleanup

Closes #2828, closes #2734, closes #2802, closes #2937
[xinpianchang] Add extractor (#2963 )
2024-11-09 19:00:39 +00:00 · 2022-03-08 22:38:06 +05:30 · 2022-03-08 08:55:40 -08:00 · 2022-03-08 08:52:51 -08:00 · 2022-03-08 08:48:35 -08:00 · 2022-03-08 08:45:23 -08:00
30 changed files with 537 additions and 259 deletions
--- a/.gitignore
+++ b/.gitignore
@ -24,6 +24,7 @@ cookies

 *.3gp
 *.ape
+*.ass
 *.avi
 *.desktop
 *.flac
@ -106,6 +107,7 @@ yt-dlp.zip
 *.iml
 .vscode
 *.sublime-*
+*.code-workspace

 # Lazy extractors
 */extractor/lazy_extractors.py
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -11,6 +11,7 @@
    - [Is anyone going to need the feature?](#is-anyone-going-to-need-the-feature)
    - [Is your question about yt-dlp?](#is-your-question-about-yt-dlp)
    - [Are you willing to share account details if needed?](#are-you-willing-to-share-account-details-if-needed)
+    - [Is the website primarily used for piracy](#is-the-website-primarily-used-for-piracy)
 - [DEVELOPER INSTRUCTIONS](#developer-instructions)
    - [Adding new feature or making overarching changes](#adding-new-feature-or-making-overarching-changes)
    - [Adding support for a new site](#adding-support-for-a-new-site)
@ -24,6 +25,7 @@
        - [Collapse fallbacks](#collapse-fallbacks)
        - [Trailing parentheses](#trailing-parentheses)
        - [Use convenience conversion and parsing functions](#use-convenience-conversion-and-parsing-functions)
+    - [My pull request is labeled pending-fixes](#my-pull-request-is-labeled-pending-fixes)
 - [EMBEDDING YT-DLP](README.md#embedding-yt-dlp)


@ -123,6 +125,10 @@ While these steps won't necessarily ensure that no misuse of the account takes p
 - Change the password before sharing the account to something random (use [this](https://passwordsgenerator.net/) if you don't have a random password generator).
 - Change the password after receiving the account back.

+### Is the website primarily used for piracy?
+
+We follow [youtube-dl's policy](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free) to not support services that is primarily used for infringing copyright. Additionally, it has been decided to not to support porn sites that specialize in deep fake. We also cannot support any service that serves only [DRM protected content](https://en.wikipedia.org/wiki/Digital_rights_management). 
+



@ -210,7 +216,7 @@ After you have ensured this site is distributing its content legally, you can fo
            }
    ```
 1. Add an import in [`yt_dlp/extractor/extractors.py`](yt_dlp/extractor/extractors.py).
-1. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, the tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. You can also run all the tests in one go with `TestDownload.test_YourExtractor_all`
+1. Run `python test/test_download.py TestDownload.test_YourExtractor` (note that `YourExtractor` doesn't end with `IE`). This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, the tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. You can also run all the tests in one go with `TestDownload.test_YourExtractor_all`
 1. Make sure you have atleast one test for your extractor. Even if all videos covered by the extractor are expected to be inaccessible for automated testing, tests should still be added with a `skip` parameter indicating why the particular test is disabled from running.
 1. Have a look at [`yt_dlp/extractor/common.py`](yt_dlp/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](yt_dlp/extractor/common.py#L91-L426). Add tests and code for as many as you want.
 1. Make sure your code follows [yt-dlp coding conventions](#yt-dlp-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart):
@ -658,6 +664,10 @@ duration = float_or_none(video.get('durationMs'), scale=1000)
 view_count = int_or_none(video.get('views'))
 ```

+# My pull request is labeled pending-fixes
+
+The `pending-fixes` label is added when there are changes requested to a PR. When the necessary changes are made, the label should be removed. However, despite our best efforts, it may sometimes happen that the maintainer did not see the changes or forgot to remove the label. If your PR is still marked as `pending-fixes` a few days after all requested changes have been made, feel free to ping the maintainer who labeled your issue and ask them to re-review and remove the label.
+



--- a/4
+++ b/4
@ -146,7 +146,7 @@ chio0hai
 cntrl-s
 Deer-Spangle
 DEvmIb
-Grabien
+Grabien/MaximVol
 j54vc1bk
 mpeter50
 mrpapersonic
@ -160,7 +160,7 @@ PilzAdam
 zmousm
 iw0nderhow
 unit193
-TwoThousandHedgehogs
+TwoThousandHedgehogs/KathrynElrod
 Jertzukka
 cypheron
 Hyeeji
--- a/2
+++ b/2
@ -16,7 +16,7 @@ pypi-files: AUTHORS Changelog.md LICENSE README.md README.txt supportedsites com
 clean-test:
 	rm -rf test/testdata/sigs/player-*.js tmp/ *.annotations.xml *.aria2 *.description *.dump *.frag \
 	*.frag.aria2 *.frag.urls *.info.json *.live_chat.json *.meta *.part* *.tmp *.temp *.unknown_video *.ytdl \
-	*.3gp *.ape *.avi *.desktop *.flac *.flv *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 \
+	*.3gp *.ape *.ass *.avi *.desktop *.flac *.flv *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 \
 	*.mp4 *.ogg *.opus *.png *.sbv *.srt *.swf *.swp *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp
 clean-dist:
 	rm -rf yt-dlp.1.temp.md yt-dlp.1 README.txt MANIFEST build/ dist/ .coverage cover/ yt-dlp.tar.gz completions/ \
--- a/README.md
+++ b/README.md
@ -112,7 +112,7 @@ yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on t

 * **Other new options**: Many new options have been added such as `--concat-playlist`, `--print`, `--wait-for-video`, `--sleep-requests`, `--convert-thumbnails`, `--write-link`, `--force-download-archive`, `--force-overwrites`, `--break-on-reject` etc

-* **Improvements**: Regex and other operators in `--match-filter`, multiple `--postprocessor-args` and `--downloader-args`, faster archive checking, more [format selection options](#format-selection), merge multi-video/audio, multiple `--config-locations`, `--exec` at different stages, etc
+* **Improvements**: Regex and other operators in `--format`/`--match-filter`, multiple `--postprocessor-args` and `--downloader-args`, faster archive checking, more [format selection options](#format-selection), merge multi-video/audio, multiple `--config-locations`, `--exec` at different stages, etc

 * **Plugins**: Extractors and PostProcessors can be loaded from an external file. See [plugins](#plugins) for details

@ -130,7 +130,7 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
 * The default [format sorting](#sorting-formats) is different from youtube-dl and prefers higher resolution and better codecs rather than higher bitrates. You can use the `--format-sort` option to change this to any order you prefer, or use `--compat-options format-sort` to use youtube-dl's sorting order
 * The default format selector is `bv*+ba/b`. This means that if a combined video + audio format that is better than the best video-only format is found, the former will be preferred. Use `-f bv+ba/b` or `--compat-options format-spec` to revert this
 * Unlike youtube-dlc, yt-dlp does not allow merging multiple audio/video streams into one file by default (since this conflicts with the use of `-f bv*+ba`). If needed, this feature must be enabled using `--audio-multistreams` and `--video-multistreams`. You can also use `--compat-options multistreams` to enable both
-* `--ignore-errors` is enabled by default. Use `--abort-on-error` or `--compat-options abort-on-error` to abort on errors instead
+* `--no-abort-on-error` is enabled by default. Use `--abort-on-error` or `--compat-options abort-on-error` to abort on errors instead
 * When writing metadata files such as thumbnails, description or infojson, the same information (if available) is also written for playlists. Use `--no-write-playlist-metafiles` or `--compat-options no-playlist-metafiles` to not write these files
 * `--add-metadata` attaches the `infojson` to `mkv` files in addition to writing the metadata when used with `--write-info-json`. Use `--no-embed-info-json` or `--compat-options no-attach-info-json` to revert this
 * Some metadata are embedded into different fields when using `--add-metadata` as compared to youtube-dl. Most notably, `comment` field contains the `webpage_url` and `synopsis` contains the `description`. You can [use `--parse-metadata`](#modifying-metadata) to modify this to your liking or use `--compat-options embed-metadata` to revert this
@ -267,7 +267,8 @@ While all the other dependencies are optional, `ffmpeg` and `ffprobe` are highly
 * [**pycryptodomex**](https://github.com/Legrandin/pycryptodome) - For decrypting AES-128 HLS streams and various other data. Licensed under [BSD2](https://github.com/Legrandin/pycryptodome/blob/master/LICENSE.rst)
 * [**websockets**](https://github.com/aaugustin/websockets) - For downloading over websocket. Licensed under [BSD3](https://github.com/aaugustin/websockets/blob/main/LICENSE)
 * [**secretstorage**](https://github.com/mitya57/secretstorage) - For accessing the Gnome keyring while decrypting cookies of Chromium-based browsers on Linux. Licensed under [BSD](https://github.com/mitya57/secretstorage/blob/master/LICENSE)
-* [**AtomicParsley**](https://github.com/wez/atomicparsley) - For embedding thumbnail in mp4/m4a if mutagen is not present. Licensed under [GPLv2+](https://github.com/wez/atomicparsley/blob/master/COPYING)
+* [**AtomicParsley**](https://github.com/wez/atomicparsley) - For embedding thumbnail in mp4/m4a if mutagen/ffmpeg cannot. Licensed under [GPLv2+](https://github.com/wez/atomicparsley/blob/master/COPYING)
+* [**brotli**](https://github.com/google/brotli) or [**brotlicffi**](https://github.com/python-hyper/brotlicffi) - [Brotli](https://en.wikipedia.org/wiki/Brotli) content encoding support. Both licensed under MIT <sup>[1](https://github.com/google/brotli/blob/master/LICENSE) [2](https://github.com/python-hyper/brotlicffi/blob/master/LICENSE) </sup>
 * [**rtmpdump**](http://rtmpdump.mplayerhq.hu) - For downloading `rtmp` streams. ffmpeg will be used as a fallback. Licensed under [GPLv2+](http://rtmpdump.mplayerhq.hu)
 * [**mplayer**](http://mplayerhq.hu/design7/info.html) or [**mpv**](https://mpv.io) - For downloading `rstp` streams. ffmpeg will be used as a fallback. Licensed under [GPLv2+](https://github.com/mpv-player/mpv/blob/master/Copyright)
 * [**phantomjs**](https://github.com/ariya/phantomjs) - Used in extractors where javascript needs to be run. Licensed under [BSD3](https://github.com/ariya/phantomjs/blob/master/LICENSE.BSD)
@ -278,13 +279,14 @@ To use or redistribute the dependencies, you must agree to their respective lice

 The Windows and MacOS standalone release binaries are already built with the python interpreter, mutagen, pycryptodomex and websockets included.

+<!-- TODO: ffmpeg has merged this patch. Remove this note once there is new release -->
 **Note**: There are some regressions in newer ffmpeg versions that causes various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds


 ## COMPILE

 **For Windows**:
-To build the Windows executable, you must have pyinstaller (and optionally mutagen, pycryptodomex, websockets). Once you have all the necessary dependencies installed, (optionally) build lazy extractors using `devscripts/make_lazy_extractors.py`, and then just run `pyinst.py`. The executable will be built for the same architecture (32/64 bit) as the python used to build it.
+To build the Windows executable, you must have pyinstaller (and any of yt-dlp's optional dependencies if needed). Once you have all the necessary dependencies installed, (optionally) build lazy extractors using `devscripts/make_lazy_extractors.py`, and then just run `pyinst.py`. The executable will be built for the same architecture (32/64 bit) as the python used to build it.

    py -m pip install -U pyinstaller -r requirements.txt
    py devscripts/make_lazy_extractors.py
@ -605,11 +607,11 @@ You can also fork the project on github and run your fork's [build workflow](.gi
                                     --write-description etc. (default)
    --no-write-playlist-metafiles    Do not write playlist metadata when using
                                     --write-info-json, --write-description etc.
-    --clean-infojson                 Remove some private fields such as
+    --clean-info-json                Remove some private fields such as
                                     filenames from the infojson. Note that it
                                     could still contain some personal
                                     information (default)
-    --no-clean-infojson              Write all fields to the infojson
+    --no-clean-info-json             Write all fields to the infojson
    --write-comments                 Retrieve video comments to be placed in the
                                     infojson. The comments are fetched even
                                     without this option if the extraction is
@ -1598,25 +1600,28 @@ This option also has a few special uses:
 * You can download an additional URL based on the metadata of the currently downloaded video. To do this, set the field `additional_urls` to the URL that you want to download. Eg: `--parse-metadata "description:(?P<additional_urls>https?://www\.vimeo\.com/\d+)` will download the first vimeo video found in the description
 * You can use this to change the metadata that is embedded in the media file. To do this, set the value of the corresponding field with a `meta_` prefix. For example, any value you set to `meta_description` field will be added to the `description` field in the file. For example, you can use this to set a different "description" and "synopsis". To modify the metadata of individual streams, use the `meta<n>_` prefix (Eg: `meta1_language`). Any value set to the `meta_` field will overwrite all default values.

+**Note**: Metadata modification happens before format selection, post-extraction and other post-processing operations. Some fields may be added or changed during these steps, overriding your changes.
+
 For reference, these are the fields yt-dlp adds by default to the file metadata:

-Metadata fields|From
-:---|:---
-`title`|`track` or `title`
-`date`|`upload_date`
-`description`,  `synopsis`|`description`
-`purl`, `comment`|`webpage_url`
-`track`|`track_number`
-`artist`|`artist`, `creator`, `uploader` or `uploader_id`
-`genre`|`genre`
-`album`|`album`
-`album_artist`|`album_artist`
-`disc`|`disc_number`
-`show`|`series`
-`season_number`|`season_number`
-`episode_id`|`episode` or `episode_id`
-`episode_sort`|`episode_number`
-`language` of each stream|From the format's `language`
+Metadata fields            | From
+:--------------------------|:------------------------------------------------
+`title`                    | `track` or `title`
+`date`                     | `upload_date`
+`description`,  `synopsis` | `description`
+`purl`, `comment`          | `webpage_url`
+`track`                    | `track_number`
+`artist`                   | `artist`, `creator`, `uploader` or `uploader_id`
+`genre`                    | `genre`
+`album`                    | `album`
+`album_artist`             | `album_artist`
+`disc`                     | `disc_number`
+`show`                     | `series`
+`season_number`            | `season_number`
+`episode_id`               | `episode` or `episode_id`
+`episode_sort`             | `episode_number`
+`language` of each stream  | the format's `language`
+
 **Note**: The file format may not support some of these fields


@ -1815,12 +1820,11 @@ ydl_opts = {
    }],
    'logger': MyLogger(),
    'progress_hooks': [my_hook],
+    # Add custom headers
+    'http_headers': {'Referer': 'https://www.google.com'}
 }


-# Add custom headers
-yt_dlp.utils.std_headers.update({'Referer': 'https://www.google.com'})
-
 # ℹ️ See the public functions in yt_dlp.YoutubeDL for for other available functions.
 # Eg: "ydl.download", "ydl.download_with_info_file"
 with yt_dlp.YoutubeDL(ydl_opts) as ydl:
--- a/devscripts/prepare_manpage.py
+++ b/devscripts/prepare_manpage.py
@ -75,7 +75,11 @@ def filter_options(readme):
    section = re.search(r'(?sm)^# USAGE AND OPTIONS\n.+?(?=^# )', readme).group(0)
    options = '# OPTIONS\n'
    for line in section.split('\n')[1:]:
-        mobj = re.fullmatch(r'\s{4}(?P<opt>-(?:,\s|[^\s])+)(?:\s(?P<meta>([^\s]|\s(?!\s))+))?(\s{2,}(?P<desc>.+))?', line)
+        mobj = re.fullmatch(r'''(?x)
+                \s{4}(?P<opt>-(?:,\s|[^\s])+)
+                (?:\s(?P<meta>(?:[^\s]|\s(?!\s))+))?
+                (\s{2,}(?P<desc>.+))?
+            ''', line)
        if not mobj:
            options += f'{line.lstrip()}\n'
            continue
--- a/pyinst.py
+++ b/pyinst.py
@ -74,7 +74,7 @@ def version_to_list(version):


 def dependency_options():
-    dependencies = [pycryptodome_module(), 'mutagen'] + collect_submodules('websockets')
+    dependencies = [pycryptodome_module(), 'mutagen', 'brotli'] + collect_submodules('websockets')
    excluded_modules = ['test', 'ytdlp_plugins', 'youtube-dl', 'youtube-dlc']

    yield from (f'--hidden-import={module}' for module in dependencies)
--- a/requirements.txt
+++ b/requirements.txt
@ -1,3 +1,5 @@
 mutagen
 pycryptodomex
 websockets
+brotli; platform_python_implementation=='CPython'
+brotlicffi; platform_python_implementation!='CPython'
--- a/setup.py
+++ b/setup.py
@ -21,9 +21,9 @@ DESCRIPTION = 'A youtube-dl fork with additional features and patches'
 LONG_DESCRIPTION = '\n\n'.join((
    'Official repository: <https://github.com/yt-dlp/yt-dlp>',
    '**PS**: Some links in this document will not work since this is a copy of the README.md from Github',
-    open('README.md', 'r', encoding='utf-8').read()))
+    open('README.md').read()))

-REQUIREMENTS = ['mutagen', 'pycryptodomex', 'websockets']
+REQUIREMENTS = open('requirements.txt').read().splitlines()


 if sys.argv[1:2] == ['py2exe']:
--- a/yt_dlp/YoutubeDL.py
+++ b/yt_dlp/YoutubeDL.py
@ -32,6 +32,7 @@ from string import ascii_letters

 from .compat import (
    compat_basestring,
+    compat_brotli,
    compat_get_terminal_size,
    compat_kwargs,
    compat_numeric_types,
@ -234,6 +235,8 @@ class YoutubeDL(object):
                       See "Sorting Formats" for more details.
    format_sort_force: Force the given format_sort. see "Sorting Formats"
                       for more details.
+    prefer_free_formats: Whether to prefer video formats with free containers
+                       over non-free ones of same quality.
    allow_multiple_video_streams:   Allow multiple video streams to be merged
                       into a single file
    allow_multiple_audio_streams:   Allow multiple audio streams to be merged
@ -3675,6 +3678,7 @@ class YoutubeDL(object):
        from .cookies import SQLITE_AVAILABLE, SECRETSTORAGE_AVAILABLE

        lib_str = join_nonempty(
+            compat_brotli and compat_brotli.__name__,
            compat_pycrypto_AES and compat_pycrypto_AES.__name__.split('.')[0],
            SECRETSTORAGE_AVAILABLE and 'secretstorage',
            has_mutagen and 'mutagen',
--- a/yt_dlp/compat.py
+++ b/yt_dlp/compat.py
@ -170,6 +170,13 @@ except ImportError:
    except ImportError:
        compat_pycrypto_AES = None

+try:
+    import brotlicffi as compat_brotli
+except ImportError:
+    try:
+        import brotli as compat_brotli
+    except ImportError:
+        compat_brotli = None

 WINDOWS_VT_MODE = False if compat_os_name == 'nt' else None

@ -258,6 +265,7 @@ __all__ = [
    'compat_asyncio_run',
    'compat_b64decode',
    'compat_basestring',
+    'compat_brotli',
    'compat_chr',
    'compat_collections_abc',
    'compat_cookiejar',
--- a/yt_dlp/downloader/youtube_live_chat.py
+++ b/yt_dlp/downloader/youtube_live_chat.py
@ -22,6 +22,9 @@ class YoutubeLiveChatFD(FragmentFD):
    def real_download(self, filename, info_dict):
        video_id = info_dict['video_id']
        self.to_screen('[%s] Downloading live chat' % self.FD_NAME)
+        if not self.params.get('skip_download'):
+            self.report_warning('Live chat download runs until the livestream ends. '
+                                'If you wish to download the video simultaneously, run a separate yt-dlp instance')

        fragment_retries = self.params.get('fragment_retries', 0)
        test = self.params.get('test', False)
--- a/yt_dlp/extractor/abematv.py
+++ b/yt_dlp/extractor/abematv.py
@ -8,10 +8,6 @@ import struct
 from base64 import urlsafe_b64encode
 from binascii import unhexlify

-import typing
-if typing.TYPE_CHECKING:
-    from ..YoutubeDL import YoutubeDL
-
 from .common import InfoExtractor
 from ..aes import aes_ecb_decrypt
 from ..compat import (
@ -36,15 +32,15 @@ from ..utils import (

 # NOTE: network handler related code is temporary thing until network stack overhaul PRs are merged (#2861/#2862)

-def add_opener(self: 'YoutubeDL', handler):
+def add_opener(ydl, handler):
    ''' Add a handler for opening URLs, like _download_webpage '''
    # https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426
    # https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605
-    assert isinstance(self._opener, compat_urllib_request.OpenerDirector)
-    self._opener.add_handler(handler)
+    assert isinstance(ydl._opener, compat_urllib_request.OpenerDirector)
+    ydl._opener.add_handler(handler)


-def remove_opener(self: 'YoutubeDL', handler):
+def remove_opener(ydl, handler):
    '''
    Remove handler(s) for opening URLs
    @param handler Either handler object itself or handler type.
@ -52,8 +48,8 @@ def remove_opener(self: 'YoutubeDL', handler):
    '''
    # https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426
    # https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605
-    opener = self._opener
-    assert isinstance(self._opener, compat_urllib_request.OpenerDirector)
+    opener = ydl._opener
+    assert isinstance(ydl._opener, compat_urllib_request.OpenerDirector)
    if isinstance(handler, (type, tuple)):
        find_cp = lambda x: isinstance(x, handler)
    else:
--- a/yt_dlp/extractor/adobepass.py
+++ b/yt_dlp/extractor/adobepass.py
@ -1345,6 +1345,11 @@ MSO_INFO = {
        'username_field': 'username',
        'password_field': 'password',
    },
+    'Suddenlink': {
+        'name': 'Suddenlink',
+        'username_field': 'username',
+        'password_field': 'password',
+    },
 }


@ -1635,6 +1640,52 @@ class AdobePassIE(InfoExtractor):
                        urlh.geturl(), video_id, 'Sending final bookend',
                        query=hidden_data)

+                    post_form(mvpd_confirm_page_res, 'Confirming Login')
+                elif mso_id == 'Suddenlink':
+                    # Suddenlink is similar to SlingTV in using a tab history count and a meta refresh,
+                    # but they also do a dynmaic redirect using javascript that has to be followed as well
+                    first_bookend_page, urlh = post_form(
+                        provider_redirect_page_res, 'Pressing Continue...')
+
+                    hidden_data = self._hidden_inputs(first_bookend_page)
+                    hidden_data['history_val'] = 1
+
+                    provider_login_redirect_page = self._download_webpage(
+                        urlh.geturl(), video_id, 'Sending First Bookend',
+                        query=hidden_data)
+
+                    provider_tryauth_url = self._html_search_regex(
+                        r'url:\s*[\'"]([^\'"]+)', provider_login_redirect_page, 'ajaxurl')
+
+                    provider_tryauth_page = self._download_webpage(
+                        provider_tryauth_url, video_id, 'Submitting TryAuth',
+                        query=hidden_data)
+
+                    provider_login_page_res = self._download_webpage_handle(
+                        f'https://authorize.suddenlink.net/saml/module.php/authSynacor/login.php?AuthState={provider_tryauth_page}',
+                        video_id, 'Getting Login Page',
+                        query=hidden_data)
+
+                    provider_association_redirect, urlh = post_form(
+                        provider_login_page_res, 'Logging in', {
+                            mso_info['username_field']: username,
+                            mso_info['password_field']: password
+                        })
+
+                    provider_refresh_redirect_url = extract_redirect_url(
+                        provider_association_redirect, url=urlh.geturl())
+
+                    last_bookend_page, urlh = self._download_webpage_handle(
+                        provider_refresh_redirect_url, video_id,
+                        'Downloading Auth Association Redirect Page')
+
+                    hidden_data = self._hidden_inputs(last_bookend_page)
+                    hidden_data['history_val'] = 3
+
+                    mvpd_confirm_page_res = self._download_webpage_handle(
+                        urlh.geturl(), video_id, 'Sending Final Bookend',
+                        query=hidden_data)
+
                    post_form(mvpd_confirm_page_res, 'Confirming Login')
                else:
                    # Some providers (e.g. DIRECTV NOW) have another meta refresh
--- a/yt_dlp/extractor/ant1newsgr.py
+++ b/yt_dlp/extractor/ant1newsgr.py
@ -97,8 +97,8 @@ class Ant1NewsGrArticleIE(Ant1NewsGrBaseIE):
        embed_urls = list(Ant1NewsGrEmbedIE._extract_urls(webpage))
        if not embed_urls:
            raise ExtractorError('no videos found for %s' % video_id, expected=True)
-        return self.url_result_or_playlist_from_matches(
-            embed_urls, video_id, info['title'], ie=Ant1NewsGrEmbedIE.ie_key(),
+        return self.playlist_from_matches(
+            embed_urls, video_id, info.get('title'), ie=Ant1NewsGrEmbedIE.ie_key(),
            video_kwargs={'url_transparent': True, 'timestamp': info.get('timestamp')})


--- a/yt_dlp/extractor/ccma.py
+++ b/yt_dlp/extractor/ccma.py
@ -1,17 +1,14 @@
 # coding: utf-8
 from __future__ import unicode_literals

-import calendar
-import datetime
-
 from .common import InfoExtractor
 from ..utils import (
    clean_html,
-    extract_timezone,
    int_or_none,
    parse_duration,
    parse_resolution,
    try_get,
+    unified_timestamp,
    url_or_none,
 )

@ -95,14 +92,8 @@ class CCMAIE(InfoExtractor):
        duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text'))
        tematica = try_get(informacio, lambda x: x['tematica']['text'])

-        timestamp = None
        data_utc = try_get(informacio, lambda x: x['data_emissio']['utc'])
-        try:
-            timezone, data_utc = extract_timezone(data_utc)
-            timestamp = calendar.timegm((datetime.datetime.strptime(
-                data_utc, '%Y-%d-%mT%H:%M:%S') - timezone).timetuple())
-        except TypeError:
-            pass
+        timestamp = unified_timestamp(data_utc)

        subtitles = {}
        subtitols = media.get('subtitols') or []
--- a/yt_dlp/extractor/common.py
+++ b/yt_dlp/extractor/common.py
@ -226,6 +226,7 @@ class InfoExtractor(object):

    The following fields are optional:

+    direct:         True if a direct video file was given (must only be set by GenericIE)
    alt_title:      A secondary title of the video.
    display_id      An alternative identifier for the video, not necessarily
                    unique, but available before title. Typically, id is
@ -274,7 +275,7 @@ class InfoExtractor(object):
                        * "url": A URL pointing to the subtitles file
                    It can optionally also have:
                        * "name": Name or description of the subtitles
-                        * http_headers: A dictionary of additional HTTP headers
+                        * "http_headers": A dictionary of additional HTTP headers
                                  to add to the request.
                    "ext" will be calculated from URL if missing
    automatic_captions: Like 'subtitles'; contains automatically generated
@ -425,8 +426,8 @@ class InfoExtractor(object):
    title, description etc.


-    Subclasses of this one should re-define the _real_initialize() and
-    _real_extract() methods and define a _VALID_URL regexp.
+    Subclasses of this should define a _VALID_URL regexp and, re-define the
+    _real_extract() and (optionally) _real_initialize() methods.
    Probably, they should also be added to the list of extractors.

    Subclasses may also override suitable() if necessary, but ensure the function
@ -661,7 +662,7 @@ class InfoExtractor(object):
        return False

    def set_downloader(self, downloader):
-        """Sets the downloader for this IE."""
+        """Sets a YoutubeDL instance as the downloader for this IE."""
        self._downloader = downloader

    def _real_initialize(self):
@ -670,7 +671,7 @@ class InfoExtractor(object):

    def _real_extract(self, url):
        """Real extraction process. Redefine in subclasses."""
-        pass
+        raise NotImplementedError('This method must be implemented by subclasses')

    @classmethod
    def ie_key(cls):
@ -1661,31 +1662,31 @@ class InfoExtractor(object):
            'format_id': {'type': 'alias', 'field': 'id'},
            'preference': {'type': 'alias', 'field': 'ie_pref'},
            'language_preference': {'type': 'alias', 'field': 'lang'},
+            'source_preference': {'type': 'alias', 'field': 'source'},
+            'protocol': {'type': 'alias', 'field': 'proto'},
+            'filesize_approx': {'type': 'alias', 'field': 'fs_approx'},

            # Deprecated
-            'dimension': {'type': 'alias', 'field': 'res'},
-            'resolution': {'type': 'alias', 'field': 'res'},
-            'extension': {'type': 'alias', 'field': 'ext'},
-            'bitrate': {'type': 'alias', 'field': 'br'},
-            'total_bitrate': {'type': 'alias', 'field': 'tbr'},
-            'video_bitrate': {'type': 'alias', 'field': 'vbr'},
-            'audio_bitrate': {'type': 'alias', 'field': 'abr'},
-            'framerate': {'type': 'alias', 'field': 'fps'},
-            'protocol': {'type': 'alias', 'field': 'proto'},
-            'source_preference': {'type': 'alias', 'field': 'source'},
-            'filesize_approx': {'type': 'alias', 'field': 'fs_approx'},
-            'filesize_estimate': {'type': 'alias', 'field': 'size'},
-            'samplerate': {'type': 'alias', 'field': 'asr'},
-            'video_ext': {'type': 'alias', 'field': 'vext'},
-            'audio_ext': {'type': 'alias', 'field': 'aext'},
-            'video_codec': {'type': 'alias', 'field': 'vcodec'},
-            'audio_codec': {'type': 'alias', 'field': 'acodec'},
-            'video': {'type': 'alias', 'field': 'hasvid'},
-            'has_video': {'type': 'alias', 'field': 'hasvid'},
-            'audio': {'type': 'alias', 'field': 'hasaud'},
-            'has_audio': {'type': 'alias', 'field': 'hasaud'},
-            'extractor': {'type': 'alias', 'field': 'ie_pref'},
-            'extractor_preference': {'type': 'alias', 'field': 'ie_pref'},
+            'dimension': {'type': 'alias', 'field': 'res', 'deprecated': True},
+            'resolution': {'type': 'alias', 'field': 'res', 'deprecated': True},
+            'extension': {'type': 'alias', 'field': 'ext', 'deprecated': True},
+            'bitrate': {'type': 'alias', 'field': 'br', 'deprecated': True},
+            'total_bitrate': {'type': 'alias', 'field': 'tbr', 'deprecated': True},
+            'video_bitrate': {'type': 'alias', 'field': 'vbr', 'deprecated': True},
+            'audio_bitrate': {'type': 'alias', 'field': 'abr', 'deprecated': True},
+            'framerate': {'type': 'alias', 'field': 'fps', 'deprecated': True},
+            'filesize_estimate': {'type': 'alias', 'field': 'size', 'deprecated': True},
+            'samplerate': {'type': 'alias', 'field': 'asr', 'deprecated': True},
+            'video_ext': {'type': 'alias', 'field': 'vext', 'deprecated': True},
+            'audio_ext': {'type': 'alias', 'field': 'aext', 'deprecated': True},
+            'video_codec': {'type': 'alias', 'field': 'vcodec', 'deprecated': True},
+            'audio_codec': {'type': 'alias', 'field': 'acodec', 'deprecated': True},
+            'video': {'type': 'alias', 'field': 'hasvid', 'deprecated': True},
+            'has_video': {'type': 'alias', 'field': 'hasvid', 'deprecated': True},
+            'audio': {'type': 'alias', 'field': 'hasaud', 'deprecated': True},
+            'has_audio': {'type': 'alias', 'field': 'hasaud', 'deprecated': True},
+            'extractor': {'type': 'alias', 'field': 'ie_pref', 'deprecated': True},
+            'extractor_preference': {'type': 'alias', 'field': 'ie_pref', 'deprecated': True},
        }

        def __init__(self, ie, field_preference):
@ -1785,7 +1786,7 @@ class InfoExtractor(object):
                    continue
                if self._get_field_setting(field, 'type') == 'alias':
                    alias, field = field, self._get_field_setting(field, 'field')
-                    if alias not in ('format_id', 'preference', 'language_preference'):
+                    if self._get_field_setting(alias, 'deprecated'):
                        self.ydl.deprecation_warning(
                            f'Format sorting alias {alias} is deprecated '
                            f'and may be removed in a future version. Please use {field} instead')
--- a/yt_dlp/extractor/extractors.py
+++ b/yt_dlp/extractor/extractors.py
@ -520,6 +520,7 @@ from .foxnews import (
    FoxNewsArticleIE,
 )
 from .foxsports import FoxSportsIE
+from .fptplay import FptplayIE
 from .franceculture import FranceCultureIE
 from .franceinter import FranceInterIE
 from .francetv import (
@ -848,6 +849,7 @@ from .microsoftvirtualacademy import (
 from .mildom import (
    MildomIE,
    MildomVodIE,
+    MildomClipIE,
    MildomUserVodIE,
 )
 from .minds import (
@ -2010,6 +2012,7 @@ from .ximalaya import (
    XimalayaIE,
    XimalayaAlbumIE
 )
+from .xinpianchang import XinpianchangIE
 from .xminus import XMinusIE
 from .xnxx import XNXXIE
 from .xstream import XstreamIE
--- a/yt_dlp/extractor/fptplay.py
+++ b/yt_dlp/extractor/fptplay.py
@ -0,0 +1,102 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import hashlib
+import time
+import urllib.parse
+
+from .common import InfoExtractor
+from ..utils import (
+    join_nonempty,
+)
+
+
+class FptplayIE(InfoExtractor):
+    _VALID_URL = r'https?://fptplay\.vn/(?P<type>xem-video)/[^/]+\-(?P<id>\w+)(?:/tap-(?P<episode>[^/]+)?/?(?:[?#]|$)|)'
+    _GEO_COUNTRIES = ['VN']
+    IE_NAME = 'fptplay'
+    IE_DESC = 'fptplay.vn'
+    _TESTS = [{
+        'url': 'https://fptplay.vn/xem-video/nhan-duyen-dai-nhan-xin-dung-buoc-621a123016f369ebbde55945',
+        'md5': 'ca0ee9bc63446c0c3e9a90186f7d6b33',
+        'info_dict': {
+            'id': '621a123016f369ebbde55945',
+            'ext': 'mp4',
+            'title': 'Nhân Duyên Đại Nhân Xin Dừng Bước - Ms. Cupid In Love',
+            'description': 'md5:23cf7d1ce0ade8e21e76ae482e6a8c6c',
+        },
+    }, {
+        'url': 'https://fptplay.vn/xem-video/ma-toi-la-dai-gia-61f3aa8a6b3b1d2e73c60eb5/tap-3',
+        'md5': 'b35be968c909b3e4e1e20ca45dd261b1',
+        'info_dict': {
+            'id': '61f3aa8a6b3b1d2e73c60eb5',
+            'ext': 'mp4',
+            'title': 'Má Tôi Là Đại Gia - 3',
+            'description': 'md5:ff8ba62fb6e98ef8875c42edff641d1c',
+        },
+    }, {
+        'url': 'https://fptplay.vn/xem-video/nha-co-chuyen-hi-alls-well-ends-well-1997-6218995f6af792ee370459f0',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        type_url, video_id, episode = self._match_valid_url(url).group('type', 'id', 'episode')
+        webpage = self._download_webpage(url, video_id=video_id, fatal=False)
+        info = self._download_json(self.get_api_with_st_token(video_id, episode or 0), video_id)
+        formats, subtitles = self._extract_m3u8_formats_and_subtitles(info['data']['url'], video_id, 'mp4')
+        self._sort_formats(formats)
+        return {
+            'id': video_id,
+            'title': join_nonempty(
+                self._html_search_meta(('og:title', 'twitter:title'), webpage), episode, delim=' - '),
+            'description': self._html_search_meta(['og:description', 'twitter:description'], webpage),
+            'formats': formats,
+            'subtitles': subtitles,
+        }
+
+    def get_api_with_st_token(self, video_id, episode):
+        path = f'/api/v6.2_w/stream/vod/{video_id}/{episode}/auto_vip'
+        timestamp = int(time.time()) + 10800
+
+        t = hashlib.md5(f'WEBv6Dkdsad90dasdjlALDDDS{timestamp}{path}'.encode()).hexdigest().upper()
+        r = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'
+        n = [int(f'0x{t[2 * o: 2 * o + 2]}', 16) for o in range(len(t) // 2)]
+
+        def convert(e):
+            t = ''
+            n = 0
+            i = [0, 0, 0]
+            a = [0, 0, 0, 0]
+            s = len(e)
+            c = 0
+            for z in range(s, 0, -1):
+                if n <= 3:
+                    i[n] = e[c]
+                n += 1
+                c += 1
+                if 3 == n:
+                    a[0] = (252 & i[0]) >> 2
+                    a[1] = ((3 & i[0]) << 4) + ((240 & i[1]) >> 4)
+                    a[2] = ((15 & i[1]) << 2) + ((192 & i[2]) >> 6)
+                    a[3] = (63 & i[2])
+                    for v in range(4):
+                        t += r[a[v]]
+                    n = 0
+            if n:
+                for o in range(n, 3):
+                    i[o] = 0
+
+                for o in range(n + 1):
+                    a[0] = (252 & i[0]) >> 2
+                    a[1] = ((3 & i[0]) << 4) + ((240 & i[1]) >> 4)
+                    a[2] = ((15 & i[1]) << 2) + ((192 & i[2]) >> 6)
+                    a[3] = (63 & i[2])
+                    t += r[a[o]]
+                n += 1
+                while n < 3:
+                    t += ''
+                    n += 1
+            return t
+
+        st_token = convert(n).replace('+', '-').replace('/', '_').replace('=', '')
+        return f'https://api.fptplay.net{path}?{urllib.parse.urlencode({"st": st_token, "e": timestamp})}'
--- a/yt_dlp/extractor/frontendmasters.py
+++ b/yt_dlp/extractor/frontendmasters.py
@ -252,9 +252,9 @@ class FrontendMastersCourseIE(FrontendMastersPageBaseIE):
        entries = []
        for lesson in lessons:
            lesson_name = lesson.get('slug')
-            if not lesson_name:
-                continue
            lesson_id = lesson.get('hash') or lesson.get('statsId')
+            if not lesson_id or not lesson_name:
+                continue
            entries.append(self._extract_lesson(chapters, lesson_id, lesson))

        title = course.get('title')
--- a/yt_dlp/extractor/iqiyi.py
+++ b/yt_dlp/extractor/iqiyi.py
@ -621,7 +621,7 @@ class IqIE(InfoExtractor):
        preview_time = traverse_obj(
            initial_format_data, ('boss_ts', (None, 'data'), ('previewTime', 'rtime')), expected_type=float_or_none, get_all=False)
        if traverse_obj(initial_format_data, ('boss_ts', 'data', 'prv'), expected_type=int_or_none):
-            self.report_warning('This preview video is limited%s' % format_field(preview_time, template='to %s seconds'))
+            self.report_warning('This preview video is limited%s' % format_field(preview_time, template=' to %s seconds'))

        # TODO: Extract audio-only formats
        for bid in set(traverse_obj(initial_format_data, ('program', 'video', ..., 'bid'), expected_type=str_or_none, default=[])):
--- a/yt_dlp/extractor/mildom.py
+++ b/yt_dlp/extractor/mildom.py
@ -1,102 +1,42 @@
 # coding: utf-8
 from __future__ import unicode_literals

-import base64
-from datetime import datetime
-import itertools
+import functools
 import json

 from .common import InfoExtractor
 from ..utils import (
-    update_url_query,
-    random_uuidv4,
-    try_get,
+    determine_ext,
+    dict_get,
+    ExtractorError,
    float_or_none,
-    dict_get
-)
-from ..compat import (
-    compat_str,
+    OnDemandPagedList,
+    random_uuidv4,
+    traverse_obj,
 )


 class MildomBaseIE(InfoExtractor):
    _GUEST_ID = None
-    _DISPATCHER_CONFIG = None

-    def _call_api(self, url, video_id, query=None, note='Downloading JSON metadata', init=False):
-        query = query or {}
-        if query:
-            query['__platform'] = 'web'
-        url = update_url_query(url, self._common_queries(query, init=init))
-        content = self._download_json(url, video_id, note=note)
-        if content['code'] == 0:
-            return content['body']
-        else:
-            self.raise_no_formats(
-                f'Video not found or premium content. {content["code"]} - {content["message"]}',
+    def _call_api(self, url, video_id, query=None, note='Downloading JSON metadata', body=None):
+        if not self._GUEST_ID:
+            self._GUEST_ID = f'pc-gp-{random_uuidv4()}'
+
+        content = self._download_json(
+            url, video_id, note=note, data=json.dumps(body).encode() if body else None,
+            headers={'Content-Type': 'application/json'} if body else {},
+            query={
+                '__guest_id': self._GUEST_ID,
+                '__platform': 'web',
+                **(query or {}),
+            })
+
+        if content['code'] != 0:
+            raise ExtractorError(
+                f'Mildom says: {content["message"]} (code {content["code"]})',
                expected=True)
-
-    def _common_queries(self, query={}, init=False):
-        dc = self._fetch_dispatcher_config()
-        r = {
-            'timestamp': self.iso_timestamp(),
-            '__guest_id': '' if init else self.guest_id(),
-            '__location': dc['location'],
-            '__country': dc['country'],
-            '__cluster': dc['cluster'],
-            '__platform': 'web',
-            '__la': self.lang_code(),
-            '__pcv': 'v2.9.44',
-            'sfr': 'pc',
-            'accessToken': '',
-        }
-        r.update(query)
-        return r
-
-    def _fetch_dispatcher_config(self):
-        if not self._DISPATCHER_CONFIG:
-            tmp = self._download_json(
-                'https://disp.mildom.com/serverListV2', 'initialization',
-                note='Downloading dispatcher_config', data=json.dumps({
-                    'protover': 0,
-                    'data': base64.b64encode(json.dumps({
-                        'fr': 'web',
-                        'sfr': 'pc',
-                        'devi': 'Windows',
-                        'la': 'ja',
-                        'gid': None,
-                        'loc': '',
-                        'clu': '',
-                        'wh': '1919*810',
-                        'rtm': self.iso_timestamp(),
-                        'ua': self.get_param('http_headers')['User-Agent'],
-                    }).encode('utf8')).decode('utf8').replace('\n', ''),
-                }).encode('utf8'))
-            self._DISPATCHER_CONFIG = self._parse_json(base64.b64decode(tmp['data']), 'initialization')
-        return self._DISPATCHER_CONFIG
-
-    @staticmethod
-    def iso_timestamp():
-        'new Date().toISOString()'
-        return datetime.utcnow().isoformat()[0:-3] + 'Z'
-
-    def guest_id(self):
-        'getGuestId'
-        if self._GUEST_ID:
-            return self._GUEST_ID
-        self._GUEST_ID = try_get(
-            self, (
-                lambda x: x._call_api(
-                    'https://cloudac.mildom.com/nonolive/gappserv/guest/h5init', 'initialization',
-                    note='Downloading guest token', init=True)['guest_id'] or None,
-                lambda x: x._get_cookies('https://www.mildom.com').get('gid').value,
-                lambda x: x._get_cookies('https://m.mildom.com').get('gid').value,
-            ), compat_str) or ''
-        return self._GUEST_ID
-
-    def lang_code(self):
-        'getCurrentLangCode'
-        return 'ja'
+        return content['body']


 class MildomIE(MildomBaseIE):
@ -106,31 +46,13 @@ class MildomIE(MildomBaseIE):

    def _real_extract(self, url):
        video_id = self._match_id(url)
-        url = 'https://www.mildom.com/%s' % video_id
-
-        webpage = self._download_webpage(url, video_id)
+        webpage = self._download_webpage(f'https://www.mildom.com/{video_id}', video_id)

        enterstudio = self._call_api(
            'https://cloudac.mildom.com/nonolive/gappserv/live/enterstudio', video_id,
            note='Downloading live metadata', query={'user_id': video_id})
        result_video_id = enterstudio.get('log_id', video_id)

-        title = try_get(
-            enterstudio, (
-                lambda x: self._html_search_meta('twitter:description', webpage),
-                lambda x: x['anchor_intro'],
-            ), compat_str)
-        description = try_get(
-            enterstudio, (
-                lambda x: x['intro'],
-                lambda x: x['live_intro'],
-            ), compat_str)
-        uploader = try_get(
-            enterstudio, (
-                lambda x: self._html_search_meta('twitter:title', webpage),
-                lambda x: x['loginname'],
-            ), compat_str)
-
        servers = self._call_api(
            'https://cloudac.mildom.com/nonolive/gappserv/live/liveserver', result_video_id,
            note='Downloading live server list', query={
@ -138,17 +60,20 @@ class MildomIE(MildomBaseIE):
                'live_server_type': 'hls',
            })

-        stream_query = self._common_queries({
-            'streamReqId': random_uuidv4(),
-            'is_lhls': '0',
-        })
-        m3u8_url = update_url_query(servers['stream_server'] + '/%s_master.m3u8' % video_id, stream_query)
-        formats = self._extract_m3u8_formats(m3u8_url, result_video_id, 'mp4', headers={
-            'Referer': 'https://www.mildom.com/',
-            'Origin': 'https://www.mildom.com',
-        }, note='Downloading m3u8 information')
+        playback_token = self._call_api(
+            'https://cloudac.mildom.com/nonolive/gappserv/live/token', result_video_id,
+            note='Obtaining live playback token', body={'host_id': video_id, 'type': 'hls'})
+        playback_token = traverse_obj(playback_token, ('data', ..., 'token'), get_all=False)
+        if not playback_token:
+            raise ExtractorError('Failed to obtain live playback token')
+
+        formats = self._extract_m3u8_formats(
+            f'{servers["stream_server"]}/{video_id}_master.m3u8?{playback_token}',
+            result_video_id, 'mp4', headers={
+                'Referer': 'https://www.mildom.com/',
+                'Origin': 'https://www.mildom.com',
+            })

-        del stream_query['streamReqId'], stream_query['timestamp']
        for fmt in formats:
            fmt.setdefault('http_headers', {})['Referer'] = 'https://www.mildom.com/'

@ -156,10 +81,10 @@ class MildomIE(MildomBaseIE):

        return {
            'id': result_video_id,
-            'title': title,
-            'description': description,
+            'title': self._html_search_meta('twitter:description', webpage, default=None) or traverse_obj(enterstudio, 'anchor_intro'),
+            'description': traverse_obj(enterstudio, 'intro', 'live_intro', expected_type=str),
            'timestamp': float_or_none(enterstudio.get('live_start_ms'), scale=1000),
-            'uploader': uploader,
+            'uploader': self._html_search_meta('twitter:title', webpage, default=None) or traverse_obj(enterstudio, 'loginname'),
            'uploader_id': video_id,
            'formats': formats,
            'is_live': True,
@ -168,7 +93,7 @@ class MildomIE(MildomBaseIE):

 class MildomVodIE(MildomBaseIE):
    IE_NAME = 'mildom:vod'
-    IE_DESC = 'Download a VOD in Mildom'
+    IE_DESC = 'VOD in Mildom'
    _VALID_URL = r'https?://(?:(?:www|m)\.)mildom\.com/playback/(?P<user_id>\d+)/(?P<id>(?P=user_id)-[a-zA-Z0-9]+-?[0-9]*)'
    _TESTS = [{
        'url': 'https://www.mildom.com/playback/10882672/10882672-1597662269',
@ -215,11 +140,8 @@ class MildomVodIE(MildomBaseIE):
    }]

    def _real_extract(self, url):
-        m = self._match_valid_url(url)
-        user_id, video_id = m.group('user_id'), m.group('id')
-        url = 'https://www.mildom.com/playback/%s/%s' % (user_id, video_id)
-
-        webpage = self._download_webpage(url, video_id)
+        user_id, video_id = self._match_valid_url(url).group('user_id', 'id')
+        webpage = self._download_webpage(f'https://www.mildom.com/playback/{user_id}/{video_id}', video_id)

        autoplay = self._call_api(
            'https://cloudac.mildom.com/nonolive/videocontent/playback/getPlaybackDetail', video_id,
@ -227,20 +149,6 @@ class MildomVodIE(MildomBaseIE):
                'v_id': video_id,
            })['playback']

-        title = try_get(
-            autoplay, (
-                lambda x: self._html_search_meta('og:description', webpage),
-                lambda x: x['title'],
-            ), compat_str)
-        description = try_get(
-            autoplay, (
-                lambda x: x['video_intro'],
-            ), compat_str)
-        uploader = try_get(
-            autoplay, (
-                lambda x: x['author_info']['login_name'],
-            ), compat_str)
-
        formats = [{
            'url': autoplay['audio_url'],
            'format_id': 'audio',
@ -265,17 +173,81 @@ class MildomVodIE(MildomBaseIE):

        return {
            'id': video_id,
-            'title': title,
-            'description': description,
-            'timestamp': float_or_none(autoplay['publish_time'], scale=1000),
-            'duration': float_or_none(autoplay['video_length'], scale=1000),
+            'title': self._html_search_meta(('og:description', 'description'), webpage, default=None) or autoplay.get('title'),
+            'description': traverse_obj(autoplay, 'video_intro'),
+            'timestamp': float_or_none(autoplay.get('publish_time'), scale=1000),
+            'duration': float_or_none(autoplay.get('video_length'), scale=1000),
            'thumbnail': dict_get(autoplay, ('upload_pic', 'video_pic')),
-            'uploader': uploader,
+            'uploader': traverse_obj(autoplay, ('author_info', 'login_name')),
            'uploader_id': user_id,
            'formats': formats,
        }


+class MildomClipIE(MildomBaseIE):
+    IE_NAME = 'mildom:clip'
+    IE_DESC = 'Clip in Mildom'
+    _VALID_URL = r'https?://(?:(?:www|m)\.)mildom\.com/clip/(?P<id>(?P<user_id>\d+)-[a-zA-Z0-9]+)'
+    _TESTS = [{
+        'url': 'https://www.mildom.com/clip/10042245-63921673e7b147ebb0806d42b5ba5ce9',
+        'info_dict': {
+            'id': '10042245-63921673e7b147ebb0806d42b5ba5ce9',
+            'title': '全然違ったよ',
+            'timestamp': 1619181890,
+            'duration': 59,
+            'thumbnail': r're:https?://.+',
+            'uploader': 'ざきんぽ',
+            'uploader_id': '10042245',
+        },
+    }, {
+        'url': 'https://www.mildom.com/clip/10111524-ebf4036e5aa8411c99fb3a1ae0902864',
+        'info_dict': {
+            'id': '10111524-ebf4036e5aa8411c99fb3a1ae0902864',
+            'title': 'かっこいい',
+            'timestamp': 1621094003,
+            'duration': 59,
+            'thumbnail': r're:https?://.+',
+            'uploader': '(ルーキー',
+            'uploader_id': '10111524',
+        },
+    }, {
+        'url': 'https://www.mildom.com/clip/10660174-2c539e6e277c4aaeb4b1fbe8d22cb902',
+        'info_dict': {
+            'id': '10660174-2c539e6e277c4aaeb4b1fbe8d22cb902',
+            'title': 'あ',
+            'timestamp': 1614769431,
+            'duration': 31,
+            'thumbnail': r're:https?://.+',
+            'uploader': 'ドルゴルスレンギーン＝ダグワドルジ',
+            'uploader_id': '10660174',
+        },
+    }]
+
+    def _real_extract(self, url):
+        user_id, video_id = self._match_valid_url(url).group('user_id', 'id')
+        webpage = self._download_webpage(f'https://www.mildom.com/clip/{video_id}', video_id)
+
+        clip_detail = self._call_api(
+            'https://cloudac-cf-jp.mildom.com/nonolive/videocontent/clip/detail', video_id,
+            note='Downloading playback metadata', query={
+                'clip_id': video_id,
+            })
+
+        return {
+            'id': video_id,
+            'title': self._html_search_meta(
+                ('og:description', 'description'), webpage, default=None) or clip_detail.get('title'),
+            'timestamp': float_or_none(clip_detail.get('create_time')),
+            'duration': float_or_none(clip_detail.get('length')),
+            'thumbnail': clip_detail.get('cover'),
+            'uploader': traverse_obj(clip_detail, ('user_info', 'loginname')),
+            'uploader_id': user_id,
+
+            'url': clip_detail['url'],
+            'ext': determine_ext(clip_detail.get('url'), 'mp4'),
+        }
+
+
 class MildomUserVodIE(MildomBaseIE):
    IE_NAME = 'mildom:user:vod'
    IE_DESC = 'Download all VODs from specific user in Mildom'
@ -286,29 +258,32 @@ class MildomUserVodIE(MildomBaseIE):
            'id': '10093333',
            'title': 'Uploads from ねこばたけ',
        },
-        'playlist_mincount': 351,
+        'playlist_mincount': 732,
    }, {
        'url': 'https://www.mildom.com/profile/10882672',
        'info_dict': {
            'id': '10882672',
            'title': 'Uploads from kson組長(けいそん)',
        },
-        'playlist_mincount': 191,
+        'playlist_mincount': 201,
    }]

-    def _entries(self, user_id):
-        for page in itertools.count(1):
-            reply = self._call_api(
-                'https://cloudac.mildom.com/nonolive/videocontent/profile/playbackList',
-                user_id, note='Downloading page %d' % page, query={
-                    'user_id': user_id,
-                    'page': page,
-                    'limit': '30',
-                })
-            if not reply:
-                break
-            for x in reply:
-                yield self.url_result('https://www.mildom.com/playback/%s/%s' % (user_id, x['v_id']))
+    def _fetch_page(self, user_id, page):
+        page += 1
+        reply = self._call_api(
+            'https://cloudac.mildom.com/nonolive/videocontent/profile/playbackList',
+            user_id, note=f'Downloading page {page}', query={
+                'user_id': user_id,
+                'page': page,
+                'limit': '30',
+            })
+        if not reply:
+            return
+        for x in reply:
+            v_id = x.get('v_id')
+            if not v_id:
+                continue
+            yield self.url_result(f'https://www.mildom.com/playback/{user_id}/{v_id}')

    def _real_extract(self, url):
        user_id = self._match_id(url)
@ -319,4 +294,5 @@ class MildomUserVodIE(MildomBaseIE):
            query={'user_id': user_id}, note='Downloading user profile')['user_info']

        return self.playlist_result(
-            self._entries(user_id), user_id, 'Uploads from %s' % profile['loginname'])
+            OnDemandPagedList(functools.partial(self._fetch_page, user_id), 30),
+            user_id, f'Uploads from {profile["loginname"]}')
--- a/yt_dlp/extractor/peertube.py
+++ b/yt_dlp/extractor/peertube.py
@ -87,6 +87,7 @@ class PeerTubeIE(InfoExtractor):
                            maindreieck-tv\.de|
                            mani\.tube|
                            manicphase\.me|
+                            media\.fsfe\.org|
                            media\.gzevd\.de|
                            media\.inno3\.cricket|
                            media\.kaitaia\.life|
--- a/yt_dlp/extractor/periscope.py
+++ b/yt_dlp/extractor/periscope.py
@ -33,7 +33,7 @@ class PeriscopeBaseIE(InfoExtractor):

        return {
            'id': broadcast.get('id') or video_id,
-            'title': self._live_title(title) if is_live else title,
+            'title': title,
            'timestamp': parse_iso8601(broadcast.get('created_at')),
            'uploader': uploader,
            'uploader_id': broadcast.get('user_id') or broadcast.get('username'),
--- a/yt_dlp/extractor/soundcloud.py
+++ b/yt_dlp/extractor/soundcloud.py
@ -59,8 +59,16 @@ class SoundcloudEmbedIE(InfoExtractor):


 class SoundcloudBaseIE(InfoExtractor):
+    _NETRC_MACHINE = 'soundcloud'
+
    _API_V2_BASE = 'https://api-v2.soundcloud.com/'
    _BASE_URL = 'https://soundcloud.com/'
+    _USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'
+    _API_AUTH_QUERY_TEMPLATE = '?client_id=%s'
+    _API_AUTH_URL_PW = 'https://api-auth.soundcloud.com/web-auth/sign-in/password%s'
+    _API_VERIFY_AUTH_TOKEN = 'https://api-auth.soundcloud.com/connect/session%s'
+    _access_token = None
+    _HEADERS = {}

    def _store_client_id(self, client_id):
        self._downloader.cache.store('soundcloud', 'client_id', client_id)
@ -103,14 +111,6 @@ class SoundcloudBaseIE(InfoExtractor):
        self._CLIENT_ID = self._downloader.cache.load('soundcloud', 'client_id') or 'a3e059563d7fd3372b49b37f00a00bcf'
        self._login()

-    _USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'
-    _API_AUTH_QUERY_TEMPLATE = '?client_id=%s'
-    _API_AUTH_URL_PW = 'https://api-auth.soundcloud.com/web-auth/sign-in/password%s'
-    _API_VERIFY_AUTH_TOKEN = 'https://api-auth.soundcloud.com/connect/session%s'
-    _access_token = None
-    _HEADERS = {}
-    _NETRC_MACHINE = 'soundcloud'
-
    def _login(self):
        username, password = self._get_login_info()
        if username is None:
--- a/yt_dlp/extractor/sovietscloset.py
+++ b/yt_dlp/extractor/sovietscloset.py
@ -67,6 +67,7 @@ class SovietsClosetIE(SovietsClosetBaseIE):
                'series': 'The Witcher',
                'season': 'Misc',
                'episode_number': 13,
+                'episode': 'Episode 13',
            },
        },
        {
@ -92,6 +93,7 @@ class SovietsClosetIE(SovietsClosetBaseIE):
                'series': 'Arma 3',
                'season': 'Zeus Games',
                'episode_number': 3,
+                'episode': 'Episode 3',
            },
        },
    ]
--- a/yt_dlp/extractor/xinpianchang.py
+++ b/yt_dlp/extractor/xinpianchang.py
@ -0,0 +1,95 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    int_or_none,
+    try_get,
+    update_url_query,
+    url_or_none,
+)
+
+
+class XinpianchangIE(InfoExtractor):
+    _VALID_URL = r'https?://www\.xinpianchang\.com/(?P<id>[^/]+?)(?:\D|$)'
+    IE_NAME = 'xinpianchang'
+    IE_DESC = 'xinpianchang.com'
+    _TESTS = [{
+        'url': 'https://www.xinpianchang.com/a11766551',
+        'info_dict': {
+            'id': 'a11766551',
+            'ext': 'mp4',
+            'title': '北京2022冬奥会闭幕式再见短片-冰墩墩下班了',
+            'description': 'md5:4a730c10639a82190fabe921c0fa4b87',
+            'duration': 151,
+            'thumbnail': r're:^https?://oss-xpc0\.xpccdn\.com.+/assets/',
+            'uploader': '正时文创',
+            'uploader_id': 10357277,
+            'categories': ['宣传片', '国家城市', '广告', '其他'],
+            'keywords': ['北京冬奥会', '冰墩墩', '再见', '告别', '冰墩墩哭了', '感动', '闭幕式', '熄火']
+        },
+    }, {
+        'url': 'https://www.xinpianchang.com/a11762904',
+        'info_dict': {
+            'id': 'a11762904',
+            'ext': 'mp4',
+            'title': '冬奥会决胜时刻《法国派出三只鸡？》',
+            'description': 'md5:55cb139ef8f48f0c877932d1f196df8b',
+            'duration': 136,
+            'thumbnail': r're:^https?://oss-xpc0\.xpccdn\.com.+/assets/',
+            'uploader': '精品动画',
+            'uploader_id': 10858927,
+            'categories': ['动画', '三维CG'],
+            'keywords': ['France Télévisions', '法国3台', '蠢萌', '冬奥会']
+        },
+    }, {
+        'url': 'https://www.xinpianchang.com/a11779743?from=IndexPick&part=%E7%BC%96%E8%BE%91%E7%B2%BE%E9%80%89&index=2',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id=video_id)
+        domain = self.find_value_with_regex(var='requireNewDomain', webpage=webpage)
+        vid = self.find_value_with_regex(var='vid', webpage=webpage)
+        app_key = self.find_value_with_regex(var='modeServerAppKey', webpage=webpage)
+        api = update_url_query(f'{domain}/mod/api/v2/media/{vid}', {'appKey': app_key})
+        data = self._download_json(api, video_id=video_id)['data']
+        formats, subtitles = [], {}
+        for k, v in data.get('resource').items():
+            if k in ('dash', 'hls'):
+                v_url = v.get('url')
+                if not v_url:
+                    continue
+                if k == 'dash':
+                    fmts, subs = self._extract_mpd_formats_and_subtitles(v_url, video_id=video_id)
+                elif k == 'hls':
+                    fmts, subs = self._extract_m3u8_formats_and_subtitles(v_url, video_id=video_id)
+                formats.extend(fmts)
+                subtitles = self._merge_subtitles(subtitles, subs)
+            elif k == 'progressive':
+                formats.extend([{
+                    'url': url_or_none(prog.get('url')),
+                    'width': int_or_none(prog.get('width')),
+                    'height': int_or_none(prog.get('height')),
+                    'ext': 'mp4',
+                } for prog in v if prog.get('url') or []])
+
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'title': data.get('title'),
+            'description': data.get('description'),
+            'duration': int_or_none(data.get('duration')),
+            'categories': data.get('categories'),
+            'keywords': data.get('keywords'),
+            'thumbnail': data.get('cover'),
+            'uploader': try_get(data, lambda x: x['owner']['username']),
+            'uploader_id': try_get(data, lambda x: x['owner']['id']),
+            'formats': formats,
+            'subtitles': subtitles,
+        }
+
+    def find_value_with_regex(self, var, webpage):
+        return self._search_regex(rf'var\s{var}\s=\s\"(?P<vid>[^\"]+)\"', webpage, name=var)
--- a/yt_dlp/extractor/youtube.py
+++ b/yt_dlp/extractor/youtube.py
@ -3094,6 +3094,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
            # Some formats may have much smaller duration than others (possibly damaged during encoding)
            # Eg: 2-nOtRESiUc Ref: https://github.com/yt-dlp/yt-dlp/issues/2823
            is_damaged = try_get(fmt, lambda x: float(x['approxDurationMs']) < approx_duration - 10000)
+            if is_damaged:
+                self.report_warning(f'{video_id}: Some formats are possibly damaged. They will be deprioritized', only_once=True)
            dct = {
                'asr': int_or_none(fmt.get('audioSampleRate')),
                'filesize': int_or_none(fmt.get('contentLength')),
--- a/yt_dlp/extractor/zingmp3.py
+++ b/yt_dlp/extractor/zingmp3.py
@ -149,7 +149,7 @@ class ZingMp3IE(ZingMp3BaseIE):
        },
    }, {
        'url': 'https://zingmp3.vn/video-clip/Suong-Hoa-Dua-Loi-K-ICM-RYO/ZO8ZF7C7.html',
-        'md5': 'e9c972b693aa88301ef981c8151c4343',
+        'md5': 'c7f23d971ac1a4f675456ed13c9b9612',
        'info_dict': {
            'id': 'ZO8ZF7C7',
            'title': 'Sương Hoa Đưa Lối',
@ -158,6 +158,8 @@ class ZingMp3IE(ZingMp3BaseIE):
            'duration': 207,
            'track': 'Sương Hoa Đưa Lối',
            'artist': 'K-ICM, RYO',
+            'album': 'Sương Hoa Đưa Lối (Single)',
+            'album_artist': 'K-ICM, RYO',
        },
    }, {
        'url': 'https://zingmp3.vn/embed/song/ZWZEI76B?start=false',
--- a/yt_dlp/utils.py
+++ b/yt_dlp/utils.py
@ -47,6 +47,7 @@ from .compat import (
    compat_HTMLParser,
    compat_HTTPError,
    compat_basestring,
+    compat_brotli,
    compat_chr,
    compat_cookiejar,
    compat_ctypes_WINFUNCTYPE,
@ -143,10 +144,16 @@ def random_user_agent():
    return _USER_AGENT_TPL % random.choice(_CHROME_VERSIONS)


+SUPPORTED_ENCODINGS = [
+    'gzip', 'deflate'
+]
+if compat_brotli:
+    SUPPORTED_ENCODINGS.append('br')
+
 std_headers = {
    'User-Agent': random_user_agent(),
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
-    'Accept-Encoding': 'gzip, deflate',
+    'Accept-Encoding': ', '.join(SUPPORTED_ENCODINGS),
    'Accept-Language': 'en-us,en;q=0.5',
    'Sec-Fetch-Mode': 'navigate',
 }
@ -1023,7 +1030,7 @@ def make_HTTPS_handler(params, **kwargs):
 def bug_reports_message(before=';'):
    msg = ('please report this issue on  https://github.com/yt-dlp/yt-dlp , '
           'filling out the "Broken site" issue template properly. '
-           'Confirm you are on the latest version using -U')
+           'Confirm you are on the latest version using  yt-dlp -U')

    before = before.rstrip()
    if not before or before.endswith(('.', '!', '?')):
@ -1357,6 +1364,12 @@ class YoutubeDLHandler(compat_urllib_request.HTTPHandler):
        except zlib.error:
            return zlib.decompress(data)

+    @staticmethod
+    def brotli(data):
+        if not data:
+            return data
+        return compat_brotli.decompress(data)
+
    def http_request(self, req):
        # According to RFC 3986, URLs can not contain non-ASCII characters, however this is not
        # always respected by websites, some tend to give out URLs with non percent-encoded
@ -1417,6 +1430,12 @@ class YoutubeDLHandler(compat_urllib_request.HTTPHandler):
            resp = compat_urllib_request.addinfourl(gz, old_resp.headers, old_resp.url, old_resp.code)
            resp.msg = old_resp.msg
            del resp.headers['Content-encoding']
+        # brotli
+        if resp.headers.get('Content-encoding', '') == 'br':
+            resp = compat_urllib_request.addinfourl(
+                io.BytesIO(self.brotli(resp.read())), old_resp.headers, old_resp.url, old_resp.code)
+            resp.msg = old_resp.msg
+            del resp.headers['Content-encoding']
        # Percent-encode redirect URL of Location HTTP header to satisfy RFC 3986 (see
        # https://github.com/ytdl-org/youtube-dl/issues/6457).
        if 300 <= resp.code < 400:
@ -5462,5 +5481,5 @@ has_websockets = bool(compat_websockets)


 def merge_headers(*dicts):
-    """Merge dicts of network headers case insensitively, prioritizing the latter ones"""
+    """Merge dicts of http headers case insensitively, prioritizing the latter ones"""
    return {k.capitalize(): v for k, v in itertools.chain.from_iterable(map(dict.items, dicts))}
Author	SHA1	Message	Date
pukkandan	08d30158ec	[cleanup, docs] Misc cleanup Closes #2828, closes #2734, closes #2802, closes #2937	2022-03-08 22:38:06 +05:30
Ha Tien Loi	c89bec262c	[xinpianchang] Add extractor (#2963 ) Authored by: hatienl0i261299	2022-03-08 08:55:40 -08:00
Ha Tien Loi	151f8f1c02	[fptplay] Add extractor (#2949 ) Closes #2857 Authored by: hatienl0i261299	2022-03-08 08:52:51 -08:00
Max Mehl	a35155be17	[peertube] Add media.fsfe.org (#2986 ) Authored by: mxmehl	2022-03-08 08:48:35 -08:00
nyuszika7h	e66662b1e0	[ccma] Fix timestamp parsing (#2989 ) Authored by: nyuszika7h	2022-03-08 08:45:23 -08:00
coletdev	4390d5ec12	Add brotli content-encoding support (#2433 ) Authored by: coletdjnz	2022-03-08 08:44:05 -08:00
CplPwnies	9e0e6adb2d	[adobepass] Add Suddenlink MSO (#2977 ) Closes #2704 Authored by: CplPwnies	2022-03-08 08:18:52 -08:00
Lesmiscore	b637c4e22e	[mildom] Fix linter	2022-03-08 23:56:30 +09:00
Lesmiscore (Naoya Ozaki)	fb6e3f4389	[mildom] Rework extractors (#2940 ) Authored by: Lesmiscore	2022-03-08 23:49:10 +09:00