Compare commits

..

No commits in common. "f92347c3121e89e9097c2cdab2a2a11dd041a147" and "926ccc84ef91498f3147b07d15eb5f40cd070471" have entirely different histories.

148 changed files with 5890 additions and 8368 deletions

View File

@ -11,7 +11,7 @@ body:
options:
- label: I'm reporting a broken site
required: true
- label: I've verified that I'm running yt-dlp version **2022.06.22.1** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2022.05.18** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@ -51,12 +51,12 @@ body:
[debug] Portable config file: yt-dlp.conf
[debug] Portable config: ['-i']
[debug] Encodings: locale cp1252, fs utf-8, stdout utf-8, stderr utf-8, pref cp1252
[debug] yt-dlp version 2022.06.22.1 (exe)
[debug] yt-dlp version 2022.05.18 (exe)
[debug] Python version 3.8.8 (CPython 64bit) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg 3.0.1, ffprobe 3.0.1
[debug] Optional libraries: Cryptodome, keyring, mutagen, sqlite, websockets
[debug] Proxy map: {}
yt-dlp is up to date (2022.06.22.1)
yt-dlp is up to date (2022.05.18)
<more lines>
render: shell
validations:

View File

@ -11,7 +11,7 @@ body:
options:
- label: I'm reporting a new site support request
required: true
- label: I've verified that I'm running yt-dlp version **2022.06.22.1** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2022.05.18** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@ -62,12 +62,12 @@ body:
[debug] Portable config file: yt-dlp.conf
[debug] Portable config: ['-i']
[debug] Encodings: locale cp1252, fs utf-8, stdout utf-8, stderr utf-8, pref cp1252
[debug] yt-dlp version 2022.06.22.1 (exe)
[debug] yt-dlp version 2022.05.18 (exe)
[debug] Python version 3.8.8 (CPython 64bit) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg 3.0.1, ffprobe 3.0.1
[debug] Optional libraries: Cryptodome, keyring, mutagen, sqlite, websockets
[debug] Proxy map: {}
yt-dlp is up to date (2022.06.22.1)
yt-dlp is up to date (2022.05.18)
<more lines>
render: shell
validations:

View File

@ -9,9 +9,9 @@ body:
description: |
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
options:
- label: I'm requesting a site-specific feature
- label: I'm reporting a site feature request
required: true
- label: I've verified that I'm running yt-dlp version **2022.06.22.1** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2022.05.18** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@ -60,12 +60,12 @@ body:
[debug] Portable config file: yt-dlp.conf
[debug] Portable config: ['-i']
[debug] Encodings: locale cp1252, fs utf-8, stdout utf-8, stderr utf-8, pref cp1252
[debug] yt-dlp version 2022.06.22.1 (exe)
[debug] yt-dlp version 2022.05.18 (exe)
[debug] Python version 3.8.8 (CPython 64bit) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg 3.0.1, ffprobe 3.0.1
[debug] Optional libraries: Cryptodome, keyring, mutagen, sqlite, websockets
[debug] Proxy map: {}
yt-dlp is up to date (2022.06.22.1)
yt-dlp is up to date (2022.05.18)
<more lines>
render: shell
validations:

View File

@ -11,7 +11,7 @@ body:
options:
- label: I'm reporting a bug unrelated to a specific site
required: true
- label: I've verified that I'm running yt-dlp version **2022.06.22.1** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2022.05.18** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@ -45,12 +45,12 @@ body:
[debug] Portable config file: yt-dlp.conf
[debug] Portable config: ['-i']
[debug] Encodings: locale cp1252, fs utf-8, stdout utf-8, stderr utf-8, pref cp1252
[debug] yt-dlp version 2022.06.22.1 (exe)
[debug] yt-dlp version 2022.05.18 (exe)
[debug] Python version 3.8.8 (CPython 64bit) - Windows-10-10.0.19041-SP0
[debug] exe versions: ffmpeg 3.0.1, ffprobe 3.0.1
[debug] Optional libraries: Cryptodome, keyring, mutagen, sqlite, websockets
[debug] Proxy map: {}
yt-dlp is up to date (2022.06.22.1)
yt-dlp is up to date (2022.05.18)
<more lines>
render: shell
validations:

View File

@ -9,11 +9,11 @@ body:
description: |
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
options:
- label: I'm requesting a feature unrelated to a specific site
- label: I'm reporting a feature request
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **2022.06.22.1** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2022.05.18** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues including closed ones. DO NOT post duplicates
required: true

View File

@ -9,16 +9,14 @@ body:
description: |
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
options:
- label: I'm asking a question and **not** reporting a bug or requesting a feature
- label: I'm asking a question and **not** reporting a bug/feature request
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **2022.06.22.1** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions including closed ones. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions including closed ones
required: true
- type: textarea
id: question
attributes:

View File

@ -9,7 +9,7 @@ body:
description: |
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
options:
- label: I'm requesting a site-specific feature
- label: I'm reporting a site feature request
required: true
- label: I've verified that I'm running yt-dlp version **%(version)s** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true

View File

@ -9,7 +9,7 @@ body:
description: |
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
options:
- label: I'm requesting a feature unrelated to a specific site
- label: I'm reporting a feature request
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true

View File

@ -9,16 +9,14 @@ body:
description: |
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
options:
- label: I'm asking a question and **not** reporting a bug or requesting a feature
- label: I'm asking a question and **not** reporting a bug/feature request
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **%(version)s** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions including closed ones. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions including closed ones
required: true
- type: textarea
id: question
attributes:

View File

@ -2,20 +2,27 @@ name: Build
on: workflow_dispatch
jobs:
create_release:
build_unix:
runs-on: ubuntu-latest
outputs:
version_suffix: ${{ steps.version_suffix.outputs.version_suffix }}
ytdlp_version: ${{ steps.bump_version.outputs.ytdlp_version }}
upload_url: ${{ steps.create_release.outputs.upload_url }}
sha256_bin: ${{ steps.sha256_bin.outputs.sha256_bin }}
sha512_bin: ${{ steps.sha512_bin.outputs.sha512_bin }}
sha256_tar: ${{ steps.sha256_tar.outputs.sha256_tar }}
sha512_tar: ${{ steps.sha512_tar.outputs.sha512_tar }}
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
- uses: actions/setup-python@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.10'
python-version: '3.8'
- name: Install packages
run: sudo apt-get -y install zip pandoc man
- name: Set version suffix
id: version_suffix
env:
@ -27,27 +34,83 @@ jobs:
run: |
python devscripts/update-version.py ${{ steps.version_suffix.outputs.version_suffix }}
make issuetemplates
- name: Push to release
id: push_release
run: |
git config --global user.name github-actions
git config --global user.email github-actions@example.com
git add -u
git commit -m "[version] update" -m "Created by: ${{ github.event.sender.login }}" -m ":ci skip all :ci run dl"
git commit -m "[version] update" -m "Created by: ${{ github.event.sender.login }}" -m ":ci skip all"
git push origin --force ${{ github.event.ref }}:release
echo ::set-output name=head_sha::$(git rev-parse HEAD)
- name: Update master
id: push_master
env:
PUSH_VERSION_COMMIT: ${{ secrets.PUSH_VERSION_COMMIT }}
if: "env.PUSH_VERSION_COMMIT != ''"
run: git push origin ${{ github.event.ref }}
- name: Get Changelog
id: get_changelog
run: |
changelog=$(grep -oPz '(?s)(?<=### ${{ steps.bump_version.outputs.ytdlp_version }}\n{2}).+?(?=\n{2,3}###)' Changelog.md) || true
changelog=$(cat Changelog.md | grep -oPz '(?s)(?<=### ${{ steps.bump_version.outputs.ytdlp_version }}\n{2}).+?(?=\n{2,3}###)') || true
echo "changelog<<EOF" >> $GITHUB_ENV
echo "$changelog" >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV
- name: Build lazy extractors
id: lazy_extractors
run: python devscripts/make_lazy_extractors.py
- name: Run Make
run: make all tar
- name: Get SHA2-256SUMS for yt-dlp
id: sha256_bin
run: echo "::set-output name=sha256_bin::$(sha256sum yt-dlp | awk '{print $1}')"
- name: Get SHA2-256SUMS for yt-dlp.tar.gz
id: sha256_tar
run: echo "::set-output name=sha256_tar::$(sha256sum yt-dlp.tar.gz | awk '{print $1}')"
- name: Get SHA2-512SUMS for yt-dlp
id: sha512_bin
run: echo "::set-output name=sha512_bin::$(sha512sum yt-dlp | awk '{print $1}')"
- name: Get SHA2-512SUMS for yt-dlp.tar.gz
id: sha512_tar
run: echo "::set-output name=sha512_tar::$(sha512sum yt-dlp.tar.gz | awk '{print $1}')"
- name: Install dependencies for pypi
env:
PYPI_TOKEN: ${{ secrets.PYPI_TOKEN }}
if: "env.PYPI_TOKEN != ''"
run: |
python -m pip install --upgrade pip
pip install setuptools wheel twine
- name: Build and publish on pypi
env:
TWINE_USERNAME: __token__
TWINE_PASSWORD: ${{ secrets.PYPI_TOKEN }}
if: "env.TWINE_PASSWORD != ''"
run: |
rm -rf dist/*
python setup.py sdist bdist_wheel
twine upload dist/*
- name: Install SSH private key
env:
BREW_TOKEN: ${{ secrets.BREW_TOKEN }}
if: "env.BREW_TOKEN != ''"
uses: yt-dlp/ssh-agent@v0.5.3
with:
ssh-private-key: ${{ env.BREW_TOKEN }}
- name: Update Homebrew Formulae
env:
BREW_TOKEN: ${{ secrets.BREW_TOKEN }}
if: "env.BREW_TOKEN != ''"
run: |
git clone git@github.com:yt-dlp/homebrew-taps taps/
python3 devscripts/update-formulae.py taps/Formula/yt-dlp.rb "${{ steps.bump_version.outputs.ytdlp_version }}"
git -C taps/ config user.name github-actions
git -C taps/ config user.email github-actions@example.com
git -C taps/ commit -am 'yt-dlp: ${{ steps.bump_version.outputs.ytdlp_version }}'
git -C taps/ push
- name: Create Release
id: create_release
uses: actions/create-release@v1
@ -66,60 +129,13 @@ jobs:
${{ env.changelog }}
draft: false
prerelease: false
build_unix:
needs: create_release
runs-on: ubuntu-18.04 # Standalone executable should be built on minimum supported OS
outputs:
sha256_bin: ${{ steps.get_sha.outputs.sha256_bin }}
sha512_bin: ${{ steps.get_sha.outputs.sha512_bin }}
sha256_tar: ${{ steps.get_sha.outputs.sha256_tar }}
sha512_tar: ${{ steps.get_sha.outputs.sha512_tar }}
sha256_linux: ${{ steps.get_sha.outputs.sha256_linux }}
sha512_linux: ${{ steps.get_sha.outputs.sha512_linux }}
sha256_linux_zip: ${{ steps.get_sha.outputs.sha256_linux_zip }}
sha512_linux_zip: ${{ steps.get_sha.outputs.sha512_linux_zip }}
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: '3.10'
- name: Install Requirements
run: |
sudo apt-get -y install zip pandoc man
python -m pip install --upgrade pip setuptools wheel twine
python -m pip install Pyinstaller -r requirements.txt
- name: Prepare
run: |
python devscripts/update-version.py ${{ needs.create_release.outputs.version_suffix }}
python devscripts/make_lazy_extractors.py
- name: Build Unix executables
run: |
make all tar
python pyinst.py --onedir
(cd ./dist/yt-dlp_linux && zip -r ../yt-dlp_linux.zip .)
python pyinst.py
- name: Get SHA2-SUMS
id: get_sha
run: |
echo "::set-output name=sha256_bin::$(sha256sum yt-dlp | awk '{print $1}')"
echo "::set-output name=sha512_bin::$(sha512sum yt-dlp | awk '{print $1}')"
echo "::set-output name=sha256_tar::$(sha256sum yt-dlp.tar.gz | awk '{print $1}')"
echo "::set-output name=sha512_tar::$(sha512sum yt-dlp.tar.gz | awk '{print $1}')"
echo "::set-output name=sha256_linux::$(sha256sum dist/yt-dlp_linux | awk '{print $1}')"
echo "::set-output name=sha512_linux::$(sha512sum dist/yt-dlp_linux | awk '{print $1}')"
echo "::set-output name=sha256_linux_zip::$(sha256sum dist/yt-dlp_linux.zip | awk '{print $1}')"
echo "::set-output name=sha512_linux_zip::$(sha512sum dist/yt-dlp_linux.zip | awk '{print $1}')"
- name: Upload zip binary
- name: Upload yt-dlp Unix binary
id: upload-release-asset
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
upload_url: ${{ steps.create_release.outputs.upload_url }}
asset_path: ./yt-dlp
asset_name: yt-dlp
asset_content_type: application/octet-stream
@ -128,269 +144,270 @@ jobs:
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
upload_url: ${{ steps.create_release.outputs.upload_url }}
asset_path: ./yt-dlp.tar.gz
asset_name: yt-dlp.tar.gz
asset_content_type: application/gzip
- name: Upload standalone binary
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
asset_path: ./dist/yt-dlp_linux
asset_name: yt-dlp_linux
asset_content_type: application/octet-stream
- name: Upload onedir binary
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
asset_path: ./dist/yt-dlp_linux.zip
asset_name: yt-dlp_linux.zip
asset_content_type: application/zip
- name: Build and publish on PyPi
env:
TWINE_USERNAME: __token__
TWINE_PASSWORD: ${{ secrets.PYPI_TOKEN }}
if: "env.TWINE_PASSWORD != ''"
run: |
rm -rf dist/*
python setup.py sdist bdist_wheel
twine upload dist/*
- name: Install SSH private key for Homebrew
env:
BREW_TOKEN: ${{ secrets.BREW_TOKEN }}
if: "env.BREW_TOKEN != ''"
uses: yt-dlp/ssh-agent@v0.5.3
with:
ssh-private-key: ${{ env.BREW_TOKEN }}
- name: Update Homebrew Formulae
env:
BREW_TOKEN: ${{ secrets.BREW_TOKEN }}
if: "env.BREW_TOKEN != ''"
run: |
git clone git@github.com:yt-dlp/homebrew-taps taps/
python devscripts/update-formulae.py taps/Formula/yt-dlp.rb "${{ needs.create_release.outputs.ytdlp_version }}"
git -C taps/ config user.name github-actions
git -C taps/ config user.email github-actions@example.com
git -C taps/ commit -am 'yt-dlp: ${{ needs.create_release.outputs.ytdlp_version }}'
git -C taps/ push
build_macos:
runs-on: macos-11
needs: create_release
needs: build_unix
outputs:
sha256_macos: ${{ steps.get_sha.outputs.sha256_macos }}
sha512_macos: ${{ steps.get_sha.outputs.sha512_macos }}
sha256_macos_zip: ${{ steps.get_sha.outputs.sha256_macos_zip }}
sha512_macos_zip: ${{ steps.get_sha.outputs.sha512_macos_zip }}
sha256_macos: ${{ steps.sha256_macos.outputs.sha256_macos }}
sha512_macos: ${{ steps.sha512_macos.outputs.sha512_macos }}
sha256_macos_zip: ${{ steps.sha256_macos_zip.outputs.sha256_macos_zip }}
sha512_macos_zip: ${{ steps.sha512_macos_zip.outputs.sha512_macos_zip }}
steps:
- uses: actions/checkout@v2
# NB: In order to create a universal2 application, the version of python3 in /usr/bin has to be used
# In order to create a universal2 application, the version of python3 in /usr/bin has to be used
- name: Install Requirements
run: |
brew install coreutils
/usr/bin/python3 -m pip install -U --user pip Pyinstaller -r requirements.txt
- name: Prepare
run: |
/usr/bin/python3 devscripts/update-version.py ${{ needs.create_release.outputs.version_suffix }}
/usr/bin/python3 devscripts/make_lazy_extractors.py
- name: Build
run: |
/usr/bin/python3 pyinst.py --target-architecture universal2 --onedir
(cd ./dist/yt-dlp_macos && zip -r ../yt-dlp_macos.zip .)
/usr/bin/python3 pyinst.py --target-architecture universal2
- name: Get SHA2-SUMS
id: get_sha
run: |
echo "::set-output name=sha256_macos::$(sha256sum dist/yt-dlp_macos | awk '{print $1}')"
echo "::set-output name=sha512_macos::$(sha512sum dist/yt-dlp_macos | awk '{print $1}')"
echo "::set-output name=sha256_macos_zip::$(sha256sum dist/yt-dlp_macos.zip | awk '{print $1}')"
echo "::set-output name=sha512_macos_zip::$(sha512sum dist/yt-dlp_macos.zip | awk '{print $1}')"
- name: Upload standalone binary
/usr/bin/python3 -m pip install -U --user pip Pyinstaller==4.10 -r requirements.txt
- name: Bump version
id: bump_version
run: /usr/bin/python3 devscripts/update-version.py
- name: Build lazy extractors
id: lazy_extractors
run: /usr/bin/python3 devscripts/make_lazy_extractors.py
- name: Run PyInstaller Script
run: /usr/bin/python3 pyinst.py --target-architecture universal2 --onefile
- name: Upload yt-dlp MacOS binary
id: upload-release-macos
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
upload_url: ${{ needs.build_unix.outputs.upload_url }}
asset_path: ./dist/yt-dlp_macos
asset_name: yt-dlp_macos
asset_content_type: application/octet-stream
- name: Upload onedir binary
- name: Get SHA2-256SUMS for yt-dlp_macos
id: sha256_macos
run: echo "::set-output name=sha256_macos::$(sha256sum dist/yt-dlp_macos | awk '{print $1}')"
- name: Get SHA2-512SUMS for yt-dlp_macos
id: sha512_macos
run: echo "::set-output name=sha512_macos::$(sha512sum dist/yt-dlp_macos | awk '{print $1}')"
- name: Run PyInstaller Script with --onedir
run: |
/usr/bin/python3 pyinst.py --target-architecture universal2 --onedir
zip ./dist/yt-dlp_macos.zip ./dist/yt-dlp_macos
- name: Upload yt-dlp MacOS onedir
id: upload-release-macos-zip
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
upload_url: ${{ needs.build_unix.outputs.upload_url }}
asset_path: ./dist/yt-dlp_macos.zip
asset_name: yt-dlp_macos.zip
asset_content_type: application/zip
- name: Get SHA2-256SUMS for yt-dlp_macos.zip
id: sha256_macos_zip
run: echo "::set-output name=sha256_macos_zip::$(sha256sum dist/yt-dlp_macos.zip | awk '{print $1}')"
- name: Get SHA2-512SUMS for yt-dlp_macos.zip
id: sha512_macos_zip
run: echo "::set-output name=sha512_macos_zip::$(sha512sum dist/yt-dlp_macos.zip | awk '{print $1}')"
build_windows:
runs-on: windows-latest
needs: create_release
needs: build_unix
outputs:
sha256_win: ${{ steps.get_sha.outputs.sha256_win }}
sha512_win: ${{ steps.get_sha.outputs.sha512_win }}
sha256_py2exe: ${{ steps.get_sha.outputs.sha256_py2exe }}
sha512_py2exe: ${{ steps.get_sha.outputs.sha512_py2exe }}
sha256_win_zip: ${{ steps.get_sha.outputs.sha256_win_zip }}
sha512_win_zip: ${{ steps.get_sha.outputs.sha512_win_zip }}
sha256_win: ${{ steps.sha256_win.outputs.sha256_win }}
sha512_win: ${{ steps.sha512_win.outputs.sha512_win }}
sha256_py2exe: ${{ steps.sha256_py2exe.outputs.sha256_py2exe }}
sha512_py2exe: ${{ steps.sha512_py2exe.outputs.sha512_py2exe }}
sha256_win_zip: ${{ steps.sha256_win_zip.outputs.sha256_win_zip }}
sha512_win_zip: ${{ steps.sha512_win_zip.outputs.sha512_win_zip }}
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with: # 3.8 is used for Win7 support
# 3.8 is used for Win7 support
- name: Set up Python 3.8
uses: actions/setup-python@v2
with:
python-version: '3.8'
- name: Install Requirements
run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
# Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
run: |
python -m pip install --upgrade pip setuptools wheel py2exe
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-4.10-py3-none-any.whl" -r requirements.txt
- name: Prepare
run: |
python devscripts/update-version.py ${{ needs.create_release.outputs.version_suffix }}
python devscripts/make_lazy_extractors.py
- name: Build
run: |
python setup.py py2exe
Move-Item ./dist/yt-dlp.exe ./dist/yt-dlp_min.exe
python pyinst.py
python pyinst.py --onedir
Compress-Archive -Path ./dist/yt-dlp/* -DestinationPath ./dist/yt-dlp_win.zip
- name: Get SHA2-SUMS
id: get_sha
run: |
echo "::set-output name=sha256_py2exe::$((Get-FileHash dist\yt-dlp_min.exe -Algorithm SHA256).Hash.ToLower())"
echo "::set-output name=sha512_py2exe::$((Get-FileHash dist\yt-dlp_min.exe -Algorithm SHA512).Hash.ToLower())"
echo "::set-output name=sha256_win::$((Get-FileHash dist\yt-dlp.exe -Algorithm SHA256).Hash.ToLower())"
echo "::set-output name=sha512_win::$((Get-FileHash dist\yt-dlp.exe -Algorithm SHA512).Hash.ToLower())"
echo "::set-output name=sha256_win_zip::$((Get-FileHash dist\yt-dlp_win.zip -Algorithm SHA256).Hash.ToLower())"
echo "::set-output name=sha512_win_zip::$((Get-FileHash dist\yt-dlp_win.zip -Algorithm SHA512).Hash.ToLower())"
- name: Upload py2exe binary
- name: Bump version
id: bump_version
env:
version_suffix: ${{ needs.build_unix.outputs.version_suffix }}
run: python devscripts/update-version.py ${{ env.version_suffix }}
- name: Build lazy extractors
id: lazy_extractors
run: python devscripts/make_lazy_extractors.py
- name: Run PyInstaller Script
run: python pyinst.py
- name: Upload yt-dlp.exe Windows binary
id: upload-release-windows
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
asset_path: ./dist/yt-dlp_min.exe
asset_name: yt-dlp_min.exe
asset_content_type: application/vnd.microsoft.portable-executable
- name: Upload standalone binary
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
upload_url: ${{ needs.build_unix.outputs.upload_url }}
asset_path: ./dist/yt-dlp.exe
asset_name: yt-dlp.exe
asset_content_type: application/vnd.microsoft.portable-executable
- name: Upload onedir binary
- name: Get SHA2-256SUMS for yt-dlp.exe
id: sha256_win
run: echo "::set-output name=sha256_win::$((Get-FileHash dist\yt-dlp.exe -Algorithm SHA256).Hash.ToLower())"
- name: Get SHA2-512SUMS for yt-dlp.exe
id: sha512_win
run: echo "::set-output name=sha512_win::$((Get-FileHash dist\yt-dlp.exe -Algorithm SHA512).Hash.ToLower())"
- name: Run PyInstaller Script with --onedir
run: |
python pyinst.py --onedir
Compress-Archive -LiteralPath ./dist/yt-dlp -DestinationPath ./dist/yt-dlp_win.zip
- name: Upload yt-dlp Windows onedir
id: upload-release-windows-zip
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
upload_url: ${{ needs.build_unix.outputs.upload_url }}
asset_path: ./dist/yt-dlp_win.zip
asset_name: yt-dlp_win.zip
asset_content_type: application/zip
- name: Get SHA2-256SUMS for yt-dlp_win.zip
id: sha256_win_zip
run: echo "::set-output name=sha256_win_zip::$((Get-FileHash dist\yt-dlp_win.zip -Algorithm SHA256).Hash.ToLower())"
- name: Get SHA2-512SUMS for yt-dlp_win.zip
id: sha512_win_zip
run: echo "::set-output name=sha512_win_zip::$((Get-FileHash dist\yt-dlp_win.zip -Algorithm SHA512).Hash.ToLower())"
- name: Run py2exe Script
run: python setup.py py2exe
- name: Upload yt-dlp_min.exe Windows binary
id: upload-release-windows-py2exe
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.build_unix.outputs.upload_url }}
asset_path: ./dist/yt-dlp.exe
asset_name: yt-dlp_min.exe
asset_content_type: application/vnd.microsoft.portable-executable
- name: Get SHA2-256SUMS for yt-dlp_min.exe
id: sha256_py2exe
run: echo "::set-output name=sha256_py2exe::$((Get-FileHash dist\yt-dlp.exe -Algorithm SHA256).Hash.ToLower())"
- name: Get SHA2-512SUMS for yt-dlp_min.exe
id: sha512_py2exe
run: echo "::set-output name=sha512_py2exe::$((Get-FileHash dist\yt-dlp.exe -Algorithm SHA512).Hash.ToLower())"
build_windows32:
runs-on: windows-latest
needs: create_release
needs: build_unix
outputs:
sha256_win32: ${{ steps.get_sha.outputs.sha256_win32 }}
sha512_win32: ${{ steps.get_sha.outputs.sha512_win32 }}
sha256_win32: ${{ steps.sha256_win32.outputs.sha256_win32 }}
sha512_win32: ${{ steps.sha512_win32.outputs.sha512_win32 }}
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with: # 3.7 is used for Vista support. See https://github.com/yt-dlp/yt-dlp/issues/390
# 3.7 is used for Vista support. See https://github.com/yt-dlp/yt-dlp/issues/390
- name: Set up Python 3.7 32-Bit
uses: actions/setup-python@v2
with:
python-version: '3.7'
architecture: 'x86'
- name: Install Requirements
run: |
python -m pip install --upgrade pip setuptools wheel
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-4.10-py3-none-any.whl" -r requirements.txt
- name: Prepare
run: |
python devscripts/update-version.py ${{ needs.create_release.outputs.version_suffix }}
python devscripts/make_lazy_extractors.py
- name: Build
run: |
python pyinst.py
- name: Get SHA2-SUMS
id: get_sha
run: |
echo "::set-output name=sha256_win32::$((Get-FileHash dist\yt-dlp_x86.exe -Algorithm SHA256).Hash.ToLower())"
echo "::set-output name=sha512_win32::$((Get-FileHash dist\yt-dlp_x86.exe -Algorithm SHA512).Hash.ToLower())"
- name: Upload standalone binary
- name: Bump version
id: bump_version
env:
version_suffix: ${{ needs.build_unix.outputs.version_suffix }}
run: python devscripts/update-version.py ${{ env.version_suffix }}
- name: Build lazy extractors
id: lazy_extractors
run: python devscripts/make_lazy_extractors.py
- name: Run PyInstaller Script for 32 Bit
run: python pyinst.py
- name: Upload Executable yt-dlp_x86.exe
id: upload-release-windows32
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
upload_url: ${{ needs.build_unix.outputs.upload_url }}
asset_path: ./dist/yt-dlp_x86.exe
asset_name: yt-dlp_x86.exe
asset_content_type: application/vnd.microsoft.portable-executable
- name: Get SHA2-256SUMS for yt-dlp_x86.exe
id: sha256_win32
run: echo "::set-output name=sha256_win32::$((Get-FileHash dist\yt-dlp_x86.exe -Algorithm SHA256).Hash.ToLower())"
- name: Get SHA2-512SUMS for yt-dlp_x86.exe
id: sha512_win32
run: echo "::set-output name=sha512_win32::$((Get-FileHash dist\yt-dlp_x86.exe -Algorithm SHA512).Hash.ToLower())"
finish:
runs-on: ubuntu-latest
needs: [create_release, build_unix, build_windows, build_windows32, build_macos]
needs: [build_unix, build_windows, build_windows32, build_macos]
steps:
- name: Make SHA2-SUMS files
- name: Make SHA2-256SUMS file
env:
SHA256_BIN: ${{ needs.build_unix.outputs.sha256_bin }}
SHA256_TAR: ${{ needs.build_unix.outputs.sha256_tar }}
SHA256_WIN: ${{ needs.build_windows.outputs.sha256_win }}
SHA256_PY2EXE: ${{ needs.build_windows.outputs.sha256_py2exe }}
SHA256_WIN_ZIP: ${{ needs.build_windows.outputs.sha256_win_zip }}
SHA256_WIN32: ${{ needs.build_windows32.outputs.sha256_win32 }}
SHA256_MACOS: ${{ needs.build_macos.outputs.sha256_macos }}
SHA256_MACOS_ZIP: ${{ needs.build_macos.outputs.sha256_macos_zip }}
run: |
echo "${{ needs.build_unix.outputs.sha256_bin }} yt-dlp" >> SHA2-256SUMS
echo "${{ needs.build_unix.outputs.sha256_tar }} yt-dlp.tar.gz" >> SHA2-256SUMS
echo "${{ needs.build_unix.outputs.sha256_linux }} yt-dlp_linux" >> SHA2-256SUMS
echo "${{ needs.build_unix.outputs.sha256_linux_zip }} yt-dlp_linux.zip" >> SHA2-256SUMS
echo "${{ needs.build_windows.outputs.sha256_win }} yt-dlp.exe" >> SHA2-256SUMS
echo "${{ needs.build_windows.outputs.sha256_py2exe }} yt-dlp_min.exe" >> SHA2-256SUMS
echo "${{ needs.build_windows32.outputs.sha256_win32 }} yt-dlp_x86.exe" >> SHA2-256SUMS
echo "${{ needs.build_windows.outputs.sha256_win_zip }} yt-dlp_win.zip" >> SHA2-256SUMS
echo "${{ needs.build_macos.outputs.sha256_macos }} yt-dlp_macos" >> SHA2-256SUMS
echo "${{ needs.build_macos.outputs.sha256_macos_zip }} yt-dlp_macos.zip" >> SHA2-256SUMS
echo "${{ needs.build_unix.outputs.sha512_bin }} yt-dlp" >> SHA2-512SUMS
echo "${{ needs.build_unix.outputs.sha512_tar }} yt-dlp.tar.gz" >> SHA2-512SUMS
echo "${{ needs.build_unix.outputs.sha512_linux }} yt-dlp_linux" >> SHA2-512SUMS
echo "${{ needs.build_unix.outputs.sha512_linux_zip }} yt-dlp_linux.zip" >> SHA2-512SUMS
echo "${{ needs.build_windows.outputs.sha512_win }} yt-dlp.exe" >> SHA2-512SUMS
echo "${{ needs.build_windows.outputs.sha512_py2exe }} yt-dlp_min.exe" >> SHA2-512SUMS
echo "${{ needs.build_windows32.outputs.sha512_win32 }} yt-dlp_x86.exe" >> SHA2-512SUMS
echo "${{ needs.build_windows.outputs.sha512_win_zip }} yt-dlp_win.zip" >> SHA2-512SUMS
echo "${{ needs.build_macos.outputs.sha512_macos }} yt-dlp_macos" >> SHA2-512SUMS
echo "${{ needs.build_macos.outputs.sha512_macos_zip }} yt-dlp_macos.zip" >> SHA2-512SUMS
- name: Upload SHA2-256SUMS file
echo "${{ env.SHA256_BIN }} yt-dlp" >> SHA2-256SUMS
echo "${{ env.SHA256_TAR }} yt-dlp.tar.gz" >> SHA2-256SUMS
echo "${{ env.SHA256_WIN }} yt-dlp.exe" >> SHA2-256SUMS
echo "${{ env.SHA256_PY2EXE }} yt-dlp_min.exe" >> SHA2-256SUMS
echo "${{ env.SHA256_WIN32 }} yt-dlp_x86.exe" >> SHA2-256SUMS
echo "${{ env.SHA256_WIN_ZIP }} yt-dlp_win.zip" >> SHA2-256SUMS
echo "${{ env.SHA256_MACOS }} yt-dlp_macos" >> SHA2-256SUMS
echo "${{ env.SHA256_MACOS_ZIP }} yt-dlp_macos.zip" >> SHA2-256SUMS
- name: Upload 256SUMS file
id: upload-sums
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
upload_url: ${{ needs.build_unix.outputs.upload_url }}
asset_path: ./SHA2-256SUMS
asset_name: SHA2-256SUMS
asset_content_type: text/plain
- name: Upload SHA2-512SUMS file
- name: Make SHA2-512SUMS file
env:
SHA512_BIN: ${{ needs.build_unix.outputs.sha512_bin }}
SHA512_TAR: ${{ needs.build_unix.outputs.sha512_tar }}
SHA512_WIN: ${{ needs.build_windows.outputs.sha512_win }}
SHA512_PY2EXE: ${{ needs.build_windows.outputs.sha512_py2exe }}
SHA512_WIN_ZIP: ${{ needs.build_windows.outputs.sha512_win_zip }}
SHA512_WIN32: ${{ needs.build_windows32.outputs.sha512_win32 }}
SHA512_MACOS: ${{ needs.build_macos.outputs.sha512_macos }}
SHA512_MACOS_ZIP: ${{ needs.build_macos.outputs.sha512_macos_zip }}
run: |
echo "${{ env.SHA512_BIN }} yt-dlp" >> SHA2-512SUMS
echo "${{ env.SHA512_TAR }} yt-dlp.tar.gz" >> SHA2-512SUMS
echo "${{ env.SHA512_WIN }} yt-dlp.exe" >> SHA2-512SUMS
echo "${{ env.SHA512_WIN_ZIP }} yt-dlp_win.zip" >> SHA2-512SUMS
echo "${{ env.SHA512_PY2EXE }} yt-dlp_min.exe" >> SHA2-512SUMS
echo "${{ env.SHA512_WIN32 }} yt-dlp_x86.exe" >> SHA2-512SUMS
echo "${{ env.SHA512_MACOS }} yt-dlp_macos" >> SHA2-512SUMS
echo "${{ env.SHA512_MACOS_ZIP }} yt-dlp_macos.zip" >> SHA2-512SUMS
- name: Upload 512SUMS file
id: upload-512sums
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ needs.create_release.outputs.upload_url }}
upload_url: ${{ needs.build_unix.outputs.upload_url }}
asset_path: ./SHA2-512SUMS
asset_name: SHA2-512SUMS
asset_content_type: text/plain

View File

@ -10,15 +10,12 @@ jobs:
matrix:
os: [ubuntu-latest]
# CPython 3.9 is in quick-test
python-version: ['3.6', '3.7', '3.10', 3.11-dev, pypy-3.6, pypy-3.7, pypy-3.8]
python-version: ['3.6', '3.7', '3.10', 3.11-dev, pypy-3.6, pypy-3.7, pypy-3.8, pypy-3.9]
run-tests-ext: [sh]
include:
# atleast one of each CPython/PyPy tests must be in windows
# atleast one of the tests must be in windows
- os: windows-latest
python-version: '3.8'
run-tests-ext: bat
- os: windows-latest
python-version: pypy-3.9
python-version: 3.8
run-tests-ext: bat
steps:
- uses: actions/checkout@v2

View File

@ -9,15 +9,11 @@ jobs:
fail-fast: true
matrix:
os: [ubuntu-latest]
python-version: ['3.6', '3.7', '3.9', '3.10', 3.11-dev, pypy-3.6, pypy-3.7, pypy-3.8]
python-version: ['3.6', '3.7', '3.9', '3.10', 3.11-dev, pypy-3.6, pypy-3.7, pypy-3.8, pypy-3.9]
run-tests-ext: [sh]
include:
# atleast one of each CPython/PyPy tests must be in windows
- os: windows-latest
python-version: '3.8'
run-tests-ext: bat
- os: windows-latest
python-version: pypy-3.9
python-version: 3.8
run-tests-ext: bat
steps:
- uses: actions/checkout@v2

View File

@ -214,7 +214,7 @@ After you have ensured this site is distributing its content legally, you can fo
# TODO more properties (see yt_dlp/extractor/common.py)
}
```
1. Add an import in [`yt_dlp/extractor/_extractors.py`](yt_dlp/extractor/_extractors.py). Note that the class name must end with `IE`.
1. Add an import in [`yt_dlp/extractor/extractors.py`](yt_dlp/extractor/extractors.py).
1. Run `python test/test_download.py TestDownload.test_YourExtractor` (note that `YourExtractor` doesn't end with `IE`). This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, the tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. You can also run all the tests in one go with `TestDownload.test_YourExtractor_all`
1. Make sure you have atleast one test for your extractor. Even if all videos covered by the extractor are expected to be inaccessible for automated testing, tests should still be added with a `skip` parameter indicating why the particular test is disabled from running.
1. Have a look at [`yt_dlp/extractor/common.py`](yt_dlp/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](yt_dlp/extractor/common.py#L91-L426). Add tests and code for as many as you want.
@ -225,7 +225,7 @@ After you have ensured this site is distributing its content legally, you can fo
1. Make sure your code works under all [Python](https://www.python.org/) versions supported by yt-dlp, namely CPython and PyPy for Python 3.6 and above. Backward compatibility is not required for even older versions of Python.
1. When the tests pass, [add](https://git-scm.com/docs/git-add) the new files, [commit](https://git-scm.com/docs/git-commit) them and [push](https://git-scm.com/docs/git-push) the result, like this:
$ git add yt_dlp/extractor/_extractors.py
$ git add yt_dlp/extractor/extractors.py
$ git add yt_dlp/extractor/yourextractor.py
$ git commit -m '[yourextractor] Add extractor'
$ git push origin yourextractor
@ -300,10 +300,14 @@ description = meta['summary'] # incorrect
The latter will break extraction process with `KeyError` if `summary` disappears from `meta` at some later time but with the former approach extraction will just go ahead with `description` set to `None` which is perfectly fine (remember `None` is equivalent to the absence of data).
If the data is nested, do not use `.get` chains, but instead make use of `traverse_obj`.
If the data is nested, do not use `.get` chains, but instead make use of the utility functions `try_get` or `traverse_obj`
Considering the above `meta` again, assume you want to extract `["user"]["name"]` and put it in the resulting info dict as `uploader`
```python
uploader = try_get(meta, lambda x: x['user']['name']) # correct
```
or
```python
uploader = traverse_obj(meta, ('user', 'name')) # correct
```
@ -317,10 +321,6 @@ or
```python
uploader = meta.get('user', {}).get('name') # incorrect
```
or
```python
uploader = try_get(meta, lambda x: x['user']['name']) # old utility
```
Similarly, you should pass `fatal=False` when extracting optional data from a webpage with `_search_regex`, `_html_search_regex` or similar methods, for instance:
@ -346,25 +346,25 @@ On failure this code will silently continue the extraction with `description` se
Another thing to remember is not to try to iterate over `None`
Say you extracted a list of thumbnails into `thumbnail_data` and want to iterate over them
Say you extracted a list of thumbnails into `thumbnail_data` using `try_get` and now want to iterate over them
```python
thumbnail_data = data.get('thumbnails') or []
thumbnail_data = try_get(...)
thumbnails = [{
'url': item['url']
} for item in thumbnail_data] # correct
} for item in thumbnail_data or []] # correct
```
and not like:
```python
thumbnail_data = data.get('thumbnails')
thumbnail_data = try_get(...)
thumbnails = [{
'url': item['url']
} for item in thumbnail_data] # incorrect
```
In this case, `thumbnail_data` will be `None` if the field was not found and this will cause the loop `for item in thumbnail_data` to raise a fatal error. Using `or []` avoids this error and results in setting an empty list in `thumbnails` instead.
In the later case, `thumbnail_data` will be `None` if the field was not found and this will cause the loop `for item in thumbnail_data` to raise a fatal error. Using `for item in thumbnail_data or []` avoids this error and results in setting an empty list in `thumbnails` instead.
### Provide fallbacks
@ -431,7 +431,7 @@ title = self._search_regex( # correct
r'<span[^>]+class="title"[^>]*>([^<]+)', webpage, 'title')
```
which tolerates potential changes in the `style` attribute's value. Or even better:
Or even better:
```python
title = self._search_regex( # correct
@ -439,7 +439,7 @@ title = self._search_regex( # correct
webpage, 'title', group='title')
```
which also handles both single quotes in addition to double quotes.
Note how you tolerate potential changes in the `style` attribute's value or switch from using double quotes to single for `class` attribute:
The code definitely should not look like:
@ -460,41 +460,6 @@ title = self._search_regex( # incorrect
Here the presence or absence of other attributes including `style` is irrelevent for the data we need, and so the regex must not depend on it
#### Keep the regular expressions as simple as possible, but no simpler
Since many extractors deal with unstructured data provided by websites, we will often need to use very complex regular expressions. You should try to use the *simplest* regex that can accomplish what you want. In other words, each part of the regex must have a reason for existing. If you can take out a symbol and the functionality does not change, the symbol should not be there.
##### Example
Correct:
```python
_VALID_URL = r'https?://(?:www\.)?website\.com/(?:[^/]+/){3,4}(?P<display_id>[^/]+)_(?P<id>\d+)'
```
Incorrect:
```python
_VALID_URL = r'https?:\/\/(?:www\.)?website\.com\/[^\/]+/[^\/]+/[^\/]+(?:\/[^\/]+)?\/(?P<display_id>[^\/]+)_(?P<id>\d+)'
```
#### Do not misuse `.` and use the correct quantifiers (`+*?`)
Avoid creating regexes that over-match because of wrong use of quantifiers. Also try to avoid non-greedy matching (`?`) where possible since they could easily result in [catastrophic backtracking](https://www.regular-expressions.info/catastrophic.html)
Correct:
```python
title = self._search_regex(r'<span\b[^>]+class="title"[^>]*>([^<]+)', webpage, 'title')
```
Incorrect:
```python
title = self._search_regex(r'<span\b.*class="title".*>(.+?)<', webpage, 'title')
```
### Long lines policy
There is a soft limit to keep lines of code under 100 characters long. This means it should be respected if possible and if it does not make readability and code maintenance worse. Sometimes, it may be reasonable to go upto 120 characters and sometimes even 80 can be unreadable. Keep in mind that this is not a hard limit and is just one of many tools to make the code more readable.
@ -556,22 +521,19 @@ formats = self._extract_m3u8_formats(m3u8_url,
### Quotes
Always use single quotes for strings (even if the string has `'`) and double quotes for docstrings. Use `'''` only for multi-line strings. An exception can be made if a string has multiple single quotes in it and escaping makes it *significantly* harder to read. For f-strings, use you can use double quotes on the inside. But avoid f-strings that have too many quotes inside.
Always use single quotes for strings (even if the string has `'`) and double quotes for docstrings. Use `'''` only for multi-line strings. An exception can be made if a string has multiple single quotes in it and escaping makes it significantly harder to read. For f-strings, use you can use double quotes on the inside. But avoid f-strings that have too many quotes inside.
### Inline values
Extracting variables is acceptable for reducing code duplication and improving readability of complex expressions. However, you should avoid extracting variables used only once and moving them to opposite parts of the extractor file, which makes reading the linear flow difficult.
#### Examples
#### Example
Correct:
```python
return {
'title': self._html_search_regex(r'<h1>([^<]+)</h1>', webpage, 'title'),
# ...some lines of code...
}
title = self._html_search_regex(r'<h1>([^<]+)</h1>', webpage, 'title')
```
Incorrect:
@ -580,11 +542,6 @@ Incorrect:
TITLE_RE = r'<h1>([^<]+)</h1>'
# ...some lines of code...
title = self._html_search_regex(TITLE_RE, webpage, 'title')
# ...some lines of code...
return {
'title': title,
# ...some lines of code...
}
```
@ -616,32 +573,33 @@ Methods supporting list of patterns are: `_search_regex`, `_html_search_regex`,
### Trailing parentheses
Always move trailing parentheses used for grouping/functions after the last argument. On the other hand, multi-line literal list/tuple/dict/set should closed be in a new line. Generators and list/dict comprehensions may use either style
Always move trailing parentheses used for grouping/functions after the last argument. On the other hand, literal list/tuple/dict/set should closed be in a new line. Generators and list/dict comprehensions may use either style
#### Examples
Correct:
```python
url = traverse_obj(info, (
'context', 'dispatcher', 'stores', 'VideoTitlePageStore', 'data', 'video', 0, 'VideoUrlSet', 'VideoUrl'), list)
url = try_get(
info,
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list)
```
Correct:
```python
url = traverse_obj(
info,
('context', 'dispatcher', 'stores', 'VideoTitlePageStore', 'data', 'video', 0, 'VideoUrlSet', 'VideoUrl'),
list)
url = try_get(info,
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list)
```
Incorrect:
```python
url = traverse_obj(
url = try_get(
info,
('context', 'dispatcher', 'stores', 'VideoTitlePageStore', 'data', 'video', 0, 'VideoUrlSet', 'VideoUrl'),
list
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list,
)
```
@ -690,17 +648,21 @@ Use `unified_strdate` for uniform `upload_date` or any `YYYYMMDD` meta field ext
Explore [`yt_dlp/utils.py`](yt_dlp/utils.py) for more useful convenience functions.
#### Examples
#### More examples
##### Safely extract optional description from parsed JSON
```python
description = traverse_obj(response, ('result', 'video', 'summary'), expected_type=str)
thumbnails = traverse_obj(response, ('result', 'thumbnails', ..., 'url'), expected_type=url_or_none)
```
##### Safely extract more optional metadata
```python
video = traverse_obj(response, ('result', 'video', 0), default={}, expected_type=dict)
description = video.get('summary')
duration = float_or_none(video.get('durationMs'), scale=1000)
view_count = int_or_none(video.get('views'))
```
# My pull request is labeled pending-fixes
The `pending-fixes` label is added when there are changes requested to a PR. When the necessary changes are made, the label should be removed. However, despite our best efforts, it may sometimes happen that the maintainer did not see the changes or forgot to remove the label. If your PR is still marked as `pending-fixes` a few days after all requested changes have been made, feel free to ping the maintainer who labeled your issue and ask them to re-review and remove the label.

View File

@ -248,22 +248,3 @@ rand-net
vertan
Wikidepia
Yipten
moench-tegeder
christoph-heinrich
HobbyistDev
LunarFang416
sbor23
aurelg
adamanldo
gamer191
vkorablin
Burve
mnn
ZhymabekRoman
mozbugbox
aejdl
ping
sqrtNOT
bubbleguuum
darkxex
miseran

View File

@ -11,131 +11,6 @@
-->
### 2022.06.22.1
* [build] Fix updating homebrew formula
### 2022.06.22
* [**Deprecate support for Python 3.6**](https://github.com/yt-dlp/yt-dlp/issues/3764#issuecomment-1154051119)
* **Add option `--download-sections` to download video partially**
* Chapter regex and time ranges are accepted (Eg: `--download-sections *1:10-2:20`)
* Add option `--alias`
* Add option `--lazy-playlist` to process entries as they are received
* Add option `--retry-sleep`
* Add slicing notation to `--playlist-items`
* Adds support for negative indices and step
* Add `-I` as alias for `--playlist-index`
* Makes `--playlist-start`, `--playlist-end`, `--playlist-reverse`, `--no-playlist-reverse` redundant
* `--config-location -` to provide options interactively
* [build] Add Linux standalone builds
* [update] Self-restart after update
* Merge youtube-dl: Upto [commit/8a158a9](https://github.com/ytdl-org/youtube-dl/commit/8a158a9)
* Add `--no-update`
* Allow extractors to specify section_start/end for clips
* Do not print progress to `stderr` with `-q`
* Ensure pre-processor errors do not block video download
* Fix `--simulate --max-downloads`
* Improve error handling of bad config files
* Return an error code if update fails
* Fix bug in [3a408f9](https://github.com/yt-dlp/yt-dlp/commit/3a408f9d199127ca2626359e21a866a09ab236b3)
* [ExtractAudio] Allow conditional conversion
* [ModifyChapters] Fix repeated removal of small segments
* [ThumbnailsConvertor] Allow conditional conversion
* [cookies] Detect profiles for cygwin/BSD by [moench-tegeder](https://github.com/moench-tegeder)
* [dash] Show fragment count with `--live-from-start` by [flashdagger](https://github.com/flashdagger)
* [extractor] Add `_search_json` by [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [extractor] Add `default` parameter to `_search_json` by [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [extractor] Add dev option `--load-pages`
* [extractor] Handle `json_ld` with multiple `@type`s
* [extractor] Import `_ALL_CLASSES` lazily
* [extractor] Recognize `src` attribute from HTML5 media elements by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/generic] Revert e6ae51c123897927eb3c9899923d8ffd31c7f85d
* [f4m] Bugfix
* [ffmpeg] Check version lazily
* [jsinterp] Some optimizations and refactoring by [dirkf](https://github.com/dirkf), [pukkandan](https://github.com/pukkandan)
* [utils] Improve performance using `functools.cache`
* [utils] Send HTTP/1.1 ALPN extension by [coletdjnz](https://github.com/coletdjnz)
* [utils] `ExtractorError`: Fix `exc_info`
* [utils] `ISO3166Utils`: Add `EU` and `AP`
* [utils] `Popen`: Refactor to use contextmanager
* [utils] `locked_file`: Fix for PyPy on Windows
* [update] Expose more functionality to API
* [update] Use `.git` folder to distinguish `source`/`unknown`
* [compat] Add `functools.cached_property`
* [test] Fix `FakeYDL` signatures by [coletdjnz](https://github.com/coletdjnz)
* [docs] Improvements
* [cleanup, ExtractAudio] Refactor
* [cleanup, downloader] Refactor `report_progress`
* [cleanup, extractor] Refactor `_download_...` methods
* [cleanup, extractor] Rename `extractors.py` to `_extractors.py`
* [cleanup, utils] Don't use kwargs for `format_field`
* [cleanup, build] Refactor
* [cleanup, docs] Re-indent "Usage and Options" section
* [cleanup] Deprecate `YoutubeDL.parse_outtmpl`
* [cleanup] Misc fixes and cleanup by [Lesmiscore](https://github.com/Lesmiscore), [MrRawes](https://github.com/MrRawes), [christoph-heinrich](https://github.com/christoph-heinrich), [flashdagger](https://github.com/flashdagger), [gamer191](https://github.com/gamer191), [kwconder](https://github.com/kwconder), [pukkandan](https://github.com/pukkandan)
* [extractor/DailyWire] Add extractors by [HobbyistDev](https://github.com/HobbyistDev), [pukkandan](https://github.com/pukkandan)
* [extractor/fourzerostudio] Add extractors by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/GoogleDrive] Add folder extractor by [evansp](https://github.com/evansp), [pukkandan](https://github.com/pukkandan)
* [extractor/MirrorCoUK] Add extractor by [LunarFang416](https://github.com/LunarFang416), [pukkandan](https://github.com/pukkandan)
* [extractor/atscaleconfevent] Add extractor by [Ashish0804](https://github.com/Ashish0804)
* [extractor/freetv] Add extractor by [elyse0](https://github.com/elyse0)
* [extractor/ixigua] Add Extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/kicker.de] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/netverse] Add extractors by [HobbyistDev](https://github.com/HobbyistDev), [pukkandan](https://github.com/pukkandan)
* [extractor/playsuisse] Add extractor by [pukkandan](https://github.com/pukkandan), [sbor23](https://github.com/sbor23)
* [extractor/substack] Add extractor by [elyse0](https://github.com/elyse0)
* [extractor/youtube] **Support downloading clips**
* [extractor/youtube] Add `innertube_host` and `innertube_key` extractor args by [coletdjnz](https://github.com/coletdjnz)
* [extractor/youtube] Add warning for PostLiveDvr
* [extractor/youtube] Bring back `_extract_chapters_from_description`
* [extractor/youtube] Extract `comment_count` from webpage
* [extractor/youtube] Fix `:ytnotifications` extractor by [coletdjnz](https://github.com/coletdjnz)
* [extractor/youtube] Fix initial player response extraction by [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [extractor/youtube] Fix live chat for videos with content warning by [coletdjnz](https://github.com/coletdjnz)
* [extractor/youtube] Make signature extraction non-fatal
* [extractor/youtube:tab] Detect `videoRenderer` in `_post_thread_continuation_entries`
* [extractor/BiliIntl] Fix metadata extraction
* [extractor/BiliIntl] Fix subtitle extraction by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/FranceCulture] Fix extractor by [aurelg](https://github.com/aurelg), [pukkandan](https://github.com/pukkandan)
* [extractor/PokemonSoundLibrary] Remove extractor by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/StreamCZ] Fix extractor by [adamanldo](https://github.com/adamanldo), [dirkf](https://github.com/dirkf)
* [extractor/WatchESPN] Support free videos and BAM_DTC by [ischmidt20](https://github.com/ischmidt20)
* [extractor/animelab] Remove extractor by [gamer191](https://github.com/gamer191)
* [extractor/bloomberg] Change playback endpoint by [m4tu4g](https://github.com/m4tu4g)
* [extractor/ccc] Extract view_count by [vkorablin](https://github.com/vkorablin)
* [extractor/crunchyroll:beta] Fix extractor after API change by [Burve](https://github.com/Burve), [tejing1](https://github.com/tejing1)
* [extractor/curiositystream] Get `auth_token` from cookie by [mnn](https://github.com/mnn)
* [extractor/digitalconcerthall] Fix extractor by [ZhymabekRoman](https://github.com/ZhymabekRoman)
* [extractor/dropbox] Extract the correct `mountComponent`
* [extractor/dropout] Login is not mandatory
* [extractor/duboku] Fix for hostname change by [mozbugbox](https://github.com/mozbugbox)
* [extractor/espn] Add `WatchESPN` extractor by [ischmidt20](https://github.com/ischmidt20), [pukkandan](https://github.com/pukkandan)
* [extractor/expressen] Fix extractor by [aejdl](https://github.com/aejdl)
* [extractor/foxnews] Update embed extraction by [elyse0](https://github.com/elyse0)
* [extractor/ina] Fix extractor by [elyse0](https://github.com/elyse0)
* [extractor/iwara:user] Make paging better by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/jwplatform] Look for `data-video-jw-id`
* [extractor/lbry] Update livestream API by [flashdagger](https://github.com/flashdagger)
* [extractor/mediaset] Improve `_VALID_URL`
* [extractor/naver] Add `navernow` extractor by [ping](https://github.com/ping)
* [extractor/niconico:series] Fix extractor by [sqrtNOT](https://github.com/sqrtNOT)
* [extractor/npr] Use stream url from json-ld by [r5d](https://github.com/r5d)
* [extractor/pornhub] Extract `uploader_id` field by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/radiofrance] Add more radios by [bubbleguuum](https://github.com/bubbleguuum)
* [extractor/rumble] Detect JS embed
* [extractor/rumble] Extract subtitles by [fstirlitz](https://github.com/fstirlitz)
* [extractor/southpark] Add `southpark.lat` extractor by [darkxex](https://github.com/darkxex)
* [extractor/spotify:show] Fix extractor
* [extractor/tiktok] Detect embeds
* [extractor/tiktok] Extract `SIGI_STATE` by [dirkf](https://github.com/dirkf), [pukkandan](https://github.com/pukkandan), [sulyi](https://github.com/sulyi)
* [extractor/tver] Fix extractor by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/vevo] Fix extractor by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/yahoo:gyao] Fix extractor
* [extractor/zattoo] Fix live streams by [miseran](https://github.com/miseran)
* [extractor/zdf] Improve format sorting by [elyse0](https://github.com/elyse0)
### 2022.05.18
* Add support for SSL client certificate authentication by [coletdjnz](https://github.com/coletdjnz), [dirkf](https://github.com/dirkf)
@ -1281,7 +1156,7 @@
* [build] Automate more of the release process by [animelover1984](https://github.com/animelover1984), [pukkandan](https://github.com/pukkandan)
* [build] Fix sha256 by [nihil-admirari](https://github.com/nihil-admirari)
* [build] Bring back brew taps by [nao20010128nao](https://github.com/nao20010128nao)
* [build] Provide `--onedir` zip for windows
* [build] Provide `--onedir` zip for windows by [pukkandan](https://github.com/pukkandan)
* [cleanup,docs] Add deprecation warning in docs for some counter intuitive behaviour
* [cleanup] Fix line endings for `nebula.py` by [glenn-slayden](https://github.com/glenn-slayden)
* [cleanup] Improve `make clean-test` by [sulyi](https://github.com/sulyi)

View File

@ -9,8 +9,7 @@ tar: yt-dlp.tar.gz
# Keep this list in sync with MANIFEST.in
# intended use: when building a source distribution,
# make pypi-files && python setup.py sdist
pypi-files: AUTHORS Changelog.md LICENSE README.md README.txt supportedsites \
completions yt-dlp.1 requirements.txt setup.cfg devscripts/* test/*
pypi-files: AUTHORS Changelog.md LICENSE README.md README.txt supportedsites completions yt-dlp.1 devscripts/* test/*
.PHONY: all clean install test tar pypi-files completions ot offlinetest codetest supportedsites
@ -43,7 +42,7 @@ PYTHON ?= /usr/bin/env python3
SYSCONFDIR = $(shell if [ $(PREFIX) = /usr -o $(PREFIX) = /usr/local ]; then echo /etc; else echo $(PREFIX)/etc; fi)
# set markdown input format to "markdown-smart" for pandoc version 2 and to "markdown" for pandoc prior to version 2
MARKDOWN = $(shell if [ `pandoc -v | head -n1 | cut -d" " -f2 | head -c1` = "2" ]; then echo markdown-smart; else echo markdown; fi)
MARKDOWN = $(shell if [ "$(pandoc -v | head -n1 | cut -d" " -f2 | head -c1)" = "2" ]; then echo markdown-smart; else echo markdown; fi)
install: lazy-extractors yt-dlp yt-dlp.1 completions
mkdir -p $(DESTDIR)$(BINDIR)
@ -92,10 +91,10 @@ yt-dlp: yt_dlp/*.py yt_dlp/*/*.py
rm yt-dlp.zip
chmod a+x yt-dlp
README.md: yt_dlp/*.py yt_dlp/*/*.py devscripts/make_readme.py
COLUMNS=80 $(PYTHON) yt_dlp/__main__.py --ignore-config --help | $(PYTHON) devscripts/make_readme.py
README.md: yt_dlp/*.py yt_dlp/*/*.py
COLUMNS=80 $(PYTHON) yt_dlp/__main__.py --help | $(PYTHON) devscripts/make_readme.py
CONTRIBUTING.md: README.md devscripts/make_contributing.py
CONTRIBUTING.md: README.md
$(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md
issuetemplates: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.yml .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.yml .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.yml .github/ISSUE_TEMPLATE_tmpl/4_bug_report.yml .github/ISSUE_TEMPLATE_tmpl/5_feature_request.yml yt_dlp/version.py
@ -112,7 +111,7 @@ supportedsites:
README.txt: README.md
pandoc -f $(MARKDOWN) -t plain README.md -o README.txt
yt-dlp.1: README.md devscripts/prepare_manpage.py
yt-dlp.1: README.md
$(PYTHON) devscripts/prepare_manpage.py yt-dlp.1.temp.md
pandoc -s -f $(MARKDOWN) -t man yt-dlp.1.temp.md -o yt-dlp.1
rm -f yt-dlp.1.temp.md
@ -129,7 +128,7 @@ completions/fish/yt-dlp.fish: yt_dlp/*.py yt_dlp/*/*.py devscripts/fish-completi
mkdir -p completions/fish
$(PYTHON) devscripts/fish-completion.py
_EXTRACTOR_FILES = $(shell find yt_dlp/extractor -name '*.py' -and -not -name 'lazy_extractors.py')
_EXTRACTOR_FILES = $(shell find yt_dlp/extractor -iname '*.py' -and -not -iname 'lazy_extractors.py')
yt_dlp/extractor/lazy_extractors.py: devscripts/make_lazy_extractors.py devscripts/lazy_load_template.py $(_EXTRACTOR_FILES)
$(PYTHON) devscripts/make_lazy_extractors.py $@
@ -148,7 +147,7 @@ yt-dlp.tar.gz: all
CONTRIBUTING.md Collaborators.md CONTRIBUTORS AUTHORS \
Makefile MANIFEST.in yt-dlp.1 README.txt completions \
setup.py setup.cfg yt-dlp yt_dlp requirements.txt \
devscripts test
devscripts test tox.ini pytest.ini
AUTHORS: .mailmap
git shortlog -s -n | cut -f2 | sort > AUTHORS

1556
README.md

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,5 @@
#!/usr/bin/env python3
import io
import optparse

View File

@ -53,7 +53,7 @@ def get_all_ies():
if os.path.exists(PLUGINS_DIRNAME):
os.rename(PLUGINS_DIRNAME, BLOCKED_DIRNAME)
try:
from yt_dlp.extractor.extractors import _ALL_CLASSES
from yt_dlp.extractor import _ALL_CLASSES
finally:
if os.path.exists(BLOCKED_DIRNAME):
os.rename(BLOCKED_DIRNAME, PLUGINS_DIRNAME)

View File

@ -2,7 +2,6 @@
# yt-dlp --help | make_readme.py
# This must be run in a console of correct width
import functools
import re
import sys
@ -11,60 +10,21 @@ README_FILE = 'README.md'
OPTIONS_START = 'General Options:'
OPTIONS_END = 'CONFIGURATION'
EPILOG_START = 'See full documentation'
ALLOWED_OVERSHOOT = 2
DISABLE_PATCH = object()
def take_section(text, start=None, end=None, *, shift=0):
return text[
text.index(start) + shift if start else None:
text.index(end) + shift if end else None
]
helptext = sys.stdin.read()
if isinstance(helptext, bytes):
helptext = helptext.decode()
def apply_patch(text, patch):
return text if patch[0] is DISABLE_PATCH else re.sub(*patch, text)
options = take_section(sys.stdin.read(), f'\n {OPTIONS_START}', f'\n{EPILOG_START}', shift=1)
max_width = max(map(len, options.split('\n')))
switch_col_width = len(re.search(r'(?m)^\s{5,}', options).group())
delim = f'\n{" " * switch_col_width}'
PATCHES = (
( # Headings
r'(?m)^ (\w.+\n)( (?=\w))?',
r'## \1'
),
( # Do not split URLs
rf'({delim[:-1]})? (?P<label>\[\S+\] )?(?P<url>https?({delim})?:({delim})?/({delim})?/(({delim})?\S+)+)\s',
lambda mobj: ''.join((delim, mobj.group('label') or '', re.sub(r'\s+', '', mobj.group('url')), '\n'))
),
( # Do not split "words"
rf'(?m)({delim}\S+)+$',
lambda mobj: ''.join((delim, mobj.group(0).replace(delim, '')))
),
( # Allow overshooting last line
rf'(?m)^(?P<prev>.+)${delim}(?P<current>.+)$(?!{delim})',
lambda mobj: (mobj.group().replace(delim, ' ')
if len(mobj.group()) - len(delim) + 1 <= max_width + ALLOWED_OVERSHOOT
else mobj.group())
),
( # Avoid newline when a space is available b/w switch and description
DISABLE_PATCH, # This creates issues with prepare_manpage
r'(?m)^(\s{4}-.{%d})(%s)' % (switch_col_width - 6, delim),
r'\1 '
),
)
start, end = helptext.index(f'\n {OPTIONS_START}'), helptext.index(f'\n{EPILOG_START}')
options = re.sub(r'(?m)^ (\w.+)$', r'## \1', helptext[start + 1: end + 1])
with open(README_FILE, encoding='utf-8') as f:
readme = f.read()
header = readme[:readme.index(f'## {OPTIONS_START}')]
footer = readme[readme.index(f'# {OPTIONS_END}'):]
with open(README_FILE, 'w', encoding='utf-8') as f:
f.write(''.join((
take_section(readme, end=f'## {OPTIONS_START}'),
functools.reduce(apply_patch, PATCHES, options),
take_section(readme, f'# {OPTIONS_END}'),
)))
for part in (header, options, footer):
f.write(part)

View File

@ -1,4 +1,4 @@
#!/usr/bin/env sh
#!/bin/sh
if [ -z $1 ]; then
test_set='test'

View File

@ -5,7 +5,24 @@ import sys
from PyInstaller.__main__ import run as run_pyinstaller
OS_NAME, ARCH = sys.platform, platform.architecture()[0][:2]
OS_NAME = platform.system()
if OS_NAME == 'Windows':
from PyInstaller.utils.win32.versioninfo import (
FixedFileInfo,
SetVersion,
StringFileInfo,
StringStruct,
StringTable,
VarFileInfo,
VarStruct,
VSVersionInfo,
)
elif OS_NAME == 'Darwin':
pass
else:
raise Exception(f'{OS_NAME} is not supported')
ARCH = platform.architecture()[0][:2]
def main():
@ -16,7 +33,10 @@ def main():
if not onedir and '-F' not in opts and '--onefile' not in opts:
opts.append('--onefile')
name, final_file = exe(onedir)
name = 'yt-dlp%s' % ('_macos' if OS_NAME == 'Darwin' else '_x86' if ARCH == '32' else '')
final_file = ''.join((
'dist/', f'{name}/' if onedir else '', name, '.exe' if OS_NAME == 'Windows' else ''))
print(f'Building yt-dlp v{version} {ARCH}bit for {OS_NAME} with options {opts}')
print('Remember to update the version using "devscripts/update-version.py"')
if not os.path.isfile('yt_dlp/extractor/lazy_extractors.py'):
@ -59,21 +79,6 @@ def read_version(fname):
return locals()['__version__']
def exe(onedir):
"""@returns (name, path)"""
name = '_'.join(filter(None, (
'yt-dlp',
{'win32': '', 'darwin': 'macos'}.get(OS_NAME, OS_NAME),
ARCH == '32' and 'x86'
)))
return name, ''.join(filter(None, (
'dist/',
onedir and f'{name}/',
name,
OS_NAME == 'win32' and '.exe'
)))
def version_to_list(version):
version_list = version.split('.')
return list(map(int, version_list)) + [0] * (4 - len(version_list))
@ -104,22 +109,11 @@ def pycryptodome_module():
def set_version_info(exe, version):
if OS_NAME == 'win32':
if OS_NAME == 'Windows':
windows_set_version(exe, version)
def windows_set_version(exe, version):
from PyInstaller.utils.win32.versioninfo import (
FixedFileInfo,
SetVersion,
StringFileInfo,
StringStruct,
StringTable,
VarFileInfo,
VarStruct,
VSVersionInfo,
)
version_list = version_to_list(version)
suffix = '_x86' if ARCH == '32' else ''
SetVersion(exe, VSVersionInfo(

4
pytest.ini Normal file
View File

@ -0,0 +1,4 @@
[pytest]
addopts = -ra -v --strict-markers
markers =
download

View File

@ -1,39 +1,6 @@
[wheel]
universal = true
universal = True
[flake8]
exclude = build,venv,.tox,.git,.pytest_cache
exclude = devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv
ignore = E402,E501,E731,E741,W503
max_line_length = 120
per_file_ignores =
devscripts/lazy_load_template.py: F401
[tool:pytest]
addopts = -ra -v --strict-markers
markers =
download
[tox:tox]
skipsdist = true
envlist = py{36,37,38,39,310},pypy{36,37,38,39}
skip_missing_interpreters = true
[testenv] # tox
deps =
pytest
commands = pytest {posargs:"-m not download"}
passenv = HOME # For test_compat_expanduser
setenv =
# PYTHONWARNINGS = error # Catches PIP's warnings too
[isort]
py_version = 36
multi_line_output = VERTICAL_HANGING_INDENT
line_length = 80
reverse_relative = true
ensure_newline_before_comments = true
include_trailing_comma = true

View File

@ -36,7 +36,7 @@ REQUIREMENTS = read('requirements.txt').splitlines()
if sys.argv[1:2] == ['py2exe']:
import py2exe # noqa: F401
import py2exe
warnings.warn(
'py2exe builds do not support pycryptodomex and needs VC++14 to run. '
'The recommended way is to use "pyinst.py" to build using pyinstaller')
@ -140,9 +140,6 @@ setup(
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: 3.9',
'Programming Language :: Python :: 3.10',
'Programming Language :: Python :: 3.11',
'Programming Language :: Python :: Implementation',
'Programming Language :: Python :: Implementation :: CPython',
'Programming Language :: Python :: Implementation :: PyPy',

View File

@ -1,6 +1,4 @@
# Supported sites
- **0000studio:archive**
- **0000studio:clip**
- **17live**
- **17live:clip**
- **1tv**: Первый канал
@ -62,6 +60,8 @@
- **AmHistoryChannel**
- **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **AnimalPlanet**
- **AnimeLab**: [<abbr title="netrc machine"><em>animelab</em></abbr>]
- **AnimeLabShows**: [<abbr title="netrc machine"><em>animelab</em></abbr>]
- **AnimeOnDemand**: [<abbr title="netrc machine"><em>animeondemand</em></abbr>]
- **ant1newsgr:article**: ant1news.gr articles
- **ant1newsgr:embed**: ant1news.gr embedded videos
@ -89,7 +89,6 @@
- **AsianCrush**
- **AsianCrushPlaylist**
- **AtresPlayer**: [<abbr title="netrc machine"><em>atresplayer</em></abbr>]
- **AtScaleConfEvent**
- **ATTTechChannel**
- **ATVAt**
- **AudiMedia**
@ -277,8 +276,6 @@
- **dailymotion**: [<abbr title="netrc machine"><em>dailymotion</em></abbr>]
- **dailymotion:playlist**: [<abbr title="netrc machine"><em>dailymotion</em></abbr>]
- **dailymotion:user**: [<abbr title="netrc machine"><em>dailymotion</em></abbr>]
- **DailyWire**
- **DailyWirePodcast**
- **damtomo:record**
- **damtomo:video**
- **daum.net**
@ -325,8 +322,8 @@
- **drtv**
- **drtv:live**
- **DTube**
- **duboku**: www.duboku.io
- **duboku:list**: www.duboku.io entire series
- **duboku**: www.duboku.co
- **duboku:list**: www.duboku.co entire series
- **Dumpert**
- **dvtv**: http://video.aktualne.cz/
- **dw**
@ -406,8 +403,6 @@
- **FranceTVSite**
- **Freesound**
- **freespeech.org**
- **freetv:series**
- **FreeTvMovies**
- **FrontendMasters**: [<abbr title="netrc machine"><em>frontendmasters</em></abbr>]
- **FrontendMastersCourse**: [<abbr title="netrc machine"><em>frontendmasters</em></abbr>]
- **FrontendMastersLesson**: [<abbr title="netrc machine"><em>frontendmasters</em></abbr>]
@ -457,7 +452,6 @@
- **google:podcasts**
- **google:podcasts:feed**
- **GoogleDrive**
- **GoogleDrive:Folder**
- **GoPro**
- **Goshgay**
- **GoToStage**
@ -541,7 +535,6 @@
- **Iwara**
- **iwara:playlist**
- **iwara:user**
- **Ixigua**
- **Izlesene**
- **Jable**
- **JablePlaylist**
@ -561,14 +554,12 @@
- **Ketnet**
- **khanacademy**
- **khanacademy:unit**
- **Kicker**
- **KickStarter**
- **KinjaEmbed**
- **KinoPoisk**
- **KonserthusetPlay**
- **Koo**
- **KrasView**: Красвью
- **KTH**
- **Ku6**
- **KUSI**
- **kuwo:album**: 酷我音乐 - 专辑
@ -684,7 +675,6 @@
- **miomio.tv**
- **mirrativ**
- **mirrativ:user**
- **MirrorCoUK**
- **MiTele**: mitele.es
- **mixch**
- **mixch:archive**
@ -750,7 +740,6 @@
- **NationalGeographicTV**
- **Naver**
- **Naver:live**
- **navernow**
- **NBA**
- **nba:watch**
- **nba:watch:collection**
@ -780,8 +769,6 @@
- **netease:singer**: 网易云音乐 - 歌手
- **netease:song**: 网易云音乐
- **NetPlus**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **Netverse**
- **NetversePlaylist**
- **Netzkino**
- **Newgrounds**
- **Newgrounds:playlist**
@ -945,7 +932,6 @@
- **PlayPlusTV**: [<abbr title="netrc machine"><em>playplustv</em></abbr>]
- **PlayStuff**
- **PlaysTV**
- **PlaySuisse**
- **Playtvak**: Playtvak.cz, iDNES.cz and Lidovky.cz
- **Playvid**
- **PlayVids**
@ -956,6 +942,7 @@
- **Podchaser**
- **podomatic**
- **Pokemon**
- **PokemonSoundLibrary**
- **PokemonWatch**
- **PokerGo**: [<abbr title="netrc machine"><em>pokergo</em></abbr>]
- **PokerGoCollection**: [<abbr title="netrc machine"><em>pokergo</em></abbr>]
@ -1163,7 +1150,6 @@
- **southpark.cc.com**
- **southpark.cc.com:español**
- **southpark.de**
- **southpark.lat**
- **southpark.nl**
- **southparkstudios.dk**
- **SovietsCloset**
@ -1203,7 +1189,6 @@
- **StretchInternet**
- **Stripchat**
- **stv:player**
- **Substack**
- **SunPorno**
- **sverigesradio:episode**
- **sverigesradio:publication**
@ -1478,7 +1463,6 @@
- **washingtonpost:article**
- **wat.tv**
- **WatchBox**
- **WatchESPN**
- **WatchIndianPorn**: Watch Indian Porn
- **WDR**
- **wdr:mobile**: (**Currently broken**)
@ -1551,7 +1535,6 @@
- **YourPorn**
- **YourUpload**
- **youtube**: YouTube
- **youtube:clip**
- **youtube:favorites**: YouTube liked videos; ":ytfav" keyword (requires cookies)
- **youtube:history**: Youtube watch history; ":ythis" keyword (requires cookies)
- **youtube:music:search_url**: YouTube music search URLs with selectable sections (Eg: #songs)

View File

@ -44,7 +44,7 @@ def try_rm(filename):
raise
def report_warning(message, *args, **kwargs):
def report_warning(message):
'''
Print the message to stderr, it will be prefixed with 'WARNING:'
If stderr is a tty file the 'WARNING:' will be colored
@ -67,10 +67,10 @@ class FakeYDL(YoutubeDL):
super().__init__(params, auto_init=False)
self.result = []
def to_screen(self, s, *args, **kwargs):
def to_screen(self, s, skip_eol=None):
print(s)
def trouble(self, s, *args, **kwargs):
def trouble(self, s, tb=None):
raise Exception(s)
def download(self, x):
@ -80,10 +80,10 @@ class FakeYDL(YoutubeDL):
# Silence an expected warning matching a regex
old_report_warning = self.report_warning
def report_warning(self, message, *args, **kwargs):
def report_warning(self, message):
if re.match(regex, message):
return
old_report_warning(message, *args, **kwargs)
old_report_warning(message)
self.report_warning = types.MethodType(report_warning, self)
@ -301,9 +301,9 @@ def assertEqual(self, got, expected, msg=None):
def expect_warnings(ydl, warnings_re):
real_warning = ydl.report_warning
def _report_warning(w, *args, **kwargs):
def _report_warning(w):
if not any(re.search(w_re, w) for w_re in warnings_re):
real_warning(w, *args, **kwargs)
real_warning(w)
ydl.report_warning = _report_warning

View File

@ -502,24 +502,6 @@ class TestInfoExtractor(unittest.TestCase):
}],
})
# from https://0000.studio/
# with type attribute but without extension in URL
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://0000.studio',
r'''
<video src="https://d1ggyt9m8pwf3g.cloudfront.net/protected/ap-northeast-1:1864af40-28d5-492b-b739-b32314b1a527/archive/clip/838db6a7-8973-4cd6-840d-8517e4093c92"
controls="controls" type="video/mp4" preload="metadata" autoplay="autoplay" playsinline class="object-contain">
</video>
''', None)[0],
{
'formats': [{
'url': 'https://d1ggyt9m8pwf3g.cloudfront.net/protected/ap-northeast-1:1864af40-28d5-492b-b739-b32314b1a527/archive/clip/838db6a7-8973-4cd6-840d-8517e4093c92',
'ext': 'mp4',
}],
})
def test_extract_jwplayer_data_realworld(self):
# from http://www.suffolk.edu/sjc/
expect_dict(

View File

@ -23,7 +23,6 @@ from yt_dlp.postprocessor.common import PostProcessor
from yt_dlp.utils import (
ExtractorError,
LazyList,
OnDemandPagedList,
int_or_none,
match_filter_func,
)
@ -40,7 +39,7 @@ class YDL(FakeYDL):
def process_info(self, info_dict):
self.downloaded_info_dicts.append(info_dict.copy())
def to_screen(self, msg, *args, **kwargs):
def to_screen(self, msg):
self.msgs.append(msg)
def dl(self, *args, **kwargs):
@ -990,79 +989,41 @@ class TestYoutubeDL(unittest.TestCase):
self.assertEqual(res, [])
def test_playlist_items_selection(self):
INDICES, PAGE_SIZE = list(range(1, 11)), 3
entries = [{
'id': compat_str(i),
'title': compat_str(i),
'url': TEST_URL,
} for i in range(1, 5)]
playlist = {
'_type': 'playlist',
'id': 'test',
'entries': entries,
'extractor': 'test:playlist',
'extractor_key': 'test:playlist',
'webpage_url': 'http://example.com',
}
def entry(i, evaluated):
evaluated.append(i)
return {
'id': str(i),
'title': str(i),
'url': TEST_URL,
}
def pagedlist_entries(evaluated):
def page_func(n):
start = PAGE_SIZE * n
for i in INDICES[start: start + PAGE_SIZE]:
yield entry(i, evaluated)
return OnDemandPagedList(page_func, PAGE_SIZE)
def page_num(i):
return (i + PAGE_SIZE - 1) // PAGE_SIZE
def generator_entries(evaluated):
for i in INDICES:
yield entry(i, evaluated)
def list_entries(evaluated):
return list(generator_entries(evaluated))
def lazylist_entries(evaluated):
return LazyList(generator_entries(evaluated))
def get_downloaded_info_dicts(params, entries):
def get_downloaded_info_dicts(params):
ydl = YDL(params)
ydl.process_ie_result({
'_type': 'playlist',
'id': 'test',
'extractor': 'test:playlist',
'extractor_key': 'test:playlist',
'webpage_url': 'http://example.com',
'entries': entries,
})
# make a deep copy because the dictionary and nested entries
# can be modified
ydl.process_ie_result(copy.deepcopy(playlist))
return ydl.downloaded_info_dicts
def test_selection(params, expected_ids, evaluate_all=False):
expected_ids = list(expected_ids)
if evaluate_all:
generator_eval = pagedlist_eval = INDICES
elif not expected_ids:
generator_eval = pagedlist_eval = []
else:
generator_eval = INDICES[0: max(expected_ids)]
pagedlist_eval = INDICES[PAGE_SIZE * page_num(min(expected_ids)) - PAGE_SIZE:
PAGE_SIZE * page_num(max(expected_ids))]
def test_selection(params, expected_ids):
results = [
(v['playlist_autonumber'] - 1, (int(v['id']), v['playlist_index']))
for v in get_downloaded_info_dicts(params)]
self.assertEqual(results, list(enumerate(zip(expected_ids, expected_ids))))
for name, func, expected_eval in (
('list', list_entries, INDICES),
('Generator', generator_entries, generator_eval),
# ('LazyList', lazylist_entries, generator_eval), # Generator and LazyList follow the exact same code path
('PagedList', pagedlist_entries, pagedlist_eval),
):
evaluated = []
entries = func(evaluated)
results = [(v['playlist_autonumber'] - 1, (int(v['id']), v['playlist_index']))
for v in get_downloaded_info_dicts(params, entries)]
self.assertEqual(results, list(enumerate(zip(expected_ids, expected_ids))), f'Entries of {name} for {params}')
self.assertEqual(sorted(evaluated), expected_eval, f'Evaluation of {name} for {params}')
test_selection({}, INDICES)
test_selection({'playlistend': 20}, INDICES, True)
test_selection({'playlistend': 2}, INDICES[:2])
test_selection({'playliststart': 11}, [], True)
test_selection({'playliststart': 2}, INDICES[1:])
test_selection({'playlist_items': '2-4'}, INDICES[1:4])
test_selection({}, [1, 2, 3, 4])
test_selection({'playlistend': 10}, [1, 2, 3, 4])
test_selection({'playlistend': 2}, [1, 2])
test_selection({'playliststart': 10}, [])
test_selection({'playliststart': 2}, [2, 3, 4])
test_selection({'playlist_items': '2-4'}, [2, 3, 4])
test_selection({'playlist_items': '2,4'}, [2, 4])
test_selection({'playlist_items': '20'}, [], True)
test_selection({'playlist_items': '10'}, [])
test_selection({'playlist_items': '0'}, [])
# Tests for https://github.com/ytdl-org/youtube-dl/issues/10591
@ -1071,33 +1032,11 @@ class TestYoutubeDL(unittest.TestCase):
# Tests for https://github.com/yt-dlp/yt-dlp/issues/720
# https://github.com/yt-dlp/yt-dlp/issues/302
test_selection({'playlistreverse': True}, INDICES[::-1])
test_selection({'playliststart': 2, 'playlistreverse': True}, INDICES[:0:-1])
test_selection({'playlistreverse': True}, [4, 3, 2, 1])
test_selection({'playliststart': 2, 'playlistreverse': True}, [4, 3, 2])
test_selection({'playlist_items': '2,4', 'playlistreverse': True}, [4, 2])
test_selection({'playlist_items': '4,2'}, [4, 2])
# Tests for --playlist-items start:end:step
test_selection({'playlist_items': ':'}, INDICES, True)
test_selection({'playlist_items': '::1'}, INDICES, True)
test_selection({'playlist_items': '::-1'}, INDICES[::-1], True)
test_selection({'playlist_items': ':6'}, INDICES[:6])
test_selection({'playlist_items': ':-6'}, INDICES[:-5], True)
test_selection({'playlist_items': '-1:6:-2'}, INDICES[:4:-2], True)
test_selection({'playlist_items': '9:-6:-2'}, INDICES[8:3:-2], True)
test_selection({'playlist_items': '1:inf:2'}, INDICES[::2], True)
test_selection({'playlist_items': '-2:inf'}, INDICES[-2:], True)
test_selection({'playlist_items': ':inf:-1'}, [], True)
test_selection({'playlist_items': '0-2:2'}, [2])
test_selection({'playlist_items': '1-:2'}, INDICES[::2], True)
test_selection({'playlist_items': '0--2:2'}, INDICES[1:-1:2], True)
test_selection({'playlist_items': '10::3'}, [10], True)
test_selection({'playlist_items': '-1::3'}, [10], True)
test_selection({'playlist_items': '11::3'}, [], True)
test_selection({'playlist_items': '-15::2'}, INDICES[1::2], True)
test_selection({'playlist_items': '-15::15'}, [], True)
def test_urlopen_no_file_protocol(self):
# see https://github.com/ytdl-org/youtube-dl/issues/8227
ydl = YDL()

View File

@ -14,16 +14,16 @@ from yt_dlp.cookies import (
class Logger:
def debug(self, message, *args, **kwargs):
def debug(self, message):
print(f'[verbose] {message}')
def info(self, message, *args, **kwargs):
def info(self, message):
print(message)
def warning(self, message, *args, **kwargs):
def warning(self, message, only_once=False):
self.error(message)
def error(self, message, *args, **kwargs):
def error(self, message):
raise Exception(message)

View File

@ -43,7 +43,7 @@ class YoutubeDL(yt_dlp.YoutubeDL):
self.processed_info_dicts = []
super().__init__(*args, **kwargs)
def report_warning(self, message, *args, **kwargs):
def report_warning(self, message):
# Don't accept warnings during tests
raise ExtractorError(message)
@ -102,10 +102,9 @@ def generator(test_case, tname):
def print_skipping(reason):
print('Skipping %s: %s' % (test_case['name'], reason))
self.skipTest(reason)
if not ie.working():
print_skipping('IE marked as not _WORKING')
return
for tc in test_cases:
info_dict = tc.get('info_dict', {})
@ -119,10 +118,11 @@ def generator(test_case, tname):
if 'skip' in test_case:
print_skipping(test_case['skip'])
return
for other_ie in other_ies:
if not other_ie.working():
print_skipping('test depends on %sIE, marked as not WORKING' % other_ie.ie_key())
return
params = get_params(test_case.get('params', {}))
params['outtmpl'] = tname + '_' + params['outtmpl']

View File

@ -38,9 +38,6 @@ class BaseTestSubtitles(unittest.TestCase):
self.DL = FakeYDL()
self.ie = self.IE()
self.DL.add_info_extractor(self.ie)
if not self.IE.working():
print('Skipping: %s marked as not _WORKING' % self.IE.ie_key())
self.skipTest('IE marked as not _WORKING')
def getInfoDict(self):
info_dict = self.DL.extract_info(self.url, download=False)
@ -60,21 +57,6 @@ class BaseTestSubtitles(unittest.TestCase):
@is_download_test
class TestYoutubeSubtitles(BaseTestSubtitles):
# Available subtitles for QRS8MkLhQmM:
# Language formats
# ru vtt, ttml, srv3, srv2, srv1, json3
# fr vtt, ttml, srv3, srv2, srv1, json3
# en vtt, ttml, srv3, srv2, srv1, json3
# nl vtt, ttml, srv3, srv2, srv1, json3
# de vtt, ttml, srv3, srv2, srv1, json3
# ko vtt, ttml, srv3, srv2, srv1, json3
# it vtt, ttml, srv3, srv2, srv1, json3
# zh-Hant vtt, ttml, srv3, srv2, srv1, json3
# hi vtt, ttml, srv3, srv2, srv1, json3
# pt-BR vtt, ttml, srv3, srv2, srv1, json3
# es-MX vtt, ttml, srv3, srv2, srv1, json3
# ja vtt, ttml, srv3, srv2, srv1, json3
# pl vtt, ttml, srv3, srv2, srv1, json3
url = 'QRS8MkLhQmM'
IE = YoutubeIE
@ -83,60 +65,47 @@ class TestYoutubeSubtitles(BaseTestSubtitles):
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(len(subtitles.keys()), 13)
self.assertEqual(md5(subtitles['en']), 'ae1bd34126571a77aabd4d276b28044d')
self.assertEqual(md5(subtitles['it']), '0e0b667ba68411d88fd1c5f4f4eab2f9')
self.assertEqual(md5(subtitles['en']), '688dd1ce0981683867e7fe6fde2a224b')
self.assertEqual(md5(subtitles['it']), '31324d30b8430b309f7f5979a504a769')
for lang in ['fr', 'de']:
self.assertTrue(subtitles.get(lang) is not None, 'Subtitles for \'%s\' not extracted' % lang)
def _test_subtitles_format(self, fmt, md5_hash, lang='en'):
self.DL.params['writesubtitles'] = True
self.DL.params['subtitlesformat'] = fmt
subtitles = self.getSubtitles()
self.assertEqual(md5(subtitles[lang]), md5_hash)
def test_youtube_subtitles_ttml_format(self):
self._test_subtitles_format('ttml', 'c97ddf1217390906fa9fbd34901f3da2')
self.DL.params['writesubtitles'] = True
self.DL.params['subtitlesformat'] = 'ttml'
subtitles = self.getSubtitles()
self.assertEqual(md5(subtitles['en']), 'c97ddf1217390906fa9fbd34901f3da2')
def test_youtube_subtitles_vtt_format(self):
self._test_subtitles_format('vtt', 'ae1bd34126571a77aabd4d276b28044d')
def test_youtube_subtitles_json3_format(self):
self._test_subtitles_format('json3', '688dd1ce0981683867e7fe6fde2a224b')
def _test_automatic_captions(self, url, lang):
self.url = url
self.DL.params['writeautomaticsub'] = True
self.DL.params['subtitleslangs'] = [lang]
self.DL.params['writesubtitles'] = True
self.DL.params['subtitlesformat'] = 'vtt'
subtitles = self.getSubtitles()
self.assertTrue(subtitles[lang] is not None)
self.assertEqual(md5(subtitles['en']), 'ae1bd34126571a77aabd4d276b28044d')
def test_youtube_automatic_captions(self):
# Available automatic captions for 8YoUxe5ncPo:
# Language formats (all in vtt, ttml, srv3, srv2, srv1, json3)
# gu, zh-Hans, zh-Hant, gd, ga, gl, lb, la, lo, tt, tr,
# lv, lt, tk, th, tg, te, fil, haw, yi, ceb, yo, de, da,
# el, eo, en, eu, et, es, ru, rw, ro, bn, be, bg, uk, jv,
# bs, ja, or, xh, co, ca, cy, cs, ps, pt, pa, vi, pl, hy,
# hr, ht, hu, hmn, hi, ha, mg, uz, ml, mn, mi, mk, ur,
# mt, ms, mr, ug, ta, my, af, sw, is, am,
# *it*, iw, sv, ar,
# su, zu, az, id, ig, nl, no, ne, ny, fr, ku, fy, fa, fi,
# ka, kk, sr, sq, ko, kn, km, st, sk, si, so, sn, sm, sl,
# ky, sd
# ...
self._test_automatic_captions('8YoUxe5ncPo', 'it')
self.url = '8YoUxe5ncPo'
self.DL.params['writeautomaticsub'] = True
self.DL.params['subtitleslangs'] = ['it']
subtitles = self.getSubtitles()
self.assertTrue(subtitles['it'] is not None)
def test_youtube_no_automatic_captions(self):
self.url = 'QRS8MkLhQmM'
self.DL.params['writeautomaticsub'] = True
subtitles = self.getSubtitles()
self.assertTrue(not subtitles)
@unittest.skip('Video unavailable')
def test_youtube_translated_subtitles(self):
# This video has a subtitles track, which can be translated (#4555)
self._test_automatic_captions('Ky9eprVWzlI', 'it')
# This video has a subtitles track, which can be translated
self.url = 'i0ZabxXmH4Y'
self.DL.params['writeautomaticsub'] = True
self.DL.params['subtitleslangs'] = ['it']
subtitles = self.getSubtitles()
self.assertTrue(subtitles['it'] is not None)
def test_youtube_nosubtitles(self):
self.DL.expect_warning('video doesn\'t have subtitles')
# Available automatic captions for 8YoUxe5ncPo:
# ...
# 8YoUxe5ncPo has no subtitles
self.url = '8YoUxe5ncPo'
self.url = 'n5BB19UTcdA'
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
@ -168,7 +137,6 @@ class TestDailymotionSubtitles(BaseTestSubtitles):
@is_download_test
@unittest.skip('IE broken')
class TestTedSubtitles(BaseTestSubtitles):
url = 'http://www.ted.com/talks/dan_dennett_on_our_consciousness.html'
IE = TedTalkIE
@ -194,12 +162,12 @@ class TestVimeoSubtitles(BaseTestSubtitles):
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), {'de', 'en', 'es', 'fr'})
self.assertEqual(md5(subtitles['en']), '386cbc9320b94e25cb364b97935e5dd1')
self.assertEqual(md5(subtitles['fr']), 'c9b69eef35bc6641c0d4da8a04f9dfac')
self.assertEqual(md5(subtitles['en']), '8062383cf4dec168fc40a088aa6d5888')
self.assertEqual(md5(subtitles['fr']), 'b6191146a6c5d3a452244d853fde6dc8')
def test_nosubtitles(self):
self.DL.expect_warning('video doesn\'t have subtitles')
self.url = 'http://vimeo.com/68093876'
self.url = 'http://vimeo.com/56015672'
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
@ -207,7 +175,6 @@ class TestVimeoSubtitles(BaseTestSubtitles):
@is_download_test
@unittest.skip('IE broken')
class TestWallaSubtitles(BaseTestSubtitles):
url = 'http://vod.walla.co.il/movie/2705958/the-yes-men'
IE = WallaIE
@ -230,7 +197,6 @@ class TestWallaSubtitles(BaseTestSubtitles):
@is_download_test
@unittest.skip('IE broken')
class TestCeskaTelevizeSubtitles(BaseTestSubtitles):
url = 'http://www.ceskatelevize.cz/ivysilani/10600540290-u6-uzasny-svet-techniky'
IE = CeskaTelevizeIE
@ -253,7 +219,6 @@ class TestCeskaTelevizeSubtitles(BaseTestSubtitles):
@is_download_test
@unittest.skip('IE broken')
class TestLyndaSubtitles(BaseTestSubtitles):
url = 'http://www.lynda.com/Bootstrap-tutorials/Using-exercise-files/110885/114408-4.html'
IE = LyndaIE
@ -267,7 +232,6 @@ class TestLyndaSubtitles(BaseTestSubtitles):
@is_download_test
@unittest.skip('IE broken')
class TestNPOSubtitles(BaseTestSubtitles):
url = 'http://www.npo.nl/nos-journaal/28-08-2014/POW_00722860'
IE = NPOIE
@ -281,7 +245,6 @@ class TestNPOSubtitles(BaseTestSubtitles):
@is_download_test
@unittest.skip('IE broken')
class TestMTVSubtitles(BaseTestSubtitles):
url = 'http://www.cc.com/video-clips/p63lk0/adam-devine-s-house-party-chasing-white-swans'
IE = ComedyCentralIE
@ -306,8 +269,8 @@ class TestNRKSubtitles(BaseTestSubtitles):
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), {'nb-ttv'})
self.assertEqual(md5(subtitles['nb-ttv']), '67e06ff02d0deaf975e68f6cb8f6a149')
self.assertEqual(set(subtitles.keys()), {'no'})
self.assertEqual(md5(subtitles['no']), '544fa917d3197fcbee64634559221cc2')
@is_download_test
@ -332,7 +295,6 @@ class TestRaiPlaySubtitles(BaseTestSubtitles):
@is_download_test
@unittest.skip('IE broken - DRM only')
class TestVikiSubtitles(BaseTestSubtitles):
url = 'http://www.viki.com/videos/1060846v-punch-episode-18'
IE = VikiIE
@ -361,7 +323,6 @@ class TestThePlatformSubtitles(BaseTestSubtitles):
@is_download_test
@unittest.skip('IE broken')
class TestThePlatformFeedSubtitles(BaseTestSubtitles):
url = 'http://feed.theplatform.com/f/7wvmTC/msnbc_video-p-test?form=json&pretty=true&range=-40&byGuid=n_hardball_5biden_140207'
IE = ThePlatformFeedIE
@ -399,7 +360,7 @@ class TestDemocracynowSubtitles(BaseTestSubtitles):
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), {'en'})
self.assertEqual(md5(subtitles['en']), 'a3cc4c0b5eadd74d9974f1c1f5101045')
self.assertEqual(md5(subtitles['en']), 'acaca989e24a9e45a6719c9b3d60815c')
def test_subtitles_in_page(self):
self.url = 'http://www.democracynow.org/2015/7/3/this_flag_comes_down_today_bree'
@ -407,7 +368,7 @@ class TestDemocracynowSubtitles(BaseTestSubtitles):
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), {'en'})
self.assertEqual(md5(subtitles['en']), 'a3cc4c0b5eadd74d9974f1c1f5101045')
self.assertEqual(md5(subtitles['en']), 'acaca989e24a9e45a6719c9b3d60815c')
@is_download_test

16
tox.ini Normal file
View File

@ -0,0 +1,16 @@
[tox]
envlist = py26,py27,py33,py34,py35
# Needed?
[testenv]
deps =
nose
coverage
# We need a valid $HOME for test_compat_expanduser
passenv = HOME
defaultargs = test --exclude test_download.py --exclude test_age_restriction.py
--exclude test_subtitles.py --exclude test_write_annotations.py
--exclude test_youtube_lists.py --exclude test_iqiyi_sdk_interpreter.py
--exclude test_socks.py
commands = nosetests --verbose {posargs:{[testenv]defaultargs}} # --with-coverage --cover-package=yt_dlp --cover-html
# test.test_download:TestDownload.test_NowVideo

View File

@ -1,2 +1,2 @@
#!/usr/bin/env sh
#!/bin/sh
exec "${PYTHON:-python3}" -bb -Werror -Xdev "$(dirname "$(realpath "$0")")/yt_dlp/__main__.py" "$@"

View File

@ -27,17 +27,19 @@ from string import ascii_letters
from .cache import Cache
from .compat import (
HAS_LEGACY as compat_has_legacy,
compat_get_terminal_size,
compat_os_name,
compat_shlex_quote,
compat_str,
compat_urllib_error,
compat_urllib_request,
windows_enable_vt_mode,
)
from .cookies import load_cookies
from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name
from .downloader.rtmp import rtmpdump_version
from .extractor import _LAZY_LOADER
from .extractor import _PLUGIN_CLASSES as plugin_extractors
from .extractor import gen_extractor_classes, get_info_extractor
from .extractor.openload import PhantomJSwrapper
from .minicurses import format_text
@ -58,7 +60,6 @@ from .postprocessor import (
from .update import detect_variant
from .utils import (
DEFAULT_OUTTMPL,
IDENTITY,
LINK_TEMPLATES,
NO_DEFAULT,
NUMBER_RE,
@ -75,13 +76,13 @@ from .utils import (
ExtractorError,
GeoRestrictedError,
HEADRequest,
InAdvancePagedList,
ISO3166Utils,
LazyList,
MaxDownloadsReached,
Namespace,
PagedList,
PerRequestProxyHandler,
PlaylistEntries,
Popen,
PostProcessingError,
ReExtractInfo,
@ -141,7 +142,6 @@ from .utils import (
url_basename,
variadic,
version_tuple,
windows_enable_vt_mode,
write_json_file,
write_string,
)
@ -194,6 +194,13 @@ class YoutubeDL:
For compatibility, a single list is also accepted
print_to_file: A dict with keys WHEN (same as forceprint) mapped to
a list of tuples with (template, filename)
forceurl: Force printing final URL. (Deprecated)
forcetitle: Force printing title. (Deprecated)
forceid: Force printing ID. (Deprecated)
forcethumbnail: Force printing thumbnail URL. (Deprecated)
forcedescription: Force printing description. (Deprecated)
forcefilename: Force printing final filename. (Deprecated)
forceduration: Force printing duration. (Deprecated)
forcejson: Force printing info_dict as JSON.
dump_single_json: Force printing the info_dict of the whole playlist
(or video) as a single JSON line.
@ -243,9 +250,11 @@ class YoutubeDL:
and don't overwrite any file if False
For compatibility with youtube-dl,
"nooverwrites" may also be used instead
playliststart: Playlist item to start at.
playlistend: Playlist item to end at.
playlist_items: Specific indices of playlist to download.
playlistreverse: Download playlist items in reverse order.
playlistrandom: Download playlist items in random order.
lazy_playlist: Process playlist entries as they are received.
matchtitle: Download only matching titles.
rejecttitle: Reject downloads for matching titles.
logger: Log messages to a logging.Logger instance.
@ -268,6 +277,9 @@ class YoutubeDL:
writedesktoplink: Write a Linux internet shortcut file (.desktop)
writesubtitles: Write the video subtitles to a file
writeautomaticsub: Write the automatically generated subtitles to a file
allsubtitles: Deprecated - Use subtitleslangs = ['all']
Downloads all the subtitles of the video
(requires writesubtitles or writeautomaticsub)
listsubtitles: Lists all available subtitles for the video
subtitlesformat: The format code for subtitles
subtitleslangs: List of languages of the subtitles to download (can be regex).
@ -321,6 +333,7 @@ class YoutubeDL:
bidi_workaround: Work around buggy terminals without bidirectional text
support, using fridibi
debug_printtraffic:Print out sent and received HTTP traffic
include_ads: Download ads as well (deprecated)
default_search: Prepend this string if an input url is not valid.
'auto' for elaborate guessing
encoding: Use this encoding instead of the system-specified.
@ -336,6 +349,10 @@ class YoutubeDL:
* when: When to run the postprocessor. Allowed values are
the entries of utils.POSTPROCESS_WHEN
Assumed to be 'post_process' if not given
post_hooks: Deprecated - Register a custom postprocessor instead
A list of functions that get called as the final step
for each video file, after all postprocessors have been
called. The filename will be passed as the only argument.
progress_hooks: A list of functions that get called on download
progress, with a dictionary with the entries
* status: One of "downloading", "error", or "finished".
@ -380,6 +397,8 @@ class YoutubeDL:
- "detect_or_warn": check whether we can do anything
about it, warn otherwise (default)
source_address: Client-side IP address to bind to.
call_home: Boolean, true iff we are allowed to contact the
yt-dlp servers for debugging. (BROKEN)
sleep_interval_requests: Number of seconds to sleep between requests
during extraction
sleep_interval: Number of seconds to sleep before each download when
@ -414,10 +433,17 @@ class YoutubeDL:
geo_bypass_ip_block:
IP range in CIDR notation that will be used similarly to
geo_bypass_country
The following options determine which downloader is picked:
external_downloader: A dictionary of protocol keys and the executable of the
external downloader to use for it. The allowed protocols
are default|http|ftp|m3u8|dash|rtsp|rtmp|mms.
Set the value to 'native' to use the native downloader
hls_prefer_native: Deprecated - Use external_downloader = {'m3u8': 'native'}
or {'m3u8': 'ffmpeg'} instead.
Use the native HLS downloader instead of ffmpeg/avconv
if True, otherwise use ffmpeg/avconv if False, otherwise
use downloader suggested by extractor if None.
compat_opts: Compatibility options. See "Differences in default behavior".
The following options do not work when used through the API:
filename, abort-on-error, multistreams, no-live-chat, format-sort
@ -427,16 +453,6 @@ class YoutubeDL:
Allowed keys are 'download', 'postprocess',
'download-title' (console title) and 'postprocess-title'.
The template is mapped on a dictionary with keys 'progress' and 'info'
retry_sleep_functions: Dictionary of functions that takes the number of attempts
as argument and returns the time to sleep in seconds.
Allowed keys are 'http', 'fragment', 'file_access'
download_ranges: A function that gets called for every video with the signature
(info_dict, *, ydl) -> Iterable[Section].
Only the returned sections will be downloaded. Each Section contains:
* start_time: Start time of the section in seconds
* end_time: End time of the section in seconds
* title: Section title (Optional)
* index: Section number (Optional)
The following parameters are not used by YoutubeDL itself, they are used by
the downloader (see yt_dlp/downloader/common.py):
@ -446,6 +462,8 @@ class YoutubeDL:
external_downloader_args, concurrent_fragment_downloads.
The following options are used by the post processors:
prefer_ffmpeg: If False, use avconv instead of ffmpeg if both are available,
otherwise prefer ffmpeg. (avconv support is deprecated)
ffmpeg_location: Location of the ffmpeg/avconv binary; either the path
to the binary or its containing directory.
postprocessor_args: A dictionary of postprocessor/executable keys (in lower case)
@ -465,54 +483,12 @@ class YoutubeDL:
See "EXTRACTOR ARGUMENTS" for details.
Eg: {'youtube': {'skip': ['dash', 'hls']}}
mark_watched: Mark videos watched (even with --simulate). Only for YouTube
The following options are deprecated and may be removed in the future:
playliststart: - Use playlist_items
Playlist item to start at.
playlistend: - Use playlist_items
Playlist item to end at.
playlistreverse: - Use playlist_items
Download playlist items in reverse order.
forceurl: - Use forceprint
Force printing final URL.
forcetitle: - Use forceprint
Force printing title.
forceid: - Use forceprint
Force printing ID.
forcethumbnail: - Use forceprint
Force printing thumbnail URL.
forcedescription: - Use forceprint
Force printing description.
forcefilename: - Use forceprint
Force printing final filename.
forceduration: - Use forceprint
Force printing duration.
allsubtitles: - Use subtitleslangs = ['all']
Downloads all the subtitles of the video
(requires writesubtitles or writeautomaticsub)
include_ads: - Doesn't work
Download ads as well
call_home: - Not implemented
Boolean, true iff we are allowed to contact the
yt-dlp servers for debugging.
post_hooks: - Register a custom postprocessor
A list of functions that get called as the final step
for each video file, after all postprocessors have been
called. The filename will be passed as the only argument.
hls_prefer_native: - Use external_downloader = {'m3u8': 'native'} or {'m3u8': 'ffmpeg'}.
Use the native HLS downloader instead of ffmpeg/avconv
if True, otherwise use ffmpeg/avconv if False, otherwise
use downloader suggested by extractor if None.
prefer_ffmpeg: - avconv support is deprecated
If False, use avconv instead of ffmpeg if both are available,
otherwise prefer ffmpeg.
youtube_include_dash_manifest: - Use extractor_args
youtube_include_dash_manifest: Deprecated - Use extractor_args instead.
If True (default), DASH manifests and related
data will be downloaded and processed by extractor.
You can reduce network I/O by disabling it if you don't
care about DASH. (only for youtube)
youtube_include_hls_manifest: - Use extractor_args
youtube_include_hls_manifest: Deprecated - Use extractor_args instead.
If True (default), HLS manifests and related
data will be downloaded and processed by extractor.
You can reduce network I/O by disabling it if you don't
@ -579,17 +555,12 @@ class YoutubeDL:
)
self._allow_colors = Namespace(**{
type_: not self.params.get('no_color') and supports_terminal_sequences(stream)
for type_, stream in self._out_files.items_ if type_ != 'console'
for type_, stream in self._out_files if type_ != 'console'
})
MIN_SUPPORTED, MIN_RECOMMENDED = (3, 6), (3, 7)
current_version = sys.version_info[:2]
if current_version < MIN_RECOMMENDED:
msg = 'Support for Python version %d.%d has been deprecated and will break in future versions of yt-dlp'
if current_version < MIN_SUPPORTED:
msg = 'Python version %d.%d is no longer supported'
self.deprecation_warning(
f'{msg}! Please update to Python %d.%d or above' % (*current_version, *MIN_RECOMMENDED))
if sys.version_info < (3, 6):
self.report_warning(
'Python version %d.%d is not supported! Please update to Python 3.6 or above' % sys.version_info[:2])
if self.params.get('allow_unplayable_formats'):
self.report_warning(
@ -617,10 +588,7 @@ class YoutubeDL:
for msg in self.params.get('_deprecation_warnings', []):
self.deprecation_warning(msg)
self.params['compat_opts'] = set(self.params.get('compat_opts', ()))
if not compat_has_legacy:
self.params['compat_opts'].add('no-compat-legacy')
if 'list-formats' in self.params['compat_opts']:
if 'list-formats' in self.params.get('compat_opts', []):
self.params['listformats_table'] = False
if 'overwrites' not in self.params and self.params.get('nooverwrites') is not None:
@ -675,7 +643,7 @@ class YoutubeDL:
'Set the LC_ALL environment variable to fix this.')
self.params['restrictfilenames'] = True
self._parse_outtmpl()
self.outtmpl_dict = self.parse_outtmpl()
# Creating format selector here allows us to catch syntax errors before the extraction
self.format_selector = (
@ -775,7 +743,6 @@ class YoutubeDL:
def add_post_processor(self, pp, when='post_process'):
"""Add a PostProcessor object to the end of the chain."""
assert when in POSTPROCESS_WHEN, f'Invalid when={when}'
self._pps[when].append(pp)
pp.set_downloader(self)
@ -818,9 +785,9 @@ class YoutubeDL:
"""Print message to stdout"""
if quiet is not None:
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument quiet. Use "YoutubeDL.to_screen" instead')
if skip_eol is not False:
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument skip_eol. Use "YoutubeDL.to_screen" instead')
self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.out)
self._write_string(
'%s%s' % (self._bidi_workaround(message), ('' if skip_eol else '\n')),
self._out_files.out)
def to_screen(self, message, skip_eol=False, quiet=None):
"""Print message to screen if not in quiet mode"""
@ -972,7 +939,7 @@ class YoutubeDL:
'''Log debug message or Print message to stderr'''
if not self.params.get('verbose', False):
return
message = f'[debug] {message}'
message = '[debug] %s' % message
if self.params.get('logger'):
self.params['logger'].debug(message)
else:
@ -1003,19 +970,21 @@ class YoutubeDL:
self.report_warning(msg)
def parse_outtmpl(self):
self.deprecation_warning('"YoutubeDL.parse_outtmpl" is deprecated and may be removed in a future version')
self._parse_outtmpl()
return self.params['outtmpl']
def _parse_outtmpl(self):
sanitize = IDENTITY
if self.params.get('restrictfilenames'): # Remove spaces in the default template
outtmpl_dict = self.params.get('outtmpl', {})
if not isinstance(outtmpl_dict, dict):
outtmpl_dict = {'default': outtmpl_dict}
# Remove spaces in the default template
if self.params.get('restrictfilenames'):
sanitize = lambda x: x.replace(' - ', ' ').replace(' ', '-')
outtmpl = self.params.setdefault('outtmpl', {})
if not isinstance(outtmpl, dict):
self.params['outtmpl'] = outtmpl = {'default': outtmpl}
outtmpl.update({k: sanitize(v) for k, v in DEFAULT_OUTTMPL.items() if outtmpl.get(k) is None})
else:
sanitize = lambda x: x
outtmpl_dict.update({
k: sanitize(v) for k, v in DEFAULT_OUTTMPL.items()
if outtmpl_dict.get(k) is None})
for _, val in outtmpl_dict.items():
if isinstance(val, bytes):
self.report_warning('Parameter outtmpl is bytes, but should be a unicode string')
return outtmpl_dict
def get_output_path(self, dir_type='', filename=None):
paths = self.params.get('paths', {})
@ -1066,7 +1035,6 @@ class YoutubeDL:
def _copy_infodict(info_dict):
info_dict = dict(info_dict)
info_dict.pop('__postprocessors', None)
info_dict.pop('__pending_error', None)
return info_dict
def prepare_outtmpl(self, outtmpl, info_dict, sanitize=False):
@ -1164,7 +1132,7 @@ class YoutubeDL:
def filename_sanitizer(key, value, restricted=self.params.get('restrictfilenames')):
return sanitize_filename(str(value), restricted=restricted, is_id=(
bool(re.search(r'(^|[_.])id(\.|$)', key))
if 'filename-sanitization' in self.params['compat_opts']
if 'filename-sanitization' in self.params.get('compat_opts', [])
else NO_DEFAULT))
sanitizer = sanitize if callable(sanitize) else filename_sanitizer
@ -1253,7 +1221,7 @@ class YoutubeDL:
def _prepare_filename(self, info_dict, *, outtmpl=None, tmpl_type=None):
assert None in (outtmpl, tmpl_type), 'outtmpl and tmpl_type are mutually exclusive'
if outtmpl is None:
outtmpl = self.params['outtmpl'].get(tmpl_type or 'default', self.params['outtmpl']['default'])
outtmpl = self.outtmpl_dict.get(tmpl_type or 'default', self.outtmpl_dict['default'])
try:
outtmpl = self._outtmpl_expandpath(outtmpl)
filename = self.evaluate_outtmpl(outtmpl, info_dict, True)
@ -1419,7 +1387,7 @@ class YoutubeDL:
else:
self.report_error('no suitable InfoExtractor for URL %s' % url)
def _handle_extraction_exceptions(func):
def __handle_extraction_exceptions(func):
@functools.wraps(func)
def wrapper(self, *args, **kwargs):
while True:
@ -1492,7 +1460,7 @@ class YoutubeDL:
self.to_screen('')
raise
@_handle_extraction_exceptions
@__handle_extraction_exceptions
def __extract_info(self, url, ie, download, extra_info, process):
ie_result = ie.extract(url)
if ie_result is None: # Finished already (backwards compatibility; listformats and friends should be moved here)
@ -1558,7 +1526,6 @@ class YoutubeDL:
self.add_extra_info(info_copy, extra_info)
info_copy, _ = self.pre_process(info_copy)
self.__forced_printings(info_copy, self.prepare_filename(info_copy), incomplete=True)
self._raise_pending_errors(info_copy)
if self.params.get('force_write_download_archive', False):
self.record_download_archive(info_copy)
return ie_result
@ -1566,7 +1533,6 @@ class YoutubeDL:
if result_type == 'video':
self.add_extra_info(ie_result, extra_info)
ie_result = self.process_video_result(ie_result, download=download)
self._raise_pending_errors(ie_result)
additional_urls = (ie_result or {}).get('additional_urls')
if additional_urls:
# TODO: Improve MetadataParserPP to allow setting a list
@ -1601,13 +1567,9 @@ class YoutubeDL:
if not info:
return info
exempted_fields = {'_type', 'url', 'ie_key'}
if not ie_result.get('section_end') and ie_result.get('section_start') is None:
# For video clips, the id etc of the clip extractor should be used
exempted_fields |= {'id', 'extractor', 'extractor_key'}
new_result = info.copy()
new_result.update(filter_dict(ie_result, lambda k, v: v is not None and k not in exempted_fields))
new_result.update(filter_dict(ie_result, lambda k, v: (
v is not None and k not in {'_type', 'url', 'id', 'extractor', 'extractor_key', 'ie_key'})))
# Extracted info may not be a video result (i.e.
# info.get('_type', 'video') != video) but rather an url or
@ -1679,31 +1641,112 @@ class YoutubeDL:
}
def __process_playlist(self, ie_result, download):
"""Process each entry in the playlist"""
title = ie_result.get('title') or ie_result.get('id') or '<Untitled>'
self.to_screen(f'[download] Downloading playlist: {title}')
# We process each entry in the playlist
playlist = ie_result.get('title') or ie_result.get('id')
self.to_screen('[download] Downloading playlist: %s' % playlist)
all_entries = PlaylistEntries(self, ie_result)
entries = orderedSet(all_entries.get_requested_items(), lazy=True)
if 'entries' not in ie_result:
raise EntryNotInPlaylist('There are no entries')
lazy = self.params.get('lazy_playlist')
if lazy:
resolved_entries, n_entries = [], 'N/A'
ie_result['requested_entries'], ie_result['entries'] = None, None
MissingEntry = object()
incomplete_entries = bool(ie_result.get('requested_entries'))
if incomplete_entries:
def fill_missing_entries(entries, indices):
ret = [MissingEntry] * max(indices)
for i, entry in zip(indices, entries):
ret[i - 1] = entry
return ret
ie_result['entries'] = fill_missing_entries(ie_result['entries'], ie_result['requested_entries'])
playlist_results = []
playliststart = self.params.get('playliststart', 1)
playlistend = self.params.get('playlistend')
# For backwards compatibility, interpret -1 as whole list
if playlistend == -1:
playlistend = None
playlistitems_str = self.params.get('playlist_items')
playlistitems = None
if playlistitems_str is not None:
def iter_playlistitems(format):
for string_segment in format.split(','):
if '-' in string_segment:
start, end = string_segment.split('-')
for item in range(int(start), int(end) + 1):
yield int(item)
else:
yield int(string_segment)
playlistitems = orderedSet(iter_playlistitems(playlistitems_str))
ie_entries = ie_result['entries']
if isinstance(ie_entries, list):
playlist_count = len(ie_entries)
msg = f'Collected {playlist_count} videos; downloading %d of them'
ie_result['playlist_count'] = ie_result.get('playlist_count') or playlist_count
def get_entry(i):
return ie_entries[i - 1]
else:
entries = resolved_entries = list(entries)
n_entries = len(resolved_entries)
ie_result['requested_entries'], ie_result['entries'] = tuple(zip(*resolved_entries)) or ([], [])
if not ie_result.get('playlist_count'):
# Better to do this after potentially exhausting entries
ie_result['playlist_count'] = all_entries.get_full_count()
msg = 'Downloading %d videos'
if not isinstance(ie_entries, (PagedList, LazyList)):
ie_entries = LazyList(ie_entries)
elif isinstance(ie_entries, InAdvancePagedList):
if ie_entries._pagesize == 1:
playlist_count = ie_entries._pagecount
def get_entry(i):
return YoutubeDL.__handle_extraction_exceptions(
lambda self, i: ie_entries[i - 1]
)(self, i)
entries, broken = [], False
items = playlistitems if playlistitems is not None else itertools.count(playliststart)
for i in items:
if i == 0:
continue
if playlistitems is None and playlistend is not None and playlistend < i:
break
entry = None
try:
entry = get_entry(i)
if entry is MissingEntry:
raise EntryNotInPlaylist()
except (IndexError, EntryNotInPlaylist):
if incomplete_entries:
raise EntryNotInPlaylist(f'Entry {i} cannot be found')
elif not playlistitems:
break
entries.append(entry)
try:
if entry is not None:
# TODO: Add auto-generated fields
self._match_entry(entry, incomplete=True, silent=True)
except (ExistingVideoReached, RejectedVideoReached):
broken = True
break
ie_result['entries'] = entries
# Save playlist_index before re-ordering
entries = [
((playlistitems[i - 1] if playlistitems else i + playliststart - 1), entry)
for i, entry in enumerate(entries, 1)
if entry is not None]
n_entries = len(entries)
if not (ie_result.get('playlist_count') or broken or playlistitems or playlistend):
ie_result['playlist_count'] = n_entries
if not playlistitems and (playliststart != 1 or playlistend):
playlistitems = list(range(playliststart, playliststart + n_entries))
ie_result['requested_entries'] = playlistitems
_infojson_written = False
write_playlist_files = self.params.get('allow_playlist_files', True)
if write_playlist_files and self.params.get('list_thumbnails'):
self.list_thumbnails(ie_result)
if write_playlist_files and not self.params.get('simulate'):
ie_copy = self._playlist_infodict(ie_result, n_entries=int_or_none(n_entries))
ie_copy = self._playlist_infodict(ie_result, n_entries=n_entries)
_infojson_written = self._write_info_json(
'playlist', ie_result, self.prepare_filename(ie_copy, 'pl_infojson'))
if _infojson_written is None:
@ -1714,41 +1757,33 @@ class YoutubeDL:
# TODO: This should be passed to ThumbnailsConvertor if necessary
self._write_thumbnails('playlist', ie_copy, self.prepare_filename(ie_copy, 'pl_thumbnail'))
if lazy:
if self.params.get('playlistreverse') or self.params.get('playlistrandom'):
self.report_warning('playlistreverse and playlistrandom are not supported with lazy_playlist', only_once=True)
elif self.params.get('playlistreverse'):
entries.reverse()
elif self.params.get('playlistrandom'):
if self.params.get('playlistreverse', False):
entries = entries[::-1]
if self.params.get('playlistrandom', False):
random.shuffle(entries)
self.to_screen(f'[{ie_result["extractor"]}] Playlist {title}: Downloading {n_entries} videos'
f'{format_field(ie_result, "playlist_count", " of %s")}')
x_forwarded_for = ie_result.get('__x_forwarded_for_ip')
self.to_screen(f'[{ie_result["extractor"]}] playlist {playlist}: {msg % n_entries}')
failures = 0
max_failures = self.params.get('skip_playlist_after_errors') or float('inf')
for i, (playlist_index, entry) in enumerate(entries):
if lazy:
resolved_entries.append((playlist_index, entry))
# TODO: Add auto-generated fields
if self._match_entry(entry, incomplete=True) is not None:
continue
for i, entry_tuple in enumerate(entries, 1):
playlist_index, entry = entry_tuple
if 'playlist-index' in self.params.get('compat_opts', []):
playlist_index = playlistitems[i - 1] if playlistitems else i + playliststart - 1
self.to_screen('[download] Downloading video %s of %s' % (
self._format_screen(i + 1, self.Styles.ID), self._format_screen(n_entries, self.Styles.EMPHASIS)))
entry['__x_forwarded_for_ip'] = ie_result.get('__x_forwarded_for_ip')
if not lazy and 'playlist-index' in self.params.get('compat_opts', []):
playlist_index = ie_result['requested_entries'][i]
entry_result = self.__process_iterable_entry(entry, download, {
'n_entries': int_or_none(n_entries),
'__last_playlist_index': max(ie_result['requested_entries'] or (0, 0)),
self._format_screen(i, self.Styles.ID), self._format_screen(n_entries, self.Styles.EMPHASIS)))
# This __x_forwarded_for_ip thing is a bit ugly but requires
# minimal changes
if x_forwarded_for:
entry['__x_forwarded_for_ip'] = x_forwarded_for
extra = {
'n_entries': n_entries,
'__last_playlist_index': max(playlistitems) if playlistitems else (playlistend or n_entries),
'playlist_count': ie_result.get('playlist_count'),
'playlist_index': playlist_index,
'playlist_autonumber': i + 1,
'playlist': title,
'playlist_autonumber': i,
'playlist': playlist,
'playlist_id': ie_result.get('id'),
'playlist_title': ie_result.get('title'),
'playlist_uploader': ie_result.get('uploader'),
@ -1758,17 +1793,20 @@ class YoutubeDL:
'webpage_url_basename': url_basename(ie_result['webpage_url']),
'webpage_url_domain': get_domain(ie_result['webpage_url']),
'extractor_key': ie_result['extractor_key'],
})
}
if self._match_entry(entry, incomplete=True) is not None:
continue
entry_result = self.__process_iterable_entry(entry, download, extra)
if not entry_result:
failures += 1
if failures >= max_failures:
self.report_error(
f'Skipping the remaining entries in playlist "{title}" since {failures} items failed extraction')
'Skipping the remaining entries in playlist "%s" since %d items failed extraction' % (playlist, failures))
break
resolved_entries[i] = (playlist_index, entry_result)
# Update with processed data
ie_result['requested_entries'], ie_result['entries'] = tuple(zip(*resolved_entries)) or ([], [])
playlist_results.append(entry_result)
ie_result['entries'] = playlist_results
# Write the updated info to json
if _infojson_written is True and self._write_info_json(
@ -1777,10 +1815,10 @@ class YoutubeDL:
return
ie_result = self.run_all_pps('playlist', ie_result)
self.to_screen(f'[download] Finished downloading playlist: {title}')
self.to_screen(f'[download] Finished downloading playlist: {playlist}')
return ie_result
@_handle_extraction_exceptions
@__handle_extraction_exceptions
def __process_iterable_entry(self, entry, download, extra_info):
return self.process_ie_result(
entry, download=download, extra_info=extra_info)
@ -1862,7 +1900,7 @@ class YoutubeDL:
temp_file.close()
try:
success, _ = self.dl(temp_file.name, f, test=True)
except (DownloadError, OSError, ValueError) + network_exceptions:
except (DownloadError, IOError, OSError, ValueError) + network_exceptions:
success = False
finally:
if os.path.exists(temp_file.name):
@ -1887,11 +1925,11 @@ class YoutubeDL:
and (
not can_merge()
or info_dict.get('is_live') and not self.params.get('live_from_start')
or self.params['outtmpl']['default'] == '-'))
or self.outtmpl_dict['default'] == '-'))
compat = (
prefer_best
or self.params.get('allow_multiple_audio_streams', False)
or 'format-spec' in self.params['compat_opts'])
or 'format-spec' in self.params.get('compat_opts', []))
return (
'best/bestvideo+bestaudio' if prefer_best
@ -2232,7 +2270,7 @@ class YoutubeDL:
def _calc_headers(self, info_dict):
res = merge_headers(self.params['http_headers'], info_dict.get('http_headers') or {})
cookies = self._calc_cookies(info_dict['url'])
cookies = self._calc_cookies(info_dict)
if cookies:
res['Cookie'] = cookies
@ -2243,8 +2281,8 @@ class YoutubeDL:
return res
def _calc_cookies(self, url):
pr = sanitized_Request(url)
def _calc_cookies(self, info_dict):
pr = sanitized_Request(info_dict['url'])
self.cookiejar.add_cookie_header(pr)
return pr.get_header('Cookie')
@ -2342,11 +2380,6 @@ class YoutubeDL:
if info_dict.get('%s_number' % field) is not None and not info_dict.get(field):
info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field])
def _raise_pending_errors(self, info):
err = info.pop('__pending_error', None)
if err:
self.report_error(err, tb=False)
def process_video_result(self, info_dict, download=True):
assert info_dict.get('_type', 'video') == 'video'
self._num_videos += 1
@ -2378,8 +2411,6 @@ class YoutubeDL:
sanitize_string_field(info_dict, 'id')
sanitize_numeric_fields(info_dict)
if info_dict.get('section_end') and info_dict.get('section_start') is not None:
info_dict['duration'] = round(info_dict['section_end'] - info_dict['section_start'], 3)
if (info_dict.get('duration') or 0) <= 0 and info_dict.pop('duration', None):
self.report_warning('"duration" field is negative, there is an error in extractor')
@ -2507,7 +2538,7 @@ class YoutubeDL:
format['dynamic_range'] = 'SDR'
if (info_dict.get('duration') and format.get('tbr')
and not format.get('filesize') and not format.get('filesize_approx')):
format['filesize_approx'] = int(info_dict['duration'] * format['tbr'] * (1024 / 8))
format['filesize_approx'] = info_dict['duration'] * format['tbr'] * (1024 / 8)
# Add HTTP headers, so that external programs can use them from the
# json output
@ -2554,7 +2585,7 @@ class YoutubeDL:
if list_only:
# Without this printing, -F --print-json will not work
self.__forced_printings(info_dict, self.prepare_filename(info_dict), incomplete=True)
return info_dict
return
format_selector = self.format_selector
if format_selector is None:
@ -2595,40 +2626,20 @@ class YoutubeDL:
# Process what we can, even without any available formats.
formats_to_download = [{}]
requested_ranges = self.params.get('download_ranges')
if requested_ranges:
requested_ranges = tuple(requested_ranges(info_dict, self))
best_format, downloaded_formats = formats_to_download[-1], []
best_format = formats_to_download[-1]
if download:
if best_format:
def to_screen(*msg):
self.to_screen(f'[info] {info_dict["id"]}: {" ".join(", ".join(variadic(m)) for m in msg)}')
to_screen(f'Downloading {len(formats_to_download)} format(s):',
(f['format_id'] for f in formats_to_download))
if requested_ranges:
to_screen(f'Downloading {len(requested_ranges)} time ranges:',
(f'{int(c["start_time"])}-{int(c["end_time"])}' for c in requested_ranges))
self.to_screen(
f'[info] {info_dict["id"]}: Downloading {len(formats_to_download)} format(s): '
+ ', '.join([f['format_id'] for f in formats_to_download]))
max_downloads_reached = False
for fmt, chapter in itertools.product(formats_to_download, requested_ranges or [{}]):
new_info = self._copy_infodict(info_dict)
for i, fmt in enumerate(formats_to_download):
formats_to_download[i] = new_info = self._copy_infodict(info_dict)
new_info.update(fmt)
offset, duration = info_dict.get('section_start') or 0, info_dict.get('duration') or float('inf')
if chapter or offset:
new_info.update({
'section_start': offset + chapter.get('start_time', 0),
'section_end': offset + min(chapter.get('end_time', 0), duration),
'section_title': chapter.get('title'),
'section_number': chapter.get('index'),
})
downloaded_formats.append(new_info)
try:
self.process_info(new_info)
except MaxDownloadsReached:
max_downloads_reached = True
self._raise_pending_errors(new_info)
# Remove copied info
for key, val in tuple(new_info.items()):
if info_dict.get(key) == val:
@ -2636,12 +2647,12 @@ class YoutubeDL:
if max_downloads_reached:
break
write_archive = {f.get('__write_download_archive', False) for f in downloaded_formats}
write_archive = {f.get('__write_download_archive', False) for f in formats_to_download}
assert write_archive.issubset({True, False, 'ignore'})
if True in write_archive and False not in write_archive:
self.record_download_archive(info_dict)
info_dict['requested_downloads'] = downloaded_formats
info_dict['requested_downloads'] = formats_to_download
info_dict = self.run_all_pps('after_video', info_dict)
if max_downloads_reached:
raise MaxDownloadsReached()
@ -2863,13 +2874,8 @@ class YoutubeDL:
# Forced printings
self.__forced_printings(info_dict, full_filename, incomplete=('format' not in info_dict))
def check_max_downloads():
if self._num_downloads >= float(self.params.get('max_downloads') or 'inf'):
raise MaxDownloadsReached()
if self.params.get('simulate'):
info_dict['__write_download_archive'] = self.params.get('force_write_download_archive')
check_max_downloads()
return
if full_filename is None:
@ -2973,8 +2979,12 @@ class YoutubeDL:
info_dict.clear()
info_dict.update(new_info)
new_info, files_to_move = self.pre_process(info_dict, 'before_dl', files_to_move)
replace_info_dict(new_info)
try:
new_info, files_to_move = self.pre_process(info_dict, 'before_dl', files_to_move)
replace_info_dict(new_info)
except PostProcessingError as err:
self.report_error('Preprocessing: %s' % str(err))
return
if self.params.get('skip_download'):
info_dict['filepath'] = temp_filename
@ -2996,16 +3006,7 @@ class YoutubeDL:
info_dict['ext'] = os.path.splitext(file)[1][1:]
return file
fd, success = None, True
if info_dict.get('protocol') or info_dict.get('url'):
fd = get_suitable_downloader(info_dict, self.params, to_stdout=temp_filename == '-')
if fd is not FFmpegFD and (
info_dict.get('section_start') or info_dict.get('section_end')):
msg = ('This format cannot be partially downloaded' if FFmpegFD.available()
else 'You have requested downloading the video partially, but ffmpeg is not installed')
self.report_error(f'{msg}. Aborting')
return
success = True
if info_dict.get('requested_formats') is not None:
def compatible_formats(formats):
@ -3038,7 +3039,7 @@ class YoutubeDL:
and info_dict.get('thumbnails')
# check with type instead of pp_key, __name__, or isinstance
# since we dont want any custom PPs to trigger this
and any(type(pp) == EmbedThumbnailPP for pp in self._pps['post_process'])): # noqa: E721
and any(type(pp) == EmbedThumbnailPP for pp in self._pps['post_process'])):
info_dict['ext'] = 'mkv'
self.report_warning(
'webm doesn\'t support embedding a thumbnail, mkv will be used')
@ -3060,8 +3061,10 @@ class YoutubeDL:
dl_filename = existing_video_file(full_filename, temp_filename)
info_dict['__real_download'] = False
merger = FFmpegMergerPP(self)
downloaded = []
merger = FFmpegMergerPP(self)
fd = get_suitable_downloader(info_dict, self.params, to_stdout=temp_filename == '-')
if dl_filename is not None:
self.report_file_already_downloaded(dl_filename)
elif fd:
@ -3141,7 +3144,6 @@ class YoutubeDL:
self.report_error(f'content too short (expected {err.expected} bytes and served {err.downloaded})')
return
self._raise_pending_errors(info_dict)
if success and full_filename != '-':
def fixup():
@ -3211,10 +3213,15 @@ class YoutubeDL:
return
info_dict['__write_download_archive'] = True
assert info_dict is original_infodict # Make sure the info_dict was modified in-place
if self.params.get('force_write_download_archive'):
info_dict['__write_download_archive'] = True
check_max_downloads()
# Make sure the info_dict was modified in-place
assert info_dict is original_infodict
max_downloads = self.params.get('max_downloads')
if max_downloads is not None and self._num_downloads >= int(max_downloads):
raise MaxDownloadsReached()
def __download_wrapper(self, func):
@functools.wraps(func)
@ -3236,7 +3243,7 @@ class YoutubeDL:
def download(self, url_list):
"""Download a given list of URLs."""
url_list = variadic(url_list) # Passing a single URL is a common mistake
outtmpl = self.params['outtmpl']['default']
outtmpl = self.outtmpl_dict['default']
if (len(url_list) > 1
and outtmpl != '-'
and '%' not in outtmpl
@ -3357,12 +3364,7 @@ class YoutubeDL:
def pre_process(self, ie_info, key='pre_process', files_to_move=None):
info = dict(ie_info)
info['__files_to_move'] = files_to_move or {}
try:
info = self.run_all_pps(key, info)
except PostProcessingError as err:
msg = f'Preprocessing: {err}'
info.setdefault('__pending_error', msg)
self.report_error(msg, is_error=False)
info = self.run_all_pps(key, info)
return info, info.pop('__files_to_move', None)
def post_process(self, filename, info, files_to_move=None):
@ -3597,14 +3599,10 @@ class YoutubeDL:
if not self.params.get('verbose'):
return
# These imports can be slow. So import them only as needed
from .extractor.extractors import _LAZY_LOADER
from .extractor.extractors import _PLUGIN_CLASSES as plugin_extractors
def get_encoding(stream):
ret = str(getattr(stream, 'encoding', 'missing (%s)' % type(stream).__name__))
if not supports_terminal_sequences(stream):
from .utils import WINDOWS_VT_MODE # Must be imported locally
from .compat import WINDOWS_VT_MODE # Must be imported locally
ret += ' (No VT)' if WINDOWS_VT_MODE is False else ' (No ANSI)'
return ret
@ -3613,7 +3611,7 @@ class YoutubeDL:
sys.getfilesystemencoding(),
self.get_encoding(),
', '.join(
f'{key} {get_encoding(stream)}' for key, stream in self._out_files.items_
f'{key} {get_encoding(stream)}' for key, stream in self._out_files
if stream is not None and key != 'console')
)
@ -3640,17 +3638,19 @@ class YoutubeDL:
write_debug('Plugins: %s' % [
'%s%s' % (klass.__name__, '' if klass.__name__ == name else f' as {name}')
for name, klass in itertools.chain(plugin_extractors.items(), plugin_postprocessors.items())])
if self.params['compat_opts']:
write_debug('Compatibility options: %s' % ', '.join(self.params['compat_opts']))
if self.params.get('compat_opts'):
write_debug('Compatibility options: %s' % ', '.join(self.params.get('compat_opts')))
if source == 'source':
try:
stdout, _, _ = Popen.run(
sp = Popen(
['git', 'rev-parse', '--short', 'HEAD'],
text=True, cwd=os.path.dirname(os.path.abspath(__file__)),
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if re.fullmatch('[0-9a-f]+', stdout.strip()):
write_debug(f'Git HEAD: {stdout.strip()}')
stdout=subprocess.PIPE, stderr=subprocess.PIPE,
cwd=os.path.dirname(os.path.abspath(__file__)))
out, err = sp.communicate_or_kill()
out = out.decode().strip()
if re.match('[0-9a-f]+', out):
write_debug('Git HEAD: %s' % out)
except Exception:
with contextlib.suppress(Exception):
sys.exc_clear()

View File

@ -4,16 +4,14 @@ f'You are using an unsupported version of Python. Only Python versions 3.6 and a
__license__ = 'Public Domain'
import itertools
import optparse
import os
import re
import sys
from .compat import compat_getpass, compat_shlex_quote
from .compat import compat_getpass, compat_os_name, compat_shlex_quote
from .cookies import SUPPORTED_BROWSERS, SUPPORTED_KEYRINGS
from .downloader import FileDownloader
from .downloader.external import get_external_downloader
from .extractor import list_extractor_classes
from .extractor import GenericIE, list_extractor_classes
from .extractor.adobepass import MSO_INFO
from .extractor.common import InfoExtractor
from .options import parseOpts
@ -26,7 +24,7 @@ from .postprocessor import (
MetadataFromFieldPP,
MetadataParserPP,
)
from .update import Updater
from .update import run_update
from .utils import (
NO_DEFAULT,
POSTPROCESS_WHEN,
@ -34,47 +32,41 @@ from .utils import (
DownloadCancelled,
DownloadError,
GeoUtils,
PlaylistEntries,
SameFileError,
decodeOption,
download_range_func,
expand_path,
float_or_none,
format_field,
int_or_none,
match_filter_func,
parse_duration,
preferredencoding,
read_batch_urls,
read_stdin,
render_table,
setproctitle,
std_headers,
traverse_obj,
variadic,
write_string,
)
from .YoutubeDL import YoutubeDL
def _exit(status=0, *args):
for msg in args:
sys.stderr.write(msg)
raise SystemExit(status)
def get_urls(urls, batchfile, verbose):
# Batch file verification
batch_urls = []
if batchfile is not None:
try:
batch_urls = read_batch_urls(
read_stdin('URLs') if batchfile == '-'
else open(expand_path(batchfile), encoding='utf-8', errors='ignore'))
if batchfile == '-':
write_string('Reading URLs from stdin - EOF (%s) to end:\n' % (
'Ctrl+Z' if compat_os_name == 'nt' else 'Ctrl+D'))
batchfd = sys.stdin
else:
batchfd = open(
expand_path(batchfile), encoding='utf-8', errors='ignore')
batch_urls = read_batch_urls(batchfd)
if verbose:
write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n')
except OSError:
_exit(f'ERROR: batch file {batchfile} could not be read')
sys.exit('ERROR: batch file %s could not be read' % batchfile)
_enc = preferredencoding()
return [
url.strip().decode(_enc, 'ignore') if isinstance(url, bytes) else url.strip()
@ -82,10 +74,6 @@ def get_urls(urls, batchfile, verbose):
def print_extractor_information(opts, urls):
# Importing GenericIE is currently slow since it imports other extractors
# TODO: Move this back to module level after generalization of embed detection
from .extractor.generic import GenericIE
out = ''
if opts.list_extractors:
urls = dict.fromkeys(urls, False)
@ -221,11 +209,15 @@ def validate_options(opts):
validate_regex('format sorting', f, InfoExtractor.FormatSort.regex)
# Postprocessor formats
validate_regex('audio format', opts.audioformat, FFmpegExtractAudioPP.FORMAT_RE)
validate_in('audio format', opts.audioformat, ['best'] + list(FFmpegExtractAudioPP.SUPPORTED_EXTS))
validate_in('subtitle format', opts.convertsubtitles, FFmpegSubtitlesConvertorPP.SUPPORTED_EXTS)
validate_regex('thumbnail format', opts.convertthumbnails, FFmpegThumbnailsConvertorPP.FORMAT_RE)
validate_regex('recode video format', opts.recodevideo, FFmpegVideoConvertorPP.FORMAT_RE)
validate_regex('remux video format', opts.remuxvideo, FFmpegVideoRemuxerPP.FORMAT_RE)
validate_in('thumbnail format', opts.convertthumbnails, FFmpegThumbnailsConvertorPP.SUPPORTED_EXTS)
if opts.recodevideo is not None:
opts.recodevideo = opts.recodevideo.replace(' ', '')
validate_regex('video recode format', opts.recodevideo, FFmpegVideoConvertorPP.FORMAT_RE)
if opts.remuxvideo is not None:
opts.remuxvideo = opts.remuxvideo.replace(' ', '')
validate_regex('video remux format', opts.remuxvideo, FFmpegVideoRemuxerPP.FORMAT_RE)
if opts.audioquality:
opts.audioquality = opts.audioquality.strip('k').strip('K')
# int_or_none prevents inf, nan
@ -247,28 +239,6 @@ def validate_options(opts):
opts.extractor_retries = parse_retries('extractor', opts.extractor_retries)
opts.file_access_retries = parse_retries('file access', opts.file_access_retries)
# Retry sleep function
def parse_sleep_func(expr):
NUMBER_RE = r'\d+(?:\.\d+)?'
op, start, limit, step, *_ = tuple(re.fullmatch(
rf'(?:(linear|exp)=)?({NUMBER_RE})(?::({NUMBER_RE})?)?(?::({NUMBER_RE}))?',
expr.strip()).groups()) + (None, None)
if op == 'exp':
return lambda n: min(float(start) * (float(step or 2) ** n), float(limit or 'inf'))
else:
default_step = start if op or limit else 0
return lambda n: min(float(start) + float(step or default_step) * n, float(limit or 'inf'))
for key, expr in opts.retry_sleep.items():
if not expr:
del opts.retry_sleep[key]
continue
try:
opts.retry_sleep[key] = parse_sleep_func(expr)
except AttributeError:
raise ValueError(f'invalid {key} retry sleep expression {expr!r}')
# Bytes
def parse_bytes(name, value):
if value is None:
@ -313,25 +283,20 @@ def validate_options(opts):
'Cannot download a video and extract audio into the same file! '
f'Use "{outtmpl_default}.%(ext)s" instead of "{outtmpl_default}" as the output template')
def parse_chapters(name, value):
chapters, ranges = [], []
for regex in value or []:
if regex.startswith('*'):
for range in regex[1:].split(','):
dur = tuple(map(parse_duration, range.strip().split('-')))
if len(dur) == 2 and all(t is not None for t in dur):
ranges.append(dur)
else:
raise ValueError(f'invalid {name} time range "{regex}". Must be of the form *start-end')
# Remove chapters
remove_chapters_patterns, opts.remove_ranges = [], []
for regex in opts.remove_chapters or []:
if regex.startswith('*'):
dur = list(map(parse_duration, regex[1:].split('-')))
if len(dur) == 2 and all(t is not None for t in dur):
opts.remove_ranges.append(tuple(dur))
continue
try:
chapters.append(re.compile(regex))
except re.error as err:
raise ValueError(f'invalid {name} regex "{regex}" - {err}')
return chapters, ranges
opts.remove_chapters, opts.remove_ranges = parse_chapters('--remove-chapters', opts.remove_chapters)
opts.download_ranges = download_range_func(*parse_chapters('--download-sections', opts.download_ranges))
raise ValueError(f'invalid --remove-chapters time range "{regex}". Must be of the form *start-end')
try:
remove_chapters_patterns.append(re.compile(regex))
except re.error as err:
raise ValueError(f'invalid --remove-chapters regex "{regex}" - {err}')
opts.remove_chapters = remove_chapters_patterns
# Cookies from browser
if opts.cookiesfrombrowser:
@ -375,12 +340,6 @@ def validate_options(opts):
opts.parse_metadata = list(itertools.chain(*map(metadataparser_actions, parse_metadata)))
# Other options
if opts.playlist_items is not None:
try:
tuple(PlaylistEntries.parse_playlist_items(opts.playlist_items))
except Exception as err:
raise ValueError(f'Invalid playlist-items {opts.playlist_items!r}: {err}')
geo_bypass_code = opts.geo_bypass_ip_block or opts.geo_bypass_country
if geo_bypass_code is not None:
try:
@ -401,15 +360,6 @@ def validate_options(opts):
if opts.no_sponsorblock:
opts.sponsorblock_mark = opts.sponsorblock_remove = set()
default_downloader = None
for proto, path in opts.external_downloader.items():
ed = get_external_downloader(path)
if ed is None:
raise ValueError(
f'No such {format_field(proto, None, "%s ", ignore="default")}external downloader "{path}"')
elif ed and proto == 'default':
default_downloader = ed.get_basename()
warnings, deprecation_warnings = [], []
# Common mistake: -f best
@ -420,18 +370,13 @@ def validate_options(opts):
'If you know what you are doing and want only the best pre-merged format, use "-f b" instead to suppress this warning')))
# --(postprocessor/downloader)-args without name
def report_args_compat(name, value, key1, key2=None, where=None):
def report_args_compat(name, value, key1, key2=None):
if key1 in value and key2 not in value:
warnings.append(f'{name.title()} arguments given without specifying name. '
f'The arguments will be given to {where or f"all {name}s"}')
warnings.append(f'{name} arguments given without specifying name. The arguments will be given to all {name}s')
return True
return False
if report_args_compat('external downloader', opts.external_downloader_args,
'default', where=default_downloader) and default_downloader:
# Compat with youtube-dl's behavior. See https://github.com/ytdl-org/youtube-dl/commit/49c5293014bc11ec8c009856cd63cffa6296c1e1
opts.external_downloader_args.setdefault(default_downloader, opts.external_downloader_args.pop('default'))
report_args_compat('external downloader', opts.external_downloader_args, 'default')
if report_args_compat('post-processor', opts.postprocessor_args, 'default-compat', 'default'):
opts.postprocessor_args['default'] = opts.postprocessor_args.pop('default-compat')
opts.postprocessor_args.setdefault('sponskrub', [])
@ -450,9 +395,6 @@ def validate_options(opts):
setattr(opts, opt1, default)
# Conflicting options
report_conflict('--playlist-reverse', 'playlist_reverse', '--playlist-random', 'playlist_random')
report_conflict('--playlist-reverse', 'playlist_reverse', '--lazy-playlist', 'lazy_playlist')
report_conflict('--playlist-random', 'playlist_random', '--lazy-playlist', 'lazy_playlist')
report_conflict('--dateafter', 'dateafter', '--date', 'date', default=None)
report_conflict('--datebefore', 'datebefore', '--date', 'date', default=None)
report_conflict('--exec-before-download', 'exec_before_dl_cmd',
@ -685,7 +627,7 @@ def parse_options(argv=None):
final_ext = (
opts.recodevideo if opts.recodevideo in FFmpegVideoConvertorPP.SUPPORTED_EXTS
else opts.remuxvideo if opts.remuxvideo in FFmpegVideoRemuxerPP.SUPPORTED_EXTS
else opts.audioformat if (opts.extractaudio and opts.audioformat in FFmpegExtractAudioPP.SUPPORTED_EXTS)
else opts.audioformat if (opts.extractaudio and opts.audioformat != 'best')
else None)
return parser, opts, urls, {
@ -744,7 +686,6 @@ def parse_options(argv=None):
'file_access_retries': opts.file_access_retries,
'fragment_retries': opts.fragment_retries,
'extractor_retries': opts.extractor_retries,
'retry_sleep_functions': opts.retry_sleep,
'skip_unavailable_fragments': opts.skip_unavailable_fragments,
'keep_fragments': opts.keep_fragments,
'concurrent_fragment_downloads': opts.concurrent_fragment_downloads,
@ -759,7 +700,6 @@ def parse_options(argv=None):
'playlistend': opts.playlistend,
'playlistreverse': opts.playlist_reverse,
'playlistrandom': opts.playlist_random,
'lazy_playlist': opts.lazy_playlist,
'noplaylist': opts.noplaylist,
'logtostderr': opts.outtmpl.get('default') == '-',
'consoletitle': opts.consoletitle,
@ -791,7 +731,6 @@ def parse_options(argv=None):
'verbose': opts.verbose,
'dump_intermediate_pages': opts.dump_intermediate_pages,
'write_pages': opts.write_pages,
'load_pages': opts.load_pages,
'test': opts.test,
'keepvideo': opts.keepvideo,
'min_filesize': opts.min_filesize,
@ -840,8 +779,6 @@ def parse_options(argv=None):
'max_sleep_interval': opts.max_sleep_interval,
'sleep_interval_subtitles': opts.sleep_interval_subtitles,
'external_downloader': opts.external_downloader,
'download_ranges': opts.download_ranges,
'force_keyframes_at_cuts': opts.force_keyframes_at_cuts,
'list_thumbnails': opts.list_thumbnails,
'playlist_items': opts.playlist_items,
'xattr_set_filesize': opts.xattr_set_filesize,
@ -873,63 +810,62 @@ def _real_main(argv=None):
if opts.dump_user_agent:
ua = traverse_obj(opts.headers, 'User-Agent', casesense=False, default=std_headers['User-Agent'])
write_string(f'{ua}\n', out=sys.stdout)
return
sys.exit(0)
if print_extractor_information(opts, all_urls):
return
sys.exit(0)
with YoutubeDL(ydl_opts) as ydl:
pre_process = opts.update_self or opts.rm_cachedir
actual_use = all_urls or opts.load_info_filename
# Remove cache dir
if opts.rm_cachedir:
ydl.cache.remove()
updater = Updater(ydl)
if opts.update_self and updater.update() and actual_use:
if updater.cmd:
return updater.restart()
# This code is reachable only for zip variant in py < 3.10
# It makes sense to exit here, but the old behavior is to continue
ydl.report_warning('Restart yt-dlp to use the updated version')
# return 100, 'ERROR: The program must exit for the update to complete'
# Update version
if opts.update_self:
# If updater returns True, exit. Required for windows
if run_update(ydl):
if actual_use:
sys.exit('ERROR: The program must exit for the update to complete')
sys.exit()
# Maybe do nothing
if not actual_use:
if pre_process:
return ydl._download_retcode
if opts.update_self or opts.rm_cachedir:
sys.exit()
ydl.warn_if_short_id(sys.argv[1:] if argv is None else argv)
parser.error(
'You must provide at least one URL.\n'
'Type yt-dlp --help to see a list of all options.')
parser.destroy()
try:
if opts.load_info_filename is not None:
return ydl.download_with_info_file(expand_path(opts.load_info_filename))
retcode = ydl.download_with_info_file(expand_path(opts.load_info_filename))
else:
return ydl.download(all_urls)
retcode = ydl.download(all_urls)
except DownloadCancelled:
ydl.to_screen('Aborting remaining downloads')
return 101
retcode = 101
sys.exit(retcode)
def main(argv=None):
try:
_exit(*variadic(_real_main(argv)))
_real_main(argv)
except DownloadError:
_exit(1)
sys.exit(1)
except SameFileError as e:
_exit(f'ERROR: {e}')
sys.exit(f'ERROR: {e}')
except KeyboardInterrupt:
_exit('\nERROR: Interrupted by user')
sys.exit('\nERROR: Interrupted by user')
except BrokenPipeError as e:
# https://docs.python.org/3/library/signal.html#note-on-sigpipe
devnull = os.open(os.devnull, os.O_WRONLY)
os.dup2(devnull, sys.stdout.fileno())
_exit(f'\nERROR: {e}')
except optparse.OptParseError as e:
_exit(2, f'\n{e}')
sys.exit(f'\nERROR: {e}')
from .extractor import gen_extractors, list_extractors

View File

@ -1,4 +1,6 @@
import contextlib
import os
import subprocess
import sys
import warnings
import xml.etree.ElementTree as etree
@ -9,13 +11,8 @@ from .compat_utils import passthrough_module
# XXX: Implement this the same way as other DeprecationWarnings without circular import
try:
passthrough_module(__name__, '._legacy', callback=lambda attr: warnings.warn(
DeprecationWarning(f'{__name__}.{attr} is deprecated'), stacklevel=2))
HAS_LEGACY = True
except ModuleNotFoundError:
# Keep working even without _legacy module
HAS_LEGACY = False
passthrough_module(__name__, '._legacy', callback=lambda attr: warnings.warn(
DeprecationWarning(f'{__name__}.{attr} is deprecated'), stacklevel=2))
del passthrough_module
@ -55,7 +52,7 @@ if compat_os_name == 'nt' and sys.version_info < (3, 8):
def compat_realpath(path):
while os.path.islink(path):
path = os.path.abspath(os.readlink(path))
return os.path.realpath(path)
return path
else:
compat_realpath = os.path.realpath
@ -77,3 +74,17 @@ if compat_os_name in ('nt', 'ce'):
return userhome + path[i:]
else:
compat_expanduser = os.path.expanduser
WINDOWS_VT_MODE = False if compat_os_name == 'nt' else None
def windows_enable_vt_mode(): # TODO: Do this the proper way https://bugs.python.org/issue30075
if compat_os_name != 'nt':
return
global WINDOWS_VT_MODE
startupinfo = subprocess.STARTUPINFO()
startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW
with contextlib.suppress(Exception):
subprocess.Popen('', shell=True, startupinfo=startupinfo).wait()
WINDOWS_VT_MODE = True

View File

@ -55,10 +55,3 @@ compat_xml_parse_error = etree.ParseError
compat_xpath = lambda xpath: xpath
compat_zip = zip
workaround_optparse_bug9161 = lambda: None
def __getattr__(name):
if name in ('WINDOWS_VT_MODE', 'windows_enable_vt_mode'):
from .. import utils
return getattr(utils, name)
raise AttributeError(name)

View File

@ -33,7 +33,7 @@ def _is_package(module):
def passthrough_module(parent, child, *, callback=lambda _: None):
parent_module = importlib.import_module(parent)
child_module = None # Import child module only as needed
child_module = importlib.import_module(child, parent)
class PassthroughModule(types.ModuleType):
def __getattr__(self, attr):
@ -41,9 +41,6 @@ def passthrough_module(parent, child, *, callback=lambda _: None):
with contextlib.suppress(ImportError):
return importlib.import_module(f'.{attr}', parent)
nonlocal child_module
child_module = child_module or importlib.import_module(child, parent)
ret = _NO_ATTRIBUTE
with contextlib.suppress(AttributeError):
ret = getattr(child_module, attr)

View File

@ -1,26 +0,0 @@
# flake8: noqa: F405
from functools import * # noqa: F403
from .compat_utils import passthrough_module
passthrough_module(__name__, 'functools')
del passthrough_module
try:
cache # >= 3.9
except NameError:
cache = lru_cache(maxsize=None)
try:
cached_property # >= 3.8
except NameError:
class cached_property:
def __init__(self, func):
update_wrapper(self, func)
self.func = func
def __get__(self, instance, _):
if instance is None:
return self
setattr(instance, self.func.__name__, self.func(instance))
return getattr(instance, self.func.__name__)

View File

@ -156,16 +156,30 @@ def _extract_firefox_cookies(profile, logger):
def _firefox_browser_dir():
if sys.platform in ('cygwin', 'win32'):
if sys.platform in ('linux', 'linux2'):
return os.path.expanduser('~/.mozilla/firefox')
elif sys.platform == 'win32':
return os.path.expandvars(R'%APPDATA%\Mozilla\Firefox\Profiles')
elif sys.platform == 'darwin':
return os.path.expanduser('~/Library/Application Support/Firefox')
return os.path.expanduser('~/.mozilla/firefox')
else:
raise ValueError(f'unsupported platform: {sys.platform}')
def _get_chromium_based_browser_settings(browser_name):
# https://chromium.googlesource.com/chromium/src/+/HEAD/docs/user_data_dir.md
if sys.platform in ('cygwin', 'win32'):
if sys.platform in ('linux', 'linux2'):
config = _config_home()
browser_dir = {
'brave': os.path.join(config, 'BraveSoftware/Brave-Browser'),
'chrome': os.path.join(config, 'google-chrome'),
'chromium': os.path.join(config, 'chromium'),
'edge': os.path.join(config, 'microsoft-edge'),
'opera': os.path.join(config, 'opera'),
'vivaldi': os.path.join(config, 'vivaldi'),
}[browser_name]
elif sys.platform == 'win32':
appdata_local = os.path.expandvars('%LOCALAPPDATA%')
appdata_roaming = os.path.expandvars('%APPDATA%')
browser_dir = {
@ -189,15 +203,7 @@ def _get_chromium_based_browser_settings(browser_name):
}[browser_name]
else:
config = _config_home()
browser_dir = {
'brave': os.path.join(config, 'BraveSoftware/Brave-Browser'),
'chrome': os.path.join(config, 'google-chrome'),
'chromium': os.path.join(config, 'chromium'),
'edge': os.path.join(config, 'microsoft-edge'),
'opera': os.path.join(config, 'opera'),
'vivaldi': os.path.join(config, 'vivaldi'),
}[browser_name]
raise ValueError(f'unsupported platform: {sys.platform}')
# Linux keyring names can be determined by snooping on dbus while opening the browser in KDE:
# dbus-monitor "interface='org.kde.KWallet'" "type=method_return"
@ -337,11 +343,14 @@ class ChromeCookieDecryptor:
def get_cookie_decryptor(browser_root, browser_keyring_name, logger, *, keyring=None):
if sys.platform == 'darwin':
if sys.platform in ('linux', 'linux2'):
return LinuxChromeCookieDecryptor(browser_keyring_name, logger, keyring=keyring)
elif sys.platform == 'darwin':
return MacChromeCookieDecryptor(browser_keyring_name, logger)
elif sys.platform in ('win32', 'cygwin'):
elif sys.platform == 'win32':
return WindowsChromeCookieDecryptor(browser_root, logger)
return LinuxChromeCookieDecryptor(browser_keyring_name, logger, keyring=keyring)
else:
raise NotImplementedError(f'Chrome cookie decryption is not supported on this platform: {sys.platform}')
class LinuxChromeCookieDecryptor(ChromeCookieDecryptor):
@ -709,19 +718,21 @@ def _get_kwallet_network_wallet(logger):
"""
default_wallet = 'kdewallet'
try:
stdout, _, returncode = Popen.run([
proc = Popen([
'dbus-send', '--session', '--print-reply=literal',
'--dest=org.kde.kwalletd5',
'/modules/kwalletd5',
'org.kde.KWallet.networkWallet'
], text=True, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)
], stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)
if returncode:
stdout, stderr = proc.communicate_or_kill()
if proc.returncode != 0:
logger.warning('failed to read NetworkWallet')
return default_wallet
else:
logger.debug(f'NetworkWallet = "{stdout.strip()}"')
return stdout.strip()
network_wallet = stdout.decode().strip()
logger.debug(f'NetworkWallet = "{network_wallet}"')
return network_wallet
except Exception as e:
logger.warning(f'exception while obtaining NetworkWallet: {e}')
return default_wallet
@ -739,16 +750,17 @@ def _get_kwallet_password(browser_keyring_name, logger):
network_wallet = _get_kwallet_network_wallet(logger)
try:
stdout, _, returncode = Popen.run([
proc = Popen([
'kwallet-query',
'--read-password', f'{browser_keyring_name} Safe Storage',
'--folder', f'{browser_keyring_name} Keys',
network_wallet
], stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)
if returncode:
logger.error(f'kwallet-query failed with return code {returncode}. '
'Please consult the kwallet-query man page for details')
stdout, stderr = proc.communicate_or_kill()
if proc.returncode != 0:
logger.error(f'kwallet-query failed with return code {proc.returncode}. Please consult '
'the kwallet-query man page for details')
return b''
else:
if stdout.lower().startswith(b'failed to read'):
@ -763,7 +775,9 @@ def _get_kwallet_password(browser_keyring_name, logger):
return b''
else:
logger.debug('password found')
return stdout.rstrip(b'\n')
if stdout[-1:] == b'\n':
stdout = stdout[:-1]
return stdout
except Exception as e:
logger.warning(f'exception running kwallet-query: {error_to_str(e)}')
return b''
@ -810,13 +824,17 @@ def _get_linux_keyring_password(browser_keyring_name, keyring, logger):
def _get_mac_keyring_password(browser_keyring_name, logger):
logger.debug('using find-generic-password to obtain password from OSX keychain')
try:
stdout, _, _ = Popen.run(
proc = Popen(
['security', 'find-generic-password',
'-w', # write password to stdout
'-a', browser_keyring_name, # match 'account'
'-s', f'{browser_keyring_name} Safe Storage'], # match 'service'
stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)
return stdout.rstrip(b'\n')
stdout, stderr = proc.communicate_or_kill()
if stdout[-1:] == b'\n':
stdout = stdout[:-1]
return stdout
except Exception as e:
logger.warning(f'exception running find-generic-password: {error_to_str(e)}')
return None

View File

@ -1,3 +1,4 @@
from ..compat import compat_str
from ..utils import NO_DEFAULT, determine_protocol
@ -84,13 +85,13 @@ def _get_suitable_downloader(info_dict, protocol, params, default):
if default is NO_DEFAULT:
default = HttpFD
if (info_dict.get('section_start') or info_dict.get('section_end')) and FFmpegFD.can_download(info_dict):
return FFmpegFD
# if (info_dict.get('start_time') or info_dict.get('end_time')) and not info_dict.get('requested_formats') and FFmpegFD.can_download(info_dict):
# return FFmpegFD
info_dict['protocol'] = protocol
downloaders = params.get('external_downloader')
external_downloader = (
downloaders if isinstance(downloaders, str) or downloaders is None
downloaders if isinstance(downloaders, compat_str) or downloaders is None
else downloaders.get(shorten_protocol_name(protocol, True), downloaders.get('default')))
if external_downloader is None:

View File

@ -15,18 +15,14 @@ from ..utils import (
NUMBER_RE,
LockingUnsupportedError,
Namespace,
classproperty,
decodeArgument,
encodeFilename,
error_to_compat_str,
float_or_none,
format_bytes,
join_nonempty,
sanitize_open,
shell_quote,
timeconvert,
timetuple_from_msec,
try_call,
)
@ -45,7 +41,6 @@ class FileDownloader:
verbose: Print additional info to stdout.
quiet: Do not print messages to stdout.
ratelimit: Download speed limit, in bytes/sec.
continuedl: Attempt to continue downloads if possible
throttledratelimit: Assume the download is being throttled below this speed (bytes/sec)
retries: Number of times to retry for HTTP error 5xx
file_access_retries: Number of times to retry on file access error
@ -69,7 +64,6 @@ class FileDownloader:
useful for bypassing bandwidth throttling imposed by
a webserver (experimental)
progress_template: See YoutubeDL.py
retry_sleep_functions: See YoutubeDL.py
Subclasses of this one must re-define the real_download method.
"""
@ -104,16 +98,12 @@ class FileDownloader:
def to_screen(self, *args, **kargs):
self.ydl.to_screen(*args, quiet=self.params.get('quiet'), **kargs)
__to_screen = to_screen
@classproperty
def FD_NAME(cls):
return re.sub(r'(?<=[a-z])(?=[A-Z])', '_', cls.__name__[:-2]).lower()
@property
def FD_NAME(self):
return re.sub(r'(?<!^)(?=[A-Z])', '_', type(self).__name__[:-2]).lower()
@staticmethod
def format_seconds(seconds):
if seconds is None:
return ' Unknown'
time = timetuple_from_msec(seconds * 1000)
if time.hours > 99:
return '--:--:--'
@ -121,8 +111,6 @@ class FileDownloader:
return '%02d:%02d' % time[1:-1]
return '%02d:%02d:%02d' % time[:-1]
format_eta = format_seconds
@staticmethod
def calc_percent(byte_counter, data_len):
if data_len is None:
@ -131,7 +119,11 @@ class FileDownloader:
@staticmethod
def format_percent(percent):
return ' N/A%' if percent is None else f'{percent:>5.1f}%'
if percent is None:
return '---.-%'
elif percent == 100:
return '100%'
return '%6s' % ('%3.1f%%' % percent)
@staticmethod
def calc_eta(start, now, total, current):
@ -145,6 +137,12 @@ class FileDownloader:
rate = float(current) / dif
return int((float(total) - float(current)) / rate)
@staticmethod
def format_eta(eta):
if eta is None:
return '--:--'
return FileDownloader.format_seconds(eta)
@staticmethod
def calc_speed(start, now, bytes):
dif = now - start
@ -154,11 +152,13 @@ class FileDownloader:
@staticmethod
def format_speed(speed):
return ' Unknown B/s' if speed is None else f'{format_bytes(speed):>10s}/s'
if speed is None:
return '%10s' % '---b/s'
return '%10s' % ('%s/s' % format_bytes(speed))
@staticmethod
def format_retries(retries):
return 'inf' if retries == float('inf') else int(retries)
return 'inf' if retries == float('inf') else '%.0f' % retries
@staticmethod
def best_block_size(elapsed_time, bytes):
@ -232,8 +232,7 @@ class FileDownloader:
self.to_screen(
f'[download] Unable to {action} file due to file access error. '
f'Retrying (attempt {retry} of {self.format_retries(file_access_retries)}) ...')
if not self.sleep_retry('file_access', retry):
time.sleep(0.01)
time.sleep(0.01)
return inner
return outer
@ -283,9 +282,9 @@ class FileDownloader:
elif self.ydl.params.get('logger'):
self._multiline = MultilineLogger(self.ydl.params['logger'], lines)
elif self.params.get('progress_with_newline'):
self._multiline = BreaklineStatusPrinter(self.ydl._out_files.out, lines)
self._multiline = BreaklineStatusPrinter(self.ydl._out_files.screen, lines)
else:
self._multiline = MultilinePrinter(self.ydl._out_files.out, lines, not self.params.get('quiet'))
self._multiline = MultilinePrinter(self.ydl._out_files.screen, lines, not self.params.get('quiet'))
self._multiline.allow_colors = self._multiline._HAVE_FULLCAP and not self.params.get('no_color')
def _finish_multiline_status(self):
@ -302,7 +301,7 @@ class FileDownloader:
)
def _report_progress_status(self, s, default_template):
for name, style in self.ProgressStyles.items_:
for name, style in self.ProgressStyles:
name = f'_{name}_str'
if name not in s:
continue
@ -326,52 +325,63 @@ class FileDownloader:
self._multiline.stream, self._multiline.allow_colors, *args, **kwargs)
def report_progress(self, s):
def with_fields(*tups, default=''):
for *fields, tmpl in tups:
if all(s.get(f) is not None for f in fields):
return tmpl
return default
if s['status'] == 'finished':
if self.params.get('noprogress'):
self.to_screen('[download] Download completed')
s.update({
'_total_bytes_str': format_bytes(s.get('total_bytes')),
'_elapsed_str': self.format_seconds(s.get('elapsed')),
'_percent_str': self.format_percent(100),
})
self._report_progress_status(s, join_nonempty(
'100%%',
with_fields(('total_bytes', 'of %(_total_bytes_str)s')),
with_fields(('elapsed', 'in %(_elapsed_str)s')),
delim=' '))
msg_template = '100%%'
if s.get('total_bytes') is not None:
s['_total_bytes_str'] = format_bytes(s['total_bytes'])
msg_template += ' of %(_total_bytes_str)s'
if s.get('elapsed') is not None:
s['_elapsed_str'] = self.format_seconds(s['elapsed'])
msg_template += ' in %(_elapsed_str)s'
s['_percent_str'] = self.format_percent(100)
self._report_progress_status(s, msg_template)
return
if s['status'] != 'downloading':
return
s.update({
'_eta_str': self.format_eta(s.get('eta')),
'_speed_str': self.format_speed(s.get('speed')),
'_percent_str': self.format_percent(try_call(
lambda: 100 * s['downloaded_bytes'] / s['total_bytes'],
lambda: 100 * s['downloaded_bytes'] / s['total_bytes_estimate'],
lambda: s['downloaded_bytes'] == 0 and 0)),
'_total_bytes_str': format_bytes(s.get('total_bytes')),
'_total_bytes_estimate_str': format_bytes(s.get('total_bytes_estimate')),
'_downloaded_bytes_str': format_bytes(s.get('downloaded_bytes')),
'_elapsed_str': self.format_seconds(s.get('elapsed')),
})
if s.get('eta') is not None:
s['_eta_str'] = self.format_eta(s['eta'])
else:
s['_eta_str'] = 'Unknown'
msg_template = with_fields(
('total_bytes', '%(_percent_str)s of %(_total_bytes_str)s at %(_speed_str)s ETA %(_eta_str)s'),
('total_bytes_estimate', '%(_percent_str)s of ~%(_total_bytes_estimate_str)s at %(_speed_str)s ETA %(_eta_str)s'),
('downloaded_bytes', 'elapsed', '%(_downloaded_bytes_str)s at %(_speed_str)s (%(_elapsed_str)s)'),
('downloaded_bytes', '%(_downloaded_bytes_str)s at %(_speed_str)s'),
default='%(_percent_str)s at %(_speed_str)s ETA %(_eta_str)s')
if s.get('total_bytes') and s.get('downloaded_bytes') is not None:
s['_percent_str'] = self.format_percent(100 * s['downloaded_bytes'] / s['total_bytes'])
elif s.get('total_bytes_estimate') and s.get('downloaded_bytes') is not None:
s['_percent_str'] = self.format_percent(100 * s['downloaded_bytes'] / s['total_bytes_estimate'])
else:
if s.get('downloaded_bytes') == 0:
s['_percent_str'] = self.format_percent(0)
else:
s['_percent_str'] = 'Unknown %'
msg_template += with_fields(
('fragment_index', 'fragment_count', ' (frag %(fragment_index)s/%(fragment_count)s)'),
('fragment_index', ' (frag %(fragment_index)s)'))
if s.get('speed') is not None:
s['_speed_str'] = self.format_speed(s['speed'])
else:
s['_speed_str'] = 'Unknown speed'
if s.get('total_bytes') is not None:
s['_total_bytes_str'] = format_bytes(s['total_bytes'])
msg_template = '%(_percent_str)s of %(_total_bytes_str)s at %(_speed_str)s ETA %(_eta_str)s'
elif s.get('total_bytes_estimate') is not None:
s['_total_bytes_estimate_str'] = format_bytes(s['total_bytes_estimate'])
msg_template = '%(_percent_str)s of ~%(_total_bytes_estimate_str)s at %(_speed_str)s ETA %(_eta_str)s'
else:
if s.get('downloaded_bytes') is not None:
s['_downloaded_bytes_str'] = format_bytes(s['downloaded_bytes'])
if s.get('elapsed'):
s['_elapsed_str'] = self.format_seconds(s['elapsed'])
msg_template = '%(_downloaded_bytes_str)s at %(_speed_str)s (%(_elapsed_str)s)'
else:
msg_template = '%(_downloaded_bytes_str)s at %(_speed_str)s'
else:
msg_template = '%(_percent_str)s at %(_speed_str)s ETA %(_eta_str)s'
if s.get('fragment_index') and s.get('fragment_count'):
msg_template += ' (frag %(fragment_index)s/%(fragment_count)s)'
elif s.get('fragment_index'):
msg_template += ' (frag %(fragment_index)s)'
self._report_progress_status(s, msg_template)
def report_resuming_byte(self, resume_len):
@ -380,23 +390,14 @@ class FileDownloader:
def report_retry(self, err, count, retries):
"""Report retry in case of HTTP error 5xx"""
self.__to_screen(
self.to_screen(
'[download] Got server HTTP error: %s. Retrying (attempt %d of %s) ...'
% (error_to_compat_str(err), count, self.format_retries(retries)))
self.sleep_retry('http', count)
def report_unable_to_resume(self):
"""Report it was impossible to resume download."""
self.to_screen('[download] Unable to resume')
def sleep_retry(self, retry_type, count):
sleep_func = self.params.get('retry_sleep_functions', {}).get(retry_type)
delay = float_or_none(sleep_func(n=count - 1)) if sleep_func else None
if delay:
self.__to_screen(f'Sleeping {delay:.2f} seconds ...')
time.sleep(delay)
return sleep_func is not None
@staticmethod
def supports_manifest(manifest):
""" Whether the downloader can download the fragments from the manifest.

View File

@ -1,7 +1,7 @@
import time
from . import get_suitable_downloader
from .fragment import FragmentFD
from ..downloader import get_suitable_downloader
from ..utils import urljoin
@ -73,7 +73,6 @@ class DashSegmentsFD(FragmentFD):
yield {
'frag_index': frag_index,
'fragment_count': fragment.get('fragment_count'),
'index': i,
'url': fragment_url,
}

View File

@ -1,4 +1,3 @@
import enum
import os.path
import re
import subprocess
@ -6,8 +5,7 @@ import sys
import time
from .fragment import FragmentFD
from ..compat import functools # isort: split
from ..compat import compat_setenv
from ..compat import compat_setenv, compat_str
from ..postprocessor.ffmpeg import EXT_TO_OUT_FORMATS, FFmpegPostProcessor
from ..utils import (
Popen,
@ -26,15 +24,9 @@ from ..utils import (
)
class Features(enum.Enum):
TO_STDOUT = enum.auto()
MULTIPLE_FORMATS = enum.auto()
class ExternalFD(FragmentFD):
SUPPORTED_PROTOCOLS = ('http', 'https', 'ftp', 'ftps')
SUPPORTED_FEATURES = ()
_CAPTURE_STDERR = True
can_download_to_stdout = False
def real_download(self, filename, info_dict):
self.report_destination(filename)
@ -82,7 +74,7 @@ class ExternalFD(FragmentFD):
def EXE_NAME(cls):
return cls.get_basename()
@functools.cached_property
@property
def exe(self):
return self.EXE_NAME
@ -98,11 +90,9 @@ class ExternalFD(FragmentFD):
@classmethod
def supports(cls, info_dict):
return all((
not info_dict.get('to_stdout') or Features.TO_STDOUT in cls.SUPPORTED_FEATURES,
'+' not in info_dict['protocol'] or Features.MULTIPLE_FORMATS in cls.SUPPORTED_FEATURES,
all(proto in cls.SUPPORTED_PROTOCOLS for proto in info_dict['protocol'].split('+')),
))
return (
(cls.can_download_to_stdout or not info_dict.get('to_stdout'))
and info_dict['protocol'] in cls.SUPPORTED_PROTOCOLS)
@classmethod
def can_download(cls, info_dict, path=None):
@ -129,31 +119,29 @@ class ExternalFD(FragmentFD):
self._debug_cmd(cmd)
if 'fragments' not in info_dict:
_, stderr, returncode = Popen.run(
cmd, text=True, stderr=subprocess.PIPE if self._CAPTURE_STDERR else None)
if returncode and stderr:
self.to_stderr(stderr)
return returncode
p = Popen(cmd, stderr=subprocess.PIPE)
_, stderr = p.communicate_or_kill()
if p.returncode != 0:
self.to_stderr(stderr.decode('utf-8', 'replace'))
return p.returncode
fragment_retries = self.params.get('fragment_retries', 0)
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
count = 0
while count <= fragment_retries:
_, stderr, returncode = Popen.run(cmd, text=True, stderr=subprocess.PIPE)
if not returncode:
p = Popen(cmd, stderr=subprocess.PIPE)
_, stderr = p.communicate_or_kill()
if p.returncode == 0:
break
# TODO: Decide whether to retry based on error code
# https://aria2.github.io/manual/en/html/aria2c.html#exit-status
if stderr:
self.to_stderr(stderr)
self.to_stderr(stderr.decode('utf-8', 'replace'))
count += 1
if count <= fragment_retries:
self.to_screen(
'[%s] Got error. Retrying fragments (attempt %d of %s)...'
% (self.get_basename(), count, self.format_retries(fragment_retries)))
self.sleep_retry('fragment', count)
if count > fragment_retries:
if not skip_unavailable_fragments:
self.report_error('Giving up after %s fragment retries' % fragment_retries)
@ -182,7 +170,6 @@ class ExternalFD(FragmentFD):
class CurlFD(ExternalFD):
AVAILABLE_OPT = '-V'
_CAPTURE_STDERR = False # curl writes the progress to stderr
def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '--location', '-o', tmpfilename, '--compressed']
@ -207,6 +194,16 @@ class CurlFD(ExternalFD):
cmd += ['--', info_dict['url']]
return cmd
def _call_downloader(self, tmpfilename, info_dict):
cmd = [encodeArgument(a) for a in self._make_cmd(tmpfilename, info_dict)]
self._debug_cmd(cmd)
# curl writes the progress to stderr so don't capture it.
p = Popen(cmd)
p.communicate_or_kill()
return p.returncode
class AxelFD(ExternalFD):
AVAILABLE_OPT = '-V'
@ -325,7 +322,7 @@ class HttpieFD(ExternalFD):
class FFmpegFD(ExternalFD):
SUPPORTED_PROTOCOLS = ('http', 'https', 'ftp', 'ftps', 'm3u8', 'm3u8_native', 'rtsp', 'rtmp', 'rtmp_ffmpeg', 'mms', 'http_dash_segments')
SUPPORTED_FEATURES = (Features.TO_STDOUT, Features.MULTIPLE_FORMATS)
can_download_to_stdout = True
@classmethod
def available(cls, path=None):
@ -333,6 +330,10 @@ class FFmpegFD(ExternalFD):
# Fixme: This may be wrong when --ffmpeg-location is used
return FFmpegPostProcessor().available
@classmethod
def supports(cls, info_dict):
return all(proto in cls.SUPPORTED_PROTOCOLS for proto in info_dict['protocol'].split('+'))
def on_process_started(self, proc, stdin):
""" Override this in subclasses """
pass
@ -377,6 +378,13 @@ class FFmpegFD(ExternalFD):
# http://trac.ffmpeg.org/ticket/6125#comment:10
args += ['-seekable', '1' if seekable else '0']
# start_time = info_dict.get('start_time') or 0
# if start_time:
# args += ['-ss', compat_str(start_time)]
# end_time = info_dict.get('end_time')
# if end_time:
# args += ['-t', compat_str(end_time - start_time)]
http_headers = None
if info_dict.get('http_headers'):
youtubedl_headers = handle_youtubedl_headers(info_dict['http_headers'])
@ -434,31 +442,25 @@ class FFmpegFD(ExternalFD):
if isinstance(conn, list):
for entry in conn:
args += ['-rtmp_conn', entry]
elif isinstance(conn, str):
elif isinstance(conn, compat_str):
args += ['-rtmp_conn', conn]
start_time, end_time = info_dict.get('section_start') or 0, info_dict.get('section_end')
for i, url in enumerate(urls):
# We need to specify headers for each http input stream
# otherwise, it will only be applied to the first.
# https://github.com/yt-dlp/yt-dlp/issues/2696
if http_headers is not None and re.match(r'^https?://', url):
args += http_headers
if start_time:
args += ['-ss', str(start_time)]
if end_time:
args += ['-t', str(end_time - start_time)]
args += self._configuration_args((f'_i{i + 1}', '_i')) + ['-i', url]
if not (start_time or end_time) or not self.params.get('force_keyframes_at_cuts'):
args += ['-c', 'copy']
args += ['-c', 'copy']
if info_dict.get('requested_formats') or protocol == 'http_dash_segments':
for (i, fmt) in enumerate(info_dict.get('requested_formats') or [info_dict]):
stream_number = fmt.get('manifest_stream_number', 0)
args.extend(['-map', f'{i}:{stream_number}'])
if self.params.get('test', False):
args += ['-fs', str(self._TEST_FILE_SIZE)]
args += ['-fs', compat_str(self._TEST_FILE_SIZE)]
ext = info_dict['ext']
if protocol in ('m3u8', 'm3u8_native'):
@ -493,23 +495,24 @@ class FFmpegFD(ExternalFD):
args.append(encodeFilename(ffpp._ffmpeg_filename_argument(tmpfilename), True))
self._debug_cmd(args)
with Popen(args, stdin=subprocess.PIPE, env=env) as proc:
if url in ('-', 'pipe:'):
self.on_process_started(proc, proc.stdin)
try:
retval = proc.wait()
except BaseException as e:
# subprocces.run would send the SIGKILL signal to ffmpeg and the
# mp4 file couldn't be played, but if we ask ffmpeg to quit it
# produces a file that is playable (this is mostly useful for live
# streams). Note that Windows is not affected and produces playable
# files (see https://github.com/ytdl-org/youtube-dl/issues/8300).
if isinstance(e, KeyboardInterrupt) and sys.platform != 'win32' and url not in ('-', 'pipe:'):
proc.communicate_or_kill(b'q')
else:
proc.kill(timeout=None)
raise
return retval
proc = Popen(args, stdin=subprocess.PIPE, env=env)
if url in ('-', 'pipe:'):
self.on_process_started(proc, proc.stdin)
try:
retval = proc.wait()
except BaseException as e:
# subprocces.run would send the SIGKILL signal to ffmpeg and the
# mp4 file couldn't be played, but if we ask ffmpeg to quit it
# produces a file that is playable (this is mostly useful for live
# streams). Note that Windows is not affected and produces playable
# files (see https://github.com/ytdl-org/youtube-dl/issues/8300).
if isinstance(e, KeyboardInterrupt) and sys.platform != 'win32' and url not in ('-', 'pipe:'):
proc.communicate_or_kill(b'q')
else:
proc.kill()
proc.wait()
raise
return retval
class AVconvFD(FFmpegFD):

View File

@ -391,10 +391,9 @@ class F4mFD(FragmentFD):
query.append(info_dict['extra_param_to_segment_url'])
url_parsed = base_url_parsed._replace(path=base_url_parsed.path + name, query='&'.join(query))
try:
success = self._download_fragment(ctx, url_parsed.geturl(), info_dict)
success, down_data = self._download_fragment(ctx, url_parsed.geturl(), info_dict)
if not success:
return False
down_data = self._read_fragment(ctx)
reader = FlvReader(down_data)
while True:
try:

View File

@ -23,7 +23,11 @@ class HttpQuietDownloader(HttpFD):
def to_screen(self, *args, **kargs):
pass
to_console_title = to_screen
console_title = to_screen
def report_retry(self, err, count, retries):
super().to_screen(
f'[download] Got server HTTP error: {err}. Retrying (attempt {count} of {self.format_retries(retries)}) ...')
class FragmentFD(FileDownloader):
@ -66,7 +70,6 @@ class FragmentFD(FileDownloader):
self.to_screen(
'\r[download] Got server HTTP error: %s. Retrying fragment %d (attempt %d of %s) ...'
% (error_to_compat_str(err), frag_index, count, self.format_retries(retries)))
self.sleep_retry('fragment', count)
def report_skip_fragment(self, frag_index, err=None):
err = f' {err};' if err else ''
@ -165,11 +168,18 @@ class FragmentFD(FileDownloader):
total_frags_str = 'unknown (live)'
self.to_screen(f'[{self.FD_NAME}] Total fragments: {total_frags_str}')
self.report_destination(ctx['filename'])
dl = HttpQuietDownloader(self.ydl, {
**self.params,
'noprogress': True,
'test': False,
})
dl = HttpQuietDownloader(
self.ydl,
{
'continuedl': self.params.get('continuedl', True),
'quiet': self.params.get('quiet'),
'noprogress': True,
'ratelimit': self.params.get('ratelimit'),
'retries': self.params.get('retries', 0),
'nopart': self.params.get('nopart', False),
'test': False,
}
)
tmpfilename = self.temp_name(ctx['filename'])
open_mode = 'wb'
resume_len = 0
@ -242,9 +252,6 @@ class FragmentFD(FileDownloader):
if s['status'] not in ('downloading', 'finished'):
return
if not total_frags and ctx.get('fragment_count'):
state['fragment_count'] = ctx['fragment_count']
if ctx_id is not None and s.get('ctx_id') != ctx_id:
return
@ -453,7 +460,6 @@ class FragmentFD(FileDownloader):
fatal, count = is_fatal(fragment.get('index') or (frag_index - 1)), 0
while count <= fragment_retries:
try:
ctx['fragment_count'] = fragment.get('fragment_count')
if self._download_fragment(ctx, fragment['url'], info_dict, headers):
break
return
@ -500,20 +506,12 @@ class FragmentFD(FileDownloader):
self.report_warning('The download speed shown is only of one thread. This is a known issue and patches are welcome')
with tpe or concurrent.futures.ThreadPoolExecutor(max_workers) as pool:
try:
for fragment, frag_index, frag_filename in pool.map(_download_fragment, fragments):
ctx.update({
'fragment_filename_sanitized': frag_filename,
'fragment_index': frag_index,
})
if not append_fragment(decrypt_fragment(fragment, self._read_fragment(ctx)), frag_index, ctx):
return False
except KeyboardInterrupt:
self._finish_multiline_status()
self.report_error(
'Interrupted by user. Waiting for all threads to shutdown...', is_error=False, tb=False)
pool.shutdown(wait=False)
raise
for fragment, frag_index, frag_filename in pool.map(_download_fragment, fragments):
ctx['fragment_filename_sanitized'] = frag_filename
ctx['fragment_index'] = frag_index
result = append_fragment(decrypt_fragment(fragment, self._read_fragment(ctx)), frag_index, ctx)
if not result:
return False
else:
for fragment in fragments:
if not interrupt_trigger[0]:

View File

@ -2,12 +2,12 @@ import binascii
import io
import re
from . import get_suitable_downloader
from .external import FFmpegFD
from .fragment import FragmentFD
from .. import webvtt
from ..compat import compat_urlparse
from ..dependencies import Cryptodome_AES
from ..downloader import get_suitable_downloader
from ..utils import bug_reports_message, parse_m3u8_attributes, update_url_query

View File

@ -136,18 +136,20 @@ class HttpFD(FileDownloader):
if has_range:
content_range = ctx.data.headers.get('Content-Range')
content_range_start, content_range_end, content_len = parse_http_range(content_range)
# Content-Range is present and matches requested Range, resume is possible
if range_start == content_range_start and (
if content_range_start is not None and range_start == content_range_start:
# Content-Range is present and matches requested Range, resume is possible
accept_content_len = (
# Non-chunked download
not ctx.chunk_size
# Chunked download and requested piece or
# its part is promised to be served
or content_range_end == range_end
or content_len < range_end):
ctx.content_len = content_len
if content_len or req_end:
ctx.data_len = min(content_len or req_end, req_end or content_len) - (req_start or 0)
return
or content_len < range_end)
if accept_content_len:
ctx.content_len = content_len
if content_len or req_end:
ctx.data_len = min(content_len or req_end, req_end or content_len) - (req_start or 0)
return
# Content-Range is either not present or invalid. Assuming remote webserver is
# trying to send the whole file, resume is not possible, so wiping the local file
# and performing entire redownload

View File

@ -1,7 +1,8 @@
import threading
from . import get_suitable_downloader
from .common import FileDownloader
from ..downloader import get_suitable_downloader
from ..extractor.niconico import NiconicoIE
from ..utils import sanitized_Request
@ -9,9 +10,8 @@ class NiconicoDmcFD(FileDownloader):
""" Downloading niconico douga from DMC with heartbeat """
def real_download(self, filename, info_dict):
from ..extractor.niconico import NiconicoIE
self.to_screen('[%s] Downloading from DMC' % self.FD_NAME)
ie = NiconicoIE(self.ydl)
info_dict, heartbeat_info_dict = ie._get_heartbeat_info(info_dict)

View File

@ -92,7 +92,8 @@ class RtmpFD(FileDownloader):
self.to_screen('')
return proc.wait()
except BaseException: # Including KeyboardInterrupt
proc.kill(timeout=None)
proc.kill()
proc.wait()
raise
url = info_dict['url']

View File

@ -3,6 +3,7 @@ import time
from .fragment import FragmentFD
from ..compat import compat_urllib_error
from ..extractor.youtube import YoutubeBaseInfoExtractor as YT_BaseIE
from ..utils import RegexNotFoundError, dict_get, int_or_none, try_get
@ -25,9 +26,7 @@ class YoutubeLiveChatFD(FragmentFD):
'total_frags': None,
}
from ..extractor.youtube import YoutubeBaseInfoExtractor
ie = YoutubeBaseInfoExtractor(self.ydl)
ie = YT_BaseIE(self.ydl)
start_time = int(time.time() * 1000)

View File

@ -1,15 +1,32 @@
from ..compat.compat_utils import passthrough_module
import contextlib
import os
passthrough_module(__name__, '.extractors')
del passthrough_module
from ..utils import load_plugins
_LAZY_LOADER = False
if not os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'):
with contextlib.suppress(ImportError):
from .lazy_extractors import * # noqa: F403
from .lazy_extractors import _ALL_CLASSES
_LAZY_LOADER = True
if not _LAZY_LOADER:
from .extractors import * # noqa: F403
_ALL_CLASSES = [ # noqa: F811
klass
for name, klass in globals().items()
if name.endswith('IE') and name != 'GenericIE'
]
_ALL_CLASSES.append(GenericIE) # noqa: F405
_PLUGIN_CLASSES = load_plugins('extractor', 'IE', globals())
_ALL_CLASSES = list(_PLUGIN_CLASSES.values()) + _ALL_CLASSES
def gen_extractor_classes():
""" Return a list of supported extractors.
The order does matter; the first extractor matched is the one handling the URL.
"""
from .extractors import _ALL_CLASSES
return _ALL_CLASSES
@ -22,12 +39,10 @@ def gen_extractors():
def list_extractor_classes(age_limit=None):
"""Return a list of extractors that are suitable for the given age, sorted by extractor name"""
from .generic import GenericIE
yield from sorted(filter(
lambda ie: ie.is_suitable(age_limit) and ie != GenericIE,
lambda ie: ie.is_suitable(age_limit) and ie != GenericIE, # noqa: F405
gen_extractor_classes()), key=lambda ie: ie.IE_NAME.lower())
yield GenericIE
yield GenericIE # noqa: F405
def list_extractors(age_limit=None):
@ -37,6 +52,4 @@ def list_extractors(age_limit=None):
def get_info_extractor(ie_name):
"""Returns the info extractor class with the given ie_name"""
from . import extractors
return getattr(extractors, f'{ie_name}IE')
return globals()[ie_name + 'IE']

File diff suppressed because it is too large Load Diff

View File

@ -16,7 +16,7 @@ from ..compat import compat_urllib_parse_urlparse, compat_urllib_request
from ..utils import (
ExtractorError,
bytes_to_intlist,
decode_base_n,
decode_base,
int_or_none,
intlist_to_bytes,
request_to_url,
@ -123,7 +123,7 @@ class AbemaLicenseHandler(compat_urllib_request.BaseHandler):
'Content-Type': 'application/json',
})
res = decode_base_n(license_response['k'], table=self.STRTABLE)
res = decode_base(license_response['k'], self.STRTABLE)
encvideokey = bytes_to_intlist(struct.pack('>QQ', res >> 64, res & 0xffffffffffffffff))
h = hmac.new(

View File

@ -0,0 +1,270 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
urlencode_postdata,
int_or_none,
str_or_none,
determine_ext,
)
from ..compat import compat_HTTPError
class AnimeLabBaseIE(InfoExtractor):
_LOGIN_URL = 'https://www.animelab.com/login'
_NETRC_MACHINE = 'animelab'
_LOGGED_IN = False
def _is_logged_in(self, login_page=None):
if not self._LOGGED_IN:
if not login_page:
login_page = self._download_webpage(self._LOGIN_URL, None, 'Downloading login page')
AnimeLabBaseIE._LOGGED_IN = 'Sign In' not in login_page
return self._LOGGED_IN
def _perform_login(self, username, password):
if self._is_logged_in():
return
login_form = {
'email': username,
'password': password,
}
try:
response = self._download_webpage(
self._LOGIN_URL, None, 'Logging in', 'Wrong login info',
data=urlencode_postdata(login_form),
headers={'Content-Type': 'application/x-www-form-urlencoded'})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
raise ExtractorError('Unable to log in (wrong credentials?)', expected=True)
raise
if not self._is_logged_in(response):
raise ExtractorError('Unable to login (cannot verify if logged in)')
def _real_initialize(self):
if not self._is_logged_in():
self.raise_login_required('Login is required to access any AnimeLab content')
class AnimeLabIE(AnimeLabBaseIE):
_VALID_URL = r'https?://(?:www\.)?animelab\.com/player/(?P<id>[^/]+)'
_TEST = {
'url': 'https://www.animelab.com/player/fullmetal-alchemist-brotherhood-episode-42',
'md5': '05bde4b91a5d1ff46ef5b94df05b0f7f',
'info_dict': {
'id': '383',
'ext': 'mp4',
'display_id': 'fullmetal-alchemist-brotherhood-episode-42',
'title': 'Fullmetal Alchemist: Brotherhood - Episode 42 - Signs of a Counteroffensive',
'description': 'md5:103eb61dd0a56d3dfc5dbf748e5e83f4',
'series': 'Fullmetal Alchemist: Brotherhood',
'episode': 'Signs of a Counteroffensive',
'episode_number': 42,
'duration': 1469,
'season': 'Season 1',
'season_number': 1,
'season_id': '38',
},
'params': {
# Ensure the same video is downloaded whether the user is premium or not
'format': '[format_id=21711_yeshardsubbed_ja-JP][height=480]',
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
# unfortunately we can get different URLs for the same formats
# e.g. if we are using a "free" account so no dubs available
# (so _remove_duplicate_formats is not effective)
# so we use a dictionary as a workaround
formats = {}
for language_option_url in ('https://www.animelab.com/player/%s/subtitles',
'https://www.animelab.com/player/%s/dubbed'):
actual_url = language_option_url % display_id
webpage = self._download_webpage(actual_url, display_id, 'Downloading URL ' + actual_url)
video_collection = self._parse_json(self._search_regex(r'new\s+?AnimeLabApp\.VideoCollection\s*?\((.*?)\);', webpage, 'AnimeLab VideoCollection'), display_id)
position = int_or_none(self._search_regex(r'playlistPosition\s*?=\s*?(\d+)', webpage, 'Playlist Position'))
raw_data = video_collection[position]['videoEntry']
video_id = str_or_none(raw_data['id'])
# create a title from many sources (while grabbing other info)
# TODO use more fallback sources to get some of these
series = raw_data.get('showTitle')
video_type = raw_data.get('videoEntryType', {}).get('name')
episode_number = raw_data.get('episodeNumber')
episode_name = raw_data.get('name')
title_parts = (series, video_type, episode_number, episode_name)
if None not in title_parts:
title = '%s - %s %s - %s' % title_parts
else:
title = episode_name
description = raw_data.get('synopsis') or self._og_search_description(webpage, default=None)
duration = int_or_none(raw_data.get('duration'))
thumbnail_data = raw_data.get('images', [])
thumbnails = []
for thumbnail in thumbnail_data:
for instance in thumbnail['imageInstances']:
image_data = instance.get('imageInfo', {})
thumbnails.append({
'id': str_or_none(image_data.get('id')),
'url': image_data.get('fullPath'),
'width': image_data.get('width'),
'height': image_data.get('height'),
})
season_data = raw_data.get('season', {}) or {}
season = str_or_none(season_data.get('name'))
season_number = int_or_none(season_data.get('seasonNumber'))
season_id = str_or_none(season_data.get('id'))
for video_data in raw_data['videoList']:
current_video_list = {}
current_video_list['language'] = video_data.get('language', {}).get('languageCode')
is_hardsubbed = video_data.get('hardSubbed')
for video_instance in video_data['videoInstances']:
httpurl = video_instance.get('httpUrl')
url = httpurl if httpurl else video_instance.get('rtmpUrl')
if url is None:
# this video format is unavailable to the user (not premium etc.)
continue
current_format = current_video_list.copy()
format_id_parts = []
format_id_parts.append(str_or_none(video_instance.get('id')))
if is_hardsubbed is not None:
if is_hardsubbed:
format_id_parts.append('yeshardsubbed')
else:
format_id_parts.append('nothardsubbed')
format_id_parts.append(current_format['language'])
format_id = '_'.join([x for x in format_id_parts if x is not None])
ext = determine_ext(url)
if ext == 'm3u8':
for format_ in self._extract_m3u8_formats(
url, video_id, m3u8_id=format_id, fatal=False):
formats[format_['format_id']] = format_
continue
elif ext == 'mpd':
for format_ in self._extract_mpd_formats(
url, video_id, mpd_id=format_id, fatal=False):
formats[format_['format_id']] = format_
continue
current_format['url'] = url
quality_data = video_instance.get('videoQuality')
if quality_data:
quality = quality_data.get('name') or quality_data.get('description')
else:
quality = None
height = None
if quality:
height = int_or_none(self._search_regex(r'(\d+)p?$', quality, 'Video format height', default=None))
if height is None:
self.report_warning('Could not get height of video')
else:
current_format['height'] = height
current_format['format_id'] = format_id
formats[current_format['format_id']] = current_format
formats = list(formats.values())
self._sort_formats(formats)
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': description,
'series': series,
'episode': episode_name,
'episode_number': int_or_none(episode_number),
'thumbnails': thumbnails,
'duration': duration,
'formats': formats,
'season': season,
'season_number': season_number,
'season_id': season_id,
}
class AnimeLabShowsIE(AnimeLabBaseIE):
_VALID_URL = r'https?://(?:www\.)?animelab\.com/shows/(?P<id>[^/]+)'
_TEST = {
'url': 'https://www.animelab.com/shows/attack-on-titan',
'info_dict': {
'id': '45',
'title': 'Attack on Titan',
'description': 'md5:989d95a2677e9309368d5cf39ba91469',
},
'playlist_count': 59,
'skip': 'All AnimeLab content requires authentication',
}
def _real_extract(self, url):
_BASE_URL = 'http://www.animelab.com'
_SHOWS_API_URL = '/api/videoentries/show/videos/'
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id, 'Downloading requested URL')
show_data_str = self._search_regex(r'({"id":.*}),\svideoEntry', webpage, 'AnimeLab show data')
show_data = self._parse_json(show_data_str, display_id)
show_id = str_or_none(show_data.get('id'))
title = show_data.get('name')
description = show_data.get('shortSynopsis') or show_data.get('longSynopsis')
entries = []
for season in show_data['seasons']:
season_id = season['id']
get_data = urlencode_postdata({
'seasonId': season_id,
'limit': 1000,
})
# despite using urlencode_postdata, we are sending a GET request
target_url = _BASE_URL + _SHOWS_API_URL + show_id + "?" + get_data.decode('utf-8')
response = self._download_webpage(
target_url,
None, 'Season id %s' % season_id)
season_data = self._parse_json(response, display_id)
for video_data in season_data['list']:
entries.append(self.url_result(
_BASE_URL + '/player/' + video_data['slug'], 'AnimeLab',
str_or_none(video_data.get('id')), video_data.get('name')
))
return {
'_type': 'playlist',
'id': show_id,
'title': title,
'description': description,
'entries': entries,
}
# TODO implement myqueue

View File

@ -442,10 +442,9 @@ class YoutubeWebArchiveIE(InfoExtractor):
'only_matching': True
},
]
_YT_INITIAL_DATA_RE = YoutubeBaseInfoExtractor._YT_INITIAL_DATA_RE
_YT_INITIAL_PLAYER_RESPONSE_RE = fr'''(?x)
(?:window\s*\[\s*["\']ytInitialPlayerResponse["\']\s*\]|ytInitialPlayerResponse)\s*=[(\s]*|
{YoutubeBaseInfoExtractor._YT_INITIAL_PLAYER_RESPONSE_RE}'''
_YT_INITIAL_DATA_RE = r'(?:(?:(?:window\s*\[\s*["\']ytInitialData["\']\s*\]|ytInitialData)\s*=\s*({.+?})\s*;)|%s)' % YoutubeBaseInfoExtractor._YT_INITIAL_DATA_RE
_YT_INITIAL_PLAYER_RESPONSE_RE = r'(?:(?:(?:window\s*\[\s*["\']ytInitialPlayerResponse["\']\s*\]|ytInitialPlayerResponse)\s*=[(\s]*({.+?})[)\s]*;)|%s)' % YoutubeBaseInfoExtractor._YT_INITIAL_PLAYER_RESPONSE_RE
_YT_INITIAL_BOUNDARY_RE = r'(?:(?:var\s+meta|</script|\n)|%s)' % YoutubeBaseInfoExtractor._YT_INITIAL_BOUNDARY_RE
_YT_DEFAULT_THUMB_SERVERS = ['i.ytimg.com'] # thumbnails most likely archived on these servers
_YT_ALL_THUMB_SERVERS = orderedSet(
@ -475,6 +474,11 @@ class YoutubeWebArchiveIE(InfoExtractor):
elif not isinstance(res, list) or len(res) != 0:
self.report_warning('Error while parsing CDX API response' + bug_reports_message())
def _extract_yt_initial_variable(self, webpage, regex, video_id, name):
return self._parse_json(self._search_regex(
(fr'{regex}\s*{self._YT_INITIAL_BOUNDARY_RE}',
regex), webpage, name, default='{}'), video_id, fatal=False)
def _extract_webpage_title(self, webpage):
page_title = self._html_extract_title(webpage, default='')
# YouTube video pages appear to always have either 'YouTube -' as prefix or '- YouTube' as suffix.
@ -484,11 +488,10 @@ class YoutubeWebArchiveIE(InfoExtractor):
def _extract_metadata(self, video_id, webpage):
search_meta = ((lambda x: self._html_search_meta(x, webpage, default=None)) if webpage else (lambda x: None))
player_response = self._search_json(
self._YT_INITIAL_PLAYER_RESPONSE_RE, webpage, 'initial player response',
video_id, default={})
initial_data = self._search_json(
self._YT_INITIAL_DATA_RE, webpage, 'initial data', video_id, default={})
player_response = self._extract_yt_initial_variable(
webpage, self._YT_INITIAL_PLAYER_RESPONSE_RE, video_id, 'initial player response') or {}
initial_data = self._extract_yt_initial_variable(
webpage, self._YT_INITIAL_DATA_RE, video_id, 'initial player response') or {}
initial_data_video = traverse_obj(
initial_data, ('contents', 'twoColumnWatchNextResults', 'results', 'results', 'contents', ..., 'videoPrimaryInfoRenderer'),

View File

@ -90,7 +90,7 @@ class ArnesIE(InfoExtractor):
'timestamp': parse_iso8601(video.get('creationTime')),
'channel': channel.get('name'),
'channel_id': channel_id,
'channel_url': format_field(channel_id, None, f'{self._BASE_URL}/?channel=%s'),
'channel_url': format_field(channel_id, template=f'{self._BASE_URL}/?channel=%s'),
'duration': float_or_none(video.get('duration'), 1000),
'view_count': int_or_none(video.get('views')),
'tags': video.get('hashtags'),

View File

@ -1,34 +0,0 @@
import re
from .common import InfoExtractor
class AtScaleConfEventIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?atscaleconference\.com/events/(?P<id>[^/&$?]+)'
_TESTS = [{
'url': 'https://atscaleconference.com/events/data-scale-spring-2022/',
'playlist_mincount': 13,
'info_dict': {
'id': 'data-scale-spring-2022',
'title': 'Data @Scale Spring 2022',
'description': 'md5:7d7ca1c42ac9c6d8a785092a1aea4b55'
},
}, {
'url': 'https://atscaleconference.com/events/video-scale-2021/',
'playlist_mincount': 14,
'info_dict': {
'id': 'video-scale-2021',
'title': 'Video @Scale 2021',
'description': 'md5:7d7ca1c42ac9c6d8a785092a1aea4b55'
},
}]
def _real_extract(self, url):
id = self._match_id(url)
webpage = self._download_webpage(url, id)
return self.playlist_from_matches(
re.findall(r'data-url\s*=\s*"(https?://(?:www\.)?atscaleconference\.com/videos/[^"]+)"', webpage),
ie='Generic', playlist_id=id,
title=self._og_search_title(webpage), description=self._og_search_description(webpage))

View File

@ -41,7 +41,7 @@ class AWAANBaseIE(InfoExtractor):
'id': video_id,
'title': title,
'description': video_data.get('description_en') or video_data.get('description_ar'),
'thumbnail': format_field(img, None, 'http://admin.mangomolo.com/analytics/%s'),
'thumbnail': format_field(img, template='http://admin.mangomolo.com/analytics/%s'),
'duration': int_or_none(video_data.get('duration')),
'timestamp': parse_iso8601(video_data.get('create_time'), ' '),
'is_live': is_live,

View File

@ -24,7 +24,7 @@ class BellMediaIE(InfoExtractor):
)/.*?(?:\b(?:vid(?:eoid)?|clipId)=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})'''
_TESTS = [{
'url': 'https://www.bnnbloomberg.ca/video/david-cockfield-s-top-picks~1403070',
'md5': '3e5b8e38370741d5089da79161646635',
'md5': '36d3ef559cfe8af8efe15922cd3ce950',
'info_dict': {
'id': '1403070',
'ext': 'flv',
@ -32,14 +32,6 @@ class BellMediaIE(InfoExtractor):
'description': 'md5:810f7f8c6a83ad5b48677c3f8e5bb2c3',
'upload_date': '20180525',
'timestamp': 1527288600,
'season_id': 73997,
'season': '2018',
'thumbnail': 'http://images2.9c9media.com/image_asset/2018_5_25_baf30cbd-b28d-4a18-9903-4bb8713b00f5_PNG_956x536.jpg',
'tags': [],
'categories': ['ETFs'],
'season_number': 8,
'duration': 272.038,
'series': 'Market Call Tonight',
},
}, {
'url': 'http://www.thecomedynetwork.ca/video/player?vid=923582',

View File

@ -677,11 +677,6 @@ class BilibiliAudioIE(BilibiliAudioBaseIE):
'vcodec': 'none'
}]
for a_format in formats:
a_format.setdefault('http_headers', {}).update({
'Referer': url,
})
song = self._call_api('song/info', au_id)
title = song['title']
statistic = song.get('statistic') or {}
@ -789,8 +784,7 @@ class BiliIntlBaseIE(InfoExtractor):
def json2srt(self, json):
data = '\n\n'.join(
f'{i + 1}\n{srt_subtitles_timecode(line["from"])} --> {srt_subtitles_timecode(line["to"])}\n{line["content"]}'
for i, line in enumerate(traverse_obj(json, (
'body', lambda _, l: l['content'] and l['from'] and l['to']))))
for i, line in enumerate(json['body']) if line.get('content'))
return data
def _get_subtitles(self, *, ep_id=None, aid=None):
@ -953,11 +947,12 @@ class BiliIntlIE(BiliIntlBaseIE):
video_id = ep_id or aid
webpage = self._download_webpage(url, video_id)
# Bstation layout
initial_data = (
self._search_json(r'window\.__INITIAL_(?:DATA|STATE)__\s*=', webpage, 'preload state', video_id, default={})
or self._search_nuxt_data(webpage, video_id, '__initialState', fatal=False, traverse=None))
video_data = traverse_obj(
initial_data, ('OgvVideo', 'epDetail'), ('UgcVideo', 'videoData'), ('ugc', 'archive'), expected_type=dict)
initial_data = self._parse_json(self._search_regex(
r'window\.__INITIAL_(?:DATA|STATE)__\s*=\s*({.+?});', webpage,
'preload state', default='{}'), video_id, fatal=False) or {}
video_data = (
traverse_obj(initial_data, ('OgvVideo', 'epDetail'), expected_type=dict)
or traverse_obj(initial_data, ('UgcVideo', 'videoData'), expected_type=dict) or {})
if season_id and not video_data:
# Non-Bstation layout, read through episode list
@ -965,7 +960,7 @@ class BiliIntlIE(BiliIntlBaseIE):
video_data = traverse_obj(season_json,
('sections', ..., 'episodes', lambda _, v: str(v['episode_id']) == ep_id),
expected_type=dict, get_all=False)
return self._extract_video_info(video_data or {}, ep_id=ep_id, aid=aid)
return self._extract_video_info(video_data, ep_id=ep_id, aid=aid)
class BiliIntlSeriesIE(BiliIntlBaseIE):

View File

@ -7,11 +7,13 @@ class BloombergIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bloomberg\.com/(?:[^/]+/)*(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.bloomberg.com/news/videos/2021-09-14/apple-unveils-the-new-iphone-13-stock-doesn-t-move-much-video',
'url': 'http://www.bloomberg.com/news/videos/b/aaeae121-5949-481e-a1ce-4562db6f5df2',
# The md5 checksum changes
'info_dict': {
'id': 'V8cFcYMxTHaMcEiiYVr39A',
'id': 'qurhIVlJSB6hzkVi229d8g',
'ext': 'flv',
'title': 'Apple Unveils the New IPhone 13, Stock Doesn\'t Move Much',
'title': 'Shah\'s Presentation on Foreign-Exchange Strategies',
'description': 'md5:a8ba0302912d03d246979735c17d2761',
},
'params': {
'format': 'best[format_id^=hds]',
@ -55,7 +57,7 @@ class BloombergIE(InfoExtractor):
title = re.sub(': Video$', '', self._og_search_title(webpage))
embed_info = self._download_json(
'http://www.bloomberg.com/multimedia/api/embed?id=%s' % video_id, video_id)
'http://www.bloomberg.com/api/embed?id=%s' % video_id, video_id)
formats = []
for stream in embed_info['streams']:
stream_url = stream.get('url')

View File

@ -75,7 +75,6 @@ class CCCIE(InfoExtractor):
'thumbnail': event_data.get('thumb_url'),
'timestamp': parse_iso8601(event_data.get('date')),
'duration': int_or_none(event_data.get('length')),
'view_count': int_or_none(event_data.get('view_count')),
'tags': event_data.get('tags'),
'formats': formats,
}

View File

@ -11,7 +11,6 @@ import sys
import time
import xml.etree.ElementTree
from ..compat import functools, re # isort: split
from ..compat import (
compat_cookiejar_Cookie,
compat_cookies_SimpleCookie,
@ -26,6 +25,7 @@ from ..compat import (
compat_urllib_parse_urlencode,
compat_urllib_request,
compat_urlparse,
re,
)
from ..downloader import FileDownloader
from ..downloader.f4m import get_base_url, remove_encrypted_media
@ -35,7 +35,6 @@ from ..utils import (
ExtractorError,
GeoRestrictedError,
GeoUtils,
LenientJSONDecoder,
RegexNotFoundError,
UnsupportedError,
age_restricted,
@ -385,11 +384,6 @@ class InfoExtractor:
release_year: Year (YYYY) when the album was released.
composer: Composer of the piece
The following fields should only be set for clips that should be cut from the original video:
section_start: Start time of the section in seconds
section_end: End time of the section in seconds
Unless mentioned otherwise, the fields should be Unicode strings.
Unless mentioned otherwise, None is equivalent to absence of information.
@ -616,7 +610,8 @@ class InfoExtractor:
if ip_block:
self._x_forwarded_for_ip = GeoUtils.random_ipv4(ip_block)
self.write_debug(f'Using fake IP {self._x_forwarded_for_ip} as X-Forwarded-For')
self._downloader.write_debug(
'[debug] Using fake IP %s as X-Forwarded-For' % self._x_forwarded_for_ip)
return
# Path 2: bypassing based on country code
@ -730,13 +725,6 @@ class InfoExtractor:
else:
return err.code in variadic(expected_status)
def _create_request(self, url_or_request, data=None, headers={}, query={}):
if isinstance(url_or_request, compat_urllib_request.Request):
return update_Request(url_or_request, data=data, headers=headers, query=query)
if query:
url_or_request = update_url_query(url_or_request, query)
return sanitized_Request(url_or_request, data, headers)
def _request_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True, data=None, headers={}, query={}, expected_status=None):
"""
Return the response handle.
@ -768,8 +756,16 @@ class InfoExtractor:
if 'X-Forwarded-For' not in headers:
headers['X-Forwarded-For'] = self._x_forwarded_for_ip
if isinstance(url_or_request, compat_urllib_request.Request):
url_or_request = update_Request(
url_or_request, data=data, headers=headers, query=query)
else:
if query:
url_or_request = update_url_query(url_or_request, query)
if data is not None or headers:
url_or_request = sanitized_Request(url_or_request, data, headers)
try:
return self._downloader.urlopen(self._create_request(url_or_request, data, headers, query))
return self._downloader.urlopen(url_or_request)
except network_exceptions as err:
if isinstance(err, compat_urllib_error.HTTPError):
if self.__can_accept_status_code(err, expected_status):
@ -792,40 +788,12 @@ class InfoExtractor:
self.report_warning(errmsg)
return False
def _download_webpage_handle(self, url_or_request, video_id, note=None, errnote=None, fatal=True,
encoding=None, data=None, headers={}, query={}, expected_status=None):
def _download_webpage_handle(self, url_or_request, video_id, note=None, errnote=None, fatal=True, encoding=None, data=None, headers={}, query={}, expected_status=None):
"""
Return a tuple (page content as string, URL handle).
Arguments:
url_or_request -- plain text URL as a string or
a compat_urllib_request.Requestobject
video_id -- Video/playlist/item identifier (string)
Keyword arguments:
note -- note printed before downloading (string)
errnote -- note printed in case of an error (string)
fatal -- flag denoting whether error should be considered fatal,
i.e. whether it should cause ExtractionError to be raised,
otherwise a warning will be reported and extraction continued
encoding -- encoding for a page content decoding, guessed automatically
when not explicitly specified
data -- POST data (bytes)
headers -- HTTP headers (dict)
query -- URL query (dict)
expected_status -- allows to accept failed HTTP requests (non 2xx
status code) by explicitly specifying a set of accepted status
codes. Can be any of the following entities:
- an integer type specifying an exact failed status code to
accept
- a list or a tuple of integer types specifying a list of
failed status codes to accept
- a callable accepting an actual failed status code and
returning True if it should be accepted
Note that this argument does not affect success status codes (2xx)
which are always accepted.
See _download_webpage docstring for arguments specification.
"""
# Strip hashes from the URL (#1038)
if isinstance(url_or_request, (compat_str, str)):
url_or_request = url_or_request.partition('#')[0]
@ -882,48 +850,140 @@ class InfoExtractor:
'Visit http://blocklist.rkn.gov.ru/ for a block reason.',
expected=True)
def _request_dump_filename(self, url, video_id):
basen = f'{video_id}_{url}'
trim_length = self.get_param('trim_file_name') or 240
if len(basen) > trim_length:
h = '___' + hashlib.md5(basen.encode('utf-8')).hexdigest()
basen = basen[:trim_length - len(h)] + h
filename = sanitize_filename(f'{basen}.dump', restricted=True)
# Working around MAX_PATH limitation on Windows (see
# http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx)
if compat_os_name == 'nt':
absfilepath = os.path.abspath(filename)
if len(absfilepath) > 259:
filename = fR'\\?\{absfilepath}'
return filename
def __decode_webpage(self, webpage_bytes, encoding, headers):
if not encoding:
encoding = self._guess_encoding_from_content(headers.get('Content-Type', ''), webpage_bytes)
try:
return webpage_bytes.decode(encoding, 'replace')
except LookupError:
return webpage_bytes.decode('utf-8', 'replace')
def _webpage_read_content(self, urlh, url_or_request, video_id, note=None, errnote=None, fatal=True, prefix=None, encoding=None):
content_type = urlh.headers.get('Content-Type', '')
webpage_bytes = urlh.read()
if prefix is not None:
webpage_bytes = prefix + webpage_bytes
if not encoding:
encoding = self._guess_encoding_from_content(content_type, webpage_bytes)
if self.get_param('dump_intermediate_pages', False):
self.to_screen('Dumping request to ' + urlh.geturl())
dump = base64.b64encode(webpage_bytes).decode('ascii')
self._downloader.to_screen(dump)
if self.get_param('write_pages'):
filename = self._request_dump_filename(urlh.geturl(), video_id)
self.to_screen(f'Saving request to {filename}')
if self.get_param('write_pages', False):
basen = f'{video_id}_{urlh.geturl()}'
trim_length = self.get_param('trim_file_name') or 240
if len(basen) > trim_length:
h = '___' + hashlib.md5(basen.encode('utf-8')).hexdigest()
basen = basen[:trim_length - len(h)] + h
raw_filename = basen + '.dump'
filename = sanitize_filename(raw_filename, restricted=True)
self.to_screen('Saving request to ' + filename)
# Working around MAX_PATH limitation on Windows (see
# http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx)
if compat_os_name == 'nt':
absfilepath = os.path.abspath(filename)
if len(absfilepath) > 259:
filename = '\\\\?\\' + absfilepath
with open(filename, 'wb') as outf:
outf.write(webpage_bytes)
content = self.__decode_webpage(webpage_bytes, encoding, urlh.headers)
try:
content = webpage_bytes.decode(encoding, 'replace')
except LookupError:
content = webpage_bytes.decode('utf-8', 'replace')
self.__check_blocked(content)
return content
def _download_webpage(
self, url_or_request, video_id, note=None, errnote=None,
fatal=True, tries=1, timeout=5, encoding=None, data=None,
headers={}, query={}, expected_status=None):
"""
Return the data of the page as a string.
Arguments:
url_or_request -- plain text URL as a string or
a compat_urllib_request.Requestobject
video_id -- Video/playlist/item identifier (string)
Keyword arguments:
note -- note printed before downloading (string)
errnote -- note printed in case of an error (string)
fatal -- flag denoting whether error should be considered fatal,
i.e. whether it should cause ExtractionError to be raised,
otherwise a warning will be reported and extraction continued
tries -- number of tries
timeout -- sleep interval between tries
encoding -- encoding for a page content decoding, guessed automatically
when not explicitly specified
data -- POST data (bytes)
headers -- HTTP headers (dict)
query -- URL query (dict)
expected_status -- allows to accept failed HTTP requests (non 2xx
status code) by explicitly specifying a set of accepted status
codes. Can be any of the following entities:
- an integer type specifying an exact failed status code to
accept
- a list or a tuple of integer types specifying a list of
failed status codes to accept
- a callable accepting an actual failed status code and
returning True if it should be accepted
Note that this argument does not affect success status codes (2xx)
which are always accepted.
"""
success = False
try_count = 0
while success is False:
try:
res = self._download_webpage_handle(
url_or_request, video_id, note, errnote, fatal,
encoding=encoding, data=data, headers=headers, query=query,
expected_status=expected_status)
success = True
except compat_http_client.IncompleteRead as e:
try_count += 1
if try_count >= tries:
raise e
self._sleep(timeout, video_id)
if res is False:
return res
else:
content, _ = res
return content
def _download_xml_handle(
self, url_or_request, video_id, note='Downloading XML',
errnote='Unable to download XML', transform_source=None,
fatal=True, encoding=None, data=None, headers={}, query={},
expected_status=None):
"""
Return a tuple (xml as an xml.etree.ElementTree.Element, URL handle).
See _download_webpage docstring for arguments specification.
"""
res = self._download_webpage_handle(
url_or_request, video_id, note, errnote, fatal=fatal,
encoding=encoding, data=data, headers=headers, query=query,
expected_status=expected_status)
if res is False:
return res
xml_string, urlh = res
return self._parse_xml(
xml_string, video_id, transform_source=transform_source,
fatal=fatal), urlh
def _download_xml(
self, url_or_request, video_id,
note='Downloading XML', errnote='Unable to download XML',
transform_source=None, fatal=True, encoding=None,
data=None, headers={}, query={}, expected_status=None):
"""
Return the xml as an xml.etree.ElementTree.Element.
See _download_webpage docstring for arguments specification.
"""
res = self._download_xml_handle(
url_or_request, video_id, note=note, errnote=errnote,
transform_source=transform_source, fatal=fatal, encoding=encoding,
data=data, headers=headers, query=query,
expected_status=expected_status)
return res if res is False else res[0]
def _parse_xml(self, xml_string, video_id, transform_source=None, fatal=True):
if transform_source:
xml_string = transform_source(xml_string)
@ -936,126 +996,101 @@ class InfoExtractor:
else:
self.report_warning(errmsg + str(ve))
def _parse_json(self, json_string, video_id, transform_source=None, fatal=True, **parser_kwargs):
def _download_json_handle(
self, url_or_request, video_id, note='Downloading JSON metadata',
errnote='Unable to download JSON metadata', transform_source=None,
fatal=True, encoding=None, data=None, headers={}, query={},
expected_status=None):
"""
Return a tuple (JSON object, URL handle).
See _download_webpage docstring for arguments specification.
"""
res = self._download_webpage_handle(
url_or_request, video_id, note, errnote, fatal=fatal,
encoding=encoding, data=data, headers=headers, query=query,
expected_status=expected_status)
if res is False:
return res
json_string, urlh = res
return self._parse_json(
json_string, video_id, transform_source=transform_source,
fatal=fatal), urlh
def _download_json(
self, url_or_request, video_id, note='Downloading JSON metadata',
errnote='Unable to download JSON metadata', transform_source=None,
fatal=True, encoding=None, data=None, headers={}, query={},
expected_status=None):
"""
Return the JSON object as a dict.
See _download_webpage docstring for arguments specification.
"""
res = self._download_json_handle(
url_or_request, video_id, note=note, errnote=errnote,
transform_source=transform_source, fatal=fatal, encoding=encoding,
data=data, headers=headers, query=query,
expected_status=expected_status)
return res if res is False else res[0]
def _parse_json(self, json_string, video_id, transform_source=None, fatal=True):
if transform_source:
json_string = transform_source(json_string)
try:
return json.loads(
json_string, cls=LenientJSONDecoder, strict=False, transform_source=transform_source, **parser_kwargs)
return json.loads(json_string, strict=False)
except ValueError as ve:
errmsg = f'{video_id}: Failed to parse JSON'
errmsg = '%s: Failed to parse JSON ' % video_id
if fatal:
raise ExtractorError(errmsg, cause=ve)
else:
self.report_warning(f'{errmsg}: {ve}')
self.report_warning(errmsg + str(ve))
def _parse_socket_response_as_json(self, data, video_id, transform_source=None, fatal=True):
return self._parse_json(
data[data.find('{'):data.rfind('}') + 1],
video_id, transform_source, fatal)
def __create_download_methods(name, parser, note, errnote, return_value):
def parse(ie, content, *args, **kwargs):
if parser is None:
return content
# parser is fetched by name so subclasses can override it
return getattr(ie, parser)(content, *args, **kwargs)
def download_handle(self, url_or_request, video_id, note=note, errnote=errnote, transform_source=None,
fatal=True, encoding=None, data=None, headers={}, query={}, expected_status=None):
res = self._download_webpage_handle(
url_or_request, video_id, note=note, errnote=errnote, fatal=fatal, encoding=encoding,
data=data, headers=headers, query=query, expected_status=expected_status)
if res is False:
return res
content, urlh = res
return parse(self, content, video_id, transform_source=transform_source, fatal=fatal), urlh
def download_content(self, url_or_request, video_id, note=note, errnote=errnote, transform_source=None,
fatal=True, encoding=None, data=None, headers={}, query={}, expected_status=None):
if self.get_param('load_pages'):
url_or_request = self._create_request(url_or_request, data, headers, query)
filename = self._request_dump_filename(url_or_request.full_url, video_id)
self.to_screen(f'Loading request from {filename}')
try:
with open(filename, 'rb') as dumpf:
webpage_bytes = dumpf.read()
except OSError as e:
self.report_warning(f'Unable to load request from disk: {e}')
else:
content = self.__decode_webpage(webpage_bytes, encoding, url_or_request.headers)
return parse(self, content, video_id, transform_source, fatal)
kwargs = {
'note': note,
'errnote': errnote,
'transform_source': transform_source,
'fatal': fatal,
'encoding': encoding,
'data': data,
'headers': headers,
'query': query,
'expected_status': expected_status,
}
if parser is None:
kwargs.pop('transform_source')
# The method is fetched by name so subclasses can override _download_..._handle
res = getattr(self, download_handle.__name__)(url_or_request, video_id, **kwargs)
return res if res is False else res[0]
def impersonate(func, name, return_value):
func.__name__, func.__qualname__ = name, f'InfoExtractor.{name}'
func.__doc__ = f'''
@param transform_source Apply this transformation before parsing
@returns {return_value}
See _download_webpage_handle docstring for other arguments specification
'''
impersonate(download_handle, f'_download_{name}_handle', f'({return_value}, URL handle)')
impersonate(download_content, f'_download_{name}', f'{return_value}')
return download_handle, download_content
_download_xml_handle, _download_xml = __create_download_methods(
'xml', '_parse_xml', 'Downloading XML', 'Unable to download XML', 'xml as an xml.etree.ElementTree.Element')
_download_json_handle, _download_json = __create_download_methods(
'json', '_parse_json', 'Downloading JSON metadata', 'Unable to download JSON metadata', 'JSON object as a dict')
_download_socket_json_handle, _download_socket_json = __create_download_methods(
'socket_json', '_parse_socket_response_as_json', 'Polling socket', 'Unable to poll socket', 'JSON object as a dict')
__download_webpage = __create_download_methods('webpage', None, None, None, 'data of the page as a string')[1]
def _download_webpage(
self, url_or_request, video_id, note=None, errnote=None,
fatal=True, tries=1, timeout=NO_DEFAULT, *args, **kwargs):
def _download_socket_json_handle(
self, url_or_request, video_id, note='Polling socket',
errnote='Unable to poll socket', transform_source=None,
fatal=True, encoding=None, data=None, headers={}, query={},
expected_status=None):
"""
Return the data of the page as a string.
Return a tuple (JSON object, URL handle).
Keyword arguments:
tries -- number of tries
timeout -- sleep interval between tries
See _download_webpage_handle docstring for other arguments specification.
See _download_webpage docstring for arguments specification.
"""
res = self._download_webpage_handle(
url_or_request, video_id, note, errnote, fatal=fatal,
encoding=encoding, data=data, headers=headers, query=query,
expected_status=expected_status)
if res is False:
return res
webpage, urlh = res
return self._parse_socket_response_as_json(
webpage, video_id, transform_source=transform_source,
fatal=fatal), urlh
R''' # NB: These are unused; should they be deprecated?
if tries != 1:
self._downloader.deprecation_warning('tries argument is deprecated in InfoExtractor._download_webpage')
if timeout is NO_DEFAULT:
timeout = 5
else:
self._downloader.deprecation_warning('timeout argument is deprecated in InfoExtractor._download_webpage')
'''
def _download_socket_json(
self, url_or_request, video_id, note='Polling socket',
errnote='Unable to poll socket', transform_source=None,
fatal=True, encoding=None, data=None, headers={}, query={},
expected_status=None):
"""
Return the JSON object as a dict.
try_count = 0
while True:
try:
return self.__download_webpage(url_or_request, video_id, note, errnote, None, fatal, *args, **kwargs)
except compat_http_client.IncompleteRead as e:
try_count += 1
if try_count >= tries:
raise e
self._sleep(timeout, video_id)
See _download_webpage docstring for arguments specification.
"""
res = self._download_socket_json_handle(
url_or_request, video_id, note=note, errnote=errnote,
transform_source=transform_source, fatal=fatal, encoding=encoding,
data=data, headers=headers, query=query,
expected_status=expected_status)
return res if res is False else res[0]
def report_warning(self, msg, video_id=None, *args, only_once=False, **kwargs):
idstr = format_field(video_id, None, '%s: ')
idstr = format_field(video_id, template='%s: ')
msg = f'[{self.IE_NAME}] {idstr}{msg}'
if only_once:
if f'WARNING: {msg}' in self._printed_messages:
@ -1101,7 +1136,7 @@ class InfoExtractor:
self.get_param('ignore_no_formats_error') or self.get_param('wait_for_video')):
self.report_warning(msg)
return
msg += format_field(self._login_hint(method), None, '. %s')
msg += format_field(self._login_hint(method), template='. %s')
raise ExtractorError(msg, expected=True)
def raise_geo_restricted(
@ -1193,33 +1228,6 @@ class InfoExtractor:
self.report_warning('unable to extract %s' % _name + bug_reports_message())
return None
def _search_json(self, start_pattern, string, name, video_id, *, end_pattern='',
contains_pattern='(?s:.+)', fatal=True, default=NO_DEFAULT, **kwargs):
"""Searches string for the JSON object specified by start_pattern"""
# NB: end_pattern is only used to reduce the size of the initial match
if default is NO_DEFAULT:
default, has_default = {}, False
else:
fatal, has_default = False, True
json_string = self._search_regex(
rf'{start_pattern}\s*(?P<json>{{\s*{contains_pattern}\s*}})\s*{end_pattern}',
string, name, group='json', fatal=fatal, default=None if has_default else NO_DEFAULT)
if not json_string:
return default
_name = self._downloader._format_err(name, self._downloader.Styles.EMPHASIS)
try:
return self._parse_json(json_string, video_id, ignore_extra=True, **kwargs)
except ExtractorError as e:
if fatal:
raise ExtractorError(
f'Unable to extract {_name} - Failed to parse JSON', cause=e.cause, video_id=video_id)
elif not has_default:
self.report_warning(
f'Unable to extract {_name} - Failed to parse JSON: {e}', video_id=video_id)
return default
def _html_search_regex(self, pattern, string, name, default=NO_DEFAULT, fatal=True, flags=0, group=None):
"""
Like _search_regex, but strips HTML tags and unescapes entities.
@ -1443,10 +1451,6 @@ class InfoExtractor:
'ViewAction': 'view',
}
def is_type(e, *expected_types):
type = variadic(traverse_obj(e, '@type'))
return any(x in type for x in expected_types)
def extract_interaction_type(e):
interaction_type = e.get('interactionType')
if isinstance(interaction_type, dict):
@ -1460,7 +1464,9 @@ class InfoExtractor:
if not isinstance(interaction_statistic, list):
return
for is_e in interaction_statistic:
if not is_type(is_e, 'InteractionCounter'):
if not isinstance(is_e, dict):
continue
if is_e.get('@type') != 'InteractionCounter':
continue
interaction_type = extract_interaction_type(is_e)
if not interaction_type:
@ -1497,10 +1503,10 @@ class InfoExtractor:
info['chapters'] = chapters
def extract_video_object(e):
assert is_type(e, 'VideoObject')
assert e['@type'] == 'VideoObject'
author = e.get('author')
info.update({
'url': traverse_obj(e, 'contentUrl', 'embedUrl', expected_type=url_or_none),
'url': url_or_none(e.get('contentUrl')),
'title': unescapeHTML(e.get('name')),
'description': unescapeHTML(e.get('description')),
'thumbnails': [{'url': url}
@ -1513,7 +1519,7 @@ class InfoExtractor:
# however some websites are using 'Text' type instead.
# 1. https://schema.org/VideoObject
'uploader': author.get('name') if isinstance(author, dict) else author if isinstance(author, compat_str) else None,
'filesize': int_or_none(float_or_none(e.get('contentSize'))),
'filesize': float_or_none(e.get('contentSize')),
'tbr': int_or_none(e.get('bitrate')),
'width': int_or_none(e.get('width')),
'height': int_or_none(e.get('height')),
@ -1529,12 +1535,13 @@ class InfoExtractor:
if at_top_level and set(e.keys()) == {'@context', '@graph'}:
traverse_json_ld(variadic(e['@graph'], allowed_types=(dict,)), at_top_level=False)
break
if expected_type is not None and not is_type(e, expected_type):
item_type = e.get('@type')
if expected_type is not None and expected_type != item_type:
continue
rating = traverse_obj(e, ('aggregateRating', 'ratingValue'), expected_type=float_or_none)
if rating is not None:
info['average_rating'] = rating
if is_type(e, 'TVEpisode', 'Episode'):
if item_type in ('TVEpisode', 'Episode'):
episode_name = unescapeHTML(e.get('name'))
info.update({
'episode': episode_name,
@ -1544,39 +1551,37 @@ class InfoExtractor:
if not info.get('title') and episode_name:
info['title'] = episode_name
part_of_season = e.get('partOfSeason')
if is_type(part_of_season, 'TVSeason', 'Season', 'CreativeWorkSeason'):
if isinstance(part_of_season, dict) and part_of_season.get('@type') in ('TVSeason', 'Season', 'CreativeWorkSeason'):
info.update({
'season': unescapeHTML(part_of_season.get('name')),
'season_number': int_or_none(part_of_season.get('seasonNumber')),
})
part_of_series = e.get('partOfSeries') or e.get('partOfTVSeries')
if is_type(part_of_series, 'TVSeries', 'Series', 'CreativeWorkSeries'):
if isinstance(part_of_series, dict) and part_of_series.get('@type') in ('TVSeries', 'Series', 'CreativeWorkSeries'):
info['series'] = unescapeHTML(part_of_series.get('name'))
elif is_type(e, 'Movie'):
elif item_type == 'Movie':
info.update({
'title': unescapeHTML(e.get('name')),
'description': unescapeHTML(e.get('description')),
'duration': parse_duration(e.get('duration')),
'timestamp': unified_timestamp(e.get('dateCreated')),
})
elif is_type(e, 'Article', 'NewsArticle'):
elif item_type in ('Article', 'NewsArticle'):
info.update({
'timestamp': parse_iso8601(e.get('datePublished')),
'title': unescapeHTML(e.get('headline')),
'description': unescapeHTML(e.get('articleBody') or e.get('description')),
})
if is_type(traverse_obj(e, ('video', 0)), 'VideoObject'):
if traverse_obj(e, ('video', 0, '@type')) == 'VideoObject':
extract_video_object(e['video'][0])
elif is_type(traverse_obj(e, ('subjectOf', 0)), 'VideoObject'):
extract_video_object(e['subjectOf'][0])
elif is_type(e, 'VideoObject'):
elif item_type == 'VideoObject':
extract_video_object(e)
if expected_type is None:
continue
else:
break
video = e.get('video')
if is_type(video, 'VideoObject'):
if isinstance(video, dict) and video.get('@type') == 'VideoObject':
extract_video_object(video)
if expected_type is None:
continue
@ -1593,13 +1598,15 @@ class InfoExtractor:
webpage, 'next.js data', fatal=fatal, **kw),
video_id, transform_source=transform_source, fatal=fatal)
def _search_nuxt_data(self, webpage, video_id, context_name='__NUXT__', *, fatal=True, traverse=('data', 0)):
"""Parses Nuxt.js metadata. This works as long as the function __NUXT__ invokes is a pure function"""
def _search_nuxt_data(self, webpage, video_id, context_name='__NUXT__'):
''' Parses Nuxt.js metadata. This works as long as the function __NUXT__ invokes is a pure function. '''
# not all website do this, but it can be changed
# https://stackoverflow.com/questions/67463109/how-to-change-or-hide-nuxt-and-nuxt-keyword-in-page-source
rectx = re.escape(context_name)
FUNCTION_RE = r'\(function\((?P<arg_keys>.*?)\){return\s+(?P<js>{.*?})\s*;?\s*}\((?P<arg_vals>.*?)\)'
js, arg_keys, arg_vals = self._search_regex(
(rf'<script>\s*window\.{rectx}={FUNCTION_RE}\s*\)\s*;?\s*</script>', rf'{rectx}\(.*?{FUNCTION_RE}'),
webpage, context_name, group=('js', 'arg_keys', 'arg_vals'), fatal=fatal)
(r'<script>window\.%s=\(function\((?P<arg_keys>.*?)\)\{return\s(?P<js>\{.*?\})\}\((?P<arg_vals>.+?)\)\);?</script>' % rectx,
r'%s\(.*?\(function\((?P<arg_keys>.*?)\)\{return\s(?P<js>\{.*?\})\}\((?P<arg_vals>.*?)\)' % rectx),
webpage, context_name, group=['js', 'arg_keys', 'arg_vals'])
args = dict(zip(arg_keys.split(','), arg_vals.split(',')))
@ -1607,8 +1614,7 @@ class InfoExtractor:
if val in ('undefined', 'void 0'):
args[key] = 'null'
ret = self._parse_json(js, video_id, transform_source=functools.partial(js_to_json, vars=args), fatal=fatal)
return traverse_obj(ret, traverse) or {}
return self._parse_json(js_to_json(js, args), video_id)['data'][0]
@staticmethod
def _hidden_inputs(html):
@ -3184,8 +3190,7 @@ class InfoExtractor:
return f
return {}
def _media_formats(src, cur_media_type, type_info=None):
type_info = type_info or {}
def _media_formats(src, cur_media_type, type_info={}):
full_url = absolute_url(src)
ext = type_info.get('ext') or determine_ext(full_url)
if ext == 'm3u8':
@ -3203,7 +3208,6 @@ class InfoExtractor:
formats = [{
'url': full_url,
'vcodec': 'none' if cur_media_type == 'audio' else None,
'ext': ext,
}]
return is_plain_url, formats
@ -3230,8 +3234,7 @@ class InfoExtractor:
media_attributes = extract_attributes(media_tag)
src = strip_or_none(media_attributes.get('src'))
if src:
f = parse_content_type(media_attributes.get('type'))
_, formats = _media_formats(src, media_type, f)
_, formats = _media_formats(src, media_type)
media_info['formats'].extend(formats)
media_info['thumbnail'] = absolute_url(media_attributes.get('poster'))
if media_content:
@ -3599,7 +3602,9 @@ class InfoExtractor:
def _get_cookies(self, url):
""" Return a compat_cookies_SimpleCookie with the cookies for the url """
return compat_cookies_SimpleCookie(self._downloader._calc_cookies(url))
req = sanitized_Request(url)
self._downloader.cookiejar.add_cookie_header(req)
return compat_cookies_SimpleCookie(req.get_header('Cookie'))
def _apply_first_set_cookie_header(self, url_handle, cookie):
"""
@ -3743,7 +3748,7 @@ class InfoExtractor:
def _get_automatic_captions(self, *args, **kwargs):
raise NotImplementedError('This method must be implemented by subclasses')
@functools.cached_property
@property
def _cookies_passed(self):
"""Whether cookies have been passed to YoutubeDL"""
return self.get_param('cookiefile') is not None or self.get_param('cookiesfrombrowser') is not None

View File

@ -728,12 +728,11 @@ class CrunchyrollBetaBaseIE(CrunchyrollBaseIE):
headers={
'Authorization': auth_response['token_type'] + ' ' + auth_response['access_token']
})
cms = traverse_obj(policy_response, 'cms_beta', 'cms')
bucket = cms['bucket']
bucket = policy_response['cms']['bucket']
params = {
'Policy': cms['policy'],
'Signature': cms['signature'],
'Key-Pair-Id': cms['key_pair_id']
'Policy': policy_response['cms']['policy'],
'Signature': policy_response['cms']['signature'],
'Key-Pair-Id': policy_response['cms']['key_pair_id']
}
locale = traverse_obj(initial_state, ('localization', 'locale'))
if locale:

View File

@ -23,11 +23,6 @@ class CuriosityStreamBaseIE(InfoExtractor):
def _call_api(self, path, video_id, query=None):
headers = {}
if not self._auth_token:
auth_cookie = self._get_cookies('https://curiositystream.com').get('auth_token')
if auth_cookie:
self.write_debug('Obtained auth_token cookie')
self._auth_token = auth_cookie.value
if self._auth_token:
headers['X-Auth-Token'] = self._auth_token
result = self._download_json(

View File

@ -5,15 +5,13 @@ import re
from .common import InfoExtractor
from ..compat import compat_HTTPError
from ..utils import (
ExtractorError,
OnDemandPagedList,
age_restricted,
clean_html,
ExtractorError,
int_or_none,
traverse_obj,
OnDemandPagedList,
try_get,
unescapeHTML,
unsmuggle_url,
urlencode_postdata,
)
@ -222,7 +220,6 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
return urls
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url)
video_id, playlist_id = self._match_valid_url(url).groups()
if playlist_id:
@ -255,7 +252,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
metadata = self._download_json(
'https://www.dailymotion.com/player/metadata/video/' + xid,
xid, 'Downloading metadata JSON',
query=traverse_obj(smuggled_data, 'query') or {'app': 'com.dailymotion.neon'})
query={'app': 'com.dailymotion.neon'})
error = metadata.get('error')
if error:

View File

@ -1,114 +0,0 @@
from .common import InfoExtractor
from ..utils import (
determine_ext,
float_or_none,
join_nonempty,
traverse_obj,
url_or_none,
)
class DailyWireBaseIE(InfoExtractor):
_JSON_PATH = {
'episode': ('props', 'pageProps', 'episodeData', 'episode'),
'videos': ('props', 'pageProps', 'videoData', 'video'),
'podcasts': ('props', 'pageProps', 'episode'),
}
def _get_json(self, url):
sites_type, slug = self._match_valid_url(url).group('sites_type', 'id')
json_data = self._search_nextjs_data(self._download_webpage(url, slug), slug)
return slug, traverse_obj(json_data, self._JSON_PATH[sites_type])
class DailyWireIE(DailyWireBaseIE):
_VALID_URL = r'https?://(?:www\.)dailywire(?:\.com)/(?P<sites_type>episode|videos)/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://www.dailywire.com/episode/1-fauci',
'info_dict': {
'id': 'ckzsl50xnqpy30850in3v4bu7',
'ext': 'mp4',
'display_id': '1-fauci',
'title': '1. Fauci',
'description': 'md5:9df630347ef85081b7e97dd30bc22853',
'thumbnail': 'https://daily-wire-production.imgix.net/episodes/ckzsl50xnqpy30850in3v4bu7/ckzsl50xnqpy30850in3v4bu7-1648237399554.jpg',
'creator': 'Caroline Roberts',
'series_id': 'ckzplm0a097fn0826r2vc3j7h',
'series': 'China: The Enemy Within',
}
}, {
'url': 'https://www.dailywire.com/episode/ep-124-bill-maher',
'info_dict': {
'id': 'cl0ngbaalplc80894sfdo9edf',
'ext': 'mp3',
'display_id': 'ep-124-bill-maher',
'title': 'Ep. 124 - Bill Maher',
'thumbnail': 'https://daily-wire-production.imgix.net/episodes/cl0ngbaalplc80894sfdo9edf/cl0ngbaalplc80894sfdo9edf-1647065568518.jpg',
'creator': 'Caroline Roberts',
'description': 'md5:adb0de584bcfa9c41374999d9e324e98',
'series_id': 'cjzvep7270hp00786l9hwccob',
'series': 'The Sunday Special',
}
}, {
'url': 'https://www.dailywire.com/videos/the-hyperions',
'only_matching': True,
}]
def _real_extract(self, url):
slug, episode_info = self._get_json(url)
urls = traverse_obj(
episode_info, (('segments', 'videoUrl'), ..., ('video', 'audio')), expected_type=url_or_none)
formats, subtitles = [], {}
for url in urls:
if determine_ext(url) != 'm3u8':
formats.append({'url': url})
continue
format_, subs_ = self._extract_m3u8_formats_and_subtitles(url, slug)
formats.extend(format_)
self._merge_subtitles(subs_, target=subtitles)
self._sort_formats(formats)
return {
'id': episode_info['id'],
'display_id': slug,
'title': traverse_obj(episode_info, 'title', 'name'),
'description': episode_info.get('description'),
'creator': join_nonempty(('createdBy', 'firstName'), ('createdBy', 'lastName'), from_dict=episode_info, delim=' '),
'duration': float_or_none(episode_info.get('duration')),
'is_live': episode_info.get('isLive'),
'thumbnail': traverse_obj(episode_info, 'thumbnail', 'image', expected_type=url_or_none),
'formats': formats,
'subtitles': subtitles,
'series_id': traverse_obj(episode_info, ('show', 'id')),
'series': traverse_obj(episode_info, ('show', 'name')),
}
class DailyWirePodcastIE(DailyWireBaseIE):
_VALID_URL = r'https?://(?:www\.)dailywire(?:\.com)/(?P<sites_type>podcasts)/(?P<podcaster>[\w-]+/(?P<id>[\w-]+))'
_TESTS = [{
'url': 'https://www.dailywire.com/podcasts/morning-wire/get-ready-for-recession-6-15-22',
'info_dict': {
'id': 'cl4f01d0w8pbe0a98ydd0cfn1',
'ext': 'm4a',
'display_id': 'get-ready-for-recession-6-15-22',
'title': 'Get Ready for Recession | 6.15.22',
'description': 'md5:c4afbadda4e1c38a4496f6d62be55634',
'thumbnail': 'https://daily-wire-production.imgix.net/podcasts/ckx4otgd71jm508699tzb6hf4-1639506575562.jpg',
'duration': 900.117667,
}
}]
def _real_extract(self, url):
slug, episode_info = self._get_json(url)
audio_id = traverse_obj(episode_info, 'audioMuxPlaybackId', 'VUsAipTrBVSgzw73SpC2DAJD401TYYwEp')
return {
'id': episode_info['id'],
'url': f'https://stream.media.dailywire.com/{audio_id}/audio.m4a',
'display_id': slug,
'title': episode_info.get('title'),
'duration': float_or_none(episode_info.get('duration')),
'thumbnail': episode_info.get('thumbnail'),
'description': episode_info.get('description'),
}

View File

@ -86,7 +86,7 @@ class DigitalConcertHallIE(InfoExtractor):
})
m3u8_url = traverse_obj(
stream_info, ('channel', lambda k, _: k.startswith('vod_mixed'), 'stream', 0, 'url'), get_all=False)
stream_info, ('channel', lambda x: x.startswith('vod_mixed'), 'stream', 0, 'url'), get_all=False)
formats = self._extract_m3u8_formats(m3u8_url, video_id, 'mp4', 'm3u8_native', fatal=False)
self._sort_formats(formats)

View File

@ -53,8 +53,8 @@ class DropboxIE(InfoExtractor):
else:
raise ExtractorError('Password protected video, use --video-password <password>', expected=True)
info_json = self._search_json(r'InitReact\.mountComponent\(.*?,', webpage, 'mountComponent', video_id,
contains_pattern=r'.+?"preview".+?', end_pattern=r'\)')['props']
json_string = self._html_search_regex(r'InitReact\.mountComponent\(.*?,\s*(\{.+\})\s*?\)', webpage, 'Info JSON')
info_json = self._parse_json(json_string, video_id).get('props')
transcode_url = traverse_obj(info_json, ((None, 'preview'), 'file', 'preview', 'content', 'transcode_url'), get_all=False)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(transcode_url, video_id)

View File

@ -1,8 +1,8 @@
from .common import InfoExtractor
from .vimeo import VHXEmbedIE
from ..utils import (
ExtractorError,
clean_html,
ExtractorError,
get_element_by_class,
get_element_by_id,
get_elements_by_class,
@ -96,12 +96,11 @@ class DropoutIE(InfoExtractor):
def _login(self, display_id):
username, password = self._get_login_info()
if not username:
return True
if not (username and password):
self.raise_login_required(method='password')
response = self._download_webpage(
self._LOGIN_URL, display_id, note='Logging in', fatal=False,
data=urlencode_postdata({
self._LOGIN_URL, display_id, note='Logging in', data=urlencode_postdata({
'email': username,
'password': password,
'authenticity_token': self._get_authenticity_token(display_id),
@ -111,25 +110,19 @@ class DropoutIE(InfoExtractor):
user_has_subscription = self._search_regex(
r'user_has_subscription:\s*["\'](.+?)["\']', response, 'subscription status', default='none')
if user_has_subscription.lower() == 'true':
return
return response
elif user_has_subscription.lower() == 'false':
return 'Account is not subscribed'
raise ExtractorError('Account is not subscribed')
else:
return 'Incorrect username/password'
raise ExtractorError('Incorrect username/password')
def _real_extract(self, url):
display_id = self._match_id(url)
login_err, webpage = False, ''
try:
login_err = self._login(display_id)
webpage = self._download_webpage(url, display_id)
self._login(display_id)
webpage = self._download_webpage(url, display_id, note='Downloading video webpage')
finally:
if not login_err:
self._download_webpage('https://www.dropout.tv/logout', display_id, note='Logging out', fatal=False)
elif '<div id="watch-unauthorized"' in webpage:
if login_err is True:
self.raise_login_required(method='password')
raise ExtractorError(login_err, expected=True)
self._download_webpage('https://www.dropout.tv/logout', display_id, note='Logging out', fatal=False)
embed_url = self._search_regex(r'embed_url:\s*["\'](.+?)["\']', webpage, 'embed url')
thumbnail = self._og_search_thumbnail(webpage)

View File

@ -51,39 +51,31 @@ def _get_element_by_tag_and_attrib(html, tag=None, attribute=None, value=None, e
class DubokuIE(InfoExtractor):
IE_NAME = 'duboku'
IE_DESC = 'www.duboku.io'
IE_DESC = 'www.duboku.co'
_VALID_URL = r'(?:https?://[^/]+\.duboku\.io/vodplay/)(?P<id>[0-9]+-[0-9-]+)\.html.*'
_VALID_URL = r'(?:https?://[^/]+\.duboku\.co/vodplay/)(?P<id>[0-9]+-[0-9-]+)\.html.*'
_TESTS = [{
'url': 'https://w.duboku.io/vodplay/1575-1-1.html',
'url': 'https://www.duboku.co/vodplay/1575-1-1.html',
'info_dict': {
'id': '1575-1-1',
'ext': 'mp4',
'ext': 'ts',
'series': '白色月光',
'title': 'contains:白色月光',
'season_number': 1,
'episode_number': 1,
'season': 'Season 1',
'episode_id': '1',
'season_id': '1',
'episode': 'Episode 1',
},
'params': {
'skip_download': 'm3u8 download',
},
}, {
'url': 'https://w.duboku.io/vodplay/1588-1-1.html',
'url': 'https://www.duboku.co/vodplay/1588-1-1.html',
'info_dict': {
'id': '1588-1-1',
'ext': 'mp4',
'ext': 'ts',
'series': '亲爱的自己',
'title': 'contains:第1集',
'title': 'contains:预告片',
'season_number': 1,
'episode_number': 1,
'episode': 'Episode 1',
'season': 'Season 1',
'episode_id': '1',
'season_id': '1',
},
'params': {
'skip_download': 'm3u8 download',
@ -99,7 +91,7 @@ class DubokuIE(InfoExtractor):
season_id = temp[1]
episode_id = temp[2]
webpage_url = 'https://w.duboku.io/vodplay/%s.html' % video_id
webpage_url = 'https://www.duboku.co/vodplay/%s.html' % video_id
webpage_html = self._download_webpage(webpage_url, video_id)
# extract video url
@ -132,13 +124,12 @@ class DubokuIE(InfoExtractor):
data_from = player_data.get('from')
# if it is an embedded iframe, maybe it's an external source
headers = {'Referer': webpage_url}
if data_from == 'iframe':
# use _type url_transparent to retain the meaningful details
# of the video.
return {
'_type': 'url_transparent',
'url': smuggle_url(data_url, {'http_headers': headers}),
'url': smuggle_url(data_url, {'http_headers': {'Referer': webpage_url}}),
'id': video_id,
'title': title,
'series': series_title,
@ -148,7 +139,7 @@ class DubokuIE(InfoExtractor):
'episode_id': episode_id,
}
formats = self._extract_m3u8_formats(data_url, video_id, 'mp4', headers=headers)
formats = self._extract_m3u8_formats(data_url, video_id, 'mp4')
return {
'id': video_id,
@ -159,29 +150,36 @@ class DubokuIE(InfoExtractor):
'episode_number': int_or_none(episode_id),
'episode_id': episode_id,
'formats': formats,
'http_headers': headers
'http_headers': {'Referer': 'https://www.duboku.co/static/player/videojs.html'}
}
class DubokuPlaylistIE(InfoExtractor):
IE_NAME = 'duboku:list'
IE_DESC = 'www.duboku.io entire series'
IE_DESC = 'www.duboku.co entire series'
_VALID_URL = r'(?:https?://[^/]+\.duboku\.io/voddetail/)(?P<id>[0-9]+)\.html.*'
_VALID_URL = r'(?:https?://[^/]+\.duboku\.co/voddetail/)(?P<id>[0-9]+)\.html.*'
_TESTS = [{
'url': 'https://w.duboku.io/voddetail/1575.html',
'url': 'https://www.duboku.co/voddetail/1575.html',
'info_dict': {
'id': 'startswith:1575',
'title': '白色月光',
},
'playlist_count': 12,
}, {
'url': 'https://w.duboku.io/voddetail/1554.html',
'url': 'https://www.duboku.co/voddetail/1554.html',
'info_dict': {
'id': 'startswith:1554',
'title': '以家人之名',
},
'playlist_mincount': 30,
}, {
'url': 'https://www.duboku.co/voddetail/1554.html#playlist2',
'info_dict': {
'id': '1554#playlist2',
'title': '以家人之名',
},
'playlist_mincount': 27,
}]
def _real_extract(self, url):
@ -191,7 +189,7 @@ class DubokuPlaylistIE(InfoExtractor):
series_id = mobj.group('id')
fragment = compat_urlparse.urlparse(url).fragment
webpage_url = 'https://w.duboku.io/voddetail/%s.html' % series_id
webpage_url = 'https://www.duboku.co/voddetail/%s.html' % series_id
webpage_html = self._download_webpage(webpage_url, series_id)
# extract title
@ -236,6 +234,6 @@ class DubokuPlaylistIE(InfoExtractor):
# return url results
return self.playlist_result([
self.url_result(
compat_urlparse.urljoin('https://w.duboku.io', x['href']),
compat_urlparse.urljoin('https://www.duboku.co', x['href']),
ie=DubokuIE.ie_key(), video_title=x.get('title'))
for x in playlist], series_id + '#' + playlist_id, title)

View File

@ -1,11 +1,8 @@
import base64
import json
import re
import urllib
from .common import InfoExtractor
from .adobepass import AdobePassIE
from .once import OnceIE
from ..compat import compat_str
from ..utils import (
determine_ext,
dict_get,
@ -27,6 +24,7 @@ class ESPNIE(OnceIE):
(?:
(?:
video/(?:clip|iframe/twitter)|
watch/player
)
(?:
.*?\?.*?\bid=|
@ -49,8 +47,6 @@ class ESPNIE(OnceIE):
'description': 'md5:39370c2e016cb4ecf498ffe75bef7f0f',
'timestamp': 1390936111,
'upload_date': '20140128',
'duration': 1302,
'thumbnail': r're:https://.+\.jpg',
},
'params': {
'skip_download': True,
@ -75,6 +71,15 @@ class ESPNIE(OnceIE):
}, {
'url': 'https://cdn.espn.go.com/video/clip/_/id/19771774',
'only_matching': True,
}, {
'url': 'http://www.espn.com/watch/player?id=19141491',
'only_matching': True,
}, {
'url': 'http://www.espn.com/watch/player?bucketId=257&id=19505875',
'only_matching': True,
}, {
'url': 'http://www.espn.com/watch/player/_/id/19141491',
'only_matching': True,
}, {
'url': 'http://www.espn.com/video/clip?id=10365079',
'only_matching': True,
@ -93,13 +98,7 @@ class ESPNIE(OnceIE):
}, {
'url': 'http://www.espn.com/espnw/video/26066627/arkansas-gibson-completes-hr-cycle-four-innings',
'only_matching': True,
}, {
'url': 'http://www.espn.com/watch/player?id=19141491',
'only_matching': True,
}, {
'url': 'http://www.espn.com/watch/player?bucketId=257&id=19505875',
'only_matching': True,
}, ]
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@ -117,7 +116,7 @@ class ESPNIE(OnceIE):
for source_id, source in source.items():
if source_id == 'alert':
continue
elif isinstance(source, str):
elif isinstance(source, compat_str):
extract_source(source, base_source_id)
elif isinstance(source, dict):
traverse_source(
@ -197,7 +196,7 @@ class ESPNArticleIE(InfoExtractor):
@classmethod
def suitable(cls, url):
return False if (ESPNIE.suitable(url) or WatchESPNIE.suitable(url)) else super(ESPNArticleIE, cls).suitable(url)
return False if ESPNIE.suitable(url) else super(ESPNArticleIE, cls).suitable(url)
def _real_extract(self, url):
video_id = self._match_id(url)
@ -278,119 +277,3 @@ class ESPNCricInfoIE(InfoExtractor):
'formats': formats,
'subtitles': subtitles,
}
class WatchESPNIE(AdobePassIE):
_VALID_URL = r'https://www.espn.com/watch/player/_/id/(?P<id>[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})'
_TESTS = [{
'url': 'https://www.espn.com/watch/player/_/id/ba7d17da-453b-4697-bf92-76a99f61642b',
'info_dict': {
'id': 'ba7d17da-453b-4697-bf92-76a99f61642b',
'ext': 'mp4',
'title': 'Serbia vs. Turkey',
'thumbnail': 'https://artwork.api.espn.com/artwork/collections/media/ba7d17da-453b-4697-bf92-76a99f61642b/default?width=640&apikey=1ngjw23osgcis1i1vbj96lmfqs',
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.espn.com/watch/player/_/id/4e9b5bd1-4ceb-4482-9d28-1dd5f30d2f34',
'info_dict': {
'id': '4e9b5bd1-4ceb-4482-9d28-1dd5f30d2f34',
'ext': 'mp4',
'title': 'Real Madrid vs. Real Betis (LaLiga)',
'thumbnail': 'https://s.secure.espncdn.com/stitcher/artwork/collections/media/bd1f3d12-0654-47d9-852e-71b85ea695c7/16x9.jpg?timestamp=202201112217&showBadge=true&cb=12&package=ESPN_PLUS',
},
'params': {
'skip_download': True,
},
}]
_API_KEY = 'ZXNwbiZicm93c2VyJjEuMC4w.ptUt7QxsteaRruuPmGZFaJByOoqKvDP2a5YkInHrc7c'
def _call_bamgrid_api(self, path, video_id, payload=None, headers={}):
if 'Authorization' not in headers:
headers['Authorization'] = f'Bearer {self._API_KEY}'
parse = urllib.parse.urlencode if path == 'token' else json.dumps
return self._download_json(
f'https://espn.api.edge.bamgrid.com/{path}', video_id, headers=headers, data=parse(payload).encode())
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._download_json(
f'https://watch-cdn.product.api.espn.com/api/product/v3/watchespn/web/playback/event?id={video_id}',
video_id)['playbackState']
# ESPN+ subscription required, through cookies
if 'DTC' in video_data.get('sourceId'):
cookie = self._get_cookies(url).get('ESPN-ONESITE.WEB-PROD.token')
if not cookie:
self.raise_login_required(method='cookies')
assertion = self._call_bamgrid_api(
'devices', video_id,
headers={'Content-Type': 'application/json; charset=UTF-8'},
payload={
'deviceFamily': 'android',
'applicationRuntime': 'android',
'deviceProfile': 'tv',
'attributes': {},
})['assertion']
token = self._call_bamgrid_api(
'token', video_id, payload={
'subject_token': assertion,
'subject_token_type': 'urn:bamtech:params:oauth:token-type:device',
'platform': 'android',
'grant_type': 'urn:ietf:params:oauth:grant-type:token-exchange'
})['access_token']
assertion = self._call_bamgrid_api(
'accounts/grant', video_id, payload={'id_token': cookie.value.split('|')[1]},
headers={
'Authorization': token,
'Content-Type': 'application/json; charset=UTF-8'
})['assertion']
token = self._call_bamgrid_api(
'token', video_id, payload={
'subject_token': assertion,
'subject_token_type': 'urn:bamtech:params:oauth:token-type:account',
'platform': 'android',
'grant_type': 'urn:ietf:params:oauth:grant-type:token-exchange'
})['access_token']
playback = self._download_json(
video_data['videoHref'].format(scenario='browser~ssai'), video_id,
headers={
'Accept': 'application/vnd.media-service+json; version=5',
'Authorization': token
})
m3u8_url, headers = playback['stream']['complete'][0]['url'], {'authorization': token}
# No login required
elif video_data.get('sourceId') == 'ESPN_FREE':
asset = self._download_json(
f'https://watch.auth.api.espn.com/video/auth/media/{video_id}/asset?apikey=uiqlbgzdwuru14v627vdusswb',
video_id)
m3u8_url, headers = asset['stream'], {}
# TV Provider required
else:
resource = self._get_mvpd_resource('ESPN', video_data['name'], video_id, None)
auth = self._extract_mvpd_auth(url, video_id, 'ESPN', resource).encode()
asset = self._download_json(
f'https://watch.auth.api.espn.com/video/auth/media/{video_id}/asset?apikey=uiqlbgzdwuru14v627vdusswb',
video_id, data=f'adobeToken={urllib.parse.quote_plus(base64.b64encode(auth))}&drmSupport=HLS'.encode())
m3u8_url, headers = asset['stream'], {}
formats, subtitles = self._extract_m3u8_formats_and_subtitles(m3u8_url, video_id, 'mp4', m3u8_id='hls')
self._sort_formats(formats)
return {
'id': video_id,
'title': video_data.get('name'),
'formats': formats,
'subtitles': subtitles,
'thumbnail': video_data.get('posterHref'),
'http_headers': headers,
}

View File

@ -19,10 +19,9 @@ class ExpressenIE(InfoExtractor):
'''
_TESTS = [{
'url': 'https://www.expressen.se/tv/ledare/ledarsnack/ledarsnack-om-arbetslosheten-bland-kvinnor-i-speciellt-utsatta-omraden/',
'md5': 'deb2ca62e7b1dcd19fa18ba37523f66e',
'md5': '2fbbe3ca14392a6b1b36941858d33a45',
'info_dict': {
'id': 'ba90f5a9-78d1-4511-aa02-c177b9c99136',
'display_id': 'ledarsnack-om-arbetslosheten-bland-kvinnor-i-speciellt-utsatta-omraden',
'id': '8690962',
'ext': 'mp4',
'title': 'Ledarsnack: Om arbetslösheten bland kvinnor i speciellt utsatta områden',
'description': 'md5:f38c81ff69f3de4d269bbda012fcbbba',
@ -65,7 +64,7 @@ class ExpressenIE(InfoExtractor):
display_id, transform_source=unescapeHTML)
info = extract_data('video-tracking-info')
video_id = info['contentId']
video_id = info['videoId']
data = extract_data('article-data')
stream = data['stream']

File diff suppressed because it is too large Load Diff

View File

@ -1,7 +1,9 @@
import re
from .common import InfoExtractor
from ..compat import compat_parse_qs
from ..compat import (
compat_parse_qs,
)
from ..dependencies import websockets
from ..utils import (
ExtractorError,
@ -207,7 +209,7 @@ class FC2LiveIE(InfoExtractor):
'User-Agent': self.get_param('http_headers')['User-Agent'],
})
self.write_debug('Sending HLS server request')
self.write_debug('[debug] Sending HLS server request')
while True:
recv = ws.recv()
@ -229,10 +231,13 @@ class FC2LiveIE(InfoExtractor):
if not data or not isinstance(data, dict):
continue
if data.get('name') == '_response_' and data.get('id') == 1:
self.write_debug('Goodbye')
self.write_debug('[debug] Goodbye.')
playlist_data = data
break
self.write_debug('Server said: %s%s' % (recv[:100], '...' if len(recv) > 100 else ''))
elif self._downloader.params.get('verbose', False):
if len(recv) > 100:
recv = recv[:100] + '...'
self.to_screen('[debug] Server said: %s' % recv)
if not playlist_data:
raise ExtractorError('Unable to fetch HLS playlist info via WebSocket')

View File

@ -94,7 +94,7 @@ class FlickrIE(InfoExtractor):
owner = video_info.get('owner', {})
uploader_id = owner.get('nsid')
uploader_path = owner.get('path_alias') or uploader_id
uploader_url = format_field(uploader_path, None, 'https://www.flickr.com/photos/%s/')
uploader_url = format_field(uploader_path, template='https://www.flickr.com/photos/%s/')
return {
'id': video_id,

View File

@ -1,107 +0,0 @@
from .common import InfoExtractor
from ..utils import traverse_obj, unified_timestamp
class FourZeroStudioArchiveIE(InfoExtractor):
_VALID_URL = r'https?://0000\.studio/(?P<uploader_id>[^/]+)/broadcasts/(?P<id>[^/]+)/archive'
IE_NAME = '0000studio:archive'
_TESTS = [{
'url': 'https://0000.studio/mumeijiten/broadcasts/1290f433-fce0-4909-a24a-5f7df09665dc/archive',
'info_dict': {
'id': '1290f433-fce0-4909-a24a-5f7df09665dc',
'title': 'noteで『canape』様へのファンレターを執筆します。数秘術その2',
'timestamp': 1653802534,
'release_timestamp': 1653796604,
'thumbnails': 'count:1',
'comments': 'count:7',
'uploader': '『中崎雄心』の執務室。',
'uploader_id': 'mumeijiten',
}
}]
def _real_extract(self, url):
video_id, uploader_id = self._match_valid_url(url).group('id', 'uploader_id')
webpage = self._download_webpage(url, video_id)
nuxt_data = self._search_nuxt_data(webpage, video_id, traverse=None)
pcb = traverse_obj(nuxt_data, ('ssrRefs', lambda _, v: v['__typename'] == 'PublicCreatorBroadcast'), get_all=False)
uploader_internal_id = traverse_obj(nuxt_data, (
'ssrRefs', lambda _, v: v['__typename'] == 'PublicUser', 'id'), get_all=False)
formats, subs = self._extract_m3u8_formats_and_subtitles(pcb['archiveUrl'], video_id, ext='mp4')
self._sort_formats(formats)
return {
'id': video_id,
'title': pcb.get('title'),
'age_limit': 18 if pcb.get('isAdult') else None,
'timestamp': unified_timestamp(pcb.get('finishTime')),
'release_timestamp': unified_timestamp(pcb.get('createdAt')),
'thumbnails': [{
'url': pcb['thumbnailUrl'],
'ext': 'png',
}] if pcb.get('thumbnailUrl') else None,
'formats': formats,
'subtitles': subs,
'comments': [{
'author': c.get('username'),
'author_id': c.get('postedUserId'),
'author_thumbnail': c.get('userThumbnailUrl'),
'id': c.get('id'),
'text': c.get('body'),
'timestamp': unified_timestamp(c.get('createdAt')),
'like_count': c.get('likeCount'),
'is_favorited': c.get('isLikedByOwner'),
'author_is_uploader': c.get('postedUserId') == uploader_internal_id,
} for c in traverse_obj(nuxt_data, (
'ssrRefs', ..., lambda _, v: v['__typename'] == 'PublicCreatorBroadcastComment')) or []],
'uploader_id': uploader_id,
'uploader': traverse_obj(nuxt_data, (
'ssrRefs', lambda _, v: v['__typename'] == 'PublicUser', 'username'), get_all=False),
}
class FourZeroStudioClipIE(InfoExtractor):
_VALID_URL = r'https?://0000\.studio/(?P<uploader_id>[^/]+)/archive-clip/(?P<id>[^/]+)'
IE_NAME = '0000studio:clip'
_TESTS = [{
'url': 'https://0000.studio/soeji/archive-clip/e46b0278-24cd-40a8-92e1-b8fc2b21f34f',
'info_dict': {
'id': 'e46b0278-24cd-40a8-92e1-b8fc2b21f34f',
'title': 'わたベーさんからイラスト差し入れいただきました。ありがとうございました!',
'timestamp': 1652109105,
'like_count': 1,
'uploader': 'ソエジマケイタ',
'uploader_id': 'soeji',
}
}]
def _real_extract(self, url):
video_id, uploader_id = self._match_valid_url(url).group('id', 'uploader_id')
webpage = self._download_webpage(url, video_id)
nuxt_data = self._search_nuxt_data(webpage, video_id, traverse=None)
clip_info = traverse_obj(nuxt_data, ('ssrRefs', lambda _, v: v['__typename'] == 'PublicCreatorArchivedClip'), get_all=False)
info = next((
m for m in self._parse_html5_media_entries(url, webpage, video_id)
if 'mp4' in traverse_obj(m, ('formats', ..., 'ext'))
), None)
if not info:
self.report_warning('Failed to find a desired media element. Falling back to using NUXT data.')
info = {
'formats': [{
'ext': 'mp4',
'url': url,
} for url in clip_info.get('mediaFiles') or [] if url],
}
return {
**info,
'id': video_id,
'title': clip_info.get('clipComment'),
'timestamp': unified_timestamp(clip_info.get('createdAt')),
'like_count': clip_info.get('likeCount'),
'uploader_id': uploader_id,
'uploader': traverse_obj(nuxt_data, (
'ssrRefs', lambda _, v: v['__typename'] == 'PublicUser', 'username'), get_all=False),
}

View File

@ -59,13 +59,10 @@ class FoxNewsIE(AMPIE):
@staticmethod
def _extract_urls(webpage):
return [
f'https://video.foxnews.com/v/video-embed.html?video_id={mobj.group("video_id")}'
mobj.group('url')
for mobj in re.finditer(
r'''(?x)
<(?:script|(?:amp-)?iframe)[^>]+\bsrc=["\']
(?:https?:)?//video\.foxnews\.com/v/(?:video-embed\.html|embed\.js)\?
(?:[^>"\']+&)?(?:video_)?id=(?P<video_id>\d+)
''', webpage)]
r'<(?:amp-)?iframe[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//video\.foxnews\.com/v/video-embed\.html?.*?\bvideo_id=\d+.*?)\1',
webpage)]
def _real_extract(self, url):
host, video_id = self._match_valid_url(url).groups()

View File

@ -0,0 +1,125 @@
import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
extract_attributes,
int_or_none,
traverse_obj,
unified_strdate,
)
class FranceCultureIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?franceculture\.fr/emissions/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
# playlist
'url': 'https://www.franceculture.fr/emissions/serie/hasta-dente',
'playlist_count': 12,
'info_dict': {
'id': 'hasta-dente',
'title': 'Hasta Dente',
'description': 'md5:57479af50648d14e9bb649e6b1f8f911',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20201024',
},
'playlist': [{
'info_dict': {
'id': '3c1c2e55-41a0-11e5-9fe0-005056a87c89',
'ext': 'mp3',
'title': 'Jeudi, vous avez dit bizarre ?',
'description': 'md5:47cf1e00cc21c86b0210279996a812c6',
'duration': 604,
'upload_date': '20201024',
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1603576680
},
},
],
}, {
'url': 'https://www.franceculture.fr/emissions/carnet-nomade/rendez-vous-au-pays-des-geeks',
'info_dict': {
'id': 'rendez-vous-au-pays-des-geeks',
'display_id': 'rendez-vous-au-pays-des-geeks',
'ext': 'mp3',
'title': 'Rendez-vous au pays des geeks',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20140301',
'vcodec': 'none',
'duration': 3569,
},
}, {
# no thumbnail
'url': 'https://www.franceculture.fr/emissions/la-recherche-montre-en-main/la-recherche-montre-en-main-du-mercredi-10-octobre-2018',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
info = {
'id': display_id,
'title': self._html_search_regex(
r'(?s)<h1[^>]*itemprop="[^"]*name[^"]*"[^>]*>(.+?)</h1>',
webpage, 'title', default=self._og_search_title(webpage)),
'description': self._html_search_regex(
r'(?s)<div[^>]+class="excerpt"[^>]*>(.*?)</div>', webpage, 'description', default=None),
'thumbnail': self._og_search_thumbnail(webpage),
'uploader': self._html_search_regex(
r'(?s)<span class="author">(.*?)</span>', webpage, 'uploader', default=None),
'upload_date': unified_strdate(self._html_search_regex(
r'(?s)class="teaser-text-date".*?(\d{2}/\d{2}/\d{4})', webpage, 'date', default=None)),
}
playlist_data = self._search_regex(
r'''(?sx)
<section[^>]+data-xiti-place="[^"]*?liste_episodes[^"?]*?"[^>]*>
(.*?)
</section>
''',
webpage, 'playlist data', fatal=False, default=None)
if playlist_data:
entries = []
for item, item_description in re.findall(
r'(?s)(<button[^<]*class="[^"]*replay-button[^>]*>).*?<p[^>]*class="[^"]*teaser-text-chapo[^>]*>(.*?)</p>',
playlist_data):
item_attributes = extract_attributes(item)
entries.append({
'id': item_attributes.get('data-emission-uuid'),
'url': item_attributes.get('data-url'),
'title': item_attributes.get('data-diffusion-title'),
'duration': int_or_none(traverse_obj(item_attributes, 'data-duration-seconds', 'data-duration-seconds')),
'description': item_description,
'timestamp': int_or_none(item_attributes.get('data-start-time')),
'thumbnail': info['thumbnail'],
'uploader': info['uploader'],
})
return {
'_type': 'playlist',
'entries': entries,
**info
}
video_data = extract_attributes(self._search_regex(
r'''(?sx)
(?:
</h1>|
<div[^>]+class="[^"]*?(?:title-zone-diffusion|heading-zone-(?:wrapper|player-button))[^"]*?"[^>]*>
).*?
(<button[^>]+data-(?:url|asset-source)="[^"]+"[^>]+>)
''',
webpage, 'video data'))
video_url = traverse_obj(video_data, 'data-url', 'data-asset-source')
ext = determine_ext(video_url.lower())
return {
'display_id': display_id,
'url': video_url,
'ext': ext,
'vcodec': 'none' if ext == 'mp3' else None,
'duration': int_or_none(video_data.get('data-duration')),
**info
}

View File

@ -1,141 +0,0 @@
import itertools
import re
from .common import InfoExtractor
from ..utils import int_or_none, traverse_obj, urlencode_postdata
class FreeTvBaseIE(InfoExtractor):
def _get_api_response(self, content_id, resource_type, postdata):
return self._download_json(
'https://www.freetv.com/wordpress/wp-admin/admin-ajax.php',
content_id, data=urlencode_postdata(postdata),
note=f'Downloading {content_id} {resource_type} JSON')['data']
class FreeTvMoviesIE(FreeTvBaseIE):
_VALID_URL = r'https?://(?:www\.)?freetv\.com/peliculas/(?P<id>[^/]+)'
_TESTS = [{
'url': 'https://www.freetv.com/peliculas/atrapame-si-puedes/',
'md5': 'dc62d5abf0514726640077cd1591aa92',
'info_dict': {
'id': '428021',
'title': 'Atrápame Si Puedes',
'description': 'md5:ca63bc00898aeb2f64ec87c6d3a5b982',
'ext': 'mp4',
}
}, {
'url': 'https://www.freetv.com/peliculas/monstruoso/',
'md5': '509c15c68de41cb708d1f92d071f20aa',
'info_dict': {
'id': '377652',
'title': 'Monstruoso',
'description': 'md5:333fc19ee327b457b980e54a911ea4a3',
'ext': 'mp4',
}
}]
def _extract_video(self, content_id, action='olyott_video_play'):
api_response = self._get_api_response(content_id, 'video', {
'action': action,
'contentID': content_id,
})
video_id, video_url = api_response['displayMeta']['contentID'], api_response['displayMeta']['streamURLVideo']
formats, subtitles = self._extract_m3u8_formats_and_subtitles(video_url, video_id, 'mp4')
self._sort_formats(formats)
return {
'id': video_id,
'title': traverse_obj(api_response, ('displayMeta', 'title')),
'description': traverse_obj(api_response, ('displayMeta', 'desc')),
'formats': formats,
'subtitles': subtitles,
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
return self._extract_video(
self._search_regex((
r'class=["\'][^>]+postid-(?P<video_id>\d+)',
r'<link[^>]+freetv.com/\?p=(?P<video_id>\d+)',
r'<div[^>]+data-params=["\'][^>]+post_id=(?P<video_id>\d+)',
), webpage, 'video id', group='video_id'))
class FreeTvIE(FreeTvBaseIE):
IE_NAME = 'freetv:series'
_VALID_URL = r'https?://(?:www\.)?freetv\.com/series/(?P<id>[^/]+)'
_TESTS = [{
'url': 'https://www.freetv.com/series/el-detective-l/',
'info_dict': {
'id': 'el-detective-l',
'title': 'El Detective L',
'description': 'md5:f9f1143bc33e9856ecbfcbfb97a759be'
},
'playlist_count': 24,
}, {
'url': 'https://www.freetv.com/series/esmeraldas/',
'info_dict': {
'id': 'esmeraldas',
'title': 'Esmeraldas',
'description': 'md5:43d7ec45bd931d8268a4f5afaf4c77bf'
},
'playlist_count': 62,
}, {
'url': 'https://www.freetv.com/series/las-aventuras-de-leonardo/',
'info_dict': {
'id': 'las-aventuras-de-leonardo',
'title': 'Las Aventuras de Leonardo',
'description': 'md5:0c47130846c141120a382aca059288f6'
},
'playlist_count': 13,
},
]
def _extract_series_season(self, season_id, series_title):
episodes = self._get_api_response(season_id, 'series', {
'contentID': season_id,
'action': 'olyott_get_dynamic_series_content',
'type': 'list',
'perPage': '1000',
})['1']
for episode in episodes:
video_id = str(episode['contentID'])
formats, subtitles = self._extract_m3u8_formats_and_subtitles(episode['streamURL'], video_id, 'mp4')
self._sort_formats(formats)
yield {
'id': video_id,
'title': episode.get('fullTitle'),
'description': episode.get('description'),
'formats': formats,
'subtitles': subtitles,
'thumbnail': episode.get('thumbnail'),
'series': series_title,
'series_id': traverse_obj(episode, ('contentMeta', 'displayMeta', 'seriesID')),
'season_id': traverse_obj(episode, ('contentMeta', 'displayMeta', 'seasonID')),
'season_number': traverse_obj(
episode, ('contentMeta', 'displayMeta', 'seasonNum'), expected_type=int_or_none),
'episode_number': traverse_obj(
episode, ('contentMeta', 'displayMeta', 'episodeNum'), expected_type=int_or_none),
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
title = self._html_search_regex(
r'<h1[^>]+class=["\']synopis[^>]>(?P<title>[^<]+)', webpage, 'title', group='title', fatal=False)
description = self._html_search_regex(
r'<div[^>]+class=["\']+synopis content[^>]><p>(?P<description>[^<]+)',
webpage, 'description', group='description', fatal=False)
return self.playlist_result(
itertools.chain.from_iterable(
self._extract_series_season(season_id, title)
for season_id in re.findall(r'<option[^>]+value=["\'](\d+)["\']', webpage)),
display_id, title, description)

View File

@ -69,13 +69,11 @@ from .spankwire import SpankwireIE
from .sportbox import SportBoxIE
from .spotify import SpotifyBaseIE
from .springboardplatform import SpringboardPlatformIE
from .substack import SubstackIE
from .svt import SVTIE
from .teachable import TeachableIE
from .ted import TedEmbedIE
from .theplatform import ThePlatformIE
from .threeqsdn import ThreeQSDNIE
from .tiktok import TikTokIE
from .tnaflix import TNAFlixNetworkEmbedIE
from .tube8 import Tube8IE
from .tunein import TuneInBaseIE
@ -2543,104 +2541,7 @@ class GenericIE(InfoExtractor):
'timestamp': 1652833414,
'age_limit': 0,
}
},
{
'url': 'https://www.mollymovieclub.com/p/interstellar?s=r#details',
'md5': '198bde8bed23d0b23c70725c83c9b6d9',
'info_dict': {
'id': '53602801',
'ext': 'mpga',
'title': 'Interstellar',
'description': 'Listen now | Episode One',
'thumbnail': 'md5:c30d9c83f738e16d8551d7219d321538',
'uploader': 'Molly Movie Club',
'uploader_id': '839621',
},
},
{
'url': 'https://www.blockedandreported.org/p/episode-117-lets-talk-about-depp?s=r',
'md5': 'c0cc44ee7415daeed13c26e5b56d6aa0',
'info_dict': {
'id': '57962052',
'ext': 'mpga',
'title': 'md5:855b2756f0ee10f6723fa00b16266f8d',
'description': 'md5:fe512a5e94136ad260c80bde00ea4eef',
'thumbnail': 'md5:2218f27dfe517bb5ac16c47d0aebac59',
'uploader': 'Blocked and Reported',
'uploader_id': '500230',
},
},
{
'url': 'https://www.skimag.com/video/ski-people-1980/',
'info_dict': {
'id': 'ski-people-1980',
'title': 'Ski People (1980)',
},
'playlist_count': 1,
'playlist': [{
'md5': '022a7e31c70620ebec18deeab376ee03',
'info_dict': {
'id': 'YTmgRiNU',
'ext': 'mp4',
'title': '1980 Ski People',
'timestamp': 1610407738,
'description': 'md5:cf9c3d101452c91e141f292b19fe4843',
'thumbnail': 'https://cdn.jwplayer.com/v2/media/YTmgRiNU/poster.jpg?width=720',
'duration': 5688.0,
'upload_date': '20210111',
}
}]
},
{
'note': 'Rumble embed',
'url': 'https://rumble.com/vdmum1-moose-the-dog-helps-girls-dig-a-snow-fort.html',
'md5': '53af34098a7f92c4e51cf0bd1c33f009',
'info_dict': {
'id': 'vb0ofn',
'ext': 'mp4',
'timestamp': 1612662578,
'uploader': 'LovingMontana',
'channel': 'LovingMontana',
'upload_date': '20210207',
'title': 'Winter-loving dog helps girls dig a snow fort ',
'channel_url': 'https://rumble.com/c/c-546523',
'thumbnail': 'https://sp.rmbl.ws/s8/1/5/f/x/x/5fxxb.OvCc.1-small-Moose-The-Dog-Helps-Girls-D.jpg',
'duration': 103,
}
},
{
'note': 'Rumble JS embed',
'url': 'https://therightscoop.com/what-does-9-plus-1-plus-1-equal-listen-to-this-audio-of-attempted-kavanaugh-assassins-call-and-youll-get-it',
'md5': '4701209ac99095592e73dbba21889690',
'info_dict': {
'id': 'v15eqxl',
'ext': 'mp4',
'channel': 'Mr Producer Media',
'duration': 92,
'title': '911 Audio From The Man Who Wanted To Kill Supreme Court Justice Kavanaugh',
'channel_url': 'https://rumble.com/c/RichSementa',
'thumbnail': 'https://sp.rmbl.ws/s8/1/P/j/f/A/PjfAe.OvCc-small-911-Audio-From-The-Man-Who-.jpg',
'timestamp': 1654892716,
'uploader': 'Mr Producer Media',
'upload_date': '20220610',
}
},
{
'note': 'JSON LD with multiple @type',
'url': 'https://www.nu.nl/280161/video/hoe-een-bladvlo-dit-verwoestende-japanse-onkruid-moet-vernietigen.html',
'md5': 'c7949f34f57273013fb7ccb1156393db',
'info_dict': {
'id': 'ipy2AcGL',
'ext': 'mp4',
'description': 'md5:6a9d644bab0dc2dc06849c2505d8383d',
'thumbnail': r're:https://media\.nu\.nl/m/.+\.jpg',
'title': 'Hoe een bladvlo dit verwoestende Japanse onkruid moet vernietigen',
'timestamp': 1586577474,
'upload_date': '20200411',
'age_limit': 0,
'duration': 111.0,
}
},
}
]
def report_following_redirect(self, new_url):
@ -3116,7 +3017,6 @@ class GenericIE(InfoExtractor):
wistia_urls = WistiaIE._extract_urls(webpage)
if wistia_urls:
playlist = self.playlist_from_matches(wistia_urls, video_id, video_title, ie=WistiaIE.ie_key())
playlist['entries'] = list(playlist['entries'])
for entry in playlist['entries']:
entry.update({
'_type': 'url_transparent',
@ -3136,11 +3036,6 @@ class GenericIE(InfoExtractor):
# Don't set the extractor because it can be a track url or an album
return self.url_result(burl)
# Check for Substack custom domains
substack_url = SubstackIE._extract_url(webpage, url)
if substack_url:
return self.url_result(substack_url, SubstackIE)
# Look for embedded Vevo player
mobj = re.search(
r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//(?:cache\.)?vevo\.com/.+?)\1', webpage)
@ -3861,11 +3756,6 @@ class GenericIE(InfoExtractor):
if ruutu_urls:
return self.playlist_from_matches(ruutu_urls, video_id, video_title)
# Look for Tiktok embeds
tiktok_urls = TikTokIE._extract_urls(webpage)
if tiktok_urls:
return self.playlist_from_matches(tiktok_urls, video_id, video_title)
# Look for HTML5 media
entries = self._parse_html5_media_entries(url, webpage, video_id, m3u8_id='hls')
if entries:
@ -3975,10 +3865,15 @@ class GenericIE(InfoExtractor):
json_ld = self._search_json_ld(webpage, video_id, default={})
if json_ld.get('url') not in (url, None):
self.report_detected('JSON LD')
return merge_dicts({
'_type': 'url_transparent',
'url': smuggle_url(json_ld['url'], {'force_videoid': video_id, 'to_generic': True}),
}, json_ld, info_dict)
if determine_ext(json_ld['url']) == 'm3u8':
json_ld['formats'], json_ld['subtitles'] = self._extract_m3u8_formats_and_subtitles(
json_ld['url'], video_id, 'mp4')
json_ld.pop('url')
self._sort_formats(json_ld['formats'])
else:
json_ld['_type'] = 'url_transparent'
json_ld['url'] = smuggle_url(json_ld['url'], {'force_videoid': video_id, 'to_generic': True})
return merge_dicts(json_ld, info_dict)
def check_video(vurl):
if YoutubeIE.suitable(vurl):

View File

@ -276,59 +276,3 @@ class GoogleDriveIE(InfoExtractor):
'automatic_captions': self.extract_automatic_captions(
video_id, subtitles_id, hl),
}
class GoogleDriveFolderIE(InfoExtractor):
IE_NAME = 'GoogleDrive:Folder'
_VALID_URL = r'https?://(?:docs|drive)\.google\.com/drive/folders/(?P<id>[\w-]{28,})'
_TESTS = [{
'url': 'https://drive.google.com/drive/folders/1dQ4sx0-__Nvg65rxTSgQrl7VyW_FZ9QI',
'info_dict': {
'id': '1dQ4sx0-__Nvg65rxTSgQrl7VyW_FZ9QI',
'title': 'Forrest'
},
'playlist_count': 3,
}]
_BOUNDARY = '=====vc17a3rwnndj====='
_REQUEST = "/drive/v2beta/files?openDrive=true&reason=102&syncType=0&errorRecovery=false&q=trashed%20%3D%20false%20and%20'{folder_id}'%20in%20parents&fields=kind%2CnextPageToken%2Citems(kind%2CmodifiedDate%2CmodifiedByMeDate%2ClastViewedByMeDate%2CfileSize%2Cowners(kind%2CpermissionId%2Cid)%2ClastModifyingUser(kind%2CpermissionId%2Cid)%2ChasThumbnail%2CthumbnailVersion%2Ctitle%2Cid%2CresourceKey%2Cshared%2CsharedWithMeDate%2CuserPermission(role)%2CexplicitlyTrashed%2CmimeType%2CquotaBytesUsed%2Ccopyable%2CfileExtension%2CsharingUser(kind%2CpermissionId%2Cid)%2Cspaces%2Cversion%2CteamDriveId%2ChasAugmentedPermissions%2CcreatedDate%2CtrashingUser(kind%2CpermissionId%2Cid)%2CtrashedDate%2Cparents(id)%2CshortcutDetails(targetId%2CtargetMimeType%2CtargetLookupStatus)%2Ccapabilities(canCopy%2CcanDownload%2CcanEdit%2CcanAddChildren%2CcanDelete%2CcanRemoveChildren%2CcanShare%2CcanTrash%2CcanRename%2CcanReadTeamDrive%2CcanMoveTeamDriveItem)%2Clabels(starred%2Ctrashed%2Crestricted%2Cviewed))%2CincompleteSearch&appDataFilter=NO_APP_DATA&spaces=drive&pageToken={page_token}&maxResults=50&supportsTeamDrives=true&includeItemsFromAllDrives=true&corpora=default&orderBy=folder%2Ctitle_natural%20asc&retryCount=0&key={key} HTTP/1.1"
_DATA = f'''--{_BOUNDARY}
content-type: application/http
content-transfer-encoding: binary
GET %s
--{_BOUNDARY}
'''
def _call_api(self, folder_id, key, data, **kwargs):
response = self._download_webpage(
'https://clients6.google.com/batch/drive/v2beta',
folder_id, data=data.encode('utf-8'),
headers={
'Content-Type': 'text/plain;charset=UTF-8;',
'Origin': 'https://drive.google.com',
}, query={
'$ct': f'multipart/mixed; boundary="{self._BOUNDARY}"',
'key': key
}, **kwargs)
return self._search_json('', response, 'api response', folder_id, **kwargs) or {}
def _get_folder_items(self, folder_id, key):
page_token = ''
while page_token is not None:
request = self._REQUEST.format(folder_id=folder_id, page_token=page_token, key=key)
page = self._call_api(folder_id, key, self._DATA % request)
yield from page['items']
page_token = page.get('nextPageToken')
def _real_extract(self, url):
folder_id = self._match_id(url)
webpage = self._download_webpage(url, folder_id)
key = self._search_regex(r'"(\w{39})"', webpage, 'key')
folder_info = self._call_api(folder_id, key, self._DATA % f'/drive/v2beta/files/{folder_id} HTTP/1.1', fatal=False)
return self.playlist_from_matches(
self._get_folder_items(folder_id, key), folder_id, folder_info.get('title'),
ie=GoogleDriveIE, getter=lambda item: f'https://drive.google.com/file/d/{item["id"]}')

View File

@ -1,19 +1,23 @@
from .common import InfoExtractor
from ..utils import unified_strdate
from ..utils import (
determine_ext,
int_or_none,
strip_or_none,
xpath_attr,
xpath_text,
)
class InaIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www|m)\.)?ina\.fr/(?:[^/]+/)?(?:video|audio)/(?P<id>\w+)'
_VALID_URL = r'https?://(?:(?:www|m)\.)?ina\.fr/(?:video|audio)/(?P<id>[A-Z0-9_]+)'
_TESTS = [{
'url': 'https://www.ina.fr/video/I12055569/francois-hollande-je-crois-que-c-est-clair-video.html',
'md5': 'c5a09e5cb5604ed10709f06e7a377dda',
'url': 'http://www.ina.fr/video/I12055569/francois-hollande-je-crois-que-c-est-clair-video.html',
'md5': 'a667021bf2b41f8dc6049479d9bb38a3',
'info_dict': {
'id': 'I12055569',
'ext': 'mp4',
'title': 'François Hollande "Je crois que c\'est clair"',
'description': 'md5:08201f1c86fb250611f0ba415d21255a',
'upload_date': '20070712',
'thumbnail': 'https://cdn-hub.ina.fr/notice/690x517/3c4/I12055569.jpeg',
'description': 'md5:3f09eb072a06cb286b8f7e4f77109663',
}
}, {
'url': 'https://www.ina.fr/video/S806544_001/don-d-organes-des-avancees-mais-d-importants-besoins-video.html',
@ -27,37 +31,53 @@ class InaIE(InfoExtractor):
}, {
'url': 'http://m.ina.fr/video/I12055569',
'only_matching': True,
}, {
'url': 'https://www.ina.fr/ina-eclaire-actu/video/cpb8205116303/les-jeux-electroniques',
'md5': '4b8284a9a3a184fdc7e744225b8251e7',
'info_dict': {
'id': 'CPB8205116303',
'ext': 'mp4',
'title': 'Les jeux électroniques',
'description': 'md5:e09f7683dad1cc60b74950490127d233',
'upload_date': '19821204',
'duration': 657,
'thumbnail': 'https://cdn-hub.ina.fr/notice/690x517/203/CPB8205116303.jpeg',
}
}]
def _real_extract(self, url):
video_id = self._match_id(url).upper()
webpage = self._download_webpage(url, video_id)
video_id = self._match_id(url)
info_doc = self._download_xml(
'http://player.ina.fr/notices/%s.mrss' % video_id, video_id)
item = info_doc.find('channel/item')
title = xpath_text(item, 'title', fatal=True)
media_ns_xpath = lambda x: self._xpath_ns(x, 'http://search.yahoo.com/mrss/')
content = item.find(media_ns_xpath('content'))
api_url = self._html_search_regex(
r'asset-details-url\s*=\s*["\'](?P<api_url>[^"\']+)',
webpage, 'api_url').replace(video_id, f'{video_id}.json')
get_furl = lambda x: xpath_attr(content, media_ns_xpath(x), 'url')
formats = []
for q, w, h in (('bq', 400, 300), ('mq', 512, 384), ('hq', 768, 576)):
q_url = get_furl(q)
if not q_url:
continue
formats.append({
'format_id': q,
'url': q_url,
'width': w,
'height': h,
})
if not formats:
furl = get_furl('player') or content.attrib['url']
ext = determine_ext(furl)
formats = [{
'url': furl,
'vcodec': 'none' if ext == 'mp3' else None,
'ext': ext,
}]
api_response = self._download_json(api_url, video_id)
thumbnails = []
for thumbnail in content.findall(media_ns_xpath('thumbnail')):
thumbnail_url = thumbnail.get('url')
if not thumbnail_url:
continue
thumbnails.append({
'url': thumbnail_url,
'height': int_or_none(thumbnail.get('height')),
'width': int_or_none(thumbnail.get('width')),
})
return {
'id': video_id,
'url': api_response['resourceUrl'],
'ext': {'video': 'mp4', 'audio': 'mp3'}.get(api_response.get('type')),
'title': api_response.get('title'),
'description': api_response.get('description'),
'upload_date': unified_strdate(api_response.get('dateOfBroadcast')),
'duration': api_response.get('duration'),
'thumbnail': api_response.get('resourceThumbnail'),
'formats': formats,
'title': title,
'description': strip_or_none(xpath_text(item, 'description')),
'thumbnails': thumbnails,
}

View File

@ -410,7 +410,7 @@ class InstagramIE(InstagramBaseIE):
if nodes:
return self.playlist_result(
self._extract_nodes(nodes, True), video_id,
format_field(username, None, 'Post by %s'), description)
format_field(username, template='Post by %s'), description)
video_url = self._og_search_video_url(webpage, secure=False)

View File

@ -37,7 +37,7 @@ def md5_text(text):
return hashlib.md5(text.encode('utf-8')).hexdigest()
class IqiyiSDK:
class IqiyiSDK(object):
def __init__(self, target, ip, timestamp):
self.target = target
self.ip = ip
@ -131,7 +131,7 @@ class IqiyiSDK:
self.target = self.digit_sum(self.timestamp) + chunks[0] + compat_str(sum(ip))
class IqiyiSDKInterpreter:
class IqiyiSDKInterpreter(object):
def __init__(self, sdk_code):
self.sdk_code = sdk_code
@ -610,7 +610,7 @@ class IqIE(InfoExtractor):
preview_time = traverse_obj(
initial_format_data, ('boss_ts', (None, 'data'), ('previewTime', 'rtime')), expected_type=float_or_none, get_all=False)
if traverse_obj(initial_format_data, ('boss_ts', 'data', 'prv'), expected_type=int_or_none):
self.report_warning('This preview video is limited%s' % format_field(preview_time, None, ' to %s seconds'))
self.report_warning('This preview video is limited%s' % format_field(preview_time, template=' to %s seconds'))
# TODO: Extract audio-only formats
for bid in set(traverse_obj(initial_format_data, ('program', 'video', ..., 'bid'), expected_type=str_or_none, default=[])):

View File

@ -1,4 +1,3 @@
import itertools
import re
import urllib
@ -172,70 +171,37 @@ class IwaraUserIE(IwaraBaseIE):
IE_NAME = 'iwara:user'
_TESTS = [{
'note': 'number of all videos page is just 1 page. less than 40 videos',
'url': 'https://ecchi.iwara.tv/users/infinityyukarip',
'url': 'https://ecchi.iwara.tv/users/CuteMMD',
'info_dict': {
'title': 'Uploaded videos from Infinity_YukariP',
'id': 'infinityyukarip',
'uploader': 'Infinity_YukariP',
'uploader_id': 'infinityyukarip',
'id': 'CuteMMD',
},
'playlist_mincount': 39,
'playlist_mincount': 198,
}, {
'note': 'no even all videos page. probably less than 10 videos',
'url': 'https://ecchi.iwara.tv/users/mmd-quintet',
# urlencoded
'url': 'https://ecchi.iwara.tv/users/%E5%92%95%E5%98%BF%E5%98%BF',
'info_dict': {
'title': 'Uploaded videos from mmd quintet',
'id': 'mmd-quintet',
'uploader': 'mmd quintet',
'uploader_id': 'mmd-quintet',
'id': '咕嘿嘿',
},
'playlist_mincount': 6,
}, {
'note': 'has paging. more than 40 videos',
'url': 'https://ecchi.iwara.tv/users/theblackbirdcalls',
'info_dict': {
'title': 'Uploaded videos from TheBlackbirdCalls',
'id': 'theblackbirdcalls',
'uploader': 'TheBlackbirdCalls',
'uploader_id': 'theblackbirdcalls',
},
'playlist_mincount': 420,
}, {
'note': 'foreign chars in URL. there must be foreign characters in URL',
'url': 'https://ecchi.iwara.tv/users/ぶた丼',
'info_dict': {
'title': 'Uploaded videos from ぶた丼',
'id': 'ぶた丼',
'uploader': 'ぶた丼',
'uploader_id': 'ぶた丼',
},
'playlist_mincount': 170,
'playlist_mincount': 141,
}]
def _entries(self, playlist_id, base_url):
webpage = self._download_webpage(
f'{base_url}/users/{playlist_id}', playlist_id)
videos_url = self._search_regex(r'<a href="(/users/[^/]+/videos)(?:\?[^"]+)?">', webpage, 'all videos url', default=None)
if not videos_url:
yield from self._extract_playlist(base_url, webpage)
return
def _entries(self, playlist_id, base_url, webpage):
yield from self._extract_playlist(base_url, webpage)
videos_url = urljoin(base_url, videos_url)
page_urls = re.findall(
r'class="pager-item"[^>]*>\s*<a[^<]+href="([^"]+)', webpage)
for n in itertools.count(1):
page = self._download_webpage(
videos_url, playlist_id, note=f'Downloading playlist page {n}',
query={'page': str(n - 1)} if n > 1 else {})
for n, path in enumerate(page_urls, 2):
yield from self._extract_playlist(
base_url, page)
if f'page={n}' not in page:
break
base_url, self._download_webpage(
urljoin(base_url, path), playlist_id, note=f'Downloading playlist page {n}'))
def _real_extract(self, url):
playlist_id, base_url = self._match_valid_url(url).group('id', 'base_url')
playlist_id = urllib.parse.unquote(playlist_id)
webpage = self._download_webpage(
f'{base_url}/users/{playlist_id}/videos', playlist_id)
return self.playlist_result(
self._entries(playlist_id, base_url), playlist_id)
self._entries(playlist_id, base_url, webpage), playlist_id)

View File

@ -1,84 +0,0 @@
import base64
from .common import InfoExtractor
from ..utils import (
ExtractorError,
get_element_by_id,
int_or_none,
js_to_json,
str_or_none,
traverse_obj,
)
class IxiguaIE(InfoExtractor):
_VALID_URL = r'https?://(?:\w+\.)?ixigua\.com/(?:video/)?(?P<id>\d+).+'
_TESTS = [{
'url': 'https://www.ixigua.com/6996881461559165471',
'info_dict': {
'id': '6996881461559165471',
'ext': 'mp4',
'title': '盲目涉水风险大,亲身示范高水位行车注意事项',
'description': 'md5:8c82f46186299add4a1c455430740229',
'tags': ['video_car'],
'like_count': int,
'dislike_count': int,
'view_count': int,
'uploader': '懂车帝原创',
'uploader_id': '6480145787',
'thumbnail': r're:^https?://.+\.(avif|webp)',
'timestamp': 1629088414,
'duration': 1030,
}
}]
def _get_json_data(self, webpage, video_id):
js_data = get_element_by_id('SSR_HYDRATED_DATA', webpage)
if not js_data:
if self._cookies_passed:
raise ExtractorError('Failed to get SSR_HYDRATED_DATA')
raise ExtractorError('Cookies (not necessarily logged in) are needed', expected=True)
return self._parse_json(
js_data.replace('window._SSR_HYDRATED_DATA=', ''), video_id, transform_source=js_to_json)
def _media_selector(self, json_data):
for path, override in (
(('video_list', ), {}),
(('dynamic_video', 'dynamic_video_list'), {'acodec': 'none'}),
(('dynamic_video', 'dynamic_audio_list'), {'vcodec': 'none', 'ext': 'm4a'}),
):
for media in traverse_obj(json_data, (..., *path, lambda _, v: v['main_url'])):
yield {
'url': base64.b64decode(media['main_url']).decode(),
'width': int_or_none(media.get('vwidth')),
'height': int_or_none(media.get('vheight')),
'fps': int_or_none(media.get('fps')),
'vcodec': media.get('codec_type'),
'format_id': str_or_none(media.get('quality_type')),
'filesize': int_or_none(media.get('size')),
'ext': 'mp4',
**override,
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
json_data = self._get_json_data(webpage, video_id)['anyVideo']['gidInformation']['packerData']['video']
formats = list(self._media_selector(json_data.get('videoResource')))
self._sort_formats(formats)
return {
'id': video_id,
'title': json_data.get('title'),
'description': json_data.get('video_abstract'),
'formats': formats,
'like_count': json_data.get('video_like_count'),
'duration': int_or_none(json_data.get('duration')),
'tags': [json_data.get('tag')],
'uploader_id': traverse_obj(json_data, ('user_info', 'user_id')),
'uploader': traverse_obj(json_data, ('user_info', 'name')),
'view_count': json_data.get('video_watch_count'),
'dislike_count': json_data.get('video_unlike_count'),
'timestamp': int_or_none(json_data.get('video_publish_time')),
}

View File

@ -70,7 +70,7 @@ class JojIE(InfoExtractor):
r'(\d+)[pP]\.', format_url, 'height', default=None)
formats.append({
'url': format_url,
'format_id': format_field(height, None, '%sp'),
'format_id': format_field(height, template='%sp'),
'height': int(height),
})
if not formats:

View File

@ -5,7 +5,7 @@ from ..utils import unsmuggle_url
class JWPlatformIE(InfoExtractor):
_VALID_URL = r'(?:https?://(?:content\.jwplatform|cdn\.jwplayer)\.com/(?:(?:feed|player|thumb|preview|manifest)s|jw6|v2/media)/|jwplatform:)(?P<id>[a-zA-Z0-9]{8})'
_VALID_URL = r'(?:https?://(?:content\.jwplatform|cdn\.jwplayer)\.com/(?:(?:feed|player|thumb|preview)s|jw6|v2/media)/|jwplatform:)(?P<id>[a-zA-Z0-9]{8})'
_TESTS = [{
'url': 'http://content.jwplatform.com/players/nPripu9l-ALJ3XQCI.js',
'md5': 'fa8899fa601eb7c83a64e9d568bdf325',
@ -37,9 +37,6 @@ class JWPlatformIE(InfoExtractor):
webpage)
if ret:
return ret
mobj = re.search(r'<div\b[^>]* data-video-jw-id="([a-zA-Z0-9]{8})"', webpage)
if mobj:
return [f'jwplatform:{mobj.group(1)}']
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})

View File

@ -382,5 +382,5 @@ class KalturaIE(InfoExtractor):
'duration': info.get('duration'),
'timestamp': info.get('createdAt'),
'uploader_id': format_field(info, 'userId', ignore=('None', None)),
'view_count': int_or_none(info.get('plays')),
'view_count': info.get('plays'),
}

View File

@ -68,7 +68,7 @@ class KeezMoviesIE(InfoExtractor):
video_url, title, 32).decode('utf-8')
formats.append({
'url': format_url,
'format_id': format_field(height, None, '%dp'),
'format_id': format_field(height, template='%dp'),
'height': height,
'tbr': tbr,
})

View File

@ -1,55 +0,0 @@
from .common import InfoExtractor
from .dailymotion import DailymotionIE
class KickerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)kicker\.(?:de)/(?P<id>[\w-]+)/video'
_TESTS = [{
'url': 'https://www.kicker.de/pogba-dembel-co-die-top-11-der-abloesefreien-spieler-905049/video',
'info_dict': {
'id': 'km04mrK0DrRAVxy2GcA',
'title': 'md5:b91d145bac5745ac58d5479d8347a875',
'ext': 'mp4',
'duration': 350,
'description': 'md5:a5a3dd77dbb6550dbfb997be100b9998',
'uploader_id': 'x2dfupo',
'timestamp': 1654677626,
'like_count': int,
'uploader': 'kicker.de',
'view_count': int,
'age_limit': 0,
'thumbnail': r're:https://s\d+\.dmcdn\.net/v/T-x741YeYAx8aSZ0Z/x1080',
'tags': ['published', 'category.InternationalSoccer'],
'upload_date': '20220608'
}
}, {
'url': 'https://www.kicker.de/ex-unioner-in-der-bezirksliga-felix-kroos-vereinschallenge-in-pankow-902825/video',
'info_dict': {
'id': 'k2omNsJKdZ3TxwxYSFJ',
'title': 'md5:72ec24d7f84b8436fe1e89d198152adf',
'ext': 'mp4',
'uploader_id': 'x2dfupo',
'duration': 331,
'timestamp': 1652966015,
'thumbnail': r're:https?://s\d+\.dmcdn\.net/v/TxU4Z1YYCmtisTbMq/x1080',
'tags': ['FELIX KROOS', 'EINFACH MAL LUPPEN', 'KROOS', 'FSV FORTUNA PANKOW', 'published', 'category.Amateurs', 'marketingpreset.Spreekick'],
'age_limit': 0,
'view_count': int,
'upload_date': '20220519',
'uploader': 'kicker.de',
'description': 'md5:0c2060c899a91c8bf40f578f78c5846f',
'like_count': int,
}
}]
def _real_extract(self, url):
video_slug = self._match_id(url)
webpage = self._download_webpage(url, video_slug)
dailymotion_video_id = self._search_regex(
r'data-dmprivateid\s*=\s*[\'"](?P<video_id>\w+)', webpage,
'video id', group='video_id')
return self.url_result(
f'https://www.dailymotion.com/video/{dailymotion_video_id}',
ie=DailymotionIE, video_title=self._html_extract_title(webpage))

View File

@ -1,28 +0,0 @@
from .common import InfoExtractor
from ..utils import smuggle_url
class KTHIE(InfoExtractor):
_VALID_URL = r'https?://play\.kth\.se/(?:[^/]+/)+(?P<id>[a-z0-9_]+)'
_TEST = {
'url': 'https://play.kth.se/media/Lunch+breakA+De+nya+aff%C3%A4rerna+inom+Fordonsdalen/0_uoop6oz9',
'md5': 'd83ada6d00ca98b73243a88efe19e8a6',
'info_dict': {
'id': '0_uoop6oz9',
'ext': 'mp4',
'title': 'md5:bd1d6931facb6828762a33e6ce865f37',
'thumbnail': 're:https?://.+/thumbnail/.+',
'duration': 3516,
'timestamp': 1647345358,
'upload_date': '20220315',
'uploader_id': 'md5:0ec23e33a89e795a4512930c8102509f',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
result = self.url_result(
smuggle_url('kaltura:308:%s' % video_id, {
'service_url': 'https://api.kaltura.nordu.net'}),
'Kaltura')
return result

View File

@ -15,7 +15,7 @@ class LastFMPlaylistBaseIE(InfoExtractor):
for page_number in range(start_page_number, (last_page_number or start_page_number) + 1):
webpage = self._download_webpage(
url, playlist_id,
note='Downloading page %d%s' % (page_number, format_field(last_page_number, None, ' of %d')),
note='Downloading page %d%s' % (page_number, format_field(last_page_number, template=' of %d')),
query={'page': page_number})
page_entries = [
self.url_result(player_url, 'Youtube')

View File

@ -192,11 +192,10 @@ class LBRYIE(LBRYBaseIE):
claim_id, is_live = result['signing_channel']['claim_id'], True
headers = {'referer': 'https://player.odysee.live/'}
live_data = self._download_json(
'https://api.odysee.live/livestream/is_live', claim_id,
query={'channel_claim_id': claim_id},
f'https://api.live.odysee.com/v1/odysee/live/{claim_id}', claim_id,
note='Downloading livestream JSON metadata')['data']
streaming_url = final_url = live_data.get('VideoURL')
if not final_url and not live_data.get('Live'):
streaming_url = final_url = live_data.get('url')
if not final_url and not live_data.get('live'):
self.raise_no_formats('This stream is not live', True, claim_id)
else:
raise UnsupportedError(url)

View File

@ -34,7 +34,7 @@ class LineLiveBaseIE(InfoExtractor):
'timestamp': int_or_none(item.get('createdAt')),
'channel': channel.get('name'),
'channel_id': channel_id,
'channel_url': format_field(channel_id, None, 'https://live.line.me/channels/%s'),
'channel_url': format_field(channel_id, template='https://live.line.me/channels/%s'),
'duration': int_or_none(item.get('archiveDuration')),
'view_count': int_or_none(item.get('viewerCount')),
'comment_count': int_or_none(item.get('chatCount')),

View File

@ -116,7 +116,7 @@ class MedalTVIE(InfoExtractor):
author = try_get(
hydration_data, lambda x: list(x['profiles'].values())[0], dict) or {}
author_id = str_or_none(author.get('id'))
author_url = format_field(author_id, None, 'https://medal.tv/users/%s')
author_url = format_field(author_id, template='https://medal.tv/users/%s')
return {
'id': video_id,

View File

@ -20,7 +20,7 @@ class MediasetIE(ThePlatformBaseIE):
(?:
mediaset:|
https?://
(?:\w+\.)+mediaset\.it/
(?:(?:www|static3)\.)?mediasetplay\.mediaset\.it/
(?:
(?:video|on-demand|movie)/(?:[^/]+/)+[^/]+_|
player/index\.html\?.*?\bprogramGuid=
@ -159,9 +159,6 @@ class MediasetIE(ThePlatformBaseIE):
}, {
'url': 'https://www.mediasetplay.mediaset.it/movie/herculeslaleggendahainizio/hercules-la-leggenda-ha-inizio_F305927501000102',
'only_matching': True,
}, {
'url': 'https://mediasetinfinity.mediaset.it/video/braveandbeautiful/episodio-113_F310948005000402',
'only_matching': True,
}]
@staticmethod
@ -289,7 +286,7 @@ class MediasetShowIE(MediasetIE):
_VALID_URL = r'''(?x)
(?:
https?://
(\w+\.)+mediaset\.it/
(?:(?:www|static3)\.)?mediasetplay\.mediaset\.it/
(?:
(?:fiction|programmi-tv|serie-tv|kids)/(?:.+?/)?
(?:[a-z-]+)_SE(?P<id>\d{12})

Some files were not shown because too many files have changed in this diff Show More