Remix.run Logo
1vuio0pswjnm7 3 days ago

"If you dive into the yt-dlp source code, you see the insane complexity of calculations needed to download a video. "

Indeed the complexity is insane

https://news.ycombinator.com/item?id=45256043

But what is meant by "a video". Is this referring to the common case or an edge/corner case. Does "a" mean one particular video or all videos

"There is code to handle nsig checks, internal YouTube API quirks, and constant obfuscation that makes it a nightmare(and the maintainers heroes) to keep up."

True, but is this code required for all YouTube videos

The majority of YT videos are non-commercial, unpromoted with low view counts. These are simple to download

For example, the current yt-dlp project contains approximately 218 YT IDs. A 2024 version contained approximately 201 YT IDs. These are often for testing edge cases

The example 1,525-character shell script below outputs download URLs for almost all the YT IDs found in yt-dlp. No Python needed

By comparison the yt-dlp project is 15,679,182 characters, approximately

The curl binary is used in the example only because it's popular, not because I use it. I use simpler, more flexible software than curl

I have been using tiny shell script to download YT videos for over 15 years. I have been downloading videos from googlevideo.com for even longer, before Google acquired YouTube.^1 Surprisingly (or not), when YT changes something that requires updating the script (and this has only happened to me about 5 times or less in 15 years) I have generally been able to fix the shell script faster than yt-dl(p) fixes its Python program (same for NewPipe/NewPipeSB)

I prefer non-commercial videos that are not promoted. The ones with relatively low view counts. For more popular videos, I listen to the audio file first before downloading the video file. After listening to the audio, I may decide to skip the video. Also I am not overly concerned about throttling

1. The original Google Video made a distinction between commercial and non-commercial(free) videos. The later were always easy to download, and no sign-in/log-in was required. This might be a more plausible theory why YT has always allowed downloads for non-commercial videos

   # custom C filters to make scripts faster, easier to write
   # yy030 filters URLs from stdin
   # yy082 filters various strings from stdin, 
   # e.g., f == print format descriptions, v == print YT IDs
   # x is a YouTube ID
   # script accepts YT ID on stdin
   
   #/bin/sh
   read x;
   y=https://www.youtube.com/youtubei/v1/player?prettyPrint=false 
   curl -K/dev/stdin $y <<eof|yy030|if test $# -gt 0;then egrep itag=$1;else yy082 f|uniq;fi;
   silent
   #verbose
   ipv4
   http1.0
   tlsv1.3
   tcp-nodelay
   resolve www.youtube.com:443:142.251.215.238 
   user-agent "com.google.ios.youtube/19.45.4 (iPhone16,2; U; CPU iOS 18_1_0 like Mac OS X;)"
   header "content-type: application/json"
   header "X-Youtube-Client-Name: 5"
   header "X-Youtube-Client-Version: 19.45.4"
   header "X-Goog-Visitor-Id: CgtpN1NtNlFnajBsRSjy1bjGBjIKCgJVUxIEGgAgIw=="
   cookie "PREF=hl=en&tz=UTC; SOCS=CAI; GPS=1; YSC=4sueFctSML0; __Secure-ROLLOUT_TOKEN=CJO64Zqggdaw7gEQiZW-9r3mjwMYiZW-9r3mjwM%=; VISITOR_INFO1_LIVE=i7Sm6Qgj0lE; VISITOR_PRIVACY_METADATA=CgJVUxIEGgAgIw=="
   data "{\"context\": {\"client\": {\"clientName\": \"IOS\", \"clientVersion\": \"19.45.4\", \"deviceMake\": \"Apple\", \"deviceModel\": \"iPhone16,2\", \"userAgent\": \"com.google.ios.youtube/19.45.4 (iPhone16,2; U; CPU iOS 18_1_0 like Mac OS X;)\", \"osName\": \"iPhone\", \"osVersion\": \"18.1.0.22B83\", \"hl\": \"en\", \"timeZone\": \"UTC\", \"utcOffsetMinutes\": 0}}, \"videoId\": \"$x\", \"playbackContext\": {\"contentPlaybackContext\": {\"html5Preference\": \"HTML5_PREF_WANTS\", \"signatureTimestamp\": 20347}}, \"contentCheckOk\": true, \"racyCheckOk\": true}"
   eof