Category: Home Lab

Downloading audio from Youtube using youtube-dl

I saw Mike over at Initial Charge link to Listenbox Turns YouTube Channels Into Podcasts. This is a neat idea and something I’ve been doing sparingly with longer-form Youtube content that is mainly audio.

However, I don’t need to pay a subscription for another thing in my life I’ll use on occasion. So I tweaked my Youtube-dl setup to pull audio from Youtube videos instead of the video itself.

First, I have to thank Jason for his youtube-dl setup which I replicated in a container on my server at home. He’s updated his configuration to support M1 Macs and some fancy iPhone shortcuts.

I have mine running on Ubuntu sitting in a container in proxmox in my office.

I’ve created a small shell script called music.sh. The contents are below, which I’ll explain.

# YTDL-Music Playlist - Audio Only - Goes to Music folder
/usr/local/bin/youtube-dl \
--extract-audio  --audio-format mp3  --embed-thumbnail --ignore-errors \
-o "/mnt/Youtube/Music/%(title)s.%(ext)s" \
--download-archive /var/www/video/music.txt \
https://www.youtube.com/playlist?list=PL9C818BCBDA822A24

The first line is a comment reminding me of what this is and does.

/usr/local/bin/youtube-dl – The location where I have youtube-dl installed.

--extract-audio – Downloads only the audio and discards the video.

--audio-format mp3 – Sets the format to MP3 for compatability.

--embed-thumbnail – Embeds the thumbnail from the Youtube video. Sometimes useful, sometime not. But it’s nice to have some kind of artwork on the file.

--ignore-errors` – Skips errors like if a video is removed or not able to be downloaded for some reason.

-o "/mnt/Youtube/Music/%(title)s.%(ext)s" – The -o stands for output and tells youtube-dl where to save the file. In my case, I have it saved to my NAS mounted through NFS at /mnt/Youtube/Music.

The title and extension are pulled right from an example output. I don’t know what the lowercase s means. I just used it because the examples showed it.

--download-archive /var/www/video/music.txt – I write a log of everything I download so if I delete it after watching it, it doesn’t get pulled back down every single time. The file logs the youtube ID as opposed to the title or something human-readable but it’s better than nothing.

Finally, the URL links to the Youtube playlist where I want to download from. This makes triggering the downloads as simple as adding a file to that playlist and letting youtube-dl do the rest.

How does youtube-dl know to check the list for new videos? I have a cron job set to run the script every 10 minutes. So anything newly added to the playlist will get downloaded.

*/10 * * * * sh /var/www/video/music.sh

Once I download the audio, it gets picked up by Plex. Plex is set to watch that folder and add new content it finds to the library as audio. So I can download music, podcasts uploaded to Youtube, or technical shows where the content is largely audio.

It works well other than the hiccups any self-hosted setup can run into and it costs my nothing additional since I’m already paid for the computer and pay for the electricity/internet anyway at home.

Did you try turning it off and on again?

Today has been a good lesson in complexity. I self-host a number of things. I really enjoy the ability to have things running at home I can play with and not pay a monthly fee to sit idle as I lose time or interest in that particular item.

But tonight, I was reminded how irritating it can be. We had a series of severe thunderstorms and lost power for a moment. And I mean the blink of an eye. My NAS and desktop computer rebooted. The laptops (obviously were ok) but not even the monitors flickered nor did the router notice anything.

But it was enough to knock the NAS and Proxmox server out of whack. I tried to pull up something to listen to on Plex and it dutifully told me there was no media. As it lives on the NAS and the multimedia NFS share was currently unavailable.

So I rebooted the NAS and the proxmox host since I didn’t feel like getting into a troubleshooting session tonight. And that didn’t work.

So I went looking for how to simply reconnect the NFS share I was sure was available from the NAS. And… got lost in a hole of promox forums and people talking around each other’s questions.

I got as far as being able to see which shares the proxmox server knew about, but not how to actually get them reconnected. This is something I would still like to know. I’m sure there’s a pvesm incantation I can chant to make it all work. But I’ve not been able to find it.

Eventually after the third (fourth?) reboot, all was well and my multimedia, backup and other shares are back online and all is well in the world But it’s still an irritating diversion and reminder that I don’t know as much as I think I do. I know just enough to be dangerous and then run to the hive mind when there’s trouble. 

Cleaner nginx proxy configs

I’ve been slowly teaching myself self-hosting things over the past few years. I’ve got a decent variety of things running at home. But I find it frustrating to keep track of their IPs and ports so I setup a reverve proxy which is a fancy term for one little linux container on my network directs you to everything else. Now I can hit plex.domain.tld, or wiki.domain.tld and get to where I want to go.

I had all of these setup with their own individual config files were very short. Most of them were little more than:

# A small thing running on a raspberry pi

server {
listen 80;
server_name pi.domain.tld

location / {
proxy_pass http://192.168.0.99:80;
}
}

I know there’s an easier way. And tonight I found a post that laid it all out and it clicked for me.

Timothy Quinn’s Using Nginx as a Reverse Proxy for Multiple Sites laid it all out and I realized all I needed to do was add all of my little config files to one big file. And I add a comment to the front of  each config to remind myself what it was and anything important to remember. This worked great for my Tiddlywiki, RSS reader, Calibre-web and Bookstack instance I have running.

His example is for sites with SSL enabled which I’m not going to move tonight. But I replicated this for all of my sites not running SSL. My next task is to move the SSL-enabled ones. But I run Nextcloud for myself and my wife so I need to take more time and make sure I do that right since it’s as close to a “production” thing that I self-host. So if it’s down, it affects more than just me.

Backups will save you

It’s true what they say. Backups are important!
 
Today was the perfect storm of why backups are important. Last night, I violated my own rule of working on projects after midnight. I thought it would be a good idea to update PHP on the server where my Nextcloud installation lives. The place where I keep and sync my files for work and I setup for my wife so she could stop paying for Dropbox.
 
I wanted to upgrade PHP so I could move to the newer versions of Nextcloud. Then, I decided to upgrade Nextcloud. So I updated PHP and Nextcloud itself. After midnight. A sure recipe for success!
 
Logging back into Nextcloud told me, This directory is unavailable, please check the logs or contact the administrator. Well, I am the administrator so that option’s out. I asked him. He’s clueless. So I went looking at the logs and they were full of errors I didn’t understand. Not enough to craft the search term that might lead to help. After a brief trip through github issues and forum posts, I gave up. I had to roll back the server to the last backup.
 
The latest one was from two nights ago. So I started the restore and went to sleep.
 
The next morning, I checked on Proxmox and after about 5 hours, the data restore completed. I took a deep breath and logged into the server.
 
No errors.
 
Files were all there.
 
Things looked good.
 
Until later that morning when my wife made sounds of distress, which I feared was my doing. Sure enough, there was a directory missing from her files. One she needed for work today. In about 30 minutes.
 
I had forgotten to mention what happened to her in the morning. She was mad. She was right.
 
I took my second deep breath of the day, asked for her laptop and the name of the folder and about where it was in her folders. (There are SO. MANY. FOLDERS.)
 
I opened Time Machine and hope the NAS downstairs had done its job. I’d had such a hard time getting the Mac mini on her desk remaining connected to the NAS to back itself up. I had setup oour laptops to backup on the same day. My laptop had not complained. So I was hopeful I my planning would pay off.
 
I was not. Time Machine did its job. I was able to locate and recover the directory and all its contents from a backup from yesterday evening.
 
Let my near-fatal errors be a lesson to you!
 
(I’m not sure my wife would have spared me and no jury would have convicted her.)
 
Backup your data.
 
For Proxmox, where I’m running Nextcloud and Plex and some other toys, it has an option to back itself up. Turn It On!
 
You know, that laptop you carry around? The one lucky enough to not have a drink spilled into it. The computer that occaional flies off desks and sofas, back it up.
 
On the Mac, it’s as easy and low tech as plugging in an external hard drive. Telling Time Machine to use that drive, and walking away. Plug that drive in as often as you like and let it handle the rest.
 
 
On any platform, you can use a service like Backblaze to send your data to the cloud. But please, whatever you do, learn from my mistakes. Whether it’s a stupid thing you do, or an accident you didn’t cause, you will lose data.
 
A backup wil save you.
 
And before you think you’re safe because you use Dropbox, Google Drive, or iCloud, I ask you. Do you sync those files? Sync is Not Backup. Replace Nextcloud in my story with any of the alternatives and you get to the same place. On the next sync, those files in the cloud are gone.
 
And for the server savvy who think you’re safe because your data is in a RAID, Raid is not backup!