Early Saturday morning I decided to start cleaning up files on our media centre. Awhile back I took inventory of all the movies on our media centre and entered them into a simple 3 field online database that included: the movie title, format of the movie (DVD/Blu-ray/VHS/Digita/Other), and the date of the movie. Over the past few months I've picked up a lot of movies that didn't make it into the online database. I use that database to make sure I don't already have a movie when I'm out looking for new movies.
The first step was to take inventory of all the filenames of the movies on the system. All movies are stored under the directory /mnt/media/Movies. That directory is further sub-divided into 2 other directories /Movies/DVDs and /Movies/Blu-rays. Those folders contain both files and folders, but all the files and folders are archives of our DVDs and Blu-ray discs.
In both folders I have a directory called 111. When I rip movies with MakeMKV they go into the appropriate 111 folder. For Blu-ray discs this would be /Movies/Blu-rays/111 and for DVDs it would be /Movies/DVDs/111. MakeMKV generates large files. I use the command-line version of Handbrake, handbrake-cli to compress those large files in the 111 folder. I use a short script that compresses every file in the folder, so I don't have to compress files individually, I just run the script and leave it until all the files are compressed. This process can take a long time, even with an i7-2600 CPU doing all the processing, and particularly when compressing a few months worth of Blu-ray pick-ups.
To list all the files in the Movies folder I changed into that folder and ran a variation of:
ls -R1 -I "*.jpg" -I "*.srt" > ~/Movies-$(date-%F).txt
This ls command recursively lists all the files and directories in a 1 (single) column without all the extra data. The -I "*.jpg" ignores all the files that end in the .jpg extension. The -I switch comes in handy because you can exclude images and subtitles (*.srt). The last part of the command > redirects the output to a file called Movies-08-12-2019.txt (where the August 12th date shown here is the current date). I then downloaded that file from our media centre to my workstation using Filezilla. At first I opened the file on another system using the mousepad text editor, and I started printing the list. I found that when you open the list with a text editor and print it many of the file names get cut off the bottom of each page (likely due to the text editor looking for a legal rather than letter size). When opened and printed with LibreOffice Writer the list printed fine.
The next issue I ran into was the pagination of my online list. My online list shows 100 movies at a time, it gets difficult to find particular movies since some would have a slightly different way of listing in the movie list compared to the file name. I'm using Drupal on the site I'm using for the list, so it was a simple matter of copying the "view" I was using and creating a new "view" that eliminated the pager. With this long list I was able to select all the movies and movie data, then paste that list into a LibreOffice Calc spreadsheet. Why not Microsoft Office? Lately I've found MS Office a real pain to do simple tasks that LibreOffice just does as you'd expect. That and the whole software freedom argument are strong points for LibreOffice.
With the online list in LibreOffice calc I sorted the data by the second field (the format) and then started highlighting the files in my printed list that do not appear in the spreadsheet.
It's tedious manual labour and the process reminded me it's important that I get the newly purchased titles into the online database before I make a video about the titles. This leaves me with the decision, do I put them into the online database before I rip/archive them, or after. The advantage of putting them in the list immediately is that I won't forget. The disadvantage of putting them in the list first is that if a movie doesn't rip I now have to remove the movie from the list. (Will I remember?)
I have a small stack of movies that haven't ripped, due to flaws in the disc, or simply being dirty. There are a couple of movies in my collection (one of which is the 1st Guardians of the Galaxy) that no matter how much I scrub the disc with Isopropyl simply won't rip correctly. Another thing to look into.
I've heard of people using apps to scan and catalog movies. This might be a future stage.
The next step after getting all the movies into the online database is to figure out why approximately 8-10 movies are not scraping correctly in KODI (usually a naming issue).
The first step was to take inventory of all the filenames of the movies on the system. All movies are stored under the directory /mnt/media/Movies. That directory is further sub-divided into 2 other directories /Movies/DVDs and /Movies/Blu-rays. Those folders contain both files and folders, but all the files and folders are archives of our DVDs and Blu-ray discs.
In both folders I have a directory called 111. When I rip movies with MakeMKV they go into the appropriate 111 folder. For Blu-ray discs this would be /Movies/Blu-rays/111 and for DVDs it would be /Movies/DVDs/111. MakeMKV generates large files. I use the command-line version of Handbrake, handbrake-cli to compress those large files in the 111 folder. I use a short script that compresses every file in the folder, so I don't have to compress files individually, I just run the script and leave it until all the files are compressed. This process can take a long time, even with an i7-2600 CPU doing all the processing, and particularly when compressing a few months worth of Blu-ray pick-ups.
To list all the files in the Movies folder I changed into that folder and ran a variation of:
ls -R1 -I "*.jpg" -I "*.srt" > ~/Movies-$(date-%F).txt
This ls command recursively lists all the files and directories in a 1 (single) column without all the extra data. The -I "*.jpg" ignores all the files that end in the .jpg extension. The -I switch comes in handy because you can exclude images and subtitles (*.srt). The last part of the command > redirects the output to a file called Movies-08-12-2019.txt (where the August 12th date shown here is the current date). I then downloaded that file from our media centre to my workstation using Filezilla. At first I opened the file on another system using the mousepad text editor, and I started printing the list. I found that when you open the list with a text editor and print it many of the file names get cut off the bottom of each page (likely due to the text editor looking for a legal rather than letter size). When opened and printed with LibreOffice Writer the list printed fine.
The next issue I ran into was the pagination of my online list. My online list shows 100 movies at a time, it gets difficult to find particular movies since some would have a slightly different way of listing in the movie list compared to the file name. I'm using Drupal on the site I'm using for the list, so it was a simple matter of copying the "view" I was using and creating a new "view" that eliminated the pager. With this long list I was able to select all the movies and movie data, then paste that list into a LibreOffice Calc spreadsheet. Why not Microsoft Office? Lately I've found MS Office a real pain to do simple tasks that LibreOffice just does as you'd expect. That and the whole software freedom argument are strong points for LibreOffice.
With the online list in LibreOffice calc I sorted the data by the second field (the format) and then started highlighting the files in my printed list that do not appear in the spreadsheet.
It's tedious manual labour and the process reminded me it's important that I get the newly purchased titles into the online database before I make a video about the titles. This leaves me with the decision, do I put them into the online database before I rip/archive them, or after. The advantage of putting them in the list immediately is that I won't forget. The disadvantage of putting them in the list first is that if a movie doesn't rip I now have to remove the movie from the list. (Will I remember?)
I have a small stack of movies that haven't ripped, due to flaws in the disc, or simply being dirty. There are a couple of movies in my collection (one of which is the 1st Guardians of the Galaxy) that no matter how much I scrub the disc with Isopropyl simply won't rip correctly. Another thing to look into.
I've heard of people using apps to scan and catalog movies. This might be a future stage.
The next step after getting all the movies into the online database is to figure out why approximately 8-10 movies are not scraping correctly in KODI (usually a naming issue).
Comments
Post a Comment