Originally posted by GreyGeek
Announcement
Collapse
No announcement yet.
Script to automate building an adblocking hosts file
Collapse
This topic is closed.
X
This is a sticky topic.
X
X
-
Script to automate building an adblocking hosts file
- Top
- Bottom
-
Script to automate building an adblocking hosts file
Originally posted by SteveRiley......
Over the long weekend I plan to make this script smarter. I hope to figure out a way to incorporate it into VMs, too... if you're using Windows on VirtualBox, say, then when you update your hosts's hosts (haha), you can pull that into your VM's hosts file, too.
....
Did you ever get around to making that script smarter?
If so, I'd like to mooch it!
GG
- Top
- Bottom
Leave a comment:
-
Script to automate building an adblocking hosts file
Already discovered a minor inconvenient side effect... if you receive marketing email and want to click the opt-out link, many of the hostnames in those URLs are included in the block lists. You'll have to temporarily disable the blocking hosts file (sudo mv, then sudo mv back) for the opt-out to work.
Over the long weekend I plan to make this script smarter. I hope to figure out a way to incorporate it into VMs, too... if you're using Windows on VirtualBox, say, then when you update your hosts's hosts (haha), you can pull that into your VM's hosts file, too.
I will likely start a new thread and move this over there, to improve discoverability.
- Top
- Bottom
Leave a comment:
-
Script to automate building an adblocking hosts file
Wow -- verrrrry cool, Steve! I wish I could script like that.
I might add, chromium-browser has the ghostery plugin that does a nice job of blocking trackers (and letting you see it).
- Top
- Bottom
Leave a comment:
-
Script to automate building an adblocking hosts file
Originally posted by SteveRileyAfter comparing the performance of browser-based ad blockers to custom-crafted hosts files, I've concluded that the latter is better. I've found four reasonably updated sources -- the winhelp one is the largest and probably most familiar, but it seems to be updated less frequently than some of the others.
The only advertisements that do sneak through that the browser ad-blocker used to catch are the occasional flash based. Looking for ways to block those, but at this point, I like the performance of the hosts file.
- Top
- Bottom
Leave a comment:
-
Script to automate building an adblocking hosts file
After comparing the performance of browser-based ad blockers to custom-crafted hosts files, I've concluded that the latter is better. I've found four reasonably updated sources -- the winhelp one is the largest and probably most familiar, but it seems to be updated less frequently than some of the others.
So I've spent the last couple hours teaching myself bash scripts and especially the handy little sed utility. I've built a script that downloads the files, cleans out all their comments, de-duplicates entries, and merges the result with your system's original hosts file.
Code:#!/bin/bash # If this is our first run, save a copy of the system's original hosts file and set to read-only for safety if [ ! -f ~/hosts-system ] then echo "Saving copy of system's original hosts file..." cp /etc/hosts ~/hosts-system chmod 444 ~/hosts-system fi # Perform work in temporary files temphosts1=$(mktemp) temphosts2=$(mktemp) # Obtain various hosts files and merge into one echo "Downloading ad-blocking hosts files..." wget -nv -O - http://winhelp2002.mvps.org/hosts.txt >> $temphosts1 wget -nv -O - http://hosts-file.net/ad_servers.asp >> $temphosts1 wget -nv -O - http://someonewhocares.org/hosts/hosts >> $temphosts1 wget -nv -O - "http://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&showintro=0&mimetype=plaintext" >> $temphosts1 # Do some work on the file: # 1. Remove MS-DOS carriage returns # 2. Delete all lines that don't begin with 127.0.0.1 # 3. Delete any lines containing the word localhost because we'll obtain that from the original hosts file # 4. Replace 127.0.0.1 with 0.0.0.0 because then we don't have to wait for the resolver to fail # 5. Scrunch extraneous spaces separating address from name into a single tab # 6. Delete any comments on lines # 7. Clean up leftover trailing blanks # Pass all this through sort with the unique flag to remove duplicates and save the result echo "Parsing, cleaning, de-duplicating, sorting..." sed -e 's/\r//' -e '/^127.0.0.1/!d' -e '/localhost/d' -e 's/127.0.0.1/0.0.0.0/' -e 's/ \+/\t/' -e 's/#.*$//' -e 's/[ \t]*$//' < $temphosts1 | sort -u > $temphosts2 # Combine system hosts with adblocks echo Merging with original system hosts... echo -e "\n# Ad blocking hosts generated "$(date) | cat ~/hosts-system - $temphosts2 > ~/hosts-block # Clean up temp files and remind user to copy new file echo "Cleaning up..." rm $temphosts1 $temphosts2 echo "Done." echo echo "Copy ad-blocking hosts file with this command:" echo " sudo cp ~/hosts-block /etc/hosts" echo echo "You can always restore your original hosts file with this command:" echo " sudo cp ~/hosts-system /etc/hosts" echo "so don't delete that file! (It's saved read-only for your protection.)" echo
Code:chmod +x ~/gethosts
Code:~/gethosts
The script outputs the file ~/hosts-block. Each time you run it, you'll need to manually replace your existing host file with this command:
Code:sudo cp ~/hosts-block /etc/hosts
Minor addition
If you want slightly shorten the number of keystrokes required to run the utility, you can create a bin subdirectory in your home folder. When you start a shell, if the directory ~/bin exists, it is automatically added to your $PATH. Now, place the script in this subdirectory. Then you can simply run
Code:gethosts
(Thanks to SecretCode for the idea!)Last edited by SteveRiley; Nov 14, 2013, 03:33 AM.Tags: None
- Stuck
- Top
- Bottom
Leave a comment: