Wow! This is a PRIME example of how Open Source is supposed to work. Steve creates a very neat Bash script, his first, to create a specialized /etc/hosts file and folks jump in and add mods for their special purposes. Everyone benefits! Now, suppose he had created a binary to sell as shareware? Only he could have made changes, depriving himself and others of improvements, changes, bug fixes, etc..., that other more experienced Bash script writers could have contributed. Everyone benefits from Steve, Feathers and the other contributers.
Announcement
Collapse
No announcement yet.
Script to automate building an adblocking hosts file
Collapse
This topic is closed.
X
This is a sticky topic.
X
X
-
Google Analytics for Wordpress works by inserting the analytics code into the header of each page. This is the code:
Code:<script type="text/javascript">//<![CDATA[ // Google Analytics for WordPress by Yoast v4.3.3 | http://yoast.com/wordpress/google-analytics/ var _gaq = _gaq || []; _gaq.push(['_setAccount', 'XXXXXXXXX']); _gaq.push(['_setCustomVar',2,'post_type','page',3],['_setCustomVar',4,'year','2013',3],['_trackPageview']); (function () { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); //]]></script>
FeathersLast edited by Feathers McGraw; Oct 28, 2013, 12:11 PM.
- Top
- Bottom
Comment
-
I don't want to take over Steve's thread, so I created a new one here for the router project, so I don't clutter this one with stuff not specific to Kubuntu.
- Top
- Bottom
Comment
-
Pan-Galactic QuordlepleenSo Long, and Thanks for All the Fish
- Jul 2011
- 9524
- Seattle, WA, USA
- Send PM
-
Nice script, but there are a few inherent weaknesses in /etc/hosts blocking. For example, it only affects DNS queries...so things that use IP addresses instead of DNS queries are unaffected (I think this is somewhat of a tendency these days...like Google safebrowsing).
I'd personally prefer something like privoxy (http://en.wikipedia.org/wiki/Privoxy) for ad-blocking/privacy, but of course there is nothing wrong in using both.
In case you're interested in suggestions for the script:
1. since /etc/hosts is systemwide, I'd probably use something like /var/local/hostblock instead of users $HOME for storing backup of hosts, and /usr/local/sbin for the script.
2. use variables instead of hardcoded paths/filenames for easier modification.
3. You could create separate dynamic hostblock files in /var/local/hostblock which could be used to generate the hosts file, like:hostblock.localhost (localhost hosts entries)4. make it cronjob friendly, this could include variable times for changing hosts (entries in hostblock.dynamic could be checked every 10 minutes, hostblock.block once a week etc.) and some error checking to make sure it'll make a valid hosts file in case /etc/hosts is modified automatically.
hostblock.static (static addresses, like static lan addresses)
hostblock.dynamic (user configurable dynamic addresses queried at runtime, like shortcut entries for dynamic DNS hostnames)
hostblock.block (null addresses for ad-blocking)
hostblock.blacklist (user configurable additions to blocked hosts)
hostblock.whitelist (addresses user wants to whitelist, removed from blocked hosts)
and possibly some configuration files:
hostblock.conf (could be used to store the variables)
hostblock.blocklists (store list of urls of adblocking hosts-files downloaded from the net)
(All just suggestions, of course, if you prefer to keep it simple that's completely fine)
- Top
- Bottom
Comment
-
Originally posted by Feathers McGraw View PostCan you not add an IP address to a hosts file?
- Top
- Bottom
Comment
-
Pan-Galactic QuordlepleenSo Long, and Thanks for All the Fish
- Jul 2011
- 9524
- Seattle, WA, USA
- Send PM
-
Since, two years later, you're still interested in comment...
Feathers had a problem with truncation copying the script, I think, and it brought to mind a reaction I had originally, but dismissed at the time as just being pernickety, though now I regret not speaking up. Anyway, I would lay out the long sed command using line continuations, maybe:
Code:sed -e 's/\r//' \ -e '/^127.0.0.1/!d' \ -e '/localhost/d' \ -e 's/127.0.0.1/0.0.0.0/' \ -e 's/ \+/\t/' \ -e 's/#.*$//' \ -e 's/[ \t]*$//' \ < $temphosts1 | sort -u > $temphosts2
Code:echo -e "\n# Ad blocking hosts generated "$(date) | cat ~/hosts-system - $temphosts2 > ~/hosts-block
Regards, John LittleRegards, John Little
- Top
- Bottom
Comment
-
Originally posted by Feathers McGraw View PostIf I've done anything embarrassingly inefficient, let me know lol.
Code:comm -23 $temphosts2 whitelist > $temphosts3
Regards, John Little
- Top
- Bottom
Comment
-
Originally posted by jlittle View PostRight you are then, the comm command does this stuff, only it needs sorted input, so would have to be applied before Steve's merge step, and the whitelist would need to be sorted:
Code:comm -23 $temphosts2 whitelist > $temphosts3
- Top
- Bottom
Comment
Comment