Announcement

Collapse
No announcement yet.

Blacklisting websites using /etc/hosts

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Blacklisting websites using /etc/hosts

    I've been using a 17MB hosts file I downloaded to block Facebook, google and other annoying sites.
    Today I decided to freshen that list and went looking for it. Instead, I found this site on GitHub that contains a hosts file twice as big.
    https://github.com/mitchellkrogza/Ul...osts.Blacklist

    Here is the link to the actual hosts file:
    https://hosts.ubuntu101.co.za/hosts

    I am using it now. It doesn't slow my Internet browsing at all, but it stops the ad junk, facebook and other annoying sites from taking over my browser.

    There is a script that will automatically install the big hosts file:
    Code:
    [FONT=Courier New]#!/bin/bash[/FONT]
    
    # Linux hosts Installer for the Ultimate Hosts Blacklist
    # Repo Url: https://github.com/mitchellkrogza/Ultimate.Hosts.Blacklist
    # Copyright - Mitchell Krog - mitchellkrog@gmail.com 
    # https://github.com/mitchellkrogza
    
    # First Backup Existing hosts file
    [FONT=Courier New]sudo mv /etc/hosts /etc/hosts.bak[/FONT]
    
    # Now download the new hosts file and put it into place
    [FONT=Courier New]sudo wget https://hosts.ubuntu101.co.za/hosts -O /etc/hosts[/FONT]
    
    [FONT=Courier New]exit 0[/FONT]
    You will have to add a line at the top:
    127.0.0.1 yourcomputerdomainname
    "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
    – John F. Kennedy, February 26, 1962.

    #2
    Thanks GG, This is the kind of tool we need in today's world. 33.4 MB? wow.

    I have to wonder how the targeted sites will respond to being so completely blocked? Run to their "safe space"? Scream out loud? The images in my mind are good for a laugh...

    I'm not sure if I'm really any "safer" using this, but it certainly feels better now that I can use this as a statement of my dislike for the tactics of such websites.

    It's much better than simply making a rude gesture at the screen before I close it.

    I find it ironic that the author of this file is using a gmail address...
    Last edited by TWPonKubuntu; Jun 20, 2018, 02:17 PM. Reason: added comments
    Kubuntu 24.11 64bit under Kernel 6.11.4, Hp Pavilion, 6MB ram. Stay away from all things Google...

    Comment


      #3
      This seemed like a good idea to me. So, I loaded the new hosts file. Today I find I can't access http://www.naturalnews.com and I can't load it. Check the hosts file and I find it blacklisted! Really? What's up with that?

      -=Ken=-
      Last edited by kenj70; Jun 22, 2018, 03:46 PM.
      -=Ken=-
      "A man has to know his limitations." Harry Callihan (Dirty Harry)
      DIY ASRock AB350, AMD Ryzen 3 1200, 16 GB RAM, nvidia GT-710, kubuntu 20.04

      Comment


        #4
        kenj70;

        This points out a problem with using a list from somebody else.

        I'm going to guess that naturalnews.com might have used some code that warranted a blacklisting, but that is just a guess...

        Perhaps they are using third party scripts from a blacklisted site? Guilt by association?

        Look at the long list of sources for the blocked sites:

        https://github.com/mitchellkrogza/Ul...ts-and-credits

        Sadly, it is a case of "Trust, but verify" ala Ronald Reagan... With more than 1.3 Million items in the list, it could take a while...
        Kubuntu 24.11 64bit under Kernel 6.11.4, Hp Pavilion, 6MB ram. Stay away from all things Google...

        Comment


          #5
          Hey thanks, TW. I reverted - changed things back to default. A blacklist hosts file sure sounds like a good idea but I think there is a political component to some of the listing. Naturalnews.com has generated some controversy due to exposure of vaccine problems and such. So it's bound to be on someones list. The problem with hosts files is of course maintenance. It could get to be a pain in the hinder parts to keep updated to one's own preferences.

          -=Ken=-
          -=Ken=-
          "A man has to know his limitations." Harry Callihan (Dirty Harry)
          DIY ASRock AB350, AMD Ryzen 3 1200, 16 GB RAM, nvidia GT-710, kubuntu 20.04

          Comment


            #6
            Originally posted by kenj70 View Post
            ...
            A blacklist hosts file sure sounds like a good idea but I think there is a political component to some of the listing.
            ...
            -=Ken=-
            That sounds Very likely... All someone would need to do is to report a site to one of the "PC watchdogs" and it could be blackballed. I don't think the person who assembled the Github list, Mitchell Krogza, is going to check all of the 1.3Million+ sites by himself.
            Kubuntu 24.11 64bit under Kernel 6.11.4, Hp Pavilion, 6MB ram. Stay away from all things Google...

            Comment


              #7
              Originally posted by kenj70 View Post
              Hey thanks, TW. I reverted - changed things back to default. A blacklist hosts file sure sounds like a good idea but I think there is a political component to some of the listing. Naturalnews.com has generated some controversy due to exposure of vaccine problems and such. So it's bound to be on someones list. The problem with hosts files is of course maintenance. It could get to be a pain in the hinder parts to keep updated to one's own preferences.

              -=Ken=-
              All you have to do is put a hash mark in front of that line to allow access. No use throwing out the wash water because your baby is in it.
              "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
              – John F. Kennedy, February 26, 1962.

              Comment


                #8
                Ooooh. All the entries I looked at had numbers in front of them. So I didn't know if I could delete them. But a hash mark sounds do-able.

                If naturalnews.com is blacklisted then someone has gone overboard against more conservative websites.

                Thanks GG.

                -=Ken=-
                -=Ken=-
                "A man has to know his limitations." Harry Callihan (Dirty Harry)
                DIY ASRock AB350, AMD Ryzen 3 1200, 16 GB RAM, nvidia GT-710, kubuntu 20.04

                Comment


                  #9
                  Originally posted by kenj70 View Post
                  Ooooh. All the entries I looked at had numbers in front of them. So I didn't know if I could delete them. But a hash mark sounds do-able.

                  If naturalnews.com is blacklisted then someone has gone overboard against more conservative websites.

                  Thanks GG.

                  -=Ken=-
                  The IP addresses: 127.0.0.1 and 0.0.0.0 refer to your PC's localhost. When your browser sees a domain name it looks in hosts to see if if that domain name is there. If it is, it will use the IP address in front of that domain name instead of referring to a DNS server for a lookup. IF you know the IP address of naturalnews you can use IT to replace the 127.0.0.1 or 0.0.0.0 that is currently in front of that domain name in hosts. I use that as a pseudo DNS for special websites.
                  "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
                  – John F. Kennedy, February 26, 1962.

                  Comment


                    #10
                    Here's how I do it (mostly stolen from Steve Riley ) but I think it needs some updating.

                    I have a file /etc/hosts.local that contains the normal /etc/hosts entries along with my local edits. I have a monthly cronjob that runs via anacron that downloads, cleans up, and compiles various adblocker lists, then it adds /etc/hosts.local to the front and saves the whole mess as /etc/hosts. If I need to add or remove a local (my home network) address from the hosts file, I edit /etc/hosts.local and run the script manually. Once a month, anacron runs the script and updates the blocked hosts list. Here's the cron script I keep in /etc/cron.monthly:

                    Code:
                    [FONT=monospace]
                    #!/bin/bash
                    
                    echo "gethosts cron script: Making new hosts file..." > /dev/kmsg  
                    # If this is our first run, save a copy of the system's original hosts file and set to read-only for safety
                    if [ ! -f /etc/hosts.local ]
                    then
                    # echo "Moving hosts files to hosts.local..."
                     cp /etc/hosts /etc/hosts.local
                    fi
                    
                    # After initial run, backup current hosts file
                    if [ -f /etc/hosts.local ]
                    then
                    # echo "Backing up copy of system's current hosts file..."
                     cp /etc/hosts /etc/hosts.backup
                    fi
                    
                    # Perform work in temporary files
                    temphosts1=$(mktemp)
                    temphosts2=$(mktemp)
                    
                    # Obtain various hosts files and merge into one
                    echo "gethosts cron script: Downloading ad-blocking hosts files..." > /dev/kmsg
                    wget -o /dev/kmsg -nv -O - http://winhelp2002.mvps.org/hosts.txt >> $temphosts1
                    wget -o /dev/kmsg -nv -O - http://hosts-file.net/ad_servers.asp >> $temphosts1
                    wget -o /dev/kmsg -nv -O - http://someonewhocares.org/hosts/hosts >> $temphosts1
                    wget -o /dev/kmsg -nv -O - "http://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&showintro=0&mimetype=plaintext" >> $temphosts1
                    
                    # Include additional hosts manually added by admin
                    if [ -f /etc/hosts.bad ]
                    then
                     cat /etc/hosts.bad >> $temphosts1
                    fi
                    
                    # Do some work on the file:
                    # 1. Remove MS-DOS carriage returns
                    # 2. Delete all lines that don't begin with 127.0.0.1 or 0.0.0.0
                    # 3. Delete any lines containing the word localhost because we'll obtain that from the original hosts file
                    # 4. Replace 127.0.0.1 with 0.0.0.0 because then we don't have to wait for the resolver to fail
                    # 5. Scrunch extraneous spaces separating address from name into a single tab
                    # 6. Delete any comments on lines
                    # 7. Clean up leftover trailing blanks
                    # Pass all this through sort with the unique flag to remove duplicates and save the result
                    #echo "Parsing, cleaning, de-duplicating, sorting..."
                    sed -e 's/\r//' -e '/^127.0.0.1\|0.0.0.0/!d' -e '/localhost/d' -e 's/127.0.0.1/0.0.0.0/' -e 's/ \+/\t/' -e 's/#.*$//' -e 's/[ \t]*$//' < $temphosts1 | sort -u > $temphosts2
                    # Now remove any sites you don't want blocked
                    # Copy this line and change "pattern to match" to the key word or phrase that matches the host you don't want to block
                    # sed -i '/pattern to match/d' $temphosts2
                    
                    # Combine system hosts with adblocks
                    echo "gethosts cron script: Merging with original system hosts..." > /dev/kmsg  
                    #echo -e "\n# Ad blocking hosts generated "$(date) | cat /etc/hosts.local - $temphosts2 > /etc/hosts
                    
                    # Clean up temp files and remind user to copy new file
                    #echo "Cleaning up..."
                    rm $temphosts1 $temphosts2
                    echo "gethosts cron script: Done making new hosts file." > /dev/kmsg  
                    #echo
                    
                    [/FONT]
                    I also have /etc/hosts.bad for manual entries and there's a line in the script where you can unblock specific hosts.
                    Last edited by oshunluvr; Jul 03, 2018, 02:46 PM.

                    Please Read Me

                    Comment


                      #11
                      Or just https://pi-hole.net/

                      Played with that running in a container instead of an actual Raspberry Pi, and seemed to work quite well.
                      On #kubuntu-devel & #kubuntu on libera.chat - IRC Nick: RikMills - Launchpad ID: click

                      Comment


                        #12
                        I'd love to run that on my router. asuswrt has a lot of benefits but I haven't gotten around to checking that out.

                        Please Read Me

                        Comment

                        Working...
                        X