Monday, September 14, 2009

My Server Was Hacked!

This post is a bit outdated since the hack actually occurred about 8 months ago. But I thought it might be helpful to some folks who find themselves in the same boat I was in.

Back in early February I was hacking around my production Linux server when I happened to stumble upon a bizarre command in my "history" file. It was a removal command for a script I had never heard of... My stomach about fell through the floor. I quickly realized my production server had been hacked!

Quickly I checked my last login history and found that someone had logged in as "root" just a few hours before. According to the logs they were on my server for about 9 minutes. During that time they installed a web-based file system browser and then spent the remainder of their time on my server browsing my web site files. It appeared they were looking for passwords or database connectivity information. They also tried to grab my database, but it didn't seem to work according to the log files.

After 9 minutes they logged off. Then back on again about a half an hour later for just 1-2 minutes. No activity that I could tell other than to maybe remove the web script they left behind.

Aside from being scared to death of what I just witnessed in my log files, I came to the realization that I could not even trust what I was reading in the log files. Anyone who obtained root access could have easily spoofed the log files to make me think they didn't get very far...

I quickly started scanning my security log files to see how the heck they got in. For a few months now I had been noticing a brute force attack on my server. Someone was trying to guess a userid/password via SSH. My firewall was configured to block them by IP if they tried too many times, so I was not too worried about it. Well I should have been more worried. They eventually got in via their brute force attack. They successfully guessed my root password and bingo, the doors opened up. Wow. My worst nightmare.

I quickly changed every password on the box and locked it down. I had previously left a small door open (SSH) so I could remote to my box from anywhere if needed. I didn't want to limit SSH incoming requests from my home IP since I travel a lot. Well this came back to bite me.

So I locked SSH down via my firewall (APF) and came up with an alternative solution involving a 3rd party service to allow me to configure APF safely and have a dynamic IP address. That is all fine and dandy, but I still cannot be 100% sure the hacker didn't leave some other back door programs lying around.

I contacted my ISP to see if they could help me sniff out any potential problems. They checked my box out and said they didn't not see anything, but they agreed that there could still be a hidden trojan and we'd never know it until it was too late.

So we made a big decision to build a completely new server from scratch. We took a back up of our websites and data from before the break-in and used it to restore our websites on the new server. We locked down SSH access via a dynamic dns service, and we made our passwords as random, long and secure as we could.

We learned a hard lesson that month. Never ignore a potential security threat. Never think that your defenses are better than they are. Never underestimate.

Friday, August 07, 2009

The Bogon List - Why Some Users May Not Be Able To Reach Your Site

I discovered a very interesting problem in the last month or so in regards to something called the "Bogon list". I had never heard of the Bogon list before, but after doing some research, I discovered it is a list of reserved IP ranges not released for public use.

Now from what I read online, many network routers across the internet will by default block request to and from IP address in the Bogon lists. Why? Well apparently spammers and other bad sorts have been known to try and use IP addresses from the range for whatever reason. So as a preventative measure, routers may simply ignore requests from these IP ranges, since they are technically suppose to be reserved and not in use by the public. This also saves on router processing cycles since they can simply discard the packets and move on.

Well on to my problem. I had a few users of my site complaining to me over email that they could no longer get to my site. For weeks they had been trying with no luck. They would get a timeout or page/site not available message.

Initially I tried seeing if their ISP was blocking our site (for whatever reason), and found the ISPs were not blocking it. One ISP however tipped me off that the user in question did have an IP address that was recently released from the Bogon list. The ISP simply gave the user a new IP and they were now able to access my site.

Armed with this information I started talking to my other users who were also unable to get to my site. Surprisingly they were all using IP addresses that were recently (in the last year) released from the Bogon list.

What this tells me is the Bogon system is a broken system. If they can release IP ranges to the public, and up to a year later those IP ranges are still being blocked by routers all over the net, how is anyone suppose to know when a site is really down, or up?

I was pissed because it makes my site look like it is offline, which is bad PR and get get my site a bad reputation in Google search results when users visit my links only to find they can't get to my pages and they go back to Google and clicks some place else.

Now I realize this problem appears to be isolated to a small set of IP ranges that get release periodically, but the the problem is one that is extremely difficult to troubleshoot, and even more difficult to track down and resolve. The "problem" router that might not have their Bogon list updated could be one of a hundred in the path from your site to the user's computer.

Long story short, if you have some users that just can't get to your site and you don't have a clue why, check their IP addresses against the Bogon release lists from the past year. Your users may need to request a new IP to regain access to your site.

Monday, January 26, 2009

Money Saving Tip: Lease-to-Own Servers

My ISP starting offering lease-to-own deals many years ago on dedicated servers.  At the time, you would pay an extra $25/month or so to sign-up.  Then after a year, your payment would get prorated down, and after two years you would own the server out-right.

I didn't really think much of the long term benefit of it at the time since I was swapping up my servers so often in the early days to account for increasing traffic and new sites I was adding each year.

Well come to find out my 2 years had come and gone this past summer.  I was doing some catching up on my books and realized my develoment server, which I was paying around $70/month for for 2 years, was now billing out at $2/month!  Nice!  I had completelyl forgotten about the lease-to-own program.  My server was essentially free from here on out!  The $2/month they charge me now is for the additional IP addresses I lease, unrelated to the hardware.

Since this is my development box, I can estimate keeping it for another 3+ years before it probably dies on me...  That will pocket me a total savings of $2520!

As an added benefit, my ISP will ship me the server once I terminate our contract, or upgrade to a new server.  It is my server!   Sweet!  

For my business I lease a few dedicated servers.   I have another one approaching the 2-year limit next month.  That one I pay close to $300/month for.  Once my hardware portion of that payment is reduced, I'll be paying less than $50/month for the server.  Assuming I hang on to it for another 2 years (possible, since I recently implemented a sweet memcached solution to help scale my server), that would give me a total savings of $6000 for this second server alone!

Between the two servers, I'm going to save $8500+ over the next few years.  Plus I'll get the servers sent to me to keep as nostalgia... ;-)

Tuesday, January 06, 2009

MySQL Performance Tip - Indexes and Wildcards

This one is kind of obvious once you think about it, but I had overlooked it in my code, so I figured maybe some other webmasters out there may have as well.  I found this snippet explaining the tip I want to share from a site called websitedatabases.com:


 MySQL also uses indexes for LIKE comparisons if the argument to LIKE is a constant string that doesn't start with a wildcard character. For example, the following SELECT statements use indexes:

SELECT * FROM tbl_name WHERE key_col LIKE 'Patrick%'; SELECT * FROM tbl_name WHERE key_col LIKE 'Pat%_ck%';

In the first statement, only rows with 'Patrick' <= key_col < 'Patricl' are considered. In the second statement, only rows with 'Pat' <= key_col < 'Pau' are considered.

The following SELECT statements will not use indexes:

SELECT * FROM tbl_name WHERE key_col LIKE '%Patrick%'; SELECT * FROM tbl_name WHERE key_col LIKE other_col;

For me, some of my most visited pages had some hidden queries buried deep in includes that were using front-end wildcards to search some of my largest tables.  It came to my attention the other day when I had to restart my server during the busiest time of my website's day (around 2-3pm).  My server could not catch up with the traffic due to the heavy pounding on my database.  

Typically it is not an issue since I use memcached to limit the usage of my database.  But in this case, my cache was lost due to the reboot, and every request was hammering the database.  So it forced me to take a closer look at my code.  Low and behold I found a number of these front-end wildcards in queries that I thought were using my indexes.   The funny part was that I really didn't need the wildcard on the front-end of the string pattern.  I must have just added it to try and get more results.  

I quickly fixed my code/queries and the site bounced back to its typical performance levels... and the database let out a sigh of relief... ;-)





Saturday, December 13, 2008

The NoFollow Debacle...

I've been researching whether or not to utilize the "nofollow" attribute on my user-generated links for about a week now.


It is pretty amazing how many different valid opinions are out there on this topic.  From one perspective (Google), Doctor Matt Cutts insists that the world is indeed round, and that we are all safe to use the nofollow tag where ever user-generated content allows links to be added.  He then goes on to also say that IF you have the ability to filter trusted users over non-trusted users, you could then consider removing the nofollow on any links contributed by trusted users.

Easy enough right?  Google says use it, so we should use it? 

Well not so easy.  There also exists the viewpoint that applying such a blanket attribute to all your user-generated links may actually hurt your site.  Some believe that it is a red flag to Google that your site/content must not be that trust worthy if you have so many "unchecked/untrusted" links on your pages.  Then there is the theory that if you are not going to trust any of your outgoing links, well then karma will come back and bite you as no one else will want to trust a link back to your site.

Ok, so forget.  I'm not going to use nofollow.  Who wants to be flagged as untrusted content?  I want to be a good internet citizen.  I'm going to try something else.

 What about redirects?  WebmasterWorld does a nifty 302 redirect for any and all user-added links to their pages.  Is that legit?  Well according to Google, 302 redirects do not pass any PR, but is it safe to use?  Many feel you should avoid 302 redirects all together since they are often abused by spammers/hijackers.   Of course Matt Cutts himself stated that redirect scripts are one of a few recommended ways folks can "sell links" legitimately.  Since 302 redirects don't pass any PR juice, they are safe to use for that purpose.  Ok great, but if I utilize a redirect script, are I now labeling myself as someone who sells links, but legitimately?  Is Google going to view me as a potential hijacker now with all my new 302 redirects?  Well no, IF you make sure to tell robots.txt to exclude any access to your redirect script.  But how many webmasters actually know how to implement this correctly?  One mistake can cost you dearly in search rankings.

Why the hell is this suddening getting like 10 times more complicated.  I just want to know if it is okay to allow my users to add links in comments.  A basic fundemental aspect of the web, and I cannot come to a decision on how to do it!?!

Okay fine, let's just leave the links as natural links.  Heck, we trust our users.  But what if we get one rogue user that starts posting links to unsafe sites.  Or one of our trusted members sells out and starts offering links from our site to other sites for cash?  Who has the time or ability to monitor all that when you are getting hundreds if not thousands of comments a day? 

Ok, forget that idea.

So I'm back to where I started.  Comments full of links that are not really links (just text URLs) that totally suck for user-friendlyness... all because I am forced to give a crap about what search engines think about my site, instead of just having to worry about what my users think.

I guess I owe special thanks to all the comment-spammers of the world!  

UPDATE:  I've come to a decision.  I'm going old school javascript style.  I'm going to have a js function that does the redirect for any URLs shared by users in comments.   I avoid the no-follow paranoia, I avoid the potential 302 pitfalls, I avoid the naturnal linking risks... and I get to still offer "working" links to my users.  

Here is my simple solution:

Javascript function:

function goLink(o){
  var u = o.innerHTML;
  document.location = u;
  return;
}
Then I simply made a matching style class to have the link appear as a link:

.link {
color: blue;  
cursor: pointer; 
text-decoration: underline;
}

Thursday, November 20, 2008

Simple Trick To Increase Your Pageviews & Revenues By 20%

A few weeks back I was reading a forum thread over at DigitalPoint about a webmaster who had added some handy navigation links to his pages that helped him browse through his content more easily.  He was talking about how he had initially added the new navigation links to help him keep up with monitoring the content more easily.  However after implementing his new links, he found the total pageviews he was getting on his site had jumped up.  Users were finding his new links helpful as well.


This simple idea of adding easy navigation to your site is one that had escaped me up until I read this guy's post.  It was so simple why hadn't I noticed it before?  My site is cleanly designed, I have a good search tool, and plenty of "related article" links after each story.  But those links require the user to actually keep scrolling down beyond the article, and then actually read the links to see if they are something of interest before clicking to the next page.

So it got me thinking, why not simply at a "Read Next Article >>" link to the end of each article?  With a little logic, I could even make it jump to the next related article.  

So now a user gets to the end of the article and does not have to scroll or think... They can simply click to see another article... click again to jump to the next.  Scrolling through my pages looking for something interesting to read.

Since I implemented this strategy, I've seen my pageviews jump by 20% across the board.  And as most of you know, pageviews relate to revenues.  I've seen a similar jump in revenues from both CPM ads and PPC ads.

If you have a content site, give it a try.  I'm guessing you will see similar results.

Monday, October 27, 2008

Are You Losing Out On Links To Your Site?

This is a huge tip for webmasters out there who have not yet discovered the new 404 feature released in Google's Webmaster Tools.


Matt Cutts did an great review on the new 404 error page feature to outline the exact steps you can take to see if you are missing out on a slew of links from other sites that may be simply mis-typed or mal-formed.

I reviewed my 404 error stats and found I had close to 100 links pointing to various pages on my site that were not working due to case problems in the url, or other typo problems.

I was able to create a few redirect rules in my Apache httpd.conf to help point the broken links to the corresponding active pages on my site.  I used a 301 redirect rule so search engines would now start counting these links as legitimate, active links to my site.

After downloading the 404 error report from Google's Webmaster Tools I had fixed a majority of the broken inbound links within an hour.  

Probably my most productive hour I ever spent.  If you have not already explored this new feature, I would definitely recommend checking out Matt's blog and follow his steps to review your broken inbound links.