EC-Council CEH 312-50 v10 – FootPrinting – Reconnaissance and Information Gathering

  1. Instructor Demo: GoogleAlerts

In this lecture demo, we’re going to be talking about the Google Hacking database that Johnny Long put into the public domain. So let’s go ahead and take a look at that. Right now I’m going to go ahead and open up a browser and I just simply went to a web page in the browser and let me go ahead and just grab in here and I’m just going to type in Google hacking database. And you can naturally see a large number of people that are hosting this database or have actually pointers to it exploit DB has these also offensive security is hosting it, it’s in public domain. And so consequently it’s free for everyone to use. The best thing to do to alleviate this Google Hacking database is to quote unquote, Google hack yourself. So how would we do something like that?

Well, it’s very easy. We could just simply type in site colon and then our site and then put in whatever hacking string we want to use to see if we are vulnerable to that. If you’re thinking to yourself that sounds like a lot of trouble, well you’re probably right. So let me make it a little bit easier for you. There are different ways that we can approach this. One of them is to let Google tell you if you have something that is deterable. What you would need to do would be to go out and get access to Johnny Long’s database. Let’s see if I can pull that up real quick.

And you can see all of these different categories that he has. Various online devices, files containing juicy information, advisories and vulnerabilities, some with passwords, all kinds of things. And these are the ways that we can find these. And so naturally all we would do would be grab this right here and paste it into where we had looked at before for Google. But preface that with site colon, whatever site you are wanting to test it on and put that in. It still seems a lot of repetitive type thing that has a problem that actually is true.

So we oftentimes find after we find out the power of Google is we’ll go ahead and do this for the first couple of months, but after that it seems like such a drudgery. What happens if you could get Google to tell you itself? I wonder if anyone has ever heard of Google Alerts. So landing on Google Alerts, you can see as an example, I put in just one alert. So I’m looking for any web page that would happen to show up with my name and the word security. Sometimes you’ll get things that happen in the press that actually don’t recognize actually what you’ve done.

And before they get all the way up to the top of Google search, you may decide that you want to do something about that. So consequently you can detect it. How often? As it happens, at most once a day at most a week. Sources, everything. I’m just going to click Automatic Language English, any region, only the best results, and deliver it to my email address using that same idea. Why couldn’t you type in let’s just do another R1 quick.

Why couldn’t you type in something like this and then paste in that Google hacking stream? Now it says that it doesn’t have any particular results, but remember we had put in there as it happens. So when Google comes up with a Google dork that matches something that you have done, google will send you an email. How cool is that? So consequently, what you’re able to do now is you’re able to go in and create yourself your own alert system. And Google does the heavy lifting for you.

  1. Removing Searchable Content

In this next lecture, we’re going to talk about how to stop Google from indexing and other robots to stop indexing your website at certain areas. Let’s take, for example, if we had the TS web or Outlook web access on our website, would we want the whole world to know about it? Probably not. Naturally, we’d want our own internal employees to know about it, but we don’t really want to advertise that to the world. What would we typically do now if you’re thinking to yourself, well, primarily what we would do would be to go out to the site and let’s just use an example here. CNN. com robotspots. TXT. We’d go out and put that information into what’s called a robot TXT file. This file is typically on the root of the directory right underneath the domain, and we would list what we don’t want to have indexed. Think about this. What have you done right here?

I need to tell you a quick story about my six year old nephew. My six year old nephew is the one that can’t keep his hands off of all of the controls in my car. I just bought a new BMW and I drove up to the store and I told him, don’t touch anything on the car while I’m in here. I’m just going to run here and get a soda or something. Well, I probably shouldn’t have said that to a six year old. When I got back into the car, the windshield wipers were going, the radio was up all the way. I mean, I probably would have been better off not to tell him anything, and I probably would have got a better result. In effect, hackers are kind of like that little boy. Their curiosity is upheld. And so what you’ve done right here with robots TXT is you’ve told them, now don’t look here, okay? And for God’s sake, don’t look here. All right? And this PR, more than likely public relations, don’t look there either.

You basically told people that want to hack you where not to look. You’re giving away a lot of information. Okay, Tim, well, that’s fine, but how else would I solve it? Let me show you. You can put a header on the various pages that you don’t want to show up telling robots not to index it. This way, the robot won’t index that. Moreover, you haven’t given away that information. A lot of websites have administration pages. A lot of websites have employee portals like the Outlook web access, like terminal servers, things like that, that you don’t really want the whole world to know about. This is how you would solve that. If you guys want to know more information on it, just simply put in a couple of these terms here in Google and it will give you a number of different places to find instructions to do just that. So catering Google hacks means stopping any information you want to keep private from being indexed by Google. In the case of sensitive files and devices, this would be configuration issue.

Perhaps the files and control panels should be unavailable to the internet in general, not just the web robots for web pages which require external access but should not be searchable on Google, the robots meta tags allow finer tunering and I’m talking about the meta tags that are in the pages. Now a gentleman by the name of Matt Cutts, head of Google’s web spam team, notes on his blog this behavior that we’ve done for the last several years, and webmasters are used to it. The no index MetaD gives a good way, in fact, one of the only ways to completely remove all the traces from this site from Google. Another way is our URL removal tool.

Now, the URL removal tool can be actually used at Google itself. So if you find out Google has this page and you want it taken out, remember the cache is going to live for about a year. You can go in to Google’s index and tell them please remove this and it will do just so. Now, speaking of robots, there was a clip on the NBC Evening news once and I found it online and have to show you this. This little girl has mistaken a water heater for a robot.

  1. Internet Archive: The WayBack Machine

Okay, folks, the next thing I need to discuss here is a company that actually indexes and archives the Internet. It’s called archive. org. More precisely, people refer to it as the Way Back Machine. What we can do is go out to Archive. org and put in the website of any website that was created from now till today and see if it pulls it up. And let’s see if we can find out a couple of things we can do because of that. So I’ve gone out to Archive. org, and as you can see, it’s going to give me a place to put in the URL of any website that I want to take a look at. I’m just going to try a couple here. I’m going to do dub, dub, dub. Okay. This was actually a company that I owned from about 2001 to a little bit into 2005. And it’s kind of interesting because you can see all the different times and places that something has been changed. I actually still own this particular domain, so I had updated to the website on January 19, 2002, again on the 23rd and so on. Let’s go ahead and look at what it looked like. And it probably is going to be pretty old fashioned compared to today’s websites. Okay. And it looks like the website right here, it has a little bit difficult time with pictures.

We’re considering an eight A plus, network plus, that kind of stuff. But you can see I could actually browse around in here inside of this, if it’s indexed those web pages as well. So it’s kind of interesting. You can pull things up from quite a bit ago. One thing that I always preach in all of my classes is that if you put something on the Internet, there’s no taking it back off. It’s kind of like taking a feather pillow and going up on top of a large mountain and shaking the pillow to get rid of all the feathers. Then someone tells you, okay, now go collect every one of those feathers. It’s going to be virtually impossible, isn’t it? Well, that’s kind of how it is when you put something on the Internet. I actually was on Miami’s Today show when I was down doing a conference.

Let’s take a look if we can. We can find this is where I was on CBS News. This is my website. Here where I was on Fox News. And this is the one I was wanting to show you. Well, social media is huge today. Almost everybody has a Facebook or a Twitter account. Question is, how can using these online sources put our personal information in jeopardy? Hacking expert Tim Pearson is here with some safety tips. He’s taking time out of your hacker halted conference that’s actually going on right now. And you’re going to give us some tips, I think that I even didn’t know. So this is good, especially for those of us who use social media. So where do we start? Obviously you’re a guy that is using your skills for the better. That’s correct. So where do we start? Basically, probably the thing that you need to know about to begin with is when you put something on the Internet, you’ll never get it back off.

There’s no retracting. There is no retracting. It’s kind of like a tattoo. And even a tattoo today you can remove. You cannot remove this. Even though you delete some photograph, you delete this or you delete that. Somebody else has copied it. Somebody else has replicated it somewhere else. It’s there for life. And I’ve got some clips to show you about some things that I think I’d like to start off with. She’s not a fan. Let’s get to though you have some tips to help protect ourselves on this. You’re saying, you know, especially for maybe our kids and our family, especially for our kids, educate them, don’t let them put up stuff. Tell them how permanent these things are when people we don’t want to appear something like this on our page. Oh, no. I think there’s a lot of teenagers and even college kids with things like that.

I have to admit that this happened to me when I was in college. I just don’t have it doesn’t live on Facebook. Exactly. To a prospective employer. How are you going to explain this? How are you going to explain this ten years from now when it’s going to be extremely important for you to get that job that you really need and they’re just not going to look twice at the person. Well, we even recommend when we have interns come in that they take off any pictures of them holding alcohol in bathing suits. I mean, even things like that when you’re going for a job, you don’t want things like that posted either. Some interesting things that have happened here recently is Mark Zuckerberg, who is the creator of Facebook. His latest makeover of Facebook wants to provide a cradle to grave lifeline for yourself. Okay. Now, while this is kind of interesting, this also provides test strings for the bad guys, basically. For example, who’s your childhood sweetheart?

That’s going to be on there. What’s your best friend’s name in high school? Well, that’s going to be on there. Sites like Ancestry. com is going to provide things like your mother’s maiden name, your father’s middle name, some things of that nature where you probably used or been asked for those test strings. Not all passwords. Yeah, not only for passwords, but for the security questions. Exactly. So this digital tattoo there is a study that was done not too long ago that basically said there’s one out of every ten teenagers has posted a nude or semi nude photo of themselves on the internet that could haunt them. They’re absolutely the rest of their lives.

Okay. And guys, my point in all of this is that regardless of whether this is on the Internet at one time, and you took it back off. Now, guess what? We can use the Way Back machine to still pull it back. So be careful. Make sure that your kids know this, because I go on in this interview and I’ll post this as a link in the resources. Because I tell you what, most of the kids today, they don’t have any issue with putting whatever they want to up on the Internet. And while it may be cool now, it’s not going to be cool maybe ten or 20 years from now. And you really wish that didn’t happen.

  1. Domain Name Registration-WhoIs Information

Now, I want to talk about this real quickly. You’ve heard me talk about the domain name registration and may not exactly know what that is. Now, it’s important for you to understand that when we first started out, there were certain organizations that kept track of this WHOIS information and provided you domains. In reality, they still do, but they have more or less subcontractors it or delegated it out to about 250 registrars. A real popular one is GoDaddy.

And so the original ones, and this is on the test, so you need to know this. Aaron Net covers north and outh America and SubSaharan Africa. Apnik stands for Asia Pacific. Apnic. net covers Asia Pacific and Ripe covers Europe, the Middle East and parts of Africa. You will possibly be asked a question just on that, so you definitely want to know that information. So what does this actually look like anyway? Well, let’s take a look. The Who Is output is going to tell you the particular details about an individual that may have registered this domain name. We’re going to do a little demonstration here on the next slide and talk about this.

  1. Instructor Demonstation – CentralOps

You know, whenever I teach a class, it never ceases to amaze me that there’s so much material in here. I always get asked, well, where do you even start, Tim? So let me show you where I actually start. In most cases, I start off at a place called Centralops Net, and they have a number of tools here that I find work very well. They’re a free site and they give you quite a little bit of information. Now I played football for the University of Missouri and I ended up getting hurt. And the people that hurt me was the people from SMU. So I was trying to hack SMU, just kind of my own way of getting back at them. So I’m just going to type in the domain name of smu. edu. I’m going to click on Go and I’m going to tell it that I want to do a service scan and a trace route and click on Go.

Now you can see what’s happened right here. I have quite a little bit of information in here. It’s telling me that their registrant now this is the WHOIS information guys. This is Domains, who is information. The registrar will have this. The person who registered you, GoDaddy or one of those other database we talked about before. The domain name is smu. edu. Here is their address. The administrative contact, David Nugent, I guess it is. Now remember what we talked about earlier. If this individual has the authority to register a domain name, they are definitely someone of note at that organization. Here’s even his email. This is not a good practice at all.

The administrative contact should be hostmaster. The email should be hostmaster. At smu. edu, you don’t want to give away any information if you can help it. That shouldn’t be a specific phone number for him. It should be the 1800 number that gets the opening line. Same way with the technical content. Now the technical contact, they actually did do a better job. Network operation center. And the technical contact is the network operation center. So that actually works out pretty good. Now on a name server, to be on the Internet, you have to have at least two name servers. You could hold one, the Internet provider holds another. The Internet provider may hold both of them. It just depends on how you want to balance it. You need to have at least two.

But in this case right here, SMU chose to have three. Sorry, have four. So you got Pony, CIS, smu. edu Cease, smu. edu Exponent, and so on and so on. The Mustang is their character and so that’s where they’re getting these pony things. So if we go down and look a little bit further in here, it has enough other information. Now notice right here, this is some older information, mikey Bruce that they had in here before. I want you to also notice it gave me their DNS records as well. Now I want you to notice the type of records we have.

We got an SOA record, a name server record here’s, all the four name servers. An A record indicates the individual host it may be pointing to. In this case, it’s actually educating the SMU domain. The MX record is the mail exchange record. A text record is usually for informational purposes. A PDR record is a pointer record. That’s a reverse lookup record, and so on. Now, you can see right here that the SOA record will always point to the primary DNS server. If we decide we want to do a zone transfer. In other words, we’re going to try and hack this database and have it download a lot of information for us, having it point to different places. We’ll talk about that more.

In the enumeration section, we need to be able to isolate which one is the SOA server. And the SOA server or the primary server would be this one right here. The primary always transfers to a secondary. So you need to use this as a server you’re going to transfer from. If we scroll down just a little bit more, it’s got a trace route on here as well. And this trace route is going to give us superlative information. Usually when it doesn’t come back either with an IP address or the fully qualified names, you can pretty much think that’s probably a firewall. Now, in this case, they did show some of the IP addresses, but as you can see, these are all public IP addresses in here. So where we started from now keep in mind where are we starting from? My IP address is not going to be in SMU’s records. Central opposite is So this started from Central Ops and went all the way down to SMU.

It then did a port scan on the ports that are available here. So it’s basically saying they’re using an Http one one. And let’s see, what are we? Okay is seven five. It’s grabbed the banner off of that particular web server. So I would just need to look up vulnerabilities for an Is seven five web server and throw them at this company, at that particular IP address to see if something hiccups. 443 is now actually our SSL or TLS port. And so consequently, it’s going to show us all the information on that particular record. It’s going to show us. Now, as you can see, we’ve gleaned a tremendous amount of information off of you that we can use to start building our blueprint, building our footprint.

img