Tag Archives: deep web

What information can Google not store?

Surface vs deep web

There are two types of web: the surface web is the part that search engines can see and index, for example BBC News, whereas the deep web refers to the parts that cannot be accessed, for instance, online banking information is hid behind a password wall. So Google cannot store anything from the deep web because their crawlers cannot crawl past firewalls, passwords or another restricted access point.

Photo showing different parts of the web.

Photo showing different parts of the web. Source: ConetIslandDreams

Cookies, IP address and browsers

Any information you give, for example your username and password, can be stored by Google, as well as, other indirect pieces of information such as cookies, IP addresses and browsers, for instance. Some of these technologies are not as clear as others. Cookies, as an example, are not broken down line-by-line so their precise use is simply not known. Can a cookie take note of an IP address, and what ISP you use, or where you live? It is not impossible for Google to track an IP address to a specific location. In fact other pieces of technology can pinpoint your location.

Case study of Google Street View

It is not uncommon for large companies to use and misuse information on their products and services. Google’s Street View cars were ordered to clear data they collected as they took pictures for their Street View service. So Google has, and can be referred to as being “evil”, misused and stored lots of unauthorised information.

A devil theme to Google's logo.

A devil theme to Google’s logo. Source: 4.bp

What information does Google store?

Google is likely to archive most things from the surface web. Your bank, as an example of a surface web website, is likely to be crawled and stored in an index but Google cannot search or store your bank account information because it is hid behind password walls, therefore considered to be within the deep web, and secure servers….

Android viewpoint

If you own and use an Android mobile Google may be able to collect even more information about you. Phone numbers and call records can be stored. Is the future of Google’s business model likely to produce cheap flights to Australia if you call a person over there frequently?

If you are interested in what information Google can store read the references below this post to learn more. Would you like me to post about a specific search engine topic? Tweet Gerald.

Posted by Gerald Murphy

References

  1. Channel 4. (no date) What does Google know about you?
  2. Google. (2013) Google’s privacy policy.
  3. Peng, W. (2000) HTTP cookies – a promising technology. Online Information Review. 24(2) pp. 150 – 153
  4. Rawlinson, K. (2013) Google ordered to delete data collected by Street View cars.
Advertisements

What is the Web?

What is the Internet? What is the Web? Is their a difference?

People use the terms Web and Internet interchangably, but they are in fact very different.

“The World Wide Web, or Web, is in fact just one of a number of ways information can be exchanged over the Internet, another being e-mail” (Murphy and Persson 2009:4).

The internet, on the other hand, refers to the physical makeup of how we communicate (i.e. the cables that carry the images, the switches that receive the signal of these cables).

Sir Tim Berners-Lee invented HyperText Markup Language (HTML) and allowed people to use if for free. HTML is the backbone of the Web because it allows everyone to participate in the communication of information.

So the Internet refers to the physical network(s), whereas the Web allows us to use the Internet as a means of communication.

Are there different types of Web?

In short, the Web is all the same; however, how we access (and interact with) it differs. For this reason the Web is full of information which can be accessed in so many different ways, so much so, people refer to different sections of the Web. To give you an overview of all of these references/names, see the list below:

  • Opaque Web refers to files that can, but are not indexed (Sherman & Price 2001).
  • The private Web is tecnically indexible because it is protected by passwords.
  • Proprietary Web requires users to agree to special terms before you use their service (e.g. NYTimes).
  • Invisible Web refers to parts of the Web that cannot be accessed, such as, social media.

The fact that there are so many different names for the Web gives it a true global scale: The Web is a huge place and people are still unsure just how big it actually is.

References

  1. Murphy, C., and Persson, N. (2009) HTML and CSS Web Standards Solutions. USA: Apress and Friends of Ed.
  2. Pedley, P. (2001) The Invisible Web. London: Aslib-IMI.
  3. Sherman, C., & Price, G. (2001) The Invisible Web. New Jersey: Info Today Inc.