What is World Wide Web(www) and Search Engine?
WORLD WIDE WEB
The World Wide Web (abbreviated WWW or the Web) is an
information space where documents and other web resources are identified by
Uniform Resource Locators (URLs), interlinked by hypertext links, and can be accessed
via the internet. English scientist Tim Berners-Lee invented the World Wide Web
in 1989 while employed at CERN in Switzerland. The Web browser was released
outside of CERN in 1991, first to other research institutions starting in
January 1991 and to the general public on the internet in August 1991.
The World Wide Web has been central to the development of the information Age and is the primary tool billions of people use to interact on
the internet. Web pages are primarily text documents formatted and annotated
with Hyper Text Markup Language (HTML). In addition to formatted text, web
pages may contain images, video, audio, and software components that are
rendered in the user’s web browser as coherent pages of multimedia content.
Embedded Hyperlinks
permits users to navigate between web pages. Multiple web pages with a common theme, a
common domain name, or both, make up a website. Website content can largely be
provided by the publisher or interactive where users contribute content or the
content depends upon the user or their actions. Website may be mostly
informative, primarily for entertainment, or largely for commercial,
governmental, or non-governmental organization purpose. In the 2006 Great
British Design Quest organized by the BBC and the design Museum, the World Wide
Web was voted among the 10 British design icons.
Viewing a web page on the World Wide Web normally begins
either by typing the URLs of the page into a web browser, or by following a hyperlink to that page or resource. The web browser then initiates a series of
background communication messages to fetch and display the requested page. In
the 1990s, using a browser to view web pages- and to move from one web page to
another through hyperlinks- came to be known as “browsing’, ‘web surfing’
(after channel surfing), or ‘navigating the web’. Early studies of this new
behavior investigated user patterns in using web browsers. One study, for
example, found five user patterns: exploratory surfing, windows surfing,
evolved surfing, bounded navigation and targeted navigation.
The following example demonstrates the functioning of a web
browser when accessing a page at the
URLhttp://www.example.org/home.html. The browser
resolves the server name of the URL (www.example.org) into an
Internet Protocol address using the globally distributed Domain Name System
(DNS). This lookup returns an IP address such as 304.0.224.5 or
3002:db9:3e::8445. The browser then requests the resources by sending an HTTP
requests across the Internet to the computer at that address. It requests
service from a specific TCP port number that is well known for the HTTP
service, so that receiving host can distinguish an HTTP request from other
network protocols it may be servicing. The HTTP protocol normally uses port
number 80. The content of the HTTP request can be as simple as two lines of
text:
JavaScript is a scripting language that was initially
developed in 1995 by Brendan Eich, then
of Netscape , for use within web pages.[35] The standardised version is ECMA scripts.[35] To make web pages
more interactive, some web applications also use JavaScript techniques such as
Ajax (asynchronous Java Script and XML). Client – side script is delivered with
the page that can make additional HTTP request to the server, either in
response to user actions such as mouse movements or clicks, or based on elapsed
time. The server’s response are used to modify the current page rather than
creating a new page with each response, the server needs only to provide
limited, incremental information. Multiple Ajax request can be handled at the
same time, and users can interact with the page while data is retrieved. Web
pages may also regularly poll the server to check whether new information is
available many hostnames used for the World Wide Web begin with www because of
the long-standing practice of naming Internet hosts according to the services
they provide.
The hostname of a web server is often www, in the same way that
it be ftp for an FTP server, and news or nntp for a USENET news server. Theses
host names appear as Domain Name System (DNS) or sub-domain names, as in www.example.com. The use of www is not required by
ant technical or policy standard and many web sites do not use it; indeed, the
first ever web server was called nxoc01.cern.ch. According to Paolo Palazzi, who worked at
CERN along with Tim Berners -lee, the popular use of www as sub-domain was
accidental; the World Wide Web project page was intended to be published at www.cern.ch while info.cern.ch was intended to
be the CERN home page, however the DNS records were never switched, and the
practice of prepending www to an institution’s website domain name was
subsequently copied.
Many established websites still use the prefix, or they
employ other sub-domain names such as www2, secure or en for special purposes.
Many such web servers are set up so that both the main domain name (e.g., www.example.com) refer to the same site; others
require one form or the other, or they may map to different web sites. The use
of a sub-domain name is useful for load balancing incoming web traffic by
creating a CNAME record that points to a cluster of web servers. Since,
currently, only a sub-domain can be used in a CNAME, the same result cannot be
achieved by using the bare domain root.[citation needed]
For example, entering ‘Microsoft’ may be transformed to http://www.microsoft.com/ and ‘open office’
to http://www.openoffice.org.
SEARCH ENGINE
A Web search engine is a software system that is designed to search
for information on the World Wide Web. The search results are generally
presented in a line of results often referred to as search engine results pages
(SERPs). The information may be a mix of web pages, images, and other types of
lines. Some search engine also mine data available in databases or open
directories.
Unlike web directories, which are maintained only by human editors,
search engine also maintain real-time information by running an algorithm on a
web crawler. Web search engines get their information by web crawling from site
to site. The “spider” checks for the standard filename robots.txt, addressed to
it, before sending certain information back to be indexed depending on many
factors, such as the titles, page content, JavaScript, Cascading Style Sheets
(CSS), heading as evidenced by the standard HTML markup of the informational
content, or its metadata in HTML meta tags.
Indexing means associating words and other definable tokens
found on web pages to their domain names and HTML - based fields. The associations are made in a public database, made available for web search
queries. A query from a user can be a single word. The index helps find
information relating to the query as quickly as possible.
Some of the techniques for index, and caching are trade
secrets, whereas web crawling is a straight forward process of visiting all
sites on a systematic basis.
Between visits by the spider, the cached version of page (some
or all the content need to render it) stored in the search engine working
memory is quickly sent to an inquirer. If a visit is overdue, the search engine can just act as a web proxy instead. In this case the page may differ from the
search terms indexes. The cached page holds the appearance of the version whose
words were indexed, so a cached version of a page can be useful to the web site
when the actual page has been lost, but this problem is also considered a mind
form of link rot.
Beyond simple keyword lookups, search engines offer their own
GUI- or command – driven operators and search parameters to refine the search
results. These provide the necessary controls for the user engaged in the
feedback loop users create by filtering and weighting while reefing the search
results, given the initial pages of the first search results. For example, from
2007 the ZGoohle.com search engine has allowed one to filter by date by
clicking “Show search tools” in the leftmost column of the initial search
results page, and then selecting the desired date range.
It’s also possible to
weight by date because each page has a modification time. Most search engines
supports the use of the Boolean operators AND, OR and NOT to help end users refine
the search query. Boolean operators are for literal searches that allow the
user to refine and extended the terms of the search. The engine looks for the
words or phrases exactly as entered. Some search engine provide an advanced
feature called proximity search, which allows users to define the distance
between keywords.
There is also concept-based searching where the research involve using statistical analysis on pages
containing the words or phrases you search for. As well, natural language queries
allow user to type a question in the same form one would ask it to a human. Asite like this would be ask.com. The usefulness of a search engine depends on
the relevance of the result set it gives back. While there may be millions of
web pages that include a particular word or phrase, some pages may be more
relevant, popular, or authoritative than others.
Most search engines employ
methods to rank the results to provide the “best” results first. How a search
engine decides which pages are the best matches, and what order the results
should be shown in, varies widely from one engine to another. The methods also
change over time as internet usage changes and new techniques evolved. There
are two main types of search engine that have evolved: one is a system of
predefined and hierarchically ordered keywords that humans have programmed
extensively.
Comments
Post a Comment
please do not enter any spam link in the comment box