Browser Software Architecture
The WWW browsers share a common architrecure, and a certain amount
of common code. (See also: Browser operation , and utility modules
which are used throughout).
In the contol flow diagram , common code is to the right of the grey
line.
- Application
- This module is the main program, and is window-system-dependent.
In the line mode browser , it is HTBrowse . The application is called
by the operating system, and manages the overall running of the program.
It asks the navigation module to load the default page.
- Navigation
- The module which acually loads documents is based in HTAccess.c.
This uses all the protocol modules . Given an anchor ID to jump to,
it asks the anchor object for the address in order to load it.
- History
- This module records and replays on request the documents which
the user vists.
- Format manager
- The format manager uses the parser modules to load
the document as appropriate. It can also decide on the format of a
file from its name.
- Anchor object
- The HTAnchor module takes care of creating anchors,
managing the links between them and their attributes. This module
is independent of the type of graphics object (text, line drawing
etc). It stores hypertext addresses of anchors, and ensures that anchors
with the same address are the same anchor. ( More )
A protocol module is invoked by the navigation module in order to
access a document. Each protocol module is responible for extracting
information from a local file or remote server using a particular
protocol. Depending on the protocol, the protocol module either builds
a graphic object (e.g. hypertext) itself, or it passes a socket descriptor
to the format manager for parsing by one of the parser modules.
- File access
- HTFile.c provides access to files, using HTFTP.c for remote
access. The latter uses HTTCP for common TCP routines.
- HTTP access
- The HTTP module handles document search and retrieve using
the HTTP protocol.
- News access
- The NNTP internet news protocol is handled by HTNews which
builds a hypertext.
- Gopher access
- The internet gopher access to menus and flat files (and
links to telnet nodes etc) is handled by HTGopher .
- WAIS access
- is implemented in a separate gateway program .
The parser modules allow different formats to be used to generate
graphic objects. A parser is invoked by the format manager. Currently
we only parse HTML and plain text, but obviously other formats can
be added.
- HTML
- Basic hypertext parsing is done by HTML.c which uses the simple
SGML engine SGML.c as a basic tokeniser and element stack manager.
- Plain text
- This is built directly by the format manager as it is so
simple.
A graphic object is a (complex) displayable entity. It is built by
a protocol module directly or using a parser. Graphic objects are
in general necessarily coded differently on diferent window systems.
The graphic object is resposible for displaying istelf, catching mouse
clicks, and calling the navigation object in order to follow links.
We use the more common term "document" to describe the logical entity
which a graphics object represents and displays.
- Hypertext
- This object is window-system dependent. In the line mode
browser, the GridText module is the hypertext object, providing the
generic functionality of HText.h
_________________________________________________________________
Tim BL