Knowledge servers



One of the advantages brought by the creation of a knowledge base its the ability to browse through it. In TROEPS, this can be achieved very easily and widely through the use of the world wide web. On-line sample knowledge bases are available at the URL http://co4.inrialpes.fr which contains extensive material about TROEPS as an HTTP server.

Setting up your own server

TROEPS can be used as a knowledge base server accessible through HTTP. For that purpose, you must have a HTTP server running on one of your machines with the opportunity to add CGI scripts. Let us consider that a machine whose IP name is myserv.mydom.domain runs a HTTP server on port 80 and launches CGI scripts with the URL (Uniform Resource Locators) prefixed by /cgi-bin. Also assume that the TROEPS server will run on a machine ( itsmach.itsdom.domain) which can be different from the HTTP server on the port 5555.
In order to set up your Troeps server, you must:
http://myserv.mydom.domain:80/cgi-bin/hytropes/itsmach.itsdom.domain:5555/TRP/home-page

It is also possible to integrate the Java classes used for displaying the class hierarchies in your server by copying them (i.e. the troeps-classes.zip file) from the co4.inrialpes.fr server in a directory (say /java on your HTTP server) and changing the -j option in the line above to http://myserv.mydom.domain:80/java. The same thing can be done for help pages: they can be found on the same server ( help directory) and can be set through the -d option to http://myserv.mydom.domain/help.

HTML output

Any TROEPS entity can be printed as a HTML (HyperText Mark-up Language) page. This can be used for browsing in a knowledge base in the absence of graphic user interface or for making the base available for others. Experiments have been carried out on generating the whole set of pages corresponding to a document and on generating on the fly the pages requested to the HTTP server.
Any TROEPS entity is provided with an URL that can be safely referred to by other web sites. When the URL is passed to the server through a client, the server will return the HTML page corresponding to the entity.
The URL is always prefixed by what corresponds to:
http://myserv.mydom.domain:80/cgi-bin/hytropes/itsmach.itsdom.domain:5555/TRP/get/
and has an individual suffix as follows:
base <basename>/<basename>.html
concept <basename>/<conceptname>/<conceptname>.html
conceptslot <basename>/<conceptname>/s/<slotname>/<slotname>.html
bridge <basename>/<conceptname>/b/<bridgename>/<bridgename>.html
conceptview <basename>/<conceptname>/v/<viewname>/<viewname>.html
class <basename>/<conceptname>/v/<viewname>/<classname>/<classname>.html
classslot <basename>/<conceptname>/v/<viewname>/<classname>/<slotname>/<slotname>.html
object <basename>/<conceptname>/o/<urlencodedstringobject>.html
The objects are normalised with regard to their printable form:
#<<conceptname keyvalue1,... keyvaluen>>
This form is transformed into a string which, in turn, is transformed into a URL-encoded string. By URL-encoded is meant a string in which spaces are replaced by + and special characters are replaced by %xx such that xx is their hecadecimal ISO-Latin code.
For instance, the house concept corresponds to the URL:
real-estate/house/house.html
and the #<<house "rue Monge ", 3, 2>> object corresponds to the URL:
real-estate/house/o/%23%3C%3Chouse+%22rue+Monge%22,+3,+2%3E%3E.html

Read-only and editable servers

The server can be set read-only or editable by setting a global variable *edit-p* in the htrp-config.t file. The () value means no edition while any other value enable the edition of the knowledge base. () is the default value.

Security

The access to the (either read-only or editable) server can be further refined through the basic security mechanisms of your HTTP server. Access control can be tied to particular pages, particular IP addresses or particular users (through passwords). As an example, it is possible to control the access to the edition, clustering and shutdown URL to the particular users in the edit group (identified by password given in the corresponding files). Here is the corresponding part of the httpd.conf file for the CERN 3.0 HTTP server:

#
#	Protection for Troeps files (for CERN server)
#

Protection EDIT-RIGHT {
	UserId nobody
	GroupId nogroup
	ServerId co4
	AuthType basic
	PasswdFile /myconfdir/passwd
	GroupFile /myconfdir/group
	GetMask edit
}

Protect /cgi-bin/hytropes/itsmach.itsdom.domain:5555/TRP/shutdown/* EDIT-RIGHT
Protect /cgi-bin/hytropes/itsmach.itsdom.domain:5555/TRP/edit/* EDIT-RIGHT
Protect /cgi-bin/hytropes/itsmach.itsdom.domain:5555/TRP/lex-edit/* EDIT-RIGHT
Protect /cgi-bin/hytropes/itsmach.itsdom.domain:5555/TRP/clustering/* EDIT-RIGHT
and of the access.conf file in the Apache 1.2 HTTP server:
# ******************************************
# *** Security access (for Apache server) **
# ******************************************

<Location ~ "/cgi-bin/hytropes/itsmach.itsdom.domain:5555/TRP/(clustering|lex-edit|edit|shutdown)/*">
	AllowOverride AuthConfig
	AuthName "Troeps"
	AuthType Basic
	AuthUserFile /myconfdir/passwd
	AuthGroupFile /myconfdir/group
	require group edit
</Location>

Editing knowledge bases through the server

All the screendumps appearing in the previous part of the manual have been created through the TROEPS HTTP interface. If some of the buttons in the screendumps do not show on the screen, the *edit-p* flag has certainly been assigned () (see above).

Customising the appearance

It is possible to control many parameters of the generation of knowledge bases. This is explained below. The kind of instruction given here can typically be placed in the htrp-config.t file of the server.

Adding HTML pages

The HTTP server can only generate pages from the knowledge it has. So, if it is able to display slot values and taxonomies, it is not able to create any picture or explanation of the notion used (even if it is connected to a lexicon). But, the users can add textual and pictorial information under the form of HTML pages which will be integrated in the pages generated on the fly whenever they are generated. The included pages can be arbitrarily complex (containing URL or images).
These pages reside in the machine where the server runs and are connected to the object through the annotations. The annotation values must be valid filenames. There are two annotations corresponding to two different places in the page:
Thus, once the pages have been designed and before the server is launched, each page must be attached to its object (which can be a TROEPS object or a concept or a bridge: any TROEPS notion) by setting the corresponding annotation to the filename of the HTML page. For instance,

(tr-set-annotation
	(tr-find-object (tr-find-concept "house") (list "rue Monge" 2 3))
	"posthtml"
	#f"/home/bases/real-estate/html-pages/map.html")

will add to the display of the object at the end of the page, the content of the file given as value of the annotation (here map.html).
This feature does not support multi-linguism (i.e. the HTML can be generated in any language you want -- provided that the messages have been defined -- but these files are unique).

Linking to other bases

The added pages can be arbitrarily complex. They can contain text and pictures. They an also contain URL referring to static pages available somewhere in the WWW or dynamic pages such as those generated on the fly by TROEPS.

Modifying object pages

If the display of the objects of a particular concept must be improved, it is possible to do so. For that purpose, the display must be programmed in a function having the object and a boolean as arguments and printing a HTML page on the standard output. This function is tied to the base through an annotation (funhtml) attached to the concept. Thus, the instruction:

(tr-set-annotation (tr-find-concept "house") "funhtml" #'my-print-house)

tells the system that the instances of house have to be displayed through the function my-print-house instead of the standard procedure.
This function can be used in order to generate the page depending on the object which can be manipulated through the TROEPS API. For instance, it is possible to embed in the page URLs which are built from the content of the knowledge base but correspond to queries to remote resources (such as a site providing e-mails of individuals).
Modifying object pages also allows to embed applets into the display since from the object, all the data necessary for passing as applet parameters are available. The call to the applet can be used in order to show some features of the object through specialised editors (e.g. a graph editor) which will run on the client side.

Adding functions

The TROEPS HTTP server uses a set of URL in order to identify objects and the operations on these objects (e.g. editing, classifying). It is however possible to extend the set of URL understood by the server and to provide the functions which correspond to these URL.
The advantage of that technique is that the functions are called in the TROEPS environment and thus it can access the knowledge base through the TROEPS API and compute the result or even modify the knowledge base. It is possible to build a complete set of pages on that model (and the TROEPS editor was first build that way).
In order to do so, the manage-url-hook-names dynamic variable contains a list of names of functions to apply when the URL is not recognised by the regular server. Each function will be called with the URL remaining after hytropes as argument and must return an integer (0 if the URL has not been recognised and something else otherwise). Typically these hooks will first recognise their own keyword after hytropes/.
The input can be decoded with the help of a few functions provided by TROEPS:
troeps.http.urldecode: decode the URL between two positions;
troeps.http.file-object: transforms a piece of this URL into the corresponding TROEPS object.
The behaviour of the function must be to print HTML data on standard output. If manage-url-hook-name is empty, or if each function returns 0, the nrk-reply function is called notifying the non recognition through HTTP.

Example

As an example, here is a typical htrp-config.t file illustrating all the features presented above:

;;; This is the configuration file for a Troeps server
;;; ==============================================================

;;; Required information

(setf (global *dir-prefix*) "")
(setf (global *edit-p*) t)

;;; ==============================================================
;;; Initialising

(tr-init)
(setf (languages) '(french english))

;;; ==============================================================
;;; Loading sample base

;;; loading the knowledge base
(tr-load-base #f"real-estate.bdf")

;;; ==============================================================
;;; Adding HTML page annotations

(tr-set-annotation (tr-current-base) "prehtml" #f"real-estate.html")
(tr-set-annotation (tr-object (tr-concept "house") '("rue Monge" 2 3)
  #f"house/rue-monge-2-3.html")

;;; ==============================================================
;;; loading additional talk functions

(load-file #f"real-estate.t")

;;; initialisation of these functions
;;; (like computing a map of the area from Troeps structures)

(re-init)

;;; ==============================================================
;;; special printing function for displaying the instances of concept person

(defun re-display-person (object edit)
	;; re-use the usual description
	(troeps.http.print-html-description object edit)
	;; Add an inlined button for fetching the e-mail of the person
	(printf "<LI><FORM METHOD=POST\n")
		(printf "ACTION=\"http://www.four11.com/cgi-bin/SledMain?")
 		(printf "FS,234,2,377FB00,3F70FA11\">\n")
		(printf "  <INPUT TYPE=hidden NAME=\"FirstName\" VALUE=\"%s\">\n"
			(tr-get-value object "firstname"))
		(printf "  <INPUT TYPE=hidden NAME=\"LastName\" VALUE=\"%s\">\n"
			(tr-get-value object "name"))
		(printf "  <INPUT TYPE=submit NAME=\"Search\" VALUE=\"Find email\">\n")
		(printf "</FORM></UL>\n")
	;; Add an Applet (using the Troeps tree display applet)
   (printf "<APPLET CODE=\"TreeDisplayApplet.class\"\n")
   (printf "        CODEBASE=\"%s\"\n" (global *url-java*))
   (printf "        ARCHIVE=\"troeps-classes.zip\"\n")
   (printf "        WIDTH=600 HEIGHT=500>\n"))
     ;; Get the parameters: the set of houses owned by the person
     (printf "  <PARAM NAME=tree VALUE=\"(%s" (tr-get-value object "firstname"))
     (for ((h in: (tr-get-value object "houses")))
       (printf " (%s)" (tr-get-value object "address")))
	  (printf ")\">\n")))
  (printf "</APPLET>\n")
  ;; Add buttons for calling other pages (printing picture)
  (troeps.http.display-button
     (catenate *url-prefix* "/re/print-picture" (troeps.http.refname object))
     "Afficher doc")
)

;;; ==============================================================
;;; Adding display annotations (after the definition of functions)

;; Tell the system to use the above function for printing person instances
(tr-set-annotation (tr-concept "person") "funhtml" #'re-display-person)

;;; ==============================================================
;;; Menu for special functions
;;; (like printing the map on demand and getting the instance from a clic)

(defun re-manage-url-hook (url)
	(let* ((begin (i1+ (char-index #\/ url 1)))
			 (end (char-index #\/ url begin))
			 (keyword (troeps.http.urldecode url begin (i1- end))))
		;; Recognise the re prefix
		(if (string= keyword "re")
			(progn
				(setf begin (i1+ end))
				(setf end (char-index #\/ url begin))
				(setf keyword (troeps.http.urldecode url begin (i1- end)))
				(cond
					;; Recognise the print-map action
					((string= keyword "print-map")
						(re-print-map))
					;; Recognise the print-picture action
					((string= keyword "print-picture")
						;; Get the object corresponding to the URL and print it
						(re-print-picture (troeps.http.file-object (substring url (i1+ end)))))
					(t 0)))
			0)))

;; Tell the system to use the above function for recognising re-prefixed urls
(pushf (dynamic manage-url-hook-names) `re-manage-url-hook)

;;; ==============================================================
 (printf ";;; real estate initialised\n")

API

(tr-save-base-as-html base dirbase ) function in [libtrh]

-> boolean, base base nab, dirbase string nas
Prints the knowledge base base as a HTML document. The result is printed in a directory whose name is that of the base in the directory dirbase. All the URL used in the output are relative. This means that the files are stored under the dirbase directory but the URL that they contains do not refer to this directory (it is only relative to the current directory). Note that all the dynamic capabilities of the HTTP server are not available under that mode (no TROEPS filtering, no editing and so on). Returns t if no error has been detected, signals an error otherwise.
(troeps.http.file-object url) function in [libtrh]

-> entity, url string
Returns the TROEPS entity corresponding to the string url interpreted as a URL as generated by the function link. If no object corresponds to the string, an error message is returned on standard output under the form of a HTML page.
(troeps.http.urldecode url begin end ) function in [libtrh]

-> string, url string, begin integer, end integer
Returns a string corresponding to the substring between position begin and end in the entry string url. The results is decoded according to the HTTP encoding rules.
(troeps.http.link entity) function in [libtrh]

-> string, entity entity
Returns the string corresponding to a HTML link to the TROEPS entity entity (it will be decoded through file-object).
( nrk-reply keyword socket ) function in [libtrh]

-> boolean, keyword string, socket stream
Returns on the socket socket a HTML page signaling that the string keyword is not understood by the HTTP server.
(global *edit-p*) global-variable in [libtrh]

-> boolean
Indicates if the TROEPS server is read-only ( () value) or writable (any other value).
(global *revision-p*) global-variable in [libtrh]

-> boolean
Indicates if the TROEPS server is will trigger the revision mechanism when an error is encountered or not ( () value).

Concurrent editing with Co4

CO4 is a framework allowing to relate several TROEPS knowledge bases in order to build consensual knowledge bases. To that extent, the CO4 protocol deals with the registration and submission from a knowledge base to another (called group base). The group base processes automatically all the queries from its subscribers.
For more information on CO4, see http://co4.inrialpes.fr/docs/co4-manual.html which provides pointers to CO4 articles, documentation and software. The feature considered here is the connection of TROEPS to CO4, it cannot be read in isolation from the CO4 documentation. In the following, it is considered that :
A ready to use TROEPS-CO4 connection is provided in the $TRDIR/examples/co4 directory. It can be recompiled for your installation by simply editing the makefile for setting the TRPDIR and CO4DIR variables and typing:
$ make
in that directory. This will put a new binary called co4trpserver in the $TRDIR/$PORTNAME/bin directory.
Like for the TROEPS server, in order to set up your new CO4-handled TROEPS server, you must (* means that it has been already done for installing the TROEPS knowledge servers):
			$TRDIR/$PORTNAME/bin/co4trpserver \
				-c <name> $ANSURL <co4port> <troepsport>\
				-j http://co4.inrialpes.fr/java \ # non compulsory
				-d http://co4.inrialpes.fr/help \ # non compulsory
				htrp-config.t							  # non compulsory
Your base is then accessible from the following URL:
http://myserv.mydom.domain:80/cgi-bin/hytropes/itsmach.itsdom.domain:5555/CO4/home-page
There are some differences between a simple TROEPS knowledge server and a CO4-handled one:
It is noteworthy that the CO4-handled TROEPS knowledge servers can be shutdown and waken up. At shutdown time, the system stores in the ANS the content and the parameters of the knowledge bases. During the time it is down, the messages addressed to the server are stored in a mailbox and at wake-up time, the content and the parameters of the knowledge base are restored and all the messages are visible from the interface.
The shutdown is processed from the HTTP client but the wake-up must occur at the shell top level (so far). It is achieved by using the -w option of co4trpserver:
			$TRDIR/$PORTNAME/bin/co4trpserver \
				-w <name> $ANSURL <co4port> <troepsport>\
				-j http://co4.inrialpes.fr/java \ # non compulsory
				-d http://co4.inrialpes.fr/help \ # non compulsory
				htrp-config.t							   # non compulsory