💾 Archived View for gemini.spam.works › mirrors › textfiles › politics › SPUNK › sp000152.txt captured on 2022-04-29 at 02:20:46.

View Raw

More Information

⬅️ Previous capture (2022-03-01)

-=-=-=-=-=-=-

Anarchy Internet University, Fall Session '93 - Lesson 2

What you want is what you get is what you want
                               or, 
                                     I never Metadata I didn't like.

(An)Archie, Veronica, and Jughead--
                   You don't need a Weatherbee to know which way the wind blows

Remember at the end of the film Brazil, when Michael Palin was wearing that
funny mask and torturing Sam, all part of the Information Retrieval process?
You might wish that you could take to the internet with a good pair of 
needlenose and wrest its riches out by force, but fortunately its physical
resources are so scattered that you'd probably do more damage to your PC or 
work site (which in some cases wouldn't be that bad a thing :>) than convince 
the net to fork over what you're looking for.

Net hacks have written tools that allow you to search metadata--data about data.
This is a key problem in the proliferation of network access points (the 
internet is 'growing' at 12% per month): How do you keep track of such vast 
volumes of data, text, pictures, sounds, movies, numbers?  The task of 
organizing such a heterogeneous mix of resources has never been attempted and 
is such a complex task that it will never fully succeed :) but is an interesting
library research problem.  

The tools that go out and graze the net use the various protocols.  We already
talked about ftp, the File Transfer Protocol.  There are many other protocols
on the internet that serve many different functions that the end user never 
need know about--this message came to you courtesy of the SMTP: Simple Mail 
Transfer Protocol (and the number 25).  A handful of the protocols, though 
are of worth a cursory familiarity if you want to optimize your time in front 
of the screen.  

PROTOCOL     ACRONYM EXPANSION         CLIENT   WHAT IT DOES
-------------------------------------------------------------------------
FTP          File Transfer                ftp   Get/Put files on remote site
             Protocol                           Remote file system manipulation.

GOPHER       Gopher Information Server gopher   Browse menu heirarchy and 
             Protocol                 xgopher   retrieve data. Character based, 
             Not an acronym :)                  graphics by separate program.

NNTP         Network News Transport     nn,rn   Read/Post news articles.
             Protocol       

WAIS         Wide Area Information waissearch   Search and retrieve documents
             Server                     xwais   Networked database access.

HTTP         HyperText Transport      xmosaic   Browse and search networked 
             Protocol                   cello   hypertext using above protocols.
                                        midas   Unix X, Mac, Windows clients.
                                        viola   incorporates support for movies,
                                         lynx   images, sound, point and click
                                        tkWWW   graphic interface. slick.

Each protocol has its own meta-data search method.  Computer geeks have
taken the name for the ftp meta-data search system, archie, and extended it
to include gopher's search system, veronica, and just last week I heard
announced a meta-data knowbot for HTTP, jughead.  Remember what Anarchy said:
"You don't need a Weatherbee to know which way the wind blows."

ARCHIE - FTP

The most second commonly used protocol on the internet is FTP.  In the last
lesson, we talked about how to log onto a ftp site and retrieve a file. 
Unless you have a network of friends who constantly use the most popular
protocol on the internet SMTP to keep you up to date on whats new out there,
you have to have ways to ftp what you want.  Archie is the tool you use to 
do this.  Its easy.  Just type:

% archie country-codes

You get back a set of citations that include ftp sites, pathnames and 
filenames.  I've edited this for space.  The real query has many more
hits, but I've left some to show you how the country codes distribute 
in an archie query.,  For the first cite, the host is plaza.aarnet.edu.au.
The pathname is /usrnet/FAQs/alt.answers/mail and the filename is
country-codes.  (FAQ = Frequently Asked Questions) Read for stuff you 
are interested in and you won't bother people with the most common questions.

Host plaza.aarnet.edu.au               AUSTRALIA -too far away
    Location: /usenet/FAQs/alt.answers/mail
           FILE -r--r--r--      19681  Oct 13 02:10  country-codes

Host rzsun2.informatik.uni-hamburg.de  GERMANY -too far away
    Location: /pub/doc/news.answers/mail
           FILE -rw-r--r--      18947  Oct 13 10:27  country-codes

Host bloom-picayune.mit.edu            MIT - a good, fast bet
    Location: /pub/usenet-by-group/alt.answers/mail
           FILE -rw-rw-r--      19681  Oct 13 02:10  country-codes

Host charon.mit.edu                    Mega MIT server - may be busy but fast
    Location: /pub/usenet-by-group/alt.answers/mail
           FILE -rw-rw-r--      19612  Sep  1 06:30  country-codes

Host sunsite.unc.edu                   Mega Univ of NC server - fast and busy
    Location: /pub/docs/about-the-net
           FILE -rw-r--r--      20137  Jun  3 15:40  country-codes

Host grasp1.univ-lyon1.fr              FRANCE - blew '68 so why ftp from them?
    Location: /pub/faq-by-newsgroup/alt/alt.internet.services/mail
           FILE -rw-r--r--      19560  Sep  1 05:01  country-codes

Host han.hana.nm.kr                    KR? check out the country-codes for info
    Location: /netinfo/sh.cs.net
           FILE -rw-r--r--      16442  Jan 14 1993  country-codes

Host nnsc.nsf.net                      .net = network support site
    Location: /info
           FILE -rw-rw-r--      17455  Oct 20 1992  country-codes

Host svin02.info.win.tue.nl            NETHERLANDS - slow link
    Location: /pub/usenet/news.answers/mail
           FILE -rw-r--r--      19652  Sep  1 02:00  country-codes

Host ugle.unit.no                      NORWAY - exotic, but slow link
    Location: /faq/comp.answers/mail
           FILE -rw-rw-r--      19633  Sep  1 05:04  country-codes

Usually the information in the citation helps you decide if its worth your 
time to go check it out you get the file size in bytes (characters) the
last modification data and its location.  You need to check out the location 
of the site too.  Although the network is fast, you are limited in the speed 
of your search by the slowest link in the virtual circuit between you and the 
ftp site you are on.  Usually, the closer the ftp site to you physically the
faster the transfer and response will be faster, all things being equal.  If 
I can get something from Berkeley or MIT instead of from Finland, New Zealand 
or Taiwan, I'll get it from the backbone instead of from the spines.  Anyway, 
if you want to find out the country codes so you don't ftp the otherside of 
the world, this archie search will do it for you:

All you have to do is 
  
  1) ftp to the ftp site,
  2) cd to the pathname,
  3) get file.

VERONICA - GOPHER

Gopher is a menu-driven networked information retrieval system developed at
the University of Minnesota.  I never read the manual on gopher and it was
just totally intuitive to use if you have ever used a computer for anything.
All you have to do is hit return and use the up/down arrow keys, pgup and pgdn
and the 'u' key to go up levels.  Otherwise follow instructions and you can't 
go wrong.  Have fun in 'gopherspace.'

The Veronica server is located off of the root gopher at gopher.umich.edu.
All you have to do is:

  1) run gopher.  Typing gopher automatically connects to the root gopher
     server at the Univ of Minn: gopher.tc.umn.edu.  You can also connect to
     any other gopher server by typing '% gopher host.id.domain' where 
     host.id.domain is the real id for the system you are interested in.

% gopher

  2) select menu item 8. Other Gopher and Information Servers/
     slink down the menu with the down arrow key till you get to number
     8 and then hit return.  You can also type 8 and then return.
     (names ending with a / mean that there is another gopher level
     available when you hit return for that item.  names without a /
     are terminal, in that they point to a resource.)

  3) select item 2.  Search titles in Gopherspace using veronica/.

  4) You will get a list of available search methods.  You can experiment
     to find which works better for you.  There are some search methods that
     return results that are 'protected' and you can't read.  But  you won't
     know until you try :>.  All of these menu items (ending with a <?>)
     are searchable indexes.  If you hit return on them, you will be prompted
     for a search term.  When you enter it, blammo, a new gopher level is
     created for you of all gopher items that matched your query.

WAIS

WAIS is a searching protocol based on the NISO Z39.50 Information Retrieval
standard.  It currently exists as a networked set of database servers that
can register with central sites.  You search by querying the central site (a
centralization that cannot continue for long if the WAIS server community
continues to grow as it has, but is now quake.think.com) and it returns a set 
of servers that you would use in a second query, that would return documents 
or referrals to other sources.  It, like all of the protocols discussed, is 
a client-server system, which means that there are clients that operate on 
many different platforms, PC's, Windows, Mac's, probably VMS as well as 
several UNIX X front-ends.

Most all of this software is freely distributed in source form (this means 
that you have the human-readable program) which means that anyone who knows 
the programming language can alter it to suit their needs.  This public domain 
software is cool because its free, its a standard that no corporation has 
control over and the most popular serious operating system, UNIX, is 
practically free in source form.  The only catch is that you cannot use the
code for profit, unless you pay a hefty fee, but no one ever said that 
undermining the state was a profit-motivated endeavor, at least as far as 
the law is concerned :).

You can ftp to think.com and wais.com for information and sources for WAIS
servers and clients.  If you are going to use WAIS to put up content that
is non-profit and contributes to making the state an endangered species, drop 
me an e-mail and I can provide some pro-bono consulting.  You basically take 
your text (and associated images or sounds and stuff) index it through your 
own database system or one that comes with WAIS.  You then set up a WAIS 
server.  

When someone asks a question of a WAIS client it connects to a server via a
special network place (a port) that the WAIS server has been patiently listening
to.  The server spins off a copy of itself to handle the search and returns 
to its regimen of listening.  The handler process then searches the database, 
and returns a set of matches, or hits that include a direct key that contains
an unique identifier so you have all the data you need to snarf the data file 
in the next transaction.  

This way the server can be STATELESS in that it needn't remember any previous 
transaction because each transaction contains enough data to completely 
describe the query.  The client puts the hit list on the screen, the user then 
selects some of the hits to retrieve.  The client whisks the direct-access 
key off to the server, which returns the requested data.  This whole 
transaction session is described in formal internet protocols so that any
host that has a Z39.50 server can be queried by a user on any computer that 
has a Z39.50 client.

JUGHEAD - HTTP

This package has just been announced and it probably won't be functional for
several months yet.  Today someone mentioned a jughead for gopher, so its up
in the air.

The implementation of HTTP, with clients and servers and a hypertext scripting
markup language makes up the World Wide Web, or WWW, www, W3 or just the web.
The cool part is that you can edit a page in vi, entering HTML, the hypertext
markup language, descriptions of a page, that contains 'hot spot' links that
can point to any resource available on the network.  One document on anarchy,
for example, can point to resources--sound, video, images, text, data retrieval
anything--around the world that make a kind of exhibit that can be accessed
by anyone else on the network so equipped.  The browsers all incorporate
support for ftp, wais and gopher into their interface, so users need only
learn the HTTP client.

The so equipped part is the problem right now.  Most of these clients are
networked applications, in that they do more than put characters up on a
terminal screen like a kermit or procomm type connection does.  They 
interoperate with special servers that run graphics screens, so you have to 
have a network connection.  This kind of network connections transmit so much 
data (high bandwidth) that most people can't afford it.  There are text-based 
browsers that peruse this hypertext jungle, but they don't have the glitz of 
the X (unix window system) implementation.  Soon enough we will most all have
access to high-speed networking, either by the local library or at home, so 
if you're into this, it shouldn't be a problem.

No test afterward.  Go out, drink a beer and netnavigate the state away.

Coming soon--encryption.