You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

73 lines
3.1 KiB
Plaintext

SGT
===
The files here contain a C++ class for implementing simple Good-Turing
re-estimation, as described by Geoff Sampson in the book Empirical Linguistics
(2001), and on the web at http://www.grsampson.net/RGoodTur.html. The code
here is a C++ adaptation of the published code by Sampson and Gale, with the
bug fix issued in 2000. It is encapsulated as a class to allow it to be
incorporated into other programs. An additional coding change is that the data
can be presented in any order, whereas the original code required the data to
be in ascending order.
Sampson's original code was issued with no restrictions on use. In keeping
with the spirit of this, the code here is issued under an open source licence
which allows essentially unrestricted use.
LICENCE
-------
Copyright (c) David Elworthy 2004.
All rights reserved.
Redistribution and use in source and binary forms for any purpose, with or
without modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions, and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions, and the disclaimer that follows
these conditions in the documentation and/or other materials
provided with the distribution.
THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN
NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Contact details
---------------
You may contact me at david@friendlymoose.com. I would be happy to hear of any
experiences you have with the code; please feel free to send me updated
versions. The reference site for the code is http://www.friendlymoose.com/.
Files and use
-------------
There are three files:
sgt.h SGT header file
sgttest.cpp A test and example program
There is no source file, as the SGT class is a template over the observation
type, typically either an int or a double.
Information about using the class is included in the header file. The code has
been tested with g++ version 3.2 on cygwin and Microsoft Visual Studio version
6 on Windows 2000. You can compile and link the test program using g++ using
the command
g++ -o sgttest sgttest.cpp
For Visual Studio, from the command line, you can compile and link with
cl -GX sgttest.cpp
Version history
---------------
Initial version released January 2004.
Updated to a better implementation April 2004.