You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
73 lines
3.1 KiB
Plaintext
73 lines
3.1 KiB
Plaintext
SGT
|
|
===
|
|
|
|
The files here contain a C++ class for implementing simple Good-Turing
|
|
re-estimation, as described by Geoff Sampson in the book Empirical Linguistics
|
|
(2001), and on the web at http://www.grsampson.net/RGoodTur.html. The code
|
|
here is a C++ adaptation of the published code by Sampson and Gale, with the
|
|
bug fix issued in 2000. It is encapsulated as a class to allow it to be
|
|
incorporated into other programs. An additional coding change is that the data
|
|
can be presented in any order, whereas the original code required the data to
|
|
be in ascending order.
|
|
|
|
Sampson's original code was issued with no restrictions on use. In keeping
|
|
with the spirit of this, the code here is issued under an open source licence
|
|
which allows essentially unrestricted use.
|
|
|
|
LICENCE
|
|
-------
|
|
Copyright (c) David Elworthy 2004.
|
|
All rights reserved.
|
|
|
|
Redistribution and use in source and binary forms for any purpose, with or
|
|
without modification, are permitted provided that the following conditions
|
|
are met:
|
|
|
|
1. Redistributions of source code must retain the above copyright notice,
|
|
this list of conditions, and the following disclaimer.
|
|
|
|
2. Redistributions in binary form must reproduce the above copyright
|
|
notice, this list of conditions, and the disclaimer that follows
|
|
these conditions in the documentation and/or other materials
|
|
provided with the distribution.
|
|
|
|
THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
|
|
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
|
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN
|
|
NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
|
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
|
|
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
|
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
|
|
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
|
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
|
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
|
|
Contact details
|
|
---------------
|
|
You may contact me at david@friendlymoose.com. I would be happy to hear of any
|
|
experiences you have with the code; please feel free to send me updated
|
|
versions. The reference site for the code is http://www.friendlymoose.com/.
|
|
|
|
Files and use
|
|
-------------
|
|
There are three files:
|
|
sgt.h SGT header file
|
|
sgttest.cpp A test and example program
|
|
|
|
There is no source file, as the SGT class is a template over the observation
|
|
type, typically either an int or a double.
|
|
|
|
Information about using the class is included in the header file. The code has
|
|
been tested with g++ version 3.2 on cygwin and Microsoft Visual Studio version
|
|
6 on Windows 2000. You can compile and link the test program using g++ using
|
|
the command
|
|
g++ -o sgttest sgttest.cpp
|
|
|
|
For Visual Studio, from the command line, you can compile and link with
|
|
cl -GX sgttest.cpp
|
|
|
|
Version history
|
|
---------------
|
|
Initial version released January 2004.
|
|
Updated to a better implementation April 2004.
|