prhyme/sgt
Eric Ihli bf6836bb69 More SGT exploration
..
ReadMe.txt Working completions
a.out More SGT exploration
freq_freqs.txt Working completions
sgt.h More SGT exploration
sgt.h.gch More SGT exploration
sgt.zip Working completions
sgttest.cpp Working completions

ReadMe.txt

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

SGT

===



The files here contain a C++ class for implementing simple Good-Turing

re-estimation, as described by Geoff Sampson in the book Empirical Linguistics

(2001), and on the web at http://www.grsampson.net/RGoodTur.html. The code

here is a C++ adaptation of the published code by Sampson and Gale, with the

bug fix issued in 2000. It is encapsulated as a class to allow it to be

incorporated into other programs. An additional coding change is that the data

can be presented in any order, whereas the original code required the data to

be in ascending order.



Sampson's original code was issued with no restrictions on use. In keeping

with the spirit of this, the code here is issued under an open source licence

which allows essentially unrestricted use.



LICENCE

-------

Copyright (c) David Elworthy 2004.

All rights reserved.



Redistribution and use in source and binary forms for any purpose, with or

without modification, are permitted provided that the following conditions

are met:



1. Redistributions of source code must retain the above copyright notice,

   this list of conditions, and the following disclaimer.

 

2. Redistributions in binary form must reproduce the above copyright

   notice, this list of conditions, and the disclaimer that follows 

   these conditions in the documentation and/or other materials 

   provided with the distribution.



THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED

WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF

MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN

NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,

SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED

TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR

PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF

LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING

NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS

SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.



Contact details

---------------

You may contact me at david@friendlymoose.com. I would be happy to hear of any

experiences you have with the code; please feel free to send me updated

versions. The reference site for the code is http://www.friendlymoose.com/.



Files and use

-------------

There are three files:

sgt.h       SGT header file

sgttest.cpp A test and example program



There is no source file, as the SGT class is a template over the observation

type, typically either an int or a double.



Information about using the class is included in the header file. The code has

been tested with g++ version 3.2 on cygwin and Microsoft Visual Studio version

6 on Windows 2000. You can compile and link the test program using g++ using

the command

     g++ -o sgttest sgttest.cpp



For Visual Studio, from the command line, you can compile and link with

     cl -GX sgttest.cpp



Version history

---------------

Initial version released January 2004.

Updated to a better implementation April 2004.