From 4612afdcb277875f5babf48ff3890b16f2651c1a Mon Sep 17 00:00:00 2001 From: Eric Ihli Date: Thu, 15 Jul 2021 20:38:33 -0500 Subject: [PATCH] Update build path for js --- web/README_WGU.org | 12 +- web/resources/public/README_WGU.htm | 692 +++++++++++++++++----------- web/wgu-app/shadow-cljs.edn | 2 +- web/wgu-app/src/main/wgu/app.cljs | 1 - 4 files changed, 429 insertions(+), 278 deletions(-) diff --git a/web/README_WGU.org b/web/README_WGU.org index ec9f99c..029bbae 100644 --- a/web/README_WGU.org +++ b/web/README_WGU.org @@ -13,7 +13,7 @@ The document you are reading now contains or points to each of the requirements The section immediately following this contains notes on how to view and run the software locally. In addition, I'm hosting a demo of the application at https://darklimericks.com/wgu. -After I describe the steps to initialize a development environment, you'll find a [[Letter Of Transmittal]], [[#executive-summary][Technical Executive Summary]], [[#requirements-documentation][links to the final product and details of how it meets each requirement]], and the [[#remaining-documentation][remaining required documentation]]. +After I describe the steps to initialize a development environment, you'll find a [[#letter-of-transmittal][Letter Of Transmittal]], [[#executive-summary][Technical Executive Summary]], [[#requirements-documentation][links to the final product and details of how it meets each requirement]], and the [[#remaining-documentation][remaining required documentation]]. * Evaluation Technical Documentation @@ -33,12 +33,16 @@ After I describe the steps to initialize a development environment, you'll find - [[https://www.docker.com/][Docker]] *** Steps - 1. Run ~./db/run.sh && ./kv/run.sh~ to start the docker containers for the database and key-value store. -2. Navigate to the root directory of this git repo and run ~java -jar darklimericks-dev.jar~ -3. Visit http://localhost:8000/wgu + a. The ~run.sh~ scripts only need to run once. They initialize development data containers. Subsequent development can continue with ~docker start db && docker start kv~. +2. The application's ~jar~ builds with a ~make~ run from the root directory. (See [[file:../Makefile][Makefile]]). +3. Navigate to the root directory of this git repo and run ~java -jar darklimericks-dev.jar~ +4. Visit http://localhost:8000/wgu * A. Letter Of Transmittal +:PROPERTIES: +:CUSTOM_ID: letter-of-transmittal +:END: ** Problem Summary diff --git a/web/resources/public/README_WGU.htm b/web/resources/public/README_WGU.htm index 9c68b32..bd88fb1 100644 --- a/web/resources/public/README_WGU.htm +++ b/web/resources/public/README_WGU.htm @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - + RhymeStorm™ - WGU CSCI Capstone Project @@ -215,6 +215,28 @@ /*]]>*///--> // @license-end + +
@@ -223,85 +245,182 @@

Table of Contents

-
-

1 A. Letter Of Transmittal

+
+

1 WGU Evaluator Notes

+

+Hello! I hope you enjoy your time with this evaluation! +

+ +

+Here’s a quick introduction to help you navigate this project. +

+ +

+The document you are reading now contains or points to each of the requirements listed at the course task overview page for C964. +

+ +

+The section immediately following this contains notes on how to view and run the software locally. In addition, I’m hosting a demo of the application at https://darklimericks.com/wgu. +

+ +

+After I describe the steps to initialize a development environment, you’ll find a Letter Of Transmittal, Technical Executive Summary, links to the final product and details of how it meets each requirement, and the remaining required documentation. +

+
+
+ +
+

2 Evaluation Technical Documentation

+
+
+
+

2.1 How To Initialize Development Environment

+
-
-

1.1 Problem Summary

-
+
+

2.1.1 Required Software

+ +
+
+ +
+

2.2 How To Run Software Locally

+
+
+
+

2.2.1 Requirements

+
+ +
+
+ +
+

2.2.2 Steps

+
+
    +
  1. Run ./db/run.sh && ./kv/run.sh to start the docker containers for the database and key-value store. +
      +
    1. The run.sh scripts only need to run once. They initialize development data containers. Subsequent development can continue with docker start db && docker start kv.
    2. +
  2. +
  3. The application’s jar builds with a make run from the root directory. (See Makefile).
  4. +
  5. Navigate to the root directory of this git repo and run java -jar darklimericks-dev.jar
  6. +
  7. Visit http://localhost:8000/wgu
  8. +
+
+
+
+
+ +
+

3 A. Letter Of Transmittal

+
+
+ +
+

3.1 Problem Summary

+

Songwriters, artists, and record labels can save time and discover better lyrics with the help of a machine learning tool that supports their creative endeavours.

@@ -312,9 +431,9 @@ Songwriters have several old-fashioned tools at their disposal including diction
-
-

1.2 Benefits

-
+
+

3.2 Benefits

+

How many sensible phrases can you think of that rhyme with “war on poverty”? What if I say that there’s a restriction to only come up with phrases that are exactly 14 syllables? That’s a common restriction when a songwriter is trying to match the meter of a previous line. What if I add another restriction that there must be primary stress at certain spots in that 14 syllable phrase?

@@ -329,11 +448,11 @@ And this is a process that is perfect for machine learning. Machine learning can
-
-

1.3 Product - RhymeStorm®

-
+
+

3.3 Product - RhymeStorm™

+

-RhymeStorm® is a tool to help songwriters brainstorm. It provides lyrics automatically generated based on training data from existing songs while adhering to restrictions based on rhyme scheme, meter, genre, and more. +RhymeStorm™ is a tool to help songwriters brainstorm. It provides lyrics automatically generated based on training data from existing songs while adhering to restrictions based on rhyme scheme, meter, genre, and more.

@@ -358,9 +477,9 @@ This auto-complete functionality will be similar to the auto-complete that is co

-
-

1.4 Data

-
+
+

3.4 Data

+

The initial model will be trained on the lyrics from http://darklyrics.com. This is a publicly available data set with minimal meta-data. Record labels will have more valuable datasets that will include meta-data along with lyrics, such as the date the song was popular, the number of radio plays of the song, the profit of the song/artist, etc…

@@ -371,9 +490,9 @@ The software can be augmented with additional algorithms to account for the type
-
-

1.5 Objectives

-
+
+

3.5 Objectives

+

This software will accomplish its primary objective if it makes its way into the daily toolkit of a handful of singers/songwriters.

@@ -392,9 +511,9 @@ Another example is the package that turns phrases into phones. That package can
-
-

1.6 Development Methodology - Agile

-
+
+

3.6 Development Methodology - Agile

+

This project will be developed with an iterative Agile methodology. Since a large part of data science and machine learning is exploration, this project will benefit from ongoing exploration in tandem with development.

@@ -409,9 +528,9 @@ The prices quoted below are for an initial minimum-viable-product that will serv
-
-

1.7 Costs

-
+
+

3.7 Costs

+

Funding requirements are minimal. The initial dataset is public and freely available. On a typical consumer laptop, Hidden Markov Models can be trained on fairly large datasets in short time and the training doesn’t require the use of expensive hardware like the GPUs used to train Deep Neural Networks.

@@ -495,18 +614,18 @@ These are my estimates for the time and cost of different aspects of initial dev
-
-

1.8 NO the impact of the solution on stakeholders

-
+
+

3.8 NO the impact of the solution on stakeholders

+

-This seems redundant or irrelevant. The only stakeholders in the project I’m describing would be the record labels or songwriters and the impact on them is described in the 1.2 section above. +This seems redundant or irrelevant. The only stakeholders in the project I’m describing would be the record labels or songwriters and the impact on them is described in the 3.2 section above.

-
-

1.9 Ethical And Legal Considerations

-
+
+

3.9 Ethical And Legal Considerations

+

Web scraping, the method used to obtain the initial dataset from http://darklyrics.com, is protected given the ruling in https://en.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn.

@@ -517,9 +636,9 @@ The use of publicly available data in generative works is less clear. But Micros
-
-

1.10 Expertise

-
+
+

3.10 Expertise

+

I have 10 years experience as a programmer and have worked extensively on both frontend technologies like HTML/JavaScript, backend technologies like Django, and building libraries/packages/frameworks.

@@ -531,30 +650,30 @@ I’ve also been writing limericks my entire life and hold the International
-
-

2 B. Executive Summary - RhymeStorm® Technical Notes And Requirements

-
+
+

4 B. Executive Summary - RhymeStorm™ Technical Notes And Requirements

+

Write an executive summary directed to IT professionals that addresses each of the following requirements:

-
-

2.1 Decision Support Opportunity

-
+
+

4.1 Decision Support Opportunity

+

-Songwriters expend a lot of time and effort finding the perfect rhyming word or phrase. RhymeStorm® is going to amplify user’s creative abilities by searching its machine learning model for sensible and proven-successful words and phrases that meet the rhyme scheme and meter requirements requested by the user. +Songwriters expend a lot of time and effort finding the perfect rhyming word or phrase. RhymeStorm™ is going to amplify user’s creative abilities by searching its machine learning model for sensible and proven-successful words and phrases that meet the rhyme scheme and meter requirements requested by the user.

-When a songwriter needs to find likely phrases that rhyme with “war on poverty” and has 14 syllables, RhymeStorm® will automatically generate dozens of possibilities and rank them by “perplexity” and rhyme quality. The songwriter can focus there efforts on simple touch-ups to perfect the automatically generated lyrics. +When a songwriter needs to find likely phrases that rhyme with “war on poverty” and has 14 syllables, RhymeStorm™ will automatically generate dozens of possibilities and rank them by “perplexity” and rhyme quality. The songwriter can focus there efforts on simple touch-ups to perfect the automatically generated lyrics.

-
-

2.2 Customer Needs And Product Description

-
+
+

4.2 Customer Needs And Product Description

+

Songwriters spend money on dictionaries, compilations of slang, thesauruses, and phrase dictionaries. They spend their time daydreaming, brainstorming, contemplating, and mixing and matching the knowledge they acquire through these traditional means.

@@ -573,9 +692,9 @@ Computers can process and sort this information and sort the results by quality
-
-

2.3 Existing Products

-
+
+

4.3 Existing Products

+

We’re all familiar with dictionaries, thesauruses, and their shortcomings.

@@ -590,15 +709,15 @@ RhymeZone is limited in its capability. It doesn’t do well finding rhymes
-
-

2.4 Available Data And Future Data Lifecycle

-
+
+

4.4 Available Data And Future Data Lifecycle

+

The initial dataset will be gathered by downloading lyrics from http://darklyrics.com and future models can be generated by downloading lyrics from other websites. Alternatively, data can be provided by record labels and combined with meta-data that the record label may have, such as how many radio plays each song gets and how much profit they make from each song.

-RhymeStorm® can offer multiple models depending on the genre or theme that the songwriter is looking for. With the initial dataset from http://darklyrics.com, all suggestions will have a heavy metal theme. But future data sets can be trained on rap, pop, or other genres. +RhymeStorm™ can offer multiple models depending on the genre or theme that the songwriter is looking for. With the initial dataset from http://darklyrics.com, all suggestions will have a heavy metal theme. But future data sets can be trained on rap, pop, or other genres.

@@ -615,11 +734,11 @@ Each new model can be uploaded to the web server and users can select which mode

-
-

2.5 Methodology - Agile

-
+
+

4.5 Methodology - Agile

+

-RhymeStorm® development will proceed with an iterative Agile methodology. It will be composed of several independent modules that can be worked on independently, in parallel, and iteratively. +RhymeStorm™ development will proceed with an iterative Agile methodology. It will be composed of several independent modules that can be worked on independently, in parallel, and iteratively.

@@ -640,9 +759,13 @@ Much of data science is exploratory and taking an iterative Agile approach can t

-
-

2.6 Deliverables

-
+
+

4.6 Deliverables

+
+

+Three aspects of this project are available as open source repositories on Github. +

+

Tightly Packed Trie

@@ -655,15 +778,19 @@ Much of data science is exploratory and taking an iterative Agile approach can t Data Processing, Markov, and Rhyme Algorithms

+

+The trained data model and web interface has been deployed at the following address and the code will be provided in an archive file. +

+

Web GUI and Documentation

-
-

2.7 Implementation Plan And Anticipations

-
+
+

4.7 Implementation Plan And Anticipations

+

the plan for implementation of your data product, including the anticipated outcomes from this development

@@ -677,7 +804,7 @@ Then I’ll write a website that imports and uses those libraries.

-Since I’ll be writing and releasing these packages iteratively as open source, I’ll share them publicly as I progress and can use feedback to improve them before RhymeStorm® takes its final form. +Since I’ll be writing and releasing these packages iteratively as open source, I’ll share them publicly as I progress and can use feedback to improve them before RhymeStorm™ takes its final form.

@@ -686,9 +813,9 @@ In anticipation of user growth, I’ll be deploying the final product on Dig

-
-

2.8 Requirements Validation And Verification

-
+
+

4.8 Requirements Validation And Verification

+

the methods for validating and verifying that the developed data product meets the requirements and subsequently the needs of the customers

@@ -707,9 +834,9 @@ The final website will integrate multiple technologies and the integrations won&
-
-

2.9 Programming Environments And Costs

-
+
+

4.9 Programming Environments And Costs

+

the programming environments and any related costs, as well as the human resources that are necessary to execute each phase in the development of the data product

@@ -732,9 +859,9 @@ All code was written and all models were trained on a Lenovo T15G with an Intel
-
-

2.10 Timeline And Milestones

-
+
+

4.10 Timeline And Milestones

+
@@ -803,25 +930,25 @@ All code was written and all models were trained on a Lenovo T15G with an Intel -
-

3 C. RhymeStorm Capstone Requirements Documentation

-
+
+

5 C. RhymeStormg™ Capstone Requirements Documentation

+

-RhymeStorm is an application to help singers and songwriters brainstorm new lyrics. +RhymeStorm™ is an application to help singers and songwriters brainstorm new lyrics.

-
-

3.1 Descriptive And Predictive Methods

-
+
+

5.1 Descriptive And Predictive Methods

+
-
-

3.1.1 Descriptive Method

-
+
+

5.1.1 Descriptive Method

+
    -
  1. Most Common Grammatical Structures In A Set Of Lyrics
    -
    +
  2. Most Common Grammatical Structures In A Set Of Lyrics
    +

    By filtering songs by metrics such as popularity, number of awards, etc… we can use this software package to determine the most common grammatical phrase structure for different filtered categories.

    @@ -867,28 +994,28 @@ In the example below, you’ll see that a simple noun-phrase is the most pop
- - + + - - + + - - + + - - + + - - + +
(TOP (NP (NNP) (.)))6(TOP (S (S (S (S (S (NP (NN)) (VP (VBP) (S (NP (JJ) (NNS)) (VP (VBD) (S (VP (TO) (VP (VB)))))))) nil (CC) (S (NP (PRP)) (VP (VBP) (ADJP (RB) (JJ)) (S (VP (TO) (VP (VB) (NP (PRP)) (SBAR (S (NP (DT) (NN)) (VP (VBZ) (RB) (PP (IN) (NP (JJ)))))))))))) (.) (NP (PRP)) (VP (MD) (VP (VB)))) nil (IN) (S (NP (PRP)) (VP (MD) (VP (VB) (S (VP (TO) (VP (VB)))))))) (.) (S (CC) (NP (PRP)) (VP (VBP) (ADJP (RB) (JJ)) (SBAR (S (NP (DT)) (VP (VBZ) (NP (NP (RB) (DT) (NN)) (SBAR (S (NP (PRP)) (VP (VBZ) (SBAR (S (NP (PRP)) (VP (MD) (ADVP (RB)) (VP (VB) (S (NP (PRP)) (ADJP (JJ))) (SBAR (IN) (S (VP (VBP) (NP (DT) (JJ) (NN)))))))))))))))))) (.)))1
(TOP (S (NP (PRP)) (VP (VBP) (ADJP (JJ))) (.)))6(INC (NP (DT) (JJ) (NN)) (.) (NP (DT) (NN)) (VBZ) (ADJP (JJ)) (.) (NP (PRP)) (VP (VB)) (CC) (VB) (IN) (NP (NNS)) (.) (CC) (NP (NN)) (VBZ) (ADVP (RB)) (VP (VBD)) (.) (NP (PRP)) (VBP) (IN) (NP (PRP)) (:) (NP (PRP)) (VBP) (IN) (NP (PRP)) (.) (JJ) (NN) (VBZ) (ADJP (JJ)) (ADVP (RB)) (.) (CC) (JJ) (IN) (VBG) (NP (PRP$) (JJ) (NN)) (.) (CC) (NP (NN)) (VBZ) (VP (VBN)) (.) (NNP) (NN) (.))1
(INC (NP (JJ) (NN)) nil (IN) (NP (DT)) (NP (PRP)) (VBP))4(TOP (S (S (S (S (NP (PRP)) (VP (VBZ) (ADJP (JJ)) (S (VP (TO) (VP (VB)))))) (.) (S (SBAR (WHADVP (WRB)) (S (S (NP (DT) (NN)) (VP (VBZ) (VP (VBN) (PP (IN) (NP (PRP\() (VBN) (NN)))))) (.) (NP (PRP)) (VP (VBP) (PP (IN) (NP (PRP\)) (JJ) (NN))) (PP (IN) (NP (PRP\() (JJ) (NN)))))) (NP (PRP)) (VP (VBP) (NP (CD) (NNS)) (SBAR (RB) (.) (S (NP (PRP)) (VP (VBZ) (RB) (NP (PRP\)) (NNS)) (SBAR (IN) (S (NP (NP (NN)) (SBAR (S (NP (PRP)) (VP (VBZ) (NP (ADJP (JJR)) (DT) (NN) (SBAR (IN) (S (NP (PRP)) (ADVP (RB)) (VP (VBP))))))))) (.) (NP (PRP)) (VP (VBP) (CC) (VP (VBP) (S (NP (NN)) (ADVP (RB)) (VP (VBG) (S (VP (TO) (VP (VB) (NP (DT) (NN))))))))))))))))) (.) (NP (PRP$) (NNS)) (VP (NN))) (IN) (S (S (NP (NP (DT) (NN)) (SBAR (S (NP (CD) (NNS)) (ADVP (RB)) (NP (PRP)) (VP (VBP) (ADJP (VBN)))))) (.) (NP (PRP)) (VP (VP (VBP) (NP (DT) (NN))) (CC) (VP (VB) (NP (DT) (JJ) (NN))))) (.) (NP (DT) (JJ) (NN)) (VP (VBZ) (PP (IN) (CC)) (VP (IN)))) (.)))1
(TOP (NP (NP (JJ) (NN)) nil (NP (NN) (CC) (NN))))4(TOP (S (S (S (S (NP (NP (NNP)) (SBAR (S (NP (NP (NP (PRP\() (NNS)) (PP (IN) (NP (JJ) (NNS)))) nil (NP (NP (NNP)) (PP (IN) (NP (DT) (JJ) (NNS))))) (VP (VBD) (SBAR (S (NP (NN)) (VP (VBZ) (NP (PRP\)) (NN)) (SBAR (IN) (S (NP (PRP)) (VP (VBD) (VP (VBG) (NP (JJ) (NNS))))))))))))) (.) (S (NP (NN)) (VP (VBZ) (ADVP (RB) (CC) (RB) (JJ)))) (.) (S (PP (IN) (NP (PRP\() (NN))) (.) (NP (DT) (VBN) (NN)) (VP (VBZ) (VP (VBN))))) (.) (NP (PRP\)) (NN) (NNS)) (SBAR (IN) (S (NP (DT)) (VP (VBZ) (NP (NP (PRP\() (NN)) (PP (IN) (NP (JJ) (NN)))))))) (.) (NP (PRP\)) (NN)) (VP (VBD) (RB) (VP (VB) (S (NP (PRP)) (ADJP (DT) (JJR)))))) (.) (VP (VB) (SBAR (S (S (NP (PRP$) (NNS)) (.) (VP (VB) (S (NP (JJ) (JJ) (NNS)) (.) (VP (VB) (ADJP (JJ)))))) (VP (VBZ) (RB) (PP (IN) (NP (NN))))))) (.)))1
(TOP (S (NP (JJ) (NN)) nil (VP (VBG) (ADJP (JJ)))))4(TOP (S (S (S (ADVP (RB)) (PP (PP (IN) (NP (NN) (NNS))) (CC) (PP (IN) (NP (NP (JJ) (NNS)) (SBAR (S (NP (PRP)) (VP (MD) (ADVP (RB)) (S (NP (NNS)) (VP (VBP) (PP (IN) (NP (NP (NP (DT) (NN)) (PP (IN) (NP (DT) (JJ) (NN)))) (SBAR (S (NP (PRP)) (ADVP (RB)) (VP (VBP) (NP (DT) (JJ) (NN))))))))))))))) (.) (CC) (S (ADVP (RB)) (NP (NNS)) (VP (VBP) (ADVP (RB)) (NP (NP (NN)) (CC) (NP (NP (NP (DT) (NN)) (PP (IN) (NP (NNS)))) (PP (IN) (NP (NN))))))) (.) (NP (NNP)) (VP (VBD) (PP (IN) (NP (NP (DT) (NN)) (PP (IN) (NP (NNS))))))) (.) (RB) (S (NP (DT)) (VP (VBZ) (S (JJ) nil (S (ADVP (RB)) (NP (PRP)) (VP (VBP) (SBAR (IN) (S (NP (PRP)) (VP (VBD) (RB) (VP (VB) (SBAR (S (NP (PRP)) (VP (MD) (VP (VB) (ADJP (JJR))))))))))))))) (.) (S (SBAR (IN) (S (NP (PRP)) (VP (VBD) (RB) (VP (VB) (PP (IN) (NP (DT) (JJ) (CC) (CD))))))) (NP (PRP)) (VP (MD) (VP (VB) (NP (NP (PRP\() (NN)) (CC) (NP (PRP\)) (NNS))))))) (.) () (NP (NNS)) (VP (MD) (VP (VB) (ADVP (RB)))) (.)))1
@@ -897,13 +1024,13 @@ In the example below, you’ll see that a simple noun-phrase is the most pop
-
-

3.1.2 Prescriptive Method

-
+
+

5.1.2 Prescriptive Method

+
    -
  1. Most Likely Word To Follow A Given Phrase
    -
    +
  2. Most Likely Word To Follow A Given Phrase
    +

    To help songwriters think of new lyrics, we provide an API to receive a list of words that commonly follow/precede a given phrase.

    @@ -998,9 +1125,9 @@ In the example below, we provide a seed suffix of “bother me” and as
-
-

3.2 Datasets

-
+
+

5.2 Datasets

+

The dataset currently in use was generated from the publicly available lyrics at http://darklyrics.com.

@@ -1011,13 +1138,13 @@ Further datasets will need to be provided by the end-user.
-
-

3.3 Decision Support Functionality

-
+
+

5.3 Decision Support Functionality

+
-
-

3.3.1 Choosing Words For A Lyric Based On Markov Likelihood

-
+
+

5.3.1 Choosing Words For A Lyric Based On Markov Likelihood

+

Entire phrases can be generated using the previously mentioned functionality of generating lists of likely prefix/suffix words.

@@ -1032,9 +1159,9 @@ The user can supply criteria such as restrictions on the number of syllables, nu
-
-

3.3.2 Choosing Words To Complete A Lyric Based On Rhyme Quality

-
+
+

5.3.2 Choosing Words To Complete A Lyric Based On Rhyme Quality

+

Another part of the decision support functionality is filtering and ordering predicted words based on their rhyme quality.

@@ -1259,9 +1386,9 @@ In the example below, you’ll see that the first 20 or so rhymes are perfec
-
-

3.4 Featurizing, Parsing, Cleaning, And Wrangling Data

-
+
+

5.4 Featurizing, Parsing, Cleaning, And Wrangling Data

+

The data processing code is in https://github.com/eihli/prhyme

@@ -1297,9 +1424,9 @@ words can be compared: “Foo” is the same as “foo”.
-
-

3.5 Data Exploration And Preparation

-
+
+

5.5 Data Exploration And Preparation

+

The primary data structure and algorithms supporting exploration of the data are a Markov Trie

@@ -1346,27 +1473,27 @@ All Trie code is hosted in the git repo located at -

3.6 TODO Data Visualization Functionalities For Data Exploration And Inspection

-
+
+

5.6 TODO Data Visualization Functionalities For Data Exploration And Inspection

+
  • graph of phrase complexity on one axis and rhyme quality on another axis.
-
-

3.7 TODO Implementation Of Interactive Queries

-
+ -
-

3.8 TODO implementation of machine-learning methods and algorithms

-
+
+

5.8 Implementation Of Machine Learning Methods

+

The machine learning method chosen for this software is a Hidden Markov Model.

@@ -1430,21 +1557,22 @@ The algorithm for generating predictions from the HMM is as follows. (eduction (data-transform/xf-file-seq 501 2))) database (atom {:next-id 1}) trie (file-seq->markov-trie database files 1 3)] + (pprint/pprint [(map (comp (partial map @database) first) (take 10 (drop 105 trie)))]))
-
-[(("<s>" "call" "me")
-  ("<s>" "call")
-  ("<s>" "right" "</s>")
-  ("<s>" "right")
-  ("<s>" "that's" "proportional")
-  ("<s>" "that's")
-  ("<s>" "don't" "</s>")
-  ("<s>" "don't")
-  ("<s>" "yourself" "in")
-  ("<s>" "yourself"))]
+
+[(("<s>" "pain")
+  ("<s>" "lone" "i")
+  ("<s>" "lone")
+  ("<s>" "black" "is")
+  ("<s>" "black")
+  ("<s>" "to" "rip")
+  ("<s>" "to")
+  ("<s>" "too" "late")
+  ("<s>" "too")
+  ("<s>" "how" "wrong"))]
 

@@ -1460,16 +1588,24 @@ It also performs compaction and serialization. Song lyrics are typically provide

-
(defn train-backwards
+
(require '[com.owoga.corpus.markov :as markov]
+         '[taoensso.nippy :as nippy]
+         '[com.owoga.prhyme.data-transform :as data-transform]
+         '[clojure.pprint :as pprint]
+         '[clojure.string :as string]
+         '[com.owoga.trie :as trie]
+         '[com.owoga.tightly-packed-trie :as tpt])
+
+(defn train-backwards
   "For building lines backwards so they can be seeded with a target rhyme."
   [files n m trie-filepath database-filepath tightly-packed-trie-filepath]
   (let [database (atom {:next-id 1})
-        trie (file-seq->backwards-markov-trie database files n m)]
+        trie (markov/file-seq->backwards-markov-trie database files n m)]
     (nippy/freeze-to-file trie-filepath (seq trie))
     (println "Froze" trie-filepath)
     (nippy/freeze-to-file database-filepath @database)
     (println "Froze" database-filepath)
-    (save-tightly-packed-trie trie database tightly-packed-trie-filepath)
+    (markov/save-tightly-packed-trie trie database tightly-packed-trie-filepath)
     (let [loaded-trie (->> trie-filepath
                            nippy/thaw-from-file
                            (into (trie/make-trie)))
@@ -1477,45 +1613,47 @@ It also performs compaction and serialization. Song lyrics are typically provide
                          nippy/thaw-from-file)
           loaded-tightly-packed-trie (tpt/load-tightly-packed-trie-from-file
                                       tightly-packed-trie-filepath
-                                      (decode-fn loaded-db))]
+                                      (markov/decode-fn loaded-db))]
       (println "Loaded trie:" (take 5 loaded-trie))
       (println "Loaded database:" (take 5 loaded-db))
       (println "Loaded tightly-packed-trie:" (take 5 loaded-tightly-packed-trie))
       (println "Successfully loaded trie and database."))))
 
-(comment
-  (time
-   (let [files (->> "dark-corpus"
-                    io/file
-                    file-seq
-                    (eduction (xf-file-seq 0 250000)))
-         [trie database] (train-backwards
-                          files
-                          1
-                          5
-                          "/home/eihli/.models/markov-trie-4-gram-backwards.bin"
-                          "/home/eihli/.models/markov-database-4-gram-backwards.bin"
-                          "/home/eihli/.models/markov-tightly-packed-trie-4-gram-backwards.bin")]))
-
-  (time
-   (def markov-trie (into (trie/make-trie) (nippy/thaw-from-file "/home/eihli/.models/markov-trie-4-gram-backwards.bin"))))
-  (time
-   (def database (nippy/thaw-from-file "/home/eihli/.models/markov-database-4-gram-backwards.bin")))
-  (time
-   (def markov-tight-trie
-     (tpt/load-tightly-packed-trie-from-file
-      "/home/eihli/.models/markov-tightly-packed-trie-4-gram-backwards.bin"
-      (decode-fn database))))
-  (take 20 markov-tight-trie)
-  )
+(let [files (->> "/home/eihli/src/prhyme/dark-corpus"
+                 io/file
+                 file-seq
+                 (eduction (data-transform/xf-file-seq 0 4)))
+      [trie database] (train-backwards
+                       files
+                       1
+                       5
+                       "/tmp/markov-trie-4-gram-backwards.bin"
+                       "/tmp/markov-database-4-gram-backwards.bin"
+                       "/tmp/markov-tightly-packed-trie-4-gram-backwards.bin")])
+
+(def markov-trie (into (trie/make-trie) (nippy/thaw-from-file "/tmp/markov-trie-4-gram-backwards.bin")))
+(def database (nippy/thaw-from-file "/tmp/markov-database-4-gram-backwards.bin"))
+(def markov-tight-trie
+  (tpt/load-tightly-packed-trie-from-file
+   "/tmp/markov-tightly-packed-trie-4-gram-backwards.bin"
+   (markov/decode-fn database)))
+
+(println "\n\n Example n-grams frequencies from Hidden Markov Model:\n")
+(pprint/pprint
+ (->> markov-tight-trie
+      (drop 600)
+      (take 10)
+      (map
+       (fn [[ngram-ids [id freq]]]
+         [(string/join " " (map database ngram-ids)) freq]))))
 
-
-

3.9 Functionalities To Evaluate The Accuracy Of The Data Product

-
+
+

5.9 Functionalities To Evaluate The Accuracy Of The Data Product

+

Since creative brainstorming is the goal, “accuracy” is subjective.

@@ -1529,9 +1667,9 @@ We can, however, measure and compare language generation algorithms against how '[com.owoga.tightly-packed-trie :as tpt] '[com.owoga.corpus.markov :as markov]) -(defonce database (nippy/thaw-from-file "/home/eihli/.models/markov-database-4-gram-backwards.bin")) +(def database (nippy/thaw-from-file "/home/eihli/.models/markov-database-4-gram-backwards.bin")) -(defonce markov-tight-trie +(def markov-tight-trie (tpt/load-tightly-packed-trie-from-file "/home/eihli/.models/markov-tightly-packed-trie-4-gram-backwards.bin" (markov/decode-fn database))) @@ -1557,21 +1695,16 @@ We can, however, measure and compare language generation algorithms against how (map database) (markov/perplexity 4 markov-tight-trie)) word)))) - ["a" "this" "that"]) - nil) + ["a" "this" "that"]))
-"a" has preceeded "hole" "</s>" "</s>" a total of 250 times
-"this" has preceeded "hole" "</s>" "</s>" a total of 173 times
-"that" has preceeded "hole" "</s>" "</s>" a total of 45 times
--12.184088569934774 is the perplexity of "a" "hole" "</s>" "</s>"
--12.552930899563904 is the perplexity of "this" "hole" "</s>" "</s>"
--13.905719644461469 is the perplexity of "that" "hole" "</s>" "</s>"
+class clojure.lang.Compiler$CompilerException
 
+

The results above make intuitive sense. The most common word to preceed “hole” at the end of a sentence is the word “a”. There are 250 instances of sentences of “… a hole.”. That can be compared to 173 instances of “… this hole.” and 45 instances of “… that hole.”.

@@ -1586,9 +1719,9 @@ This standardized measure of accuracy can be used to compare different language
-
-

3.10 Security Features

-
+
+

5.10 Security Features

+

Artists/Songwriters place a lot of value in the secrecy of their content. Therefore, all communication with the web-based interface occurs over a secure connection using HTTPS.

@@ -1603,9 +1736,9 @@ With this precaution in place, attackers will not be able to snoop the content t
-
-

3.11 TODO Tools To Monitor And Maintain The Product

-
+
+

5.11 TODO Tools To Monitor And Maintain The Product

+
  • Script to auto-update SSL cert
  • Enable NGINX dashboard?
  • @@ -1613,41 +1746,56 @@ With this precaution in place, attackers will not be able to snoop the content t
-
-

3.12 TODO A User-Friendly, Functional Dashboard That Includes At Least Three Visualization Types

+
+

5.12 TODO A User-Friendly, Functional Dashboard That Includes At Least Three Visualization Types

-
-

4 D. Documentation

-
+
+

6 D. Documentation

+

Create each of the following forms of documentation for the product you have developed:

-
-

4.1 Business Vision

-
+
+

6.1 Business Vision

+

Provide rhyming lyric suggestions optionally constrained by syllable count.

+ +
+

6.1.1 Requirements

+
+
    +
  • [ ] Given a word or phrase, suggest rhymes (ranked by quality) (Trie)
  • +
  • [ ] Given a word or phrase, suggest lyric completion (Hidden Markov Model) +
      +
    • [ ] Restrict suggestion by syllable count
    • +
    • [ ] Restrict suggestion by rhyme quality
    • +
    • [ ] Show graph of suggestions with perplexity on one axis and rhyme quality on the other
    • +
  • +
+
+
-
-

4.2 Data Sets

-
+
+

6.2 Data Sets

+

See resources/darklyrics-markov.tpt

-
-

4.3 Data Analysis

-
+
+

6.3 Data Analysis

+

See src/com/owoga/darklyrics/core.clj

@@ -1658,9 +1806,9 @@ See https://github.com/eihli/prhyme
-
-

4.4 Assessment

-
+
+

6.4 Assessment

+

See visualization of rhyme suggestion in action.

@@ -1671,9 +1819,9 @@ See perplexity?
-
-

4.5 Visualizations

-
+
+

6.5 Visualizations

+

See visualization of smoothing technique.

@@ -1684,36 +1832,36 @@ See wordcloud
-
-

4.6 Accuracy

-
+
+

6.6 Accuracy

+

• assessment of the product’s accuracy

-
-

4.7 Testing

-
+
+

6.7 Testing

+

• the results from the data product testing, revisions, and optimization based on the provided plans, including screenshots

-
-

4.8 Source

-
+
+

6.8 Source

+

• source code and executable file(s)

-
-

4.9 Quick Start

-
+
+

6.9 Quick Start

+

• a quick start guide summarizing the steps necessary to install and use the product

@@ -1721,9 +1869,9 @@ See wordcloud
-
-

5 Notes

-
+
+

7 Notes

+

http-kit doesn’t support https so no need to bother with keystore stuff like you would with jetty. Just proxy from haproxy.

@@ -1732,7 +1880,7 @@ http-kit doesn’t support https so no need to bother with keystore stuff li

Author: Eric Ihli

-

Created: 2021-07-14 Wed 19:43

+

Created: 2021-07-15 Thu 20:35

diff --git a/web/wgu-app/shadow-cljs.edn b/web/wgu-app/shadow-cljs.edn index 4ddeea3..f422b27 100644 --- a/web/wgu-app/shadow-cljs.edn +++ b/web/wgu-app/shadow-cljs.edn @@ -10,6 +10,6 @@ :builds {:frontend {:target :browser - :output-dir "/home/eihli/src/darklimericks/web/resources/public/wgu/" + :output-dir "../resources/public/wgu/" :assets-path "/assets/" :modules {:main {:init-fn wgu.app/init}}}}} diff --git a/web/wgu-app/src/main/wgu/app.cljs b/web/wgu-app/src/main/wgu/app.cljs index d7f0154..6a43a65 100644 --- a/web/wgu-app/src/main/wgu/app.cljs +++ b/web/wgu-app/src/main/wgu/app.cljs @@ -5,7 +5,6 @@ [reagent.core :as r])) - (defn play-data [& names] (for [n names i (range 20)]