Slight modifications to README

main
Eric Ihli 3 years ago
parent dadbb78c37
commit 673998c709

@ -145,8 +145,6 @@ I've also been writing limericks my entire life and hold the International Limer
:CUSTOM_ID: executive-summary
:END:
Write an executive summary directed to IT professionals that addresses each of the following requirements:
** Decision Support Opportunity
Songwriters expend a lot of time and effort finding the perfect rhyming word or phrase. RhymeStorm™ is going to amplify user's creative abilities by searching its machine learning model for sensible and proven-successful words and phrases that meet the rhyme scheme and meter requirements requested by the user.
@ -225,8 +223,6 @@ In anticipation of user growth, I'll be deploying the final product on DigitalOc
** Requirements Validation And Verification
the methods for validating and verifying that the developed data product meets the requirements and subsequently the needs of the customers
For the known requirements, I'll perform personally perform manual tests and quality assurance. This is a small enough project that one individual can thoroughly test all of the primary requirements.
Since the project is broken down into isolated sub-projects, unit tests will be added to the sub-projects to make sure they meet their own goals and performance standards.
@ -235,8 +231,6 @@ The final website will integrate multiple technologies and the integrations won'
** Programming Environments And Costs
the programming environments and any related costs, as well as the human resources that are necessary to execute each phase in the development of the data product
One of the benefits of a Hidden Markov Model is its relative computational affordability when compared to other machine learning techniques, like Deep Neural Networks.
We don't require a GPU or long training times on powerful computers. The over 200,000 songs obtained from http://darklyrics.com can be trained into a 4-gram Hidden Markov Model in just a few hours on a consumer laptop.
@ -829,7 +823,7 @@ You'll see 3 input fields.
The first input field is for a word or phrase for which you wish to find a rhyme. Submitting that field will return three visualizations to help you pick a rhyme.
The first visualization is a scatter plot of rhyming words with the "quality" of the rhyme on the Y axis and the number of times that rhyming word/phrase occurrs in the training corpus on the X axis.
The first visualization is a scatter plot of rhyming words with the "quality" of the rhyme on the Y axis and the number of times that rhyming word/phrase occurs in the training corpus on the X axis.
[[file:images/wgu-vis.png]]
@ -846,11 +840,15 @@ The third visualization is a table that lists all of the rhymes, their pronuncia
:CUSTOM_ID: remaining-documentation
:END:
Create each of the following forms of documentation for the product you have developed:
** Business Vision
Provide rhyming lyric suggestions optionally constrained by syllable count.
Supercharge songwriter's abilities with automated rhyming lyric suggestions for brainstorming.
Without the physical constraints imposed by paperpack rhyming dictionaries, and with the full power of machine learning training, RhymeStorm™ will find rhymes don't show up in typical rhyming dictionaries.
Rhymes and lyric suggestions will further be honed to target specific genres based on the training data set.
These two features combine with the speed of modern-day processing to provide rapid-fire rhyming suggestions never before seen.
*** Requirements
@ -929,10 +927,18 @@ and more.
** Visualizations
RhymeStorm™ provides three visualizations to help songwriter's find the perfect lyric.
The first visualization is a scatterplot comparing rhyme quality to frequency that the rhyming word or phrase appears in the training corpus.
[[file:images/rhyme-scatterplot.png]]
The second visualization is a word cloud where each word's size is in proportion to the frequency with which the word appears in the training corpus.
[[file:images/wordcloud.png]]
And the third visualization is a sorted table of rhyme suggestions. The rhymes are sorted first by quality and then by popularity.
[[file:images/rhyme-table.png]]
** Accuracy
@ -1035,6 +1041,8 @@ Here is an example of the test suite for the code related to syllabification: [[
** Source Code
I wrote three Clojure libraries and one Clojure application that combine to make RhymeStorm™.
*** Tightly Packed Trie
This is the data structure that backs the Hidden Markov Model.

@ -3,7 +3,7 @@
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<!-- 2021-07-23 Fri 16:05 -->
<!-- 2021-07-23 Fri 17:16 -->
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>RhymeStorm™ - WGU CSCI Capstone Project</title>
@ -223,122 +223,122 @@
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
<li><a href="#orgb2d6a45">1. WGU Evaluator Notes</a></li>
<li><a href="#org269b8de">2. Evaluation Technical Documentation</a>
<li><a href="#orgd38ae05">1. WGU Evaluator Notes</a></li>
<li><a href="#orgc9caa35">2. Evaluation Technical Documentation</a>
<ul>
<li><a href="#orgcb57995">2.1. How To Initialize Development Environment</a>
<li><a href="#org4f30257">2.1. How To Initialize Development Environment</a>
<ul>
<li><a href="#org69f29e5">2.1.1. Required Software</a></li>
<li><a href="#orgceea7ac">2.1.2. Steps</a></li>
<li><a href="#orgf188cf0">2.1.1. Required Software</a></li>
<li><a href="#orgf197a7a">2.1.2. Steps</a></li>
</ul>
</li>
<li><a href="#orgce86f08">2.2. How To Run Software Locally</a>
<li><a href="#org853b767">2.2. How To Run Software Locally</a>
<ul>
<li><a href="#org4796d1f">2.2.1. Requirements</a></li>
<li><a href="#orgf490951">2.2.2. Steps</a></li>
<li><a href="#org6ee0f4f">2.2.1. Requirements</a></li>
<li><a href="#orgedf3725">2.2.2. Steps</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#letter-of-transmittal">3. A. Letter Of Transmittal</a>
<ul>
<li><a href="#orgebb7bab">3.1. Problem Summary</a></li>
<li><a href="#org2366696">3.2. Benefits</a></li>
<li><a href="#org965c4ba">3.3. Product - RhymeStorm™</a></li>
<li><a href="#org14b7943">3.4. Data</a></li>
<li><a href="#orgc8b62fe">3.5. Objectives</a></li>
<li><a href="#orgbbc3e05">3.6. Development Methodology - Agile</a></li>
<li><a href="#orgf4db435">3.7. Costs</a></li>
<li><a href="#org8a7247d">3.8. Stakeholder Impact</a></li>
<li><a href="#orgff1422e">3.9. Ethical And Legal Considerations</a></li>
<li><a href="#org13188cc">3.10. Expertise</a></li>
<li><a href="#org5ee843e">3.1. Problem Summary</a></li>
<li><a href="#org890f8eb">3.2. Benefits</a></li>
<li><a href="#orgf459037">3.3. Product - RhymeStorm™</a></li>
<li><a href="#org853971e">3.4. Data</a></li>
<li><a href="#org1e5fdb9">3.5. Objectives</a></li>
<li><a href="#org8b76d5a">3.6. Development Methodology - Agile</a></li>
<li><a href="#orgc9a5773">3.7. Costs</a></li>
<li><a href="#orgc102cbe">3.8. Stakeholder Impact</a></li>
<li><a href="#org63d5a71">3.9. Ethical And Legal Considerations</a></li>
<li><a href="#orge7ed6b6">3.10. Expertise</a></li>
</ul>
</li>
<li><a href="#executive-summary">4. B. Executive Summary - RhymeStorm™ Technical Notes And Requirements</a>
<ul>
<li><a href="#orga01d182">4.1. Decision Support Opportunity</a></li>
<li><a href="#org3940b7a">4.2. Customer Needs And Product Description</a></li>
<li><a href="#org6fd0b72">4.3. Existing Products</a></li>
<li><a href="#org1f4bbcb">4.4. Available Data And Future Data Lifecycle</a></li>
<li><a href="#org5172de8">4.5. Methodology - Agile</a></li>
<li><a href="#orgd870b43">4.6. Deliverables</a></li>
<li><a href="#org3e2bddf">4.7. Implementation Plan And Anticipations</a></li>
<li><a href="#org5ac346f">4.8. Requirements Validation And Verification</a></li>
<li><a href="#org111778f">4.9. Programming Environments And Costs</a></li>
<li><a href="#orgecf463b">4.10. Timeline And Milestones</a></li>
<li><a href="#org0ffe6ee">4.1. Decision Support Opportunity</a></li>
<li><a href="#org24903e6">4.2. Customer Needs And Product Description</a></li>
<li><a href="#orgc7e0d50">4.3. Existing Products</a></li>
<li><a href="#orgd471480">4.4. Available Data And Future Data Lifecycle</a></li>
<li><a href="#org46d6de3">4.5. Methodology - Agile</a></li>
<li><a href="#orga321efb">4.6. Deliverables</a></li>
<li><a href="#orgada24b3">4.7. Implementation Plan And Anticipations</a></li>
<li><a href="#org8467485">4.8. Requirements Validation And Verification</a></li>
<li><a href="#orga48f74d">4.9. Programming Environments And Costs</a></li>
<li><a href="#org1712f4e">4.10. Timeline And Milestones</a></li>
</ul>
</li>
<li><a href="#requirements-documentation">5. C. RhymeStorm™ Capstone Requirements Documentation</a>
<ul>
<li><a href="#orgcb78d3b">5.1. Descriptive And Predictive Methods</a>
<li><a href="#orgda35db8">5.1. Descriptive And Predictive Methods</a>
<ul>
<li><a href="#orgb66d992">5.1.1. Descriptive Method</a></li>
<li><a href="#org185fb9a">5.1.2. Prescriptive Method</a></li>
<li><a href="#orgab98aaf">5.1.1. Descriptive Method</a></li>
<li><a href="#orgc07d72f">5.1.2. Prescriptive Method</a></li>
</ul>
</li>
<li><a href="#orgdd2dd18">5.2. Datasets</a></li>
<li><a href="#orgb53da05">5.3. Decision Support Functionality</a>
<li><a href="#org8f499c5">5.2. Datasets</a></li>
<li><a href="#org2d4eaec">5.3. Decision Support Functionality</a>
<ul>
<li><a href="#org71793f7">5.3.1. Choosing Words For A Lyric Based On Markov Likelihood</a></li>
<li><a href="#org7d8037f">5.3.2. Choosing Words To Complete A Lyric Based On Rhyme Quality</a></li>
<li><a href="#org7c927a3">5.3.1. Choosing Words For A Lyric Based On Markov Likelihood</a></li>
<li><a href="#org0a51a02">5.3.2. Choosing Words To Complete A Lyric Based On Rhyme Quality</a></li>
</ul>
</li>
<li><a href="#org3f14d9e">5.4. Featurizing, Parsing, Cleaning, And Wrangling Data</a></li>
<li><a href="#org47f5845">5.5. Data Exploration And Preparation</a></li>
<li><a href="#orgc1d3d92">5.6. Data Visualization Functionalities For Data Exploration And Inspection</a></li>
<li><a href="#orgfa2b06c">5.7. Implementation Of Interactive Queries</a>
<li><a href="#orgc667065">5.4. Featurizing, Parsing, Cleaning, And Wrangling Data</a></li>
<li><a href="#org6b7a95d">5.5. Data Exploration And Preparation</a></li>
<li><a href="#org1d3435f">5.6. Data Visualization Functionalities For Data Exploration And Inspection</a></li>
<li><a href="#orgec327c6">5.7. Implementation Of Interactive Queries</a>
<ul>
<li><a href="#orgf9fd9be">5.7.1. Generate Rhyming Lyrics</a></li>
<li><a href="#org567d88d">5.7.2. Complete Lyric Containing Suffix</a></li>
<li><a href="#org92a52fa">5.7.1. Generate Rhyming Lyrics</a></li>
<li><a href="#org4eb310c">5.7.2. Complete Lyric Containing Suffix</a></li>
</ul>
</li>
<li><a href="#orgc382b6e">5.8. Implementation Of Machine Learning Methods</a></li>
<li><a href="#org5962734">5.9. Functionalities To Evaluate The Accuracy Of The Data Product</a></li>
<li><a href="#org74d6640">5.10. Security Features</a></li>
<li><a href="#orgaab8668">5.11. Tools To Monitor And Maintain The Product</a></li>
<li><a href="#orgb54a6ca">5.12. A User-Friendly, Functional Dashboard That Includes At Least Three Visualization Types</a></li>
<li><a href="#org875011a">5.8. Implementation Of Machine Learning Methods</a></li>
<li><a href="#org5824f12">5.9. Functionalities To Evaluate The Accuracy Of The Data Product</a></li>
<li><a href="#org88dc329">5.10. Security Features</a></li>
<li><a href="#org613bd8f">5.11. Tools To Monitor And Maintain The Product</a></li>
<li><a href="#orgc6266b7">5.12. A User-Friendly, Functional Dashboard That Includes At Least Three Visualization Types</a></li>
</ul>
</li>
<li><a href="#remaining-documentation">6. D. Documentation</a>
<ul>
<li><a href="#org953dbce">6.1. Business Vision</a>
<li><a href="#org9df4605">6.1. Business Vision</a>
<ul>
<li><a href="#org3fdc5ba">6.1.1. Requirements</a></li>
<li><a href="#orga3bdd1c">6.1.1. Requirements</a></li>
</ul>
</li>
<li><a href="#org9a18e32">6.2. Data Sets</a></li>
<li><a href="#orgf82dd4e">6.3. Data Analysis</a></li>
<li><a href="#orgf3a4715">6.4. Assessment Of Hypothesis</a></li>
<li><a href="#org2dec1b4">6.5. Visualizations</a></li>
<li><a href="#orgc2586f0">6.6. Accuracy</a>
<li><a href="#orgd136d58">6.2. Data Sets</a></li>
<li><a href="#orgf736042">6.3. Data Analysis</a></li>
<li><a href="#org407721c">6.4. Assessment Of Hypothesis</a></li>
<li><a href="#org2d951c6">6.5. Visualizations</a></li>
<li><a href="#org60086e9">6.6. Accuracy</a>
<ul>
<li><a href="#org564fa6e">6.6.1. Percentage Of Generated Lines That Are Valid English Sentences</a></li>
<li><a href="#orgd2e3d30">6.6.1. Percentage Of Generated Lines That Are Valid English Sentences</a></li>
</ul>
</li>
<li><a href="#orgf63ab34">6.7. Testing</a></li>
<li><a href="#org965436f">6.8. Source Code</a>
<li><a href="#org8d29ef2">6.7. Testing</a></li>
<li><a href="#orgbcd20cb">6.8. Source Code</a>
<ul>
<li><a href="#org2aea5c0">6.8.1. Tightly Packed Trie</a></li>
<li><a href="#orgb75e602">6.8.2. Phonetics</a></li>
<li><a href="#orgc333af7">6.8.3. Rhyming</a></li>
<li><a href="#org5e9ab5a">6.8.4. Web Server And User Interface</a></li>
<li><a href="#orgb5bde0d">6.8.1. Tightly Packed Trie</a></li>
<li><a href="#org68009bd">6.8.2. Phonetics</a></li>
<li><a href="#org615c902">6.8.3. Rhyming</a></li>
<li><a href="#org8ffc320">6.8.4. Web Server And User Interface</a></li>
</ul>
</li>
<li><a href="#org5689c17">6.9. Quick Start</a>
<li><a href="#org9010313">6.9. Quick Start</a>
<ul>
<li><a href="#org980ba69">6.9.1. How To Initialize Development Environment</a></li>
<li><a href="#org609caa1">6.9.2. How To Run Software Locally</a></li>
<li><a href="#org00f3e76">6.9.1. How To Initialize Development Environment</a></li>
<li><a href="#org7cd2611">6.9.2. How To Run Software Locally</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#org6806ed2">7. Citations</a></li>
<li><a href="#orgffa2fb6">7. Citations</a></li>
</ul>
</div>
</div>
<div id="outline-container-orgb2d6a45" class="outline-2">
<h2 id="orgb2d6a45"><span class="section-number-2">1</span> WGU Evaluator Notes</h2>
<div id="outline-container-orgd38ae05" class="outline-2">
<h2 id="orgd38ae05"><span class="section-number-2">1</span> WGU Evaluator Notes</h2>
<div class="outline-text-2" id="text-1">
<p>
Hello! I hope you enjoy your time with this evaluation!
@ -362,20 +362,20 @@ After I describe the steps to initialize a development environment, you&rsquo;ll
</div>
</div>
<div id="outline-container-org269b8de" class="outline-2">
<h2 id="org269b8de"><span class="section-number-2">2</span> Evaluation Technical Documentation</h2>
<div id="outline-container-orgc9caa35" class="outline-2">
<h2 id="orgc9caa35"><span class="section-number-2">2</span> Evaluation Technical Documentation</h2>
<div class="outline-text-2" id="text-2">
<p>
It&rsquo;s probably not necessary for you to replicate my development environment in order to evaluate this project. You can access the deployed application at <a href="https://darklimericks.com/wgu">https://darklimericks.com/wgu</a> and the libraries and supporting code that I wrote for this project at <a href="https://github.com/eihli/clj-tightly-packed-trie">https://github.com/eihli/clj-tightly-packed-trie</a>, <a href="https://github.com/eihli/syllabify">https://github.com/eihli/syllabify</a>, and <a href="https://github.com/eihli/prhyme">https://github.com/eihli/prhyme</a>. The web server and web application is not hosted publicly but you will find it uploaded with my submission as a <code>.tar</code> archive.
</p>
</div>
<div id="outline-container-orgcb57995" class="outline-3">
<h3 id="orgcb57995"><span class="section-number-3">2.1</span> How To Initialize Development Environment</h3>
<div id="outline-container-org4f30257" class="outline-3">
<h3 id="org4f30257"><span class="section-number-3">2.1</span> How To Initialize Development Environment</h3>
<div class="outline-text-3" id="text-2-1">
</div>
<div id="outline-container-org69f29e5" class="outline-4">
<h4 id="org69f29e5"><span class="section-number-4">2.1.1</span> Required Software</h4>
<div id="outline-container-orgf188cf0" class="outline-4">
<h4 id="orgf188cf0"><span class="section-number-4">2.1.1</span> Required Software</h4>
<div class="outline-text-4" id="text-2-1-1">
<ul class="org-ul">
<li><a href="https://www.docker.com/">Docker</a></li>
@ -385,8 +385,8 @@ It&rsquo;s probably not necessary for you to replicate my development environmen
</div>
</div>
<div id="outline-container-orgceea7ac" class="outline-4">
<h4 id="orgceea7ac"><span class="section-number-4">2.1.2</span> Steps</h4>
<div id="outline-container-orgf197a7a" class="outline-4">
<h4 id="orgf197a7a"><span class="section-number-4">2.1.2</span> Steps</h4>
<div class="outline-text-4" id="text-2-1-2">
<ol class="org-ol">
<li>Run <code>./db/run.sh &amp;&amp; ./kv/run.sh</code> to start the docker containers for the database and key-value store.
@ -400,12 +400,12 @@ It&rsquo;s probably not necessary for you to replicate my development environmen
</div>
</div>
<div id="outline-container-orgce86f08" class="outline-3">
<h3 id="orgce86f08"><span class="section-number-3">2.2</span> How To Run Software Locally</h3>
<div id="outline-container-org853b767" class="outline-3">
<h3 id="org853b767"><span class="section-number-3">2.2</span> How To Run Software Locally</h3>
<div class="outline-text-3" id="text-2-2">
</div>
<div id="outline-container-org4796d1f" class="outline-4">
<h4 id="org4796d1f"><span class="section-number-4">2.2.1</span> Requirements</h4>
<div id="outline-container-org6ee0f4f" class="outline-4">
<h4 id="org6ee0f4f"><span class="section-number-4">2.2.1</span> Requirements</h4>
<div class="outline-text-4" id="text-2-2-1">
<ul class="org-ul">
<li><a href="https://www.java.com/download/ie_manual.jsp">Java</a></li>
@ -414,8 +414,8 @@ It&rsquo;s probably not necessary for you to replicate my development environmen
</div>
</div>
<div id="outline-container-orgf490951" class="outline-4">
<h4 id="orgf490951"><span class="section-number-4">2.2.2</span> Steps</h4>
<div id="outline-container-orgedf3725" class="outline-4">
<h4 id="orgedf3725"><span class="section-number-4">2.2.2</span> Steps</h4>
<div class="outline-text-4" id="text-2-2-2">
<ol class="org-ol">
<li>Run <code>./db/run.sh &amp;&amp; ./kv/run.sh</code> to start the docker containers for the database and key-value store.
@ -436,8 +436,8 @@ It&rsquo;s probably not necessary for you to replicate my development environmen
<div class="outline-text-2" id="text-letter-of-transmittal">
</div>
<div id="outline-container-orgebb7bab" class="outline-3">
<h3 id="orgebb7bab"><span class="section-number-3">3.1</span> Problem Summary</h3>
<div id="outline-container-org5ee843e" class="outline-3">
<h3 id="org5ee843e"><span class="section-number-3">3.1</span> Problem Summary</h3>
<div class="outline-text-3" id="text-3-1">
<p>
Songwriters, artists, and record labels can save time and discover better lyrics with the help of a machine learning tool that supports their creative endeavours.
@ -449,8 +449,8 @@ Songwriters have several old-fashioned tools at their disposal including diction
</div>
</div>
<div id="outline-container-org2366696" class="outline-3">
<h3 id="org2366696"><span class="section-number-3">3.2</span> Benefits</h3>
<div id="outline-container-org890f8eb" class="outline-3">
<h3 id="org890f8eb"><span class="section-number-3">3.2</span> Benefits</h3>
<div class="outline-text-3" id="text-3-2">
<p>
How many sensible phrases can you think of that rhyme with &ldquo;war on poverty&rdquo;? What if I say that there&rsquo;s a restriction to only come up with phrases that are exactly 14 syllables? That&rsquo;s a common restriction when a songwriter is trying to match the meter of a previous line. What if I add another restriction that there must be primary stress at certain spots in that 14 syllable phrase?
@ -466,8 +466,8 @@ And this is a process that is perfect for machine learning. Machine learning can
</div>
</div>
<div id="outline-container-org965c4ba" class="outline-3">
<h3 id="org965c4ba"><span class="section-number-3">3.3</span> Product - RhymeStorm™</h3>
<div id="outline-container-orgf459037" class="outline-3">
<h3 id="orgf459037"><span class="section-number-3">3.3</span> Product - RhymeStorm™</h3>
<div class="outline-text-3" id="text-3-3">
<p>
RhymeStorm™ is a tool to help songwriters brainstorm. It provides lyrics automatically generated based on training data from existing songs while adhering to restrictions based on rhyme scheme, meter, genre, and more.
@ -495,8 +495,8 @@ This auto-complete functionality will be similar to the auto-complete that is co
</div>
</div>
<div id="outline-container-org14b7943" class="outline-3">
<h3 id="org14b7943"><span class="section-number-3">3.4</span> Data</h3>
<div id="outline-container-org853971e" class="outline-3">
<h3 id="org853971e"><span class="section-number-3">3.4</span> Data</h3>
<div class="outline-text-3" id="text-3-4">
<p>
The initial model will be trained on the lyrics from <a href="http://darklyrics.com">http://darklyrics.com</a>. This is a publicly available data set with minimal meta-data. Record labels will have more valuable datasets that will include meta-data along with lyrics, such as the date the song was popular, the number of radio plays of the song, the profit of the song/artist, etc&#x2026;
@ -508,8 +508,8 @@ The software can be augmented with additional algorithms to account for the type
</div>
</div>
<div id="outline-container-orgc8b62fe" class="outline-3">
<h3 id="orgc8b62fe"><span class="section-number-3">3.5</span> Objectives</h3>
<div id="outline-container-org1e5fdb9" class="outline-3">
<h3 id="org1e5fdb9"><span class="section-number-3">3.5</span> Objectives</h3>
<div class="outline-text-3" id="text-3-5">
<p>
This software will accomplish its primary objective if it makes its way into the daily toolkit of a handful of singers/songwriters.
@ -529,8 +529,8 @@ Another example is the package that turns phrases into phones (symbols of pronun
</div>
</div>
<div id="outline-container-orgbbc3e05" class="outline-3">
<h3 id="orgbbc3e05"><span class="section-number-3">3.6</span> Development Methodology - Agile</h3>
<div id="outline-container-org8b76d5a" class="outline-3">
<h3 id="org8b76d5a"><span class="section-number-3">3.6</span> Development Methodology - Agile</h3>
<div class="outline-text-3" id="text-3-6">
<p>
This project will be developed with an iterative Agile methodology. Since a large part of data science and machine learning is exploration, this project will benefit from ongoing exploration in tandem with development.
@ -546,8 +546,8 @@ The prices quoted below are for an initial minimum-viable-product that will serv
</div>
</div>
<div id="outline-container-orgf4db435" class="outline-3">
<h3 id="orgf4db435"><span class="section-number-3">3.7</span> Costs</h3>
<div id="outline-container-orgc9a5773" class="outline-3">
<h3 id="orgc9a5773"><span class="section-number-3">3.7</span> Costs</h3>
<div class="outline-text-3" id="text-3-7">
<p>
Funding requirements are minimal. The initial dataset is public and freely available. On a typical consumer laptop, Hidden Markov Models can be trained on fairly large datasets in short time and the training doesn&rsquo;t require the use of expensive hardware like the GPUs used to train Deep Neural Networks.
@ -631,17 +631,17 @@ These are my estimates for the time and cost of different aspects of initial dev
</div>
</div>
<div id="outline-container-org8a7247d" class="outline-3">
<h3 id="org8a7247d"><span class="section-number-3">3.8</span> Stakeholder Impact</h3>
<div id="outline-container-orgc102cbe" class="outline-3">
<h3 id="orgc102cbe"><span class="section-number-3">3.8</span> Stakeholder Impact</h3>
<div class="outline-text-3" id="text-3-8">
<p>
The only stakeholders in the project will be the record labels or songwriters. I describe the only impact to them in the <a href="#org2366696">3.2</a> section above.
The only stakeholders in the project will be the record labels or songwriters. I describe the only impact to them in the <a href="#org890f8eb">3.2</a> section above.
</p>
</div>
</div>
<div id="outline-container-orgff1422e" class="outline-3">
<h3 id="orgff1422e"><span class="section-number-3">3.9</span> Ethical And Legal Considerations</h3>
<div id="outline-container-org63d5a71" class="outline-3">
<h3 id="org63d5a71"><span class="section-number-3">3.9</span> Ethical And Legal Considerations</h3>
<div class="outline-text-3" id="text-3-9">
<p>
Web scraping, the method used to obtain the initial dataset from <a href="http://darklyrics.com">http://darklyrics.com</a>, is protected given the ruling in <a href="https://en.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn">https://en.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn</a> (HiQ Labs v. LinkedIn 2021).
@ -653,8 +653,8 @@ The use of publicly available data in generative works is less clear. But Micros
</div>
</div>
<div id="outline-container-org13188cc" class="outline-3">
<h3 id="org13188cc"><span class="section-number-3">3.10</span> Expertise</h3>
<div id="outline-container-orge7ed6b6" class="outline-3">
<h3 id="orge7ed6b6"><span class="section-number-3">3.10</span> Expertise</h3>
<div class="outline-text-3" id="text-3-10">
<p>
I have 10 years experience as a programmer and have worked extensively on both frontend technologies like HTML/JavaScript, backend technologies like Django, and building libraries/packages/frameworks.
@ -670,13 +670,10 @@ I&rsquo;ve also been writing limericks my entire life and hold the International
<div id="outline-container-executive-summary" class="outline-2">
<h2 id="executive-summary"><span class="section-number-2">4</span> B. Executive Summary - RhymeStorm™ Technical Notes And Requirements</h2>
<div class="outline-text-2" id="text-executive-summary">
<p>
Write an executive summary directed to IT professionals that addresses each of the following requirements:
</p>
</div>
<div id="outline-container-orga01d182" class="outline-3">
<h3 id="orga01d182"><span class="section-number-3">4.1</span> Decision Support Opportunity</h3>
<div id="outline-container-org0ffe6ee" class="outline-3">
<h3 id="org0ffe6ee"><span class="section-number-3">4.1</span> Decision Support Opportunity</h3>
<div class="outline-text-3" id="text-4-1">
<p>
Songwriters expend a lot of time and effort finding the perfect rhyming word or phrase. RhymeStorm™ is going to amplify user&rsquo;s creative abilities by searching its machine learning model for sensible and proven-successful words and phrases that meet the rhyme scheme and meter requirements requested by the user.
@ -688,8 +685,8 @@ When a songwriter needs to find likely phrases that rhyme with &ldquo;war on pov
</div>
</div>
<div id="outline-container-org3940b7a" class="outline-3">
<h3 id="org3940b7a"><span class="section-number-3">4.2</span> Customer Needs And Product Description</h3>
<div id="outline-container-org24903e6" class="outline-3">
<h3 id="org24903e6"><span class="section-number-3">4.2</span> Customer Needs And Product Description</h3>
<div class="outline-text-3" id="text-4-2">
<p>
Songwriters spend money on dictionaries, compilations of slang, thesauruses, and phrase dictionaries. They spend their time daydreaming, brainstorming, contemplating, and mixing and matching the knowledge they acquire through these traditional means.
@ -709,8 +706,8 @@ Computers can process and sort this information and sort the results by quality
</div>
</div>
<div id="outline-container-org6fd0b72" class="outline-3">
<h3 id="org6fd0b72"><span class="section-number-3">4.3</span> Existing Products</h3>
<div id="outline-container-orgc7e0d50" class="outline-3">
<h3 id="orgc7e0d50"><span class="section-number-3">4.3</span> Existing Products</h3>
<div class="outline-text-3" id="text-4-3">
<p>
We&rsquo;re all familiar with dictionaries, thesauruses, and their shortcomings.
@ -726,8 +723,8 @@ RhymeZone is limited in its capability. It doesn&rsquo;t do well finding rhymes
</div>
</div>
<div id="outline-container-org1f4bbcb" class="outline-3">
<h3 id="org1f4bbcb"><span class="section-number-3">4.4</span> Available Data And Future Data Lifecycle</h3>
<div id="outline-container-orgd471480" class="outline-3">
<h3 id="orgd471480"><span class="section-number-3">4.4</span> Available Data And Future Data Lifecycle</h3>
<div class="outline-text-3" id="text-4-4">
<p>
The initial dataset will be gathered by downloading lyrics from <a href="http://darklyrics.com">http://darklyrics.com</a> and future models can be generated by downloading lyrics from other websites. Alternatively, data can be provided by record labels and combined with meta-data that the record label may have, such as how many radio plays each song gets and how much profit they make from each song.
@ -751,8 +748,8 @@ Each new model can be uploaded to the web server and users can select which mode
</div>
</div>
<div id="outline-container-org5172de8" class="outline-3">
<h3 id="org5172de8"><span class="section-number-3">4.5</span> Methodology - Agile</h3>
<div id="outline-container-org46d6de3" class="outline-3">
<h3 id="org46d6de3"><span class="section-number-3">4.5</span> Methodology - Agile</h3>
<div class="outline-text-3" id="text-4-5">
<p>
RhymeStorm™ development will proceed with an iterative Agile methodology. It will be composed of several independent modules that can be worked on independently, in parallel, and iteratively.
@ -776,8 +773,8 @@ Much of data science is exploratory and taking an iterative Agile approach can t
</div>
</div>
<div id="outline-container-orgd870b43" class="outline-3">
<h3 id="orgd870b43"><span class="section-number-3">4.6</span> Deliverables</h3>
<div id="outline-container-orga321efb" class="outline-3">
<h3 id="orga321efb"><span class="section-number-3">4.6</span> Deliverables</h3>
<div class="outline-text-3" id="text-4-6">
<ul class="org-ul">
<li>Supporting libraries source code</li>
@ -811,8 +808,8 @@ The trained data model and web interface has been deployed at the following addr
</div>
</div>
<div id="outline-container-org3e2bddf" class="outline-3">
<h3 id="org3e2bddf"><span class="section-number-3">4.7</span> Implementation Plan And Anticipations</h3>
<div id="outline-container-orgada24b3" class="outline-3">
<h3 id="orgada24b3"><span class="section-number-3">4.7</span> Implementation Plan And Anticipations</h3>
<div class="outline-text-3" id="text-4-7">
<p>
I&rsquo;ll start by writing and releasing the supporting libraries and packages: Tries, Syllabification/Phonetics, Rhyming.
@ -832,13 +829,9 @@ In anticipation of user growth, I&rsquo;ll be deploying the final product on Dig
</div>
</div>
<div id="outline-container-org5ac346f" class="outline-3">
<h3 id="org5ac346f"><span class="section-number-3">4.8</span> Requirements Validation And Verification</h3>
<div id="outline-container-org8467485" class="outline-3">
<h3 id="org8467485"><span class="section-number-3">4.8</span> Requirements Validation And Verification</h3>
<div class="outline-text-3" id="text-4-8">
<p>
the methods for validating and verifying that the developed data product meets the requirements and subsequently the needs of the customers
</p>
<p>
For the known requirements, I&rsquo;ll perform personally perform manual tests and quality assurance. This is a small enough project that one individual can thoroughly test all of the primary requirements.
</p>
@ -853,13 +846,9 @@ The final website will integrate multiple technologies and the integrations won&
</div>
</div>
<div id="outline-container-org111778f" class="outline-3">
<h3 id="org111778f"><span class="section-number-3">4.9</span> Programming Environments And Costs</h3>
<div id="outline-container-orga48f74d" class="outline-3">
<h3 id="orga48f74d"><span class="section-number-3">4.9</span> Programming Environments And Costs</h3>
<div class="outline-text-3" id="text-4-9">
<p>
the programming environments and any related costs, as well as the human resources that are necessary to execute each phase in the development of the data product
</p>
<p>
One of the benefits of a Hidden Markov Model is its relative computational affordability when compared to other machine learning techniques, like Deep Neural Networks.
</p>
@ -878,8 +867,8 @@ All code was written and all models were trained on a Lenovo T15G with an Intel
</div>
</div>
<div id="outline-container-orgecf463b" class="outline-3">
<h3 id="orgecf463b"><span class="section-number-3">4.10</span> Timeline And Milestones</h3>
<div id="outline-container-org1712f4e" class="outline-3">
<h3 id="org1712f4e"><span class="section-number-3">4.10</span> Timeline And Milestones</h3>
<div class="outline-text-3" id="text-4-10">
<table border="2" cellspacing="0" cellpadding="6" rules="groups" frame="hsides">
@ -957,16 +946,16 @@ RhymeStorm™ is an application to help singers and songwriters brainstorm new l
</p>
</div>
<div id="outline-container-orgcb78d3b" class="outline-3">
<h3 id="orgcb78d3b"><span class="section-number-3">5.1</span> Descriptive And Predictive Methods</h3>
<div id="outline-container-orgda35db8" class="outline-3">
<h3 id="orgda35db8"><span class="section-number-3">5.1</span> Descriptive And Predictive Methods</h3>
<div class="outline-text-3" id="text-5-1">
</div>
<div id="outline-container-orgb66d992" class="outline-4">
<h4 id="orgb66d992"><span class="section-number-4">5.1.1</span> Descriptive Method</h4>
<div id="outline-container-orgab98aaf" class="outline-4">
<h4 id="orgab98aaf"><span class="section-number-4">5.1.1</span> Descriptive Method</h4>
<div class="outline-text-4" id="text-5-1-1">
</div>
<ol class="org-ol">
<li><a id="org510f9eb"></a>Most Common Grammatical Structures In A Set Of Lyrics<br />
<li><a id="org00830d9"></a>Most Common Grammatical Structures In A Set Of Lyrics<br />
<div class="outline-text-5" id="text-5-1-1-1">
<p>
By filtering songs by metrics such as popularity, number of awards, etc&#x2026; we can use this software package to determine the most common grammatical phrase structure for different filtered categories.
@ -1043,12 +1032,12 @@ In the example below, you&rsquo;ll see that a simple noun-phrase is the most pop
</ol>
</div>
<div id="outline-container-org185fb9a" class="outline-4">
<h4 id="org185fb9a"><span class="section-number-4">5.1.2</span> Prescriptive Method</h4>
<div id="outline-container-orgc07d72f" class="outline-4">
<h4 id="orgc07d72f"><span class="section-number-4">5.1.2</span> Prescriptive Method</h4>
<div class="outline-text-4" id="text-5-1-2">
</div>
<ol class="org-ol">
<li><a id="org4b6e650"></a>Most Likely Word To Follow A Given Phrase<br />
<li><a id="org79ddf78"></a>Most Likely Word To Follow A Given Phrase<br />
<div class="outline-text-5" id="text-5-1-2-1">
<p>
To help songwriters think of new lyrics, we provide an API to receive a list of words that commonly follow/precede a given phrase.
@ -1144,8 +1133,8 @@ In the example below, we provide a seed suffix of &ldquo;bother me&rdquo; and as
</div>
</div>
<div id="outline-container-orgdd2dd18" class="outline-3">
<h3 id="orgdd2dd18"><span class="section-number-3">5.2</span> Datasets</h3>
<div id="outline-container-org8f499c5" class="outline-3">
<h3 id="org8f499c5"><span class="section-number-3">5.2</span> Datasets</h3>
<div class="outline-text-3" id="text-5-2">
<p>
The dataset currently in use was generated from the publicly available lyrics at <a href="http://darklyrics.com">http://darklyrics.com</a>.
@ -1161,12 +1150,12 @@ The trained dataset is available as a resource in this repository at <code>web/r
</div>
</div>
<div id="outline-container-orgb53da05" class="outline-3">
<h3 id="orgb53da05"><span class="section-number-3">5.3</span> Decision Support Functionality</h3>
<div id="outline-container-org2d4eaec" class="outline-3">
<h3 id="org2d4eaec"><span class="section-number-3">5.3</span> Decision Support Functionality</h3>
<div class="outline-text-3" id="text-5-3">
</div>
<div id="outline-container-org71793f7" class="outline-4">
<h4 id="org71793f7"><span class="section-number-4">5.3.1</span> Choosing Words For A Lyric Based On Markov Likelihood</h4>
<div id="outline-container-org7c927a3" class="outline-4">
<h4 id="org7c927a3"><span class="section-number-4">5.3.1</span> Choosing Words For A Lyric Based On Markov Likelihood</h4>
<div class="outline-text-4" id="text-5-3-1">
<p>
Entire phrases can be generated using the previously mentioned functionality of generating lists of likely prefix/suffix words.
@ -1182,8 +1171,8 @@ The user can supply criteria such as restrictions on the number of syllables, nu
</div>
</div>
<div id="outline-container-org7d8037f" class="outline-4">
<h4 id="org7d8037f"><span class="section-number-4">5.3.2</span> Choosing Words To Complete A Lyric Based On Rhyme Quality</h4>
<div id="outline-container-org0a51a02" class="outline-4">
<h4 id="org0a51a02"><span class="section-number-4">5.3.2</span> Choosing Words To Complete A Lyric Based On Rhyme Quality</h4>
<div class="outline-text-4" id="text-5-3-2">
<p>
Another part of the decision support functionality is filtering and ordering predicted words based on their rhyme quality.
@ -1409,8 +1398,8 @@ In the example below, you&rsquo;ll see that the first 20 or so rhymes are perfec
</div>
</div>
<div id="outline-container-org3f14d9e" class="outline-3">
<h3 id="org3f14d9e"><span class="section-number-3">5.4</span> Featurizing, Parsing, Cleaning, And Wrangling Data</h3>
<div id="outline-container-orgc667065" class="outline-3">
<h3 id="orgc667065"><span class="section-number-3">5.4</span> Featurizing, Parsing, Cleaning, And Wrangling Data</h3>
<div class="outline-text-3" id="text-5-4">
<p>
The data processing code is in <a href="https://github.com/eihli/prhyme">https://github.com/eihli/prhyme</a>
@ -1446,8 +1435,8 @@ words can be compared: &ldquo;Foo&rdquo; is the same as &ldquo;foo&rdquo;.
</div>
</div>
<div id="outline-container-org47f5845" class="outline-3">
<h3 id="org47f5845"><span class="section-number-3">5.5</span> Data Exploration And Preparation</h3>
<div id="outline-container-org6b7a95d" class="outline-3">
<h3 id="org6b7a95d"><span class="section-number-3">5.5</span> Data Exploration And Preparation</h3>
<div class="outline-text-3" id="text-5-5">
<p>
The primary data structure and algorithms supporting exploration of the data are a Markov Trie
@ -1495,8 +1484,8 @@ All Trie code is hosted in the git repo located at <a href="https://github.com/e
</div>
</div>
<div id="outline-container-orgc1d3d92" class="outline-3">
<h3 id="orgc1d3d92"><span class="section-number-3">5.6</span> Data Visualization Functionalities For Data Exploration And Inspection</h3>
<div id="outline-container-org1d3435f" class="outline-3">
<h3 id="org1d3435f"><span class="section-number-3">5.6</span> Data Visualization Functionalities For Data Exploration And Inspection</h3>
<div class="outline-text-3" id="text-5-6">
<p>
The functionality to explore and visualize data is baked into the Trie data structure.
@ -1506,7 +1495,7 @@ The functionality to explore and visualize data is baked into the Trie data stru
By simply viewing the Trie in a Clojure REPL, you can inspect the Trie&rsquo;s structure.
</p>
<pre class="example" id="org3a9fa2a">
<pre class="example" id="orgd88bf1a">
(let [initialized-trie (-&gt;&gt; (trie/make-trie "dog" "dog" "dot" "dot" "do" "do"))]
initialized-trie)
;; =&gt; {(\d \o \g) "dog", (\d \o \t) "dot", (\d \o) "do", (\d) nil}
@ -1548,12 +1537,12 @@ The Hidden Markov Model data structure doesn&rsquo;t lend itself to any useful g
</div>
</div>
<div id="outline-container-orgfa2b06c" class="outline-3">
<h3 id="orgfa2b06c"><span class="section-number-3">5.7</span> Implementation Of Interactive Queries</h3>
<div id="outline-container-orgec327c6" class="outline-3">
<h3 id="orgec327c6"><span class="section-number-3">5.7</span> Implementation Of Interactive Queries</h3>
<div class="outline-text-3" id="text-5-7">
</div>
<div id="outline-container-orgf9fd9be" class="outline-4">
<h4 id="orgf9fd9be"><span class="section-number-4">5.7.1</span> Generate Rhyming Lyrics</h4>
<div id="outline-container-org92a52fa" class="outline-4">
<h4 id="org92a52fa"><span class="section-number-4">5.7.1</span> Generate Rhyming Lyrics</h4>
<div class="outline-text-4" id="text-5-7-1">
<p>
This interactive query will return a list of rhyming phrases to any word or phrase you enter.
@ -1696,8 +1685,8 @@ The interactive query for the above can be found at <a href="https://darklimeric
</div>
</div>
<div id="outline-container-org567d88d" class="outline-4">
<h4 id="org567d88d"><span class="section-number-4">5.7.2</span> Complete Lyric Containing Suffix</h4>
<div id="outline-container-org4eb310c" class="outline-4">
<h4 id="org4eb310c"><span class="section-number-4">5.7.2</span> Complete Lyric Containing Suffix</h4>
<div class="outline-text-4" id="text-5-7-2">
<p>
This interactive query will return a list of lyrics completing the given suffix with randomly generated prefixes.
@ -1799,8 +1788,8 @@ The interactive query for the above can be found at <a href="https://darklimeric
</div>
</div>
<div id="outline-container-orgc382b6e" class="outline-3">
<h3 id="orgc382b6e"><span class="section-number-3">5.8</span> Implementation Of Machine Learning Methods</h3>
<div id="outline-container-org875011a" class="outline-3">
<h3 id="org875011a"><span class="section-number-3">5.8</span> Implementation Of Machine Learning Methods</h3>
<div class="outline-text-3" id="text-5-8">
<p>
The machine learning method chosen for this software is a Hidden Markov Model.
@ -1870,7 +1859,7 @@ The algorithm for generating predictions from the HMM is as follows.
</pre>
</div>
<pre class="example" id="orgdf854c2">
<pre class="example" id="orgeb7813e">
[(("&lt;s&gt;" "call" "me")
("&lt;s&gt;" "call")
("&lt;s&gt;" "right" "&lt;/s&gt;")
@ -1960,8 +1949,8 @@ It also performs compaction and serialization. Song lyrics are typically provide
</div>
<div id="outline-container-org5962734" class="outline-3">
<h3 id="org5962734"><span class="section-number-3">5.9</span> Functionalities To Evaluate The Accuracy Of The Data Product</h3>
<div id="outline-container-org5824f12" class="outline-3">
<h3 id="org5824f12"><span class="section-number-3">5.9</span> Functionalities To Evaluate The Accuracy Of The Data Product</h3>
<div class="outline-text-3" id="text-5-9">
<p>
Since creative brainstorming is the goal, &ldquo;accuracy&rdquo; is subjective.
@ -2036,8 +2025,8 @@ This standardized measure of accuracy can be used to compare different language
</div>
</div>
<div id="outline-container-org74d6640" class="outline-3">
<h3 id="org74d6640"><span class="section-number-3">5.10</span> Security Features</h3>
<div id="outline-container-org88dc329" class="outline-3">
<h3 id="org88dc329"><span class="section-number-3">5.10</span> Security Features</h3>
<div class="outline-text-3" id="text-5-10">
<p>
Artists/Songwriters place a lot of value in the secrecy of their content. Therefore, all communication with the web-based interface occurs over a secure connection using HTTPS.
@ -2053,15 +2042,15 @@ With this precaution in place, attackers will not be able to snoop the content t
</div>
</div>
<div id="outline-container-orgaab8668" class="outline-3">
<h3 id="orgaab8668"><span class="section-number-3">5.11</span> Tools To Monitor And Maintain The Product</h3>
<div id="outline-container-org613bd8f" class="outline-3">
<h3 id="org613bd8f"><span class="section-number-3">5.11</span> Tools To Monitor And Maintain The Product</h3>
<div class="outline-text-3" id="text-5-11">
<p>
By having the application server behind an HAProxy load balancer, we can take advantage of the built-in HAProxy stats page for monitoring amount of traffic and health of the application servers.
</p>
<div id="org235130f" class="figure">
<div id="org2112f75" class="figure">
<p><img src="images/stats.png" alt="stats.png" />
</p>
</div>
@ -2080,8 +2069,8 @@ The server also includes the <code>certbot</code> script for updating and mainta
</div>
</div>
<div id="outline-container-orgb54a6ca" class="outline-3">
<h3 id="orgb54a6ca"><span class="section-number-3">5.12</span> A User-Friendly, Functional Dashboard That Includes At Least Three Visualization Types</h3>
<div id="outline-container-orgc6266b7" class="outline-3">
<h3 id="orgc6266b7"><span class="section-number-3">5.12</span> A User-Friendly, Functional Dashboard That Includes At Least Three Visualization Types</h3>
<div class="outline-text-3" id="text-5-12">
<p>
You can access an example of the user interface at <a href="https://darklimericks.com/wgu">https://darklimericks.com/wgu</a>.
@ -2096,11 +2085,11 @@ The first input field is for a word or phrase for which you wish to find a rhyme
</p>
<p>
The first visualization is a scatter plot of rhyming words with the &ldquo;quality&rdquo; of the rhyme on the Y axis and the number of times that rhyming word/phrase occurrs in the training corpus on the X axis.
The first visualization is a scatter plot of rhyming words with the &ldquo;quality&rdquo; of the rhyme on the Y axis and the number of times that rhyming word/phrase occurs in the training corpus on the X axis.
</p>
<div id="orge641d29" class="figure">
<div id="org6aa1adf" class="figure">
<p><img src="images/wgu-vis.png" alt="wgu-vis.png" />
</p>
</div>
@ -2110,7 +2099,7 @@ The second visualization is a word cloud where the size of each word is based on
</p>
<div id="orga473162" class="figure">
<div id="org950c96a" class="figure">
<p><img src="images/wgu-vis-cloud.png" alt="wgu-vis-cloud.png" />
</p>
</div>
@ -2120,7 +2109,7 @@ The third visualization is a table that lists all of the rhymes, their pronuncia
</p>
<div id="org04fe17a" class="figure">
<div id="org215dc00" class="figure">
<p><img src="images/wgu-vis-table.png" alt="wgu-vis-table.png" />
</p>
</div>
@ -2131,21 +2120,30 @@ The third visualization is a table that lists all of the rhymes, their pronuncia
<div id="outline-container-remaining-documentation" class="outline-2">
<h2 id="remaining-documentation"><span class="section-number-2">6</span> D. Documentation</h2>
<div class="outline-text-2" id="text-remaining-documentation">
<p>
Create each of the following forms of documentation for the product you have developed:
</p>
</div>
<div id="outline-container-org953dbce" class="outline-3">
<h3 id="org953dbce"><span class="section-number-3">6.1</span> Business Vision</h3>
<div id="outline-container-org9df4605" class="outline-3">
<h3 id="org9df4605"><span class="section-number-3">6.1</span> Business Vision</h3>
<div class="outline-text-3" id="text-6-1">
<p>
Provide rhyming lyric suggestions optionally constrained by syllable count.
Supercharge songwriter&rsquo;s abilities with automated rhyming lyric suggestions for brainstorming.
</p>
<p>
Without the physical constraints imposed by paperpack rhyming dictionaries, and with the full power of machine learning training, RhymeStorm™ will find rhymes don&rsquo;t show up in typical rhyming dictionaries.
</p>
<p>
Rhymes and lyric suggestions will further be honed to target specific genres based on the training data set.
</p>
<p>
These two features combine with the speed of modern-day processing to provide rapid-fire rhyming suggestions never before seen.
</p>
</div>
<div id="outline-container-org3fdc5ba" class="outline-4">
<h4 id="org3fdc5ba"><span class="section-number-4">6.1.1</span> Requirements</h4>
<div id="outline-container-orga3bdd1c" class="outline-4">
<h4 id="orga3bdd1c"><span class="section-number-4">6.1.1</span> Requirements</h4>
<div class="outline-text-4" id="text-6-1-1">
<ul class="org-ul">
<li class="on"><code>[X]</code> Given a word or phrase, suggest rhymes (ranked by quality) (Trie)</li>
@ -2161,8 +2159,8 @@ Provide rhyming lyric suggestions optionally constrained by syllable count.
</div>
</div>
<div id="outline-container-org9a18e32" class="outline-3">
<h3 id="org9a18e32"><span class="section-number-3">6.2</span> Data Sets</h3>
<div id="outline-container-orgd136d58" class="outline-3">
<h3 id="orgd136d58"><span class="section-number-3">6.2</span> Data Sets</h3>
<div class="outline-text-3" id="text-6-2">
<p>
I obtained the dataset from <a href="http://darklyrics.com">http://darklyrics.com</a>.
@ -2186,8 +2184,8 @@ See <code>web/resources/models/</code>
</div>
</div>
<div id="outline-container-orgf82dd4e" class="outline-3">
<h3 id="orgf82dd4e"><span class="section-number-3">6.3</span> Data Analysis</h3>
<div id="outline-container-orgf736042" class="outline-3">
<h3 id="orgf736042"><span class="section-number-3">6.3</span> Data Analysis</h3>
<div class="outline-text-3" id="text-6-3">
<p>
I wrote code to perform certain types of data analysis, but I didn&rsquo;t find it useful to meet the business requirements of this project.
@ -2199,8 +2197,8 @@ For example, there is natural language processing code at <a href="https://githu
</div>
</div>
<div id="outline-container-orgf3a4715" class="outline-3">
<h3 id="orgf3a4715"><span class="section-number-3">6.4</span> Assessment Of Hypothesis</h3>
<div id="outline-container-org407721c" class="outline-3">
<h3 id="org407721c"><span class="section-number-3">6.4</span> Assessment Of Hypothesis</h3>
<div class="outline-text-3" id="text-6-4">
<p>
I&rsquo;ll use an example output to subjectively assess the results of the project.
@ -2416,31 +2414,47 @@ and more.
</div>
</div>
<div id="outline-container-org2dec1b4" class="outline-3">
<h3 id="org2dec1b4"><span class="section-number-3">6.5</span> Visualizations</h3>
<div id="outline-container-org2d951c6" class="outline-3">
<h3 id="org2d951c6"><span class="section-number-3">6.5</span> Visualizations</h3>
<div class="outline-text-3" id="text-6-5">
<p>
RhymeStorm™ provides three visualizations to help songwriter&rsquo;s find the perfect lyric.
</p>
<div id="org6486c92" class="figure">
<p>
The first visualization is a scatterplot comparing rhyme quality to frequency that the rhyming word or phrase appears in the training corpus.
</p>
<div id="orgfb59f99" class="figure">
<p><img src="images/rhyme-scatterplot.png" alt="rhyme-scatterplot.png" />
</p>
</div>
<p>
The second visualization is a word cloud where each word&rsquo;s size is in proportion to the frequency with which the word appears in the training corpus.
</p>
<div id="org2e16384" class="figure">
<div id="org3403aad" class="figure">
<p><img src="images/wordcloud.png" alt="wordcloud.png" />
</p>
</div>
<p>
And the third visualization is a sorted table of rhyme suggestions. The rhymes are sorted first by quality and then by popularity.
</p>
<div id="orgf699c68" class="figure">
<div id="orgea5f528" class="figure">
<p><img src="images/rhyme-table.png" alt="rhyme-table.png" />
</p>
</div>
</div>
</div>
<div id="outline-container-orgc2586f0" class="outline-3">
<h3 id="orgc2586f0"><span class="section-number-3">6.6</span> Accuracy</h3>
<div id="outline-container-org60086e9" class="outline-3">
<h3 id="org60086e9"><span class="section-number-3">6.6</span> Accuracy</h3>
<div class="outline-text-3" id="text-6-6">
<p>
It&rsquo;s difficult to objectively test the models accuracy since the goal of &ldquo;brainstorm new lyric&rdquo; is such a subjective goal. A valid test of that goal will require many human subjects to subjectively evaluate their performance while using the tool compared to their performance without the tool.
@ -2451,8 +2465,8 @@ If we allow ourselves the assumption that the close a generated phrase is to a v
</p>
</div>
<div id="outline-container-org564fa6e" class="outline-4">
<h4 id="org564fa6e"><span class="section-number-4">6.6.1</span> Percentage Of Generated Lines That Are Valid English Sentences</h4>
<div id="outline-container-orgd2e3d30" class="outline-4">
<h4 id="orgd2e3d30"><span class="section-number-4">6.6.1</span> Percentage Of Generated Lines That Are Valid English Sentences</h4>
<div class="outline-text-4" id="text-6-6-1">
<p>
We can use <a href="https://opennlp.apache.org/">Apache OpenNLP</a> to parse sentences into a grammar structure conforming to the parts of speech specified by the <a href="https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html">University of Pennsylvania&rsquo;s Treebank Project</a>.
@ -2530,8 +2544,8 @@ Where <code>nlp/valid-sentence?</code> is defined as follows.
</div>
</div>
<div id="outline-container-orgf63ab34" class="outline-3">
<h3 id="orgf63ab34"><span class="section-number-3">6.7</span> Testing</h3>
<div id="outline-container-org8d29ef2" class="outline-3">
<h3 id="org8d29ef2"><span class="section-number-3">6.7</span> Testing</h3>
<div class="outline-text-3" id="text-6-7">
<p>
My language of choice for this project encourages a programming technique or paradigm known as REPL-driven development. REPL stands for Read-Eval-Print-Loop. This is a way to write and test code in real-time without a compilation step. Individual code chunks can be evaluated inside an editor, resulting in rapid feedback.
@ -2575,12 +2589,16 @@ Here is an example of the test suite for the code related to syllabification: <a
</div>
</div>
<div id="outline-container-org965436f" class="outline-3">
<h3 id="org965436f"><span class="section-number-3">6.8</span> Source Code</h3>
<div id="outline-container-orgbcd20cb" class="outline-3">
<h3 id="orgbcd20cb"><span class="section-number-3">6.8</span> Source Code</h3>
<div class="outline-text-3" id="text-6-8">
<p>
I wrote three Clojure libraries and one Clojure application that combine to make RhymeStorm™.
</p>
</div>
<div id="outline-container-org2aea5c0" class="outline-4">
<h4 id="org2aea5c0"><span class="section-number-4">6.8.1</span> Tightly Packed Trie</h4>
<div id="outline-container-orgb5bde0d" class="outline-4">
<h4 id="orgb5bde0d"><span class="section-number-4">6.8.1</span> Tightly Packed Trie</h4>
<div class="outline-text-4" id="text-6-8-1">
<p>
This is the data structure that backs the Hidden Markov Model.
@ -2592,8 +2610,8 @@ This is the data structure that backs the Hidden Markov Model.
</div>
</div>
<div id="outline-container-orgb75e602" class="outline-4">
<h4 id="orgb75e602"><span class="section-number-4">6.8.2</span> Phonetics</h4>
<div id="outline-container-org68009bd" class="outline-4">
<h4 id="org68009bd"><span class="section-number-4">6.8.2</span> Phonetics</h4>
<div class="outline-text-4" id="text-6-8-2">
<p>
This is the helper library that syllabifies and manipulates words, phones, and syllables.
@ -2605,8 +2623,8 @@ This is the helper library that syllabifies and manipulates words, phones, and s
</div>
</div>
<div id="outline-container-orgc333af7" class="outline-4">
<h4 id="orgc333af7"><span class="section-number-4">6.8.3</span> Rhyming</h4>
<div id="outline-container-org615c902" class="outline-4">
<h4 id="org615c902"><span class="section-number-4">6.8.3</span> Rhyming</h4>
<div class="outline-text-4" id="text-6-8-3">
<p>
This library contains code for analyzing rhymes, sentence structure, and manipulating corpuses.
@ -2618,8 +2636,8 @@ This library contains code for analyzing rhymes, sentence structure, and manipul
</div>
</div>
<div id="outline-container-org5e9ab5a" class="outline-4">
<h4 id="org5e9ab5a"><span class="section-number-4">6.8.4</span> Web Server And User Interface</h4>
<div id="outline-container-org8ffc320" class="outline-4">
<h4 id="org8ffc320"><span class="section-number-4">6.8.4</span> Web Server And User Interface</h4>
<div class="outline-text-4" id="text-6-8-4">
<p>
This application is not publicly available. I&rsquo;ll upload it with submission of the project.
@ -2628,16 +2646,16 @@ This application is not publicly available. I&rsquo;ll upload it with submission
</div>
</div>
<div id="outline-container-org5689c17" class="outline-3">
<h3 id="org5689c17"><span class="section-number-3">6.9</span> Quick Start</h3>
<div id="outline-container-org9010313" class="outline-3">
<h3 id="org9010313"><span class="section-number-3">6.9</span> Quick Start</h3>
<div class="outline-text-3" id="text-6-9">
</div>
<div id="outline-container-org980ba69" class="outline-4">
<h4 id="org980ba69"><span class="section-number-4">6.9.1</span> How To Initialize Development Environment</h4>
<div id="outline-container-org00f3e76" class="outline-4">
<h4 id="org00f3e76"><span class="section-number-4">6.9.1</span> How To Initialize Development Environment</h4>
<div class="outline-text-4" id="text-6-9-1">
</div>
<ol class="org-ol">
<li><a id="org5418910"></a>Required Software<br />
<li><a id="org3ad8643"></a>Required Software<br />
<div class="outline-text-5" id="text-6-9-1-1">
<ul class="org-ul">
<li><a href="https://www.docker.com/">Docker</a></li>
@ -2647,7 +2665,7 @@ This application is not publicly available. I&rsquo;ll upload it with submission
</div>
</li>
<li><a id="orgb469978"></a>Steps<br />
<li><a id="org5eaa8dc"></a>Steps<br />
<div class="outline-text-5" id="text-6-9-1-2">
<ol class="org-ol">
<li>Run <code>./db/run.sh &amp;&amp; ./kv/run.sh</code> to start the docker containers for the database and key-value store.
@ -2662,12 +2680,12 @@ This application is not publicly available. I&rsquo;ll upload it with submission
</ol>
</div>
<div id="outline-container-org609caa1" class="outline-4">
<h4 id="org609caa1"><span class="section-number-4">6.9.2</span> How To Run Software Locally</h4>
<div id="outline-container-org7cd2611" class="outline-4">
<h4 id="org7cd2611"><span class="section-number-4">6.9.2</span> How To Run Software Locally</h4>
<div class="outline-text-4" id="text-6-9-2">
</div>
<ol class="org-ol">
<li><a id="orga645e28"></a>Requirements<br />
<li><a id="orga03ff0d"></a>Requirements<br />
<div class="outline-text-5" id="text-6-9-2-1">
<ul class="org-ul">
<li><a href="https://www.java.com/download/ie_manual.jsp">Java</a></li>
@ -2676,7 +2694,7 @@ This application is not publicly available. I&rsquo;ll upload it with submission
</div>
</li>
<li><a id="orgf1bc042"></a>Steps<br />
<li><a id="org37b7c9e"></a>Steps<br />
<div class="outline-text-5" id="text-6-9-2-2">
<ol class="org-ol">
<li>Run <code>./db/run.sh &amp;&amp; ./kv/run.sh</code> to start the docker containers for the database and key-value store.
@ -2695,8 +2713,8 @@ This application is not publicly available. I&rsquo;ll upload it with submission
</div>
<div id="outline-container-org6806ed2" class="outline-2">
<h2 id="org6806ed2"><span class="section-number-2">7</span> Citations</h2>
<div id="outline-container-orgffa2fb6" class="outline-2">
<h2 id="orgffa2fb6"><span class="section-number-2">7</span> Citations</h2>
<div class="outline-text-2" id="text-7">
<p>
Wikimedia Foundation. (2021, July 16). Markov Model. Wikipedia.
@ -2730,7 +2748,7 @@ Ulrich Germann, Eric Joanis, and Samuel Larkin. 2009. Tightly packed tries: How
</div>
<div id="postamble" class="status">
<p class="author">Author: Eric Ihli</p>
<p class="date">Created: 2021-07-23 Fri 16:05</p>
<p class="date">Created: 2021-07-23 Fri 17:16</p>
</div>
</body>
</html>

Loading…
Cancel
Save