Projects

Note: Most projects have code that can be viewed, but for a variety of reasons, some do not. These are marked below with an asterisk (*). All others should link to a repository.

CLI Search Engine
Code available here
  • Allows user to index and search a directory of text files
  • Utilizes Inverted Index data structure to track word document, index, and frequency
  • Allows user to continously search without having to reload the index file
  • Written in Java
CLI Spellcheck
Code available here
  • Allows user to spellcheck basic text files
  • Utilizes Trie data structure to represent an approved list of correctly spelled words
  • Allows user to use a word list of their choosing to build the spellcheck Trie
  • Written in Java

Code available here
Site available here
  • Allows user to visually edit and create .bdf font files pixel by pixel
  • Allows user to import and edit .bdf fonts
  • Allows user to customize which glyphs they include in their font
  • Written in HTML/CSS/Javascript

Code available here
Site available here
  • Allows users to quickly view an AQI value for their area based on the closest Purple Air sensors
  • Written in Python
  • My contributions:
    • Implemented math for calculating nearest sensors in NumPy, which greatly increased the speed of AQI calculation for the system

Code available here
  • Allows users to segment a directory of audio clips into smaller clips
  • Utilizes the pydub package
  • Written in Python

  • Built in collaboration with other students as a part of a graduate level course
  • Summarizes documents of a given document in approximatley 100 words
  • Written in Python
  • Utlizes TDIF scoring for content selection, multiple expert information ordering, regex based content compression, and BLEU score based redundancy limitation
  • Utilizes NLTK (tokenization, BLEU scores), Splitta (sentence boundary detection), and more
  • My contributions:
    • Handled initial data extraction with BeautifulSoup
    • Added sentence boundary detection post-data extraction using NLTK/Splitta
    • Helped to implement TFIDF scoring for content selection
    • Wrote a Bash script to handle automated testing of our system
    • Was very active in helping to debug the system as it was created

Code available here
  • Built as part of a Masters capstone project
  • Allows users to search a bilingual FAQ from the Alaska Labor Standards and Safety Division's Wage and Hour - Employee FAQ with speech in English and Yup'ik
  • Implemented in Python
  • Search with English speech utilizes DeepSpeech's pre-trained English model and scorer
  • Search with Yup'ik speech utilizes an acoustic model trained with DeepSpeech's fine-tuning capabilities (on the pre-trained English model), and their modified
    KenLM scorer creator was used to make the scorer
  • Utilizes sounddevice, wavio, sox, PySimpleGUI, and more

Code available here
  • Built as part of a graduate level course at the University of Washington in collaboration with two other group members
  • Game logic built in Python, utilizing Rasa's ConveRT pipeline for NLU
  • Uses a small text adventure game to show the accessibility and usability of NLU in games
  • NLU allows players to use natural language when interacting with the game, unlike most text adventure games that require terse language
  • My contributions:
    • Built the original NLU model using Rasa and our group's handmade utterances
    • Researched how to interact with Rasa's NLU server system and wrote the initial toy script wrapper for our code to do this locally
    • Wrote the JSON script file used to return game dialogue to the user
    • Offered consultation to the group member who worked on the game logic in their design of the logic, especially as it pertained to the design of the script

Code available here
  • Allows users to track time spent on tasks with active timers/task descriptions, or to manually add tasks and time spent on them.
  • Utilizes a simple command-line interface with options for file specification and file overwrite.
  • Written in Python

Code and grammar available here
  • Built in collaboration with Lonny Strunk and Dr. Emily Bender (instructor) as part of a graduate grammar engineering course
  • A partial grammar engineered for Central Alaskan Yup'ik (esu)
  • HPSG based, built in tdl using the LKB environment
  • Built on research by Miyaoka, O. and Jacobson, S.

Code available here
  • Generates random sentences of changeable size in the style of author Jane Austen, based on text from three of her novels
  • Built in Python, using a basic bigram language model to generate random text