Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
mc.rst 33.3 KiB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513
.. _gsoc-proposal-template:

Proposal template 
#################

Introduction
*************

Summary links
=============

- **Contributor:** `Michael Clemens <https://forum.beagleboard.org/u/mclemcrew>`_
- **Mentors:** `Jack Armitage <https://forum.beagleboard.org/u/jarm>`_, `Chris Kiefer <https://forum.beagleboard.org/u/luuma>`_

Status
=======

This project is currently just a proposal.

Proposal
========

About 
=====

- **Forum:** :fab:`discourse` `u/mclemcrew (Michael Clemens) <https://forum.beagleboard.org/u/mclemcrew>`_
- **OpenBeagle:** :fab:`gitlab` `mclem (Michael Clemens) <https://openbeagle.org/mclem>`_
- **Github:** :fab:`github` `mclemcrew (Michael Clemens) <https://github.com/mclemcrew>`_
- **School:** :fas:`school` University of Utah
- **Country:** :fas:`flag` United States
- **Primary language:** :fas:`language` English
- **Typical work hours:** :fas:`clock` 7AM-3PM US Mountain
- **Previous GSoC participation:** :fab:`google` N/A

Project
********

**Project name:** Differential Logic Based Adaptive Soundwalk

Soundwalking
==============

Soundwalking is a practice that was initially developed by the World Soundscape Project, an international research group founded by R. Murray Schafer (Traux 2021).  The project focused on acoustic ecology and the idea that noise pollution in our sonic environment has impaired our ability to actively listen (Staśko-Mazur 2015).  Soundwalking was initially used to improve our listening skills and understand the human experience of aural perception in modern society.  Today, it is used to help listeners appreciate the importance of the soundscape in their environment and is widely used in soundscape research.

Other definitions of soundwalking from researchers include:

- An empirical method for identifying and analyzing the soundscape and its components in different locations (Adams, 2008)
- '...any excursion whose main purpose is listening to the environment' (Drever, 2009)
- A practice that promotes an aural perception of the environment as both a physical space and a space of social and political tensions, divisions, and flows (Carras, 2019)

In summary, soundwalking focuses primarily on the soundscape's auditory elements rather than other sources of sensory information.  It is a way of connecting with nature that is often overlooked due to our reliance on visual cues.  This project draws inspiration from the following innovative soundwalk concepts:

- **"Alter Bahnhof Video Walk"** by `Janet Cardiff and George Bures Miller <https://cardiffmiller.com/walks/alter-bahnhof-video-walk/>`_ offers an uncanny soundwalking experience for users.  Participants use an iPod and headphones to navigate Kassel's train station.  The video on the screen mirrors their physical environment, and they are provided with a directional narration via the `video provided <https://www.youtube.com/watch?v=sOkQE7m31Pw>`_.
- **"Ambulation"** by `Shaw and Bowers <https://blog.bela.io/ambulation-sound-walk-tim-shaw/>`_ focuses on in situ soundwalks.  Sounds are collected and processed live from the participant's environment and broadcasted through wireless headphones, avoiding pre-recorded elements.  Shaw initially implemented this on a laptop with PureData but now uses Bela as a way to realize this work.

This project proposal is a derivative of these works and leverages soundwalks as a medium for creative exploration.  Users will wear wireless headphones while walking along a predetermined route provided by the aural narrator (similar to Cardiff and Miller's presentation).  The system, using Bela, will continuously analyze the environment.  It will identify specific audio patterns (e.g., footsteps on concrete, birdsong ) and trigger curated narration tailored to the user's surroundings.

To ensure accurate, real-time sound classification, this project proposes implementing DiffLogic to address the latency issues commonly associated with deploying such models on embedded platforms such as Bela.  The following section delves into the specifics of DiffLogic implementation.

Differential Logics
====================
Traditional logic gate networks cannot conventionally leverage gradient-based training because of their non-differentiable nature. Petersen et al. (2022) address this limitation by introducing a relaxation process, making the gates differentiable and enabling standard training methods. Here is how their approach compares to classic neural networks:

A standard neural network uses a perceptron, which is a binary linear classifier and consists of the following four parts:

#. Input values or biases - A vector from a previous layer or discretized raw input (e.g., image pixel values, audio amplitudes)
#. Weights - Learned parameters demonstrating the importance of a particular node.
#. Weighted Sum - Inputs multiplied by weights, then summed
#. Activation Function - Non-linear function mapping the weighted sum to an output, enabling complex decision-making

The following image illustrates the steps of the perceptron.

.. image:: ../_static/images/perceptron.png
  :alt: An image of a perceptron describing each stage

A logic gate is similar to a perceptron in that it also can be used as a binary linear classifier.  With XOR, it takes on the implementation of a multi-layer perceptron, but the main concept within leveraging logic gates as perceptrons is that they both function as binary linear classifiers as demonstrated below:

.. image:: ../_static/images/logic-gates-fix.png
  :alt: An image of logic gates acting as linear classifiers

A table of key differences between the network architectures follows:

.. list-table::
   :widths: 50 50
   :header-rows: 1

   * - Logic Gates
     - Neurons
   * - Can have non-linear decision boundaries (e.g., XOR)
     - Only models linear decision boundaries
   * - Algorithm-like: Fixed function
     - Neural-like: Flexible learning of function
   * - Fast inference
     - Slow inference

Although all logic gates can be modeled by using just the NAND and NOR gates (`DeMorgan's Law <https://www.allaboutcircuits.com/textbook/digital/chpt-7/demorgans-theorems/>`_), the authors of this work use the following 16, including a falsy and truthy representation as represented below:

.. list-table::
   :widths: 10 25 25 10 10 10 10
   :header-rows: 1

   * - ID
     - Operator
     - Real-valued
     - 00
     - 01
     - 10
     - 11
   * - 1
     - False
     - 0
     - 0
     - 0
     - 0
     - 0
   * - 2
     - A ∧ B
     - A · B
     - 0
     - 0
     - 0
     - 1
   * - 3
     - ¬(A ⇒ B)
     - A - AB
     - 0
     - 0
     - 1
     - 0
   * - 4
     - A
     - A
     - 0
     - 0
     - 1
     - 1
   * - 5
     - ¬(A ⇐ B)
     - B - AB
     - 0
     - 1
     - 0
     - 0
   * - 6
     - B
     - B
     - 0
     - 1
     - 0
     - 1
   * - 7
     - A ⊕ B
     - A + B - 2AB
     - 0
     - 1
     - 1
     - 0
   * - 8
     - A ∨ B
     - A + B - AB
     - 0
     - 1
     - 1
     - 1
   * - 9
     - ¬(A ≡ B)
     - 1 - (A + B - 2AB)
     - 1
     - 0
     - 0
     - 1
   * - 10
     - ¬B
     - 1 - B
     - 1
     - 0
     - 1
     - 0
   * - 11
     - A ⇐ B
     - 1 - B + AB
     - 1
     - 0
     - 1
     - 1
   * - 12
     - ¬A
     - 1 - A
     - 1
     - 1
     - 0
     - 0
   * - 13
     - A ⇒ B
     - 1 - A + AB
     - 1
     - 1
     - 0
     - 1
   * - 14
     - ¬(A ∧ B)
     - 1 - AB
     - 1
     - 1
     - 1
     - 0
   * - 15
     - True
     - 1
     - 1
     - 1
     - 1
     - 1

Leveraging these gates, the authors used gradient-based training to learn the optimal logic gate implementation for each node.  Their classification models show that inference time is dramatically reduced while training costs are often higher.  This speed advantage is crucial in applications like embedded computing, where models are typically small, and training occurs offline, making training cost less of a concern.

Related Works
====================
Digital musical instruments (DMIs) demand low latency systems to ensure artists maintain full agency over their creative output.  McPherson et al. (2016) underscore this need, demonstrating that common microcontrollers fail to meet accepted latency guidelines (generally between 4ms-10ms as supported by Lago and Kon (2004) and Wessel and Wright (2002)).  Bela stands out in this context, offering submillisecond latencies and jitter under 50μs.

Bela's minimal delay makes it ideal for complex, interactive DMIs. As machine learning becomes increasingly prevalent in music and musical performances, creators require reliable ways to deploy these models on embedded AI systems.  This was a key theme of the 2022 Embedded AI workshop at NIME led by this project's mentors and other researchers within this field (`Pelinski et al. 2022 <https://www.youtube.com/watch?v=Xb7Q3tk79JI>`_).  A previous GSoC project contributor (Pierce et al. 2023) presented at the workshop on the topic of achieving suitable performance within an embedded context using `Bela <https://aimc2023.pubpub.org/pub/t2l10z49/release/3>`_.  A follow-up to this work also came from (Pelinski et al. 2023) in their work *"Pipeline for recording datasets and running neural networks on the Bela embedded hardware platform."*  These works collectively highlight efforts to streamline ML/DL deployment on the Bela platform.

Building upon this foundation, our mentors successfully trained a differential logic gate network on `Bela for sound classification <https://github.com/jarmitage/DiffLogicBela/tree/main>`_.  Previous work by Solomes (2020) and Stowell demonstrates similar success using Bela for the `DCASE 2018 - Bird Audio Detection task <https://github.com/qiuqiangkong/dcase2018_task3>`_.

This proposal aims to extend these initiatives, explicitly targeting the intersection of NIME and embedded AI on Bela.  The goal is to help empower musicians and makers to seamlessly create and utilize custom DMIs that harness the power of machine learning models.

Project Goals
====================
There are two main goals of this project being as follows:

#. **DiffLogic Implementation Pipeline for Bela**:
    * **Comprehensive Guide:** Step-by-step instructions on building, training, and deploying deep logic gate models on Bela. This will include documentation, Jupyter notebooks, and video tutorials.
    * **Performance Metrics:** Evaluate the ideal use cases for this model architecture (e.g., sound classification, gesture recognition) and provide relevant performance metrics.
    * **Learning Resources:** Create supplementary materials to enhance understanding of DiffLogic.
#. **Soundwalk with Adaptive Narrative:**
    * Implement a soundwalk where the aural narrative dynamically updates based on fast, accurate sound classification of the user's sonic environment.
    * **Feature Exploration:** Utilize FFTs, GFCCs, and MFCCs for sound classification, potentially leveraging insights from Zhao (2013) on GFCCs' potential for accuracy improvement.

The majority of the project's effort will be dedicated to establishing a robust and user-friendly DiffLogic pipeline for Bela users. Initial focus will be on sound classification using the following datasets:

- `UrbanSound8K <https://paperswithcode.com/dataset/urbansound8k-1>`_: To evaluate the soundwalk implementation's effectiveness in a complex urban environment.
- `DCASE 2018 - Bird Audio Detection <https://github.com/qiuqiangkong/dcase2018_task3>`_: To facilitate direct performance comparisons between DiffLogic and existing CNN architectures on Bela.

If time permits, I will explore additional audio classification tasks such as speech recognition or sentence utterances. While gesture recognition could be highly valuable for NIME, it is currently considered outside the scope of this project.

Metrics
====================
Building on the findings of Petersen et al. (2022), this project aims to achieve significant reductions in inference time without sacrificing accuracy when deploying models on Bela. We'll measure and compare the following metrics across classification tasks using the UrbanSound8K and DCASE 2018 - Bird Audio Detection datasets:

- Accuracy
- Training time
- Number of parameters
- Inference time
- Storage space

Models evaluated will include a decision tree learner, a logistic regression, a neural network, and a DiffLogic network.

The project's soundwalk experience will take place in the Indiana Dunes National Park in the latter half of summer 2024, featuring aural narrations by the project lead. We will fine-tune the classifier on a corpus of 500 labeled soundbites collected from the region. Participants will be provided a short explanation of what they will experience, will be guided to the origin, and the aural narrative will be provided in real time and will update accordingly based on their current aural surroundings.



Software
=========

- Python (Jupyter, PyTorch)
- C
- C++


Hardware
========

- `Bela <https://shop.bela.io/collections/bela-and-bela-mini/products/bela-starter-kit>`_
- `Microphone <https://github.com/sparkfun/SparkFun_Analog_MEMS_Microphone_Breakout_ICS-40180>`_
- `BT Transmitter <https://www.amazon.com/Acxico-Bluetooth-Transmitter-Wireless-Headphones/dp/B0964T7FBP/ref=sr_1_3?crid=1A0DUBS68GI36&dib=eyJ2IjoiMSJ9.XfhF0anLttLf4GtHh30FjzfKnse9KgPZ4dgL_ZN48-GpBd6OoxkNLDibNVZurtbyvq3m1K2LiUeOT1EOpV8fuG7eiqVonqdhxSPtAUhkBZf3tlXrqPTsMbGzuYNfRW9TQEIhsSKKV4Qn4K5eew01_XLd2uQxhOmFijMYc7pexpJf4RkGFq-UwuGUw1ViehvBLsjRdC8l2tmOOXzYylBQATys1vCiJ-_Q3K3IVIZvY7A.079Tq5hqUJX1FSOhqtcWQqQOJV1Olval75tU8C7qIM4&dib_tag=se&keywords=bluetooth+audio+transmitter+board&qid=1711320174&sprefix=bluetooth+audio+transmitter+boar%2Caps%2C151&sr=8-3>`_

Early Experiements / Troubleshooting
=====================================
Within the difflogic repo in the main.py and main_baseline.py files, the Iris dataset is included in the code but not within the terminal commands.  I `updated their code <https://github.com/mclemcrew/difflogic/commit/56c2ffb32d6e4bfbf06d6e431b899e28d360c89c>`_ and began attempting to run the experiments to see if I could verify and replicate their results.  Unfortunately, I could not do so despite several attempts to get it working.  Within my University's Center for High-Performance Computing, I ran into a GCC compiler version error (PyTorch required 9 and our node was running on 8).  I tried to install it from source but ended up with more errors than anything and decided to spin up an EC2 instance instead and install everything manually to ensure the correct versioning.  For CUDA to be uploaded, I needed to request that my vCPUs be upgraded in the region I was deploying so I could attach a GPU to my instance.  After attempting to spin everything up from scratch, I kept receiving the following error while installing via pip or from building from source:

.. code-block:: bash

   CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

After looking into this, I believe there is still a mismatch between torch versioning and CUDA, but I ran out of time to resolve this before getting anything running.  The EC2 instance was the best solution for the time being, and despite following their installation support file, I still encountered issues I needed to resolve.  I'll be looking through the CUDA installation guide in more detail since I'm sure I overlooked something while installing everything on the EC2 instance.

Timeline
********

Provide a development timeline with 10 milestones, one for each week of development without 
an evaluation, and any pre-work. (A realistic, measurable timeline is critical to our selection process.)

.. note:: This timeline is based on the `official GSoC timeline <https://developers.google.com/open-source/gsoc/timeline>`_


Timeline summary
=================

.. table:: 

    +------------------------+----------------------------------------------------------------------------------------------------+
    | Date                   | Activity                                                                                           |                                  
    +========================+====================================================================================================+
    | February 26            | Connect with possible mentors and request review on first draft                                    |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | March 4                | Complete prerequisites, verify value to community and request review on second draft               |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | March 11               | Finalized timeline and request review on final draft                                               |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | March 21               | Submit application                                                                                 |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | May 1                  | Start bonding                                                                                      |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | May 27                 | Start coding and introductory video                                                                |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | June 3                 | Release introductory video and complete milestone #1                                               |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | June 10                | Complete milestone #2                                                                              |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | June 17                | Complete milestone #3                                                                              |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | June 24                | Complete milestone #4                                                                              |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | July 1                 | Complete milestone #5                                                                              |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | July 8                 | Submit midterm evaluations                                                                         |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | July 15                | Complete milestone #6                                                                              |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | July 22                | Complete milestone #7                                                                              |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | July 29                | Complete milestone #8                                                                              |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | August 5               | Complete milestone #9                                                                              |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | August 12              | Complete milestone #10                                                                             |
    +------------------------+----------------------------------------------------------------------------------------------------+
    | August 19              | Submit final project video, submit final work to GSoC site and complete final mentor evaluation    |
    +------------------------+----------------------------------------------------------------------------------------------------+

Timeline detailed
=================

Community Bonding Period (May 1st - May 26th)
==============================================
GSoC contributors get to know mentors, read documentation, get up to speed to begin working on their projects

Coding begins (May 27th)
=========================
++++++++++++++++++++
Project Foundations
++++++++++++++++++++
- Set up development environment for Bela platform
- Begin experimenting with pipeline for building ML/DL models and delopying/testing with Bela

Milestone #1, Introductory YouTube video (June 3rd)
===================================================
- Begin implementing the pipeline for DiffLogic on Bela platform 
- Experiment with different logic gate architectures and training methods
- Start drafting comprehensive guide and documentation for DiffLogic implementation

Milestone #2 (June 10th)
==========================
- Iterate on DiffLogic implementation based on initial testing and feedback
- Evaluate and select appropriate datasets for classifier training (UrbanSound8K, DCASE 2018)

Milestone #3 (June 17th)
=========================
- Begin training the DiffLogic model using selected datasets
- Evaluate performance metrics including accuracy, training time, and inference time
- Compare DiffLogic network with traditional classification models (decision tree, logistic regression, neural network)

Milestone #4 (June 24th)
==========================
- Develop supplementary materials such as Jupyter notebooks and video tutorials
- Complete writing comprehensive guide and documentation for DiffLogic implementation

Milestone #5 (July 1st)
========================
- Develop supplementary materials such as Jupyter notebooks and video tutorials
- Start building soundwalking prototype 
- Conduct initial tests and simulations to validate the effectiveness of real-time narrative updates based on sound classification

Submit midterm evaluations (July 8th)
=====================================
- Implementation and results from DiffLogic implementation

.. important:: 
    
    **July 12 - 18:00 UTC:** Midterm evaluation deadline (standard coding period) 

Milestone #6 (July 15th)
=========================
- Start designing the soundwalk experience, including route planning and narrative structure
- Explore potential audio classification features (FFT, GFCCs, MFCCs) for sound classification
- Test initial sound samples with labels

Milestone #7 (July 22nd)
=========================
- Build a prototype of the soundwalk system integrating the DiffLogic classifier
- Conduct initial tests and simulations to validate the effectiveness of real-time narrative updates based on sound classification

Milestone #8 (July 29th)
=========================
- Optimize the soundwalk system for performance and efficiency
- Fine-tune the DiffLogic model based on feedback and performance evaluation
- Address any remaining issues or bugs in the implementation

Milestone #9 (Aug 5th)
=======================
- Finalize all project deliverables including documentation, tutorials, and the soundwalk system
- Conduct thorough testing and quality assurance checks to ensure everything functions as intended
- Prepare for the upcoming soundwalk experience in Indiana Dunes National Park

Milestone #10 (Aug 12th)
========================
- Conduct the soundwalk experience in Indiana Dunes National Park
- Analyze feedback and data collected during the soundwalk experience
- Finalize deliverables

Final YouTube video (Aug 19th)
===============================
- Video of DiffLogic implementation and soundwalking example

Submit final project video, submit final work to GSoC site 
and complete final mentor evaluation

Final Submission (Aug 24nd)
============================

.. important::

    **August 19 - 26 - 18:00 UTC:** Final week: GSoC contributors submit their final work 
    product and their final mentor evaluation (standard coding period)

    **August 26 - September 2 - 18:00 UTC:** Mentors submit final GSoC contributor 
    evaluations (standard coding period)

Initial results (September 3)
=============================

.. important:: 
    **September 3 - November 4:** GSoC contributors with extended timelines continue coding

    **November 4 - 18:00 UTC:** Final date for all GSoC contributors to submit their final work product and final evaluation

    **November 11 - 18:00 UTC:** Final date for mentors to submit evaluations for GSoC contributor projects with extended deadline

Experience and approach
************************

Although I have limited experience with Bela specifically, I have worked on numerous embedded projects that demonstrate my proficiency in this area including my `MIDI Mosaic project <http://mclem.in/#/projects/midi-mosaic>`_ that utilized the Teensy microcontroller for processing of the capacitive touch sensor and MIDI data wrangling.

I'm currently building a `co-creative music mixing agent <https://github.com/mclemcrew/MixologyDB>`_ that leverages non-copyrighted audio data to recommend audio effect, parameters, and their associated values based on user input through a chat-interface.  Despite the potential challenges of using chat interfaces for music, as highlighted by Martin Mull's quote "Writing about music is like dancing about architecture."  With this in mind, I'm leveraging XAI principles such as chain-of-thought reasoning and counterfactuals to help recommend mixing basics for `amateurs through pro-ams <https://arxiv.org/abs/2304.03407>`_.  This project demonstrates my experience in applying advanced machine learning techniques to real-world problems.

Through my research within the field of natural language processing (NLP) and my hands-on experience with various projects, I have developed a strong foundation in machine learning and deep learning.  While I am aware of some Diff Logic approaches from papers such as `"High-Fidelity Noise Reduction with Differentiable Signal Processing" <https://arxiv.org/abs/2310.11364>`_, I have not yet had the opportunity to implement them directly.  However, I am excited to apply my existing knowledge and skills to explore this concept further within the context of embedded AI.

Contingency
===========

There are a few places to reach out for help, including:

- The `Beagle Forum <https://forum.beagleboard.org/u/mclemcrew>`_ or Discord
- The `Bela Forum <https://forum.bela.io/u/mclem>`_
- `X <https://x.com/mclemcrew>`_ (Formerly Twitter)
- `DASP-PyTorch <https://github.com/csteinmetz1/dasp-pytorch>`_ 
- `Diff Logic PyTorch library <https://github.com/Felix-Petersen/difflogic>`_
- `Bela DL Pipeline <https://github.com/pelinski/bela-dl-pipeline>`_
- `C++ Real-time Audio Programming with Bela <https://github.com/BelaPlatform/bela-online-course>`_
- `PRU Programming for Beagleboard <https://old.beagleboard.org/pru#:~:text=A%20programmable%20real-time%20unit,BeagleBone%20Green%2C%20etc.>`_
- `Compiled models on RPi2040 <https://github.com/lnfiniteMonkeys/uSEQ/blob/main/docs/useq.md>`_
- `Experiments with DiffLogic and Bela <https://github.com/jarmitage/DiffLogicBela/tree/main/projects>`_
- `PyBela <https://github.com/BelaPlatform/pybela>`_

Each of these resources provides benefit to the overall project in some manner.  The Beagle Forum or Bela Forum are the optimal places to ask for assistance for Bela-related queries.  Within the differential logic and ML/DL communities, X has been helpful with connecting to the right body of work or the right person who may be able to help briefly.  Differential signal processing has become big recently within Music Generation, and I would reach out to researchers I know within these projects should I require assistance before I meet with my mentor again.

Benefit
========

If completed, this project will have a positive impact on the BeagleBoard community by providing a robust and accessible framework for implementing embedded AI designs, specifically in the domain of machine listening and audio processing.

The project will demonstrate the feasibility and effectiveness of leveraging Differential Logic techniques for real-time audio processing on the BeagleBoard platform. By showcasing the successful integration of advanced machine learning algorithms, such as Siamese networks and transfer learning, with the BeagleBoard's hardware capabilities, this project seeks to inspire and encourage other developers and researchers to explore similar approaches in their own work.

This project will also contribute a set of demos, code snippets, and tutorials for implementing on-device machine learning models, sound classification, and machine listening using differential logic on the Bela platform. These resources will serve as a foundation for the BeagleBoard community to build upon, ensuring that embedded AI projects are more accessible to the overall community within BB and NIME.

Relevant Forum Posts:

- `Embedded differentiable logic gate networks for real-time interactive and creative applications <https://forum.beagleboard.org/t/embedded-differentiable-logic-gate-networks-for-real-time-interactive-and-creative-applications/37768/1>`_
- `Enhanced Media Experience with AI-Powered Commercial Detection and Replacement <https://forum.beagleboard.org/t/enhanced-media-experience-with-ai-powered-commercial-detection-and-replacement/37358>`_

Misc
====

This is the `link <https://github.com/jadonk/gsoc-application/pull/186>`_ to the merge request.

References
===========

- X. Zhao and D. Wang, "Analyzing noise robustness of MFCC and GFCC features in speaker identification," *2013 IEEE International Conference on Acoustics, Speech and Signal Processing*, Vancouver, BC, Canada, 2013, pp. 7204-7208, doi: 10.1109/ICASSP.2013.6639061.
- Adams, Bruce, N. C., Davies, W., Cain, R., Jennings, P., Carlyle, A., Cusack, P. T., Hume, K., & Plack, C. (2008). *Soundwalking as a methodology for understanding soundscapes*. `https://www.semanticscholar.org/paper/Soundwalking-as-a-methodology-for-understanding-Adams-Bruce/f9e9eb719fb213c3f1f72dd000cabcf52169e5c7 <https://www.semanticscholar.org/paper/Soundwalking-as-a-methodology-for-understanding-Adams-Bruce/f9e9eb719fb213c3f1f72dd000cabcf52169e5c7>`_
- Carras, C. (2019). Soundwalks: An experiential path to new sonic art. *Organised Sound*, *24*(3), 261–273. `https://doi.org/10.1017/S1355771819000335 <https://doi.org/10.1017/S1355771819000335>`_
- Drever, J. L. (2009). *Soundwalking: Aural Excursions into the Everyday*. `https://www.semanticscholar.org/paper/Soundwalking%3A-Aural-Excursions-into-the-Everyday-Drever/870b40f70636933df0efb2d9706dbea678f3cf3a <https://www.semanticscholar.org/paper/Soundwalking%3A-Aural-Excursions-into-the-Everyday-Drever/870b40f70636933df0efb2d9706dbea678f3cf3a>`_
- Staśko-Mazur, K. (2015). Soundwalk as a multifaceted practice. *Argument : Biannual Philosophical Journal*. `https://www.semanticscholar.org/paper/Soundwalk-as-a-multifaceted-practice-Sta%C5%9Bko-Mazur/571241ba10d121a0b6d8e29cc7efb73387eaa436 <https://www.semanticscholar.org/paper/Soundwalk-as-a-multifaceted-practice-Sta%C5%9Bko-Mazur/571241ba10d121a0b6d8e29cc7efb73387eaa436>`_
- Truax, B. (2021). R. Murray Schafer (1933–2021) and the World Soundscape Project. *Organised Sound*, *26*(3), 419–421. `https://doi.org/10.1017/S1355771821000509 <https://doi.org/10.1017/S1355771821000509>`_
- Pierce, E., Shepardson, V., Armitage, J., & Magnusson, T. (2023). Bela-IREE: An Approach to Embedded Machine Learning for Real-Time Music Interaction. AIMC 2023. Retrieved from https://aimc2023.pubpub.org/pub/t2l10z49
- Pelinski, T., Shepardson, V., Symons, S., Caspe, F. S., Benito Temprano, A. L., Armitage, J., … McPherson, A. (2022). Embedded AI for NIME: Challenges and Opportunities. International Conference on New Interfaces for Musical Expression. `https://doi.org/10.21428/92fbeb44.76beab02 <https://doi.org/10.21428/92fbeb44.76beab02>`_
- Solomes, A. M., and D. Stowell. 2020. “Efficient Bird Sound Detection on the Bela Embedded System.” `https://qmro.qmul.ac.uk/xmlui/handle/123456789/67750 <https://qmro.qmul.ac.uk/xmlui/handle/123456789/67750>`_.
- Shaw, Tim, and J. Bowers. 2020. “Ambulation: Exploring Listening Technologies for an Extended Sound Walking Practice.” *New Interfaces for Musical Expression*, 23–28.
- McPherson, Andrew P., Robert H. Jack, Giulio Moro, and Others. 2016. “Action-Sound Latency: Are Our Tools Fast Enough?” *Qmro.qmul.ac.uk*. `https://qmro.qmul.ac.uk/xmlui/handle/123456789/12479 <https://qmro.qmul.ac.uk/xmlui/handle/123456789/12479>`_.
- Pelinski, Teresa, Rodrigo Diaz, A. L. B. Temprano, and Andrew J. McPherson. 2023. “Pipeline for Recording Datasets and Running Neural Networks on the Bela Embedded Hardware Platform.” *ArXiv* abs/2306.11389 (June). `https://doi.org/10.48550/arXiv.2306.11389 <https://doi.org/10.48550/arXiv.2306.11389>`_.
- N.P. Lago and F. Kon. The quest for low latency. In Proc. ICMC, 2004.
- D. Wessel and M. Wright. Problems and prospects for intimate musical control of computers. *Computer Music Journal*, *26*(3):11–22, 2002.