Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • gsoc/gsoc.beagleboard.io
  • Krishna_13/gsoc.beagleboard.io
  • krvprashanth/gsoc.beagleboard.io
  • lorforlinux/gsoc.beagleboard.io
  • jkridner/gsoc
  • anujdeshpande/gsoc.beagleboard.io
  • ayush1325/gsoc.beagleboard.io
  • samdai/gsoc.beagleboard.io
  • abdelrahman/gsoc.beagleboard.io
  • aryan_nanda/gsoc.beagleboard.io
  • fuadzade/gsoc.beagleboard.io
  • vvaishak/gsoc.beagleboard.io
  • Roger18/gsoc.beagleboard.io
  • mclem/gsoc.beagleboard.io
  • NachtSpyder04/gsoc.beagleboard.io
  • melta101/melta101-gsoc
  • saiprasad-patil/gsoc.beagleboard.io
  • mattd/gsoc.beagleboard.io
  • SurajS0215/gsoc.beagleboard.io
  • jarm/gsoc.beagleboard.io
  • ijc/gsoc.beagleboard.io
  • himanshuk/gsoc.beagleboard.io
  • mahelaekanayake10/gsoc.beagleboard.io
  • alecdenny/gsoc.beagleboard.io
  • darshan15/gsoc.beagleboard.io
  • san.s.kar03/gsoc.beagleboard.io
  • jjateen/gsoc.beagleboard.io
  • vidhusarwal/gsoc.beagleboard.io
  • giuliomoro/gsoc.beagleboard.io
  • ketanthorat/gsoc.beagleboard.io
  • Sahil7741/gsoc.beagleboard.io
  • Whiz-Manas/mana-gsoc-beagleboard-io
32 results
Show changes
Commits on Source (138)
Showing
with 681 additions and 161 deletions
image: beagle/sphinx-build-env:latest
# The Docker image that will be used to build your app
image: registry.git.beagleboard.org/docs/sphinx-build-env:latest
pages:
tags:
- docker-amd64
before_script:
- source ./venv-build-env.sh
script:
- "./gitlab-build.sh"
artifacts:
paths:
- public
\ No newline at end of file
- public
.. _C:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/c.html
.. _Assembly:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/assembly.html
.. _Verilog:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/verilog.html
.. _Zephyr:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/zephyr.html
.. _Linux:
https://docs.beagleboard.cc/latest/intro/beagle101/linux.html
.. _device-tree:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/device-tree.html
.. _FPGA:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/fpga.html
.. _basic wiring:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/basic-wiring.html
.. _motors:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/motors.html
.. _embedded serial interfaces:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/embedded-serial.html
.. _OpenBeagle CI:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/openbeagle-ci.html
.. _verification:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/verification.html
.. _wireless communications:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/wireless-communications.html
.. _Buildroot:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/buildroot.html
.. _RISC-V ISA:
https://docs.beagleboard.cc/docs/latest/intro/beagle101/riscv.html
\ No newline at end of file
......@@ -17,15 +17,24 @@ from sphinx.application import Sphinx
sys.path.append(str(Path(".").resolve()))
project = 'gsoc.beagleboard.io'
copyright = '2024, BeagleBoard.org'
copyright = '2025, BeagleBoard.org'
author = 'BeagleBoard.org'
# Add epilog details to rst_epilog
rst_epilog =""
rst_epilog_path = "_static/epilog/"
for (dirpath, dirnames, filenames) in os.walk(rst_epilog_path):
for filename in filenames:
with open(dirpath + filename) as f:
rst_epilog += f.read()
# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
extensions = [
"sphinx_design",
"sphinxcontrib.youtube",
"sphinxcontrib.images",
"sphinx_copybutton"
]
......@@ -122,8 +131,8 @@ html_theme_options = {
"use_edit_page_button": True,
"show_toc_level": 1,
"navbar_align": "right",
"show_nav_level": 2,
"announcement": "Welcome to the new site for BeagleBoard.org GSoC 2024 projects!",
"show_nav_level": 1,
"announcement": "Welcome to the site for BeagleBoard.org GSoC 2025 projects!",
# "show_version_warning_banner": True,
"navbar_center": ["navbar-nav"],
"navbar_start": ["navbar-logo"],
......@@ -181,6 +190,7 @@ latex_elements = {
),
}
sd_fontawesome_latex = True
latex_engine = "xelatex"
latex_logo = str("_static/images/logo-latex.pdf")
latex_documents = []
......
......@@ -12,14 +12,14 @@ Guides
Spend your summer break writing code and learning about open source development while earning money!
Accepted contributors work with a mentor and become a part of the open source community. Many become lifetime
open source developers! The 2024 contributor application window will be open from
`March 18th 2024 <https://developers.google.com/open-source/gsoc/timeline#march_18_-_1800_utc>`_ to
`April 2nd 2024 <https://developers.google.com/open-source/gsoc/timeline#april_2_-_1800_utc>`_!
open source developers! The 2025 contributor application window will be open from
`March 24 2025 <https://opensource.googleblog.com/2025/01/google-summer-of-code-2025-is-here.html>`_ to
`April April 8 2025 <https://opensource.googleblog.com/2025/01/google-summer-of-code-2025-is-here.html>`_!
But don't wait for then to engage! Come to our `Discord <https://bbb.io/gsocchat>`_ and
`Forum <https://bbb.io/gsocml>`_ to share ideas today.
This section includes guides for :ref:`contributors <gsoc-contributor-guide>` & :ref:`mentors <gsoc-mentor-guide>` who want to participate
in GSoC 2024 with `BeagleBoard.org <www.beagleboard.org>`_. It's highly recommended to check `GSoC Frequently Asked Questions
in GSoC 2025 with `BeagleBoard.org <www.beagleboard.org>`_. It's highly recommended to check `GSoC Frequently Asked Questions
<https://developers.google.com/open-source/gsoc/faq>`_. For anyone who just want to contribute to this site we also have
a step by step :ref:`contribution guide <gsoc-site-editing-guide>`.
......
......@@ -14,7 +14,7 @@ Mentor Guide
become familiar with the code base and testing practices, to finally releasing their code on
`OpenBeagle <https://openbeagle.org/>`_ for the world to use!
You will also need be invited by an administrator to register on the GSoC site and request
You will also need to be invited by an administrator to register on the GSoC site and request
to be a mentor for `BeagleBoard.org <https://www.beagleboard.org/>`_.
Who Are Mentors?
......@@ -25,7 +25,7 @@ with a GSoC contributor. Mentors provide guidance such as pointers to useful doc
In addition to providing GSoC contributors with feedback and pointers, a mentor acts as an ambassador to help
GSoC contributors integrate into their project’s community. `BeagleBoard.org <https://www.beagleboard.org/>`_
always assigns more than one mentor to each of GSoC contributor. Many members of `BeagleBoard.org <https://www.
beagleboard.org/>`_ community also provide guidance to GSoC contributors without mentoring in an “official”
beagleboard.org/>`_ community also provides guidance to GSoC contributors without mentoring in an “official”
capacity, as much as they would answer anyone’s questions on our `Discord <https://bbb.io/gsocchat>`_ and our
`Forum <https://bbb.io/gsocml>`_.
......@@ -34,7 +34,7 @@ Idea Submission Process
Mentors should:
1. Submit projects ideas to our `Forum <https://bbb.io/gsocml>`_ and then
1. Submit project ideas to our `Forum <https://bbb.io/gsocml>`_ and then
2. Contribute an update to our :ref:`gsoc-project-ideas` page using our :ref:`gsoc-site-editing-guide` to promote their idea to contributors.
Only ideas deemed by administrators as being sufficiently supported by qualified mentors will be merged.
......@@ -44,11 +44,11 @@ Only ideas deemed by administrators as being sufficiently supported by qualified
BeagleBoard.org mentored GSoC projects are supposed to be for software projects that service the Beagle and general open source
embedded systems community, not theses, how-to guides or what I did over my summer vacation ideas.
Prospective mentors, sudents will use our `Discord <https://bbb.io/gsocchat>`_ and `Forum <https://bbb.io/gsocml>`_
Prospective mentors, students will use our `Discord <https://bbb.io/gsocchat>`_ and `Forum <https://bbb.io/gsocml>`_
to make contact with you, so be sure to provide up-to-date information. Please feel free to add yourself on mentors page and we will monitor
and police that list. Acceptance as an official mentor with the ability to rate proposals and grade contributors will come via the Google system.
We will only approve official mentors who have a proven track record with Beagle, but welcome all community members to provide guidance to both
mentors and contributors to best service the community as a whole. Don’t be shy and don’t be offended when we edit. We are thrilled to have you on-board!
mentors and contributors to best serve the community as a whole. Don’t be shy, and don’t be offended when we edit. We are thrilled to have you on board!
......
......@@ -28,30 +28,31 @@ Ideas
| :bdg-info:`Low complexity` | :bdg-info-line:`90 hours` |
+------------------------------------+-------------------------------+
.. card:: Low-latency I/O RISC-V CPU core in FPGA fabric
.. tip::
Below are the latest project ideas, you can also check our our :ref:`gsoc-old-ideas` and :ref:`Past_Projects` for inspiration.
:fas:`microchip;pst-color-primary` FPGA gateware improvements :bdg-success:`Medium complexity` :bdg-success-line:`175 hours`
.. card:: A Conversational AI Assistant for BeagleBoard using RAG and Fine-tuning
^^^^
:fas:`brain;pst-color-secondary` Deep Learning :bdg-success:`Medium complexity` :bdg-success-line:`175 hours`
BeagleV-Fire features RISC-V 64-bit CPU cores and FPGA fabric. In that FPGA fabric, we'd like to
implement a RISC-V 32-bit CPU core with operations optimized for low-latency GPIO. This is similar
to the programmable real-time unit (PRU) RISC cores popularized on BeagleBone Black.
^^^^
| **Goal:** RISC-V-based CPU on BeagleV-Fire FPGA fabric with GPIO
| **Hardware Skills:** Verilog, Verification, FPGA
| **Software Skills:** RISC-V ISA, assembly, `Linux`_
| **Possible Mentors:** `Cyril Jean <https://forum.beagleboard.org/u/vauban>`_, `Jason Kridner <https://forum.beagleboard.org/u/jkridner>`_
BeagleBoard currently lacks an AI-powered assistant to help users troubleshoot errors. This project aims to address that need while also streamlining the onboarding process for new contributors, enabling them to get started more quickly.
| **Goal:** Develop a domain-specific chatbot for BeagleBoard using a combination of RAG and fine-tuning of an open-source LLM (like Llama 3, Mixtral, or Gemma). This chatbot will assist users with troubleshooting, provide information about BeagleBoard products, and streamline the onboarding process for new contributors.
| **Hardware Skills:** Ability to test applications on BeagleBone AI-64/BeagleY-AI and optimize for performance using quantization techniques.
| **Software Skills:** Python, RAG, Scraping techniques, Fine tuning LLMs, Gradio, Hugging Face Inference Endpoints, NLTK/spaCy, Git
| **Possible Mentors:** `Aryan Nanda <https://forum.beagleboard.org/u/aryan_nanda/>`_
++++
.. button-link:: https://forum.beagleboard.org/t/low-latency-risc-v-i-o-cpu-core/37156
.. button-link:: https://forum.beagleboard.org/t/beaglemind/40806
:color: danger
:expand:
:fab:`discourse;pst-color-light` Discuss on forum
.. card:: Update beagle-tester for mainline testing
:fab:`linux;pst-color-primary` Linux kernel improvements :bdg-success:`Medium complexity` :bdg-danger-line:`350 hours`
......@@ -63,8 +64,8 @@ Ideas
and device-tree overlays on various Beagle computers.
| **Goal:** Execution on Beagle test farm with over 30 mikroBUS boards testing all mikroBUS enabled cape interfaces (PWM, ADC, UART, I2C, SPI, GPIO and interrupt) performing weekly mainline Linux regression verification
| **Hardware Skills:** basic wiring, familiarity with embedded serial interfaces
| **Software Skills:** device-tree, `Linux`_, `C`_, continuous integration with GitLab, Buildroot
| **Hardware Skills:** `basic wiring`_, `embedded serial interfaces`_
| **Software Skills:** `device-tree`_, `Linux`_, `C`_, `OpenBeagle CI`_, `Buildroot`_
| **Possible Mentors:** `Deepak Khatri <https://forum.beagleboard.org/u/lorforlinux>`_, `Anuj Deshpande <https://forum.beagleboard.org/u/Anuj_Deshpande>`_, `Dhruva Gole <https://forum.beagleboard.org/u/dhruvag2000>`_
++++
......@@ -86,7 +87,7 @@ Ideas
acceptable upstream.
| **Goal:** Add functional gaps, submit upstream patches for these drivers and respond to feedback
| **Hardware Skills:** Familiarity with wireless communication
| **Hardware Skills:** `wireless communications`_
| **Software Skills:** `C`_, `Linux`_
| **Possible Mentors:** `Ayush Singh <https://forum.beagleboard.org/u/ayush1325>`_, `Jason Kridner <https://forum.beagleboard.org/u/jkridner>`_
......@@ -108,7 +109,7 @@ Ideas
needs to be cleaned up. We can also work on support for Raspberry Pi if UCSD releases their Hat for it.
| **Goal:** Update librobotcontrol for Robotics Cape on BeagleBone AI, BeagleBone AI-64 and BeagleV-Fire
| **Hardware Skills:** Basic wiring, some DC motor familiarity
| **Hardware Skills:** `basic wiring`_, `motors`_
| **Software Skills:** `C`_, `Linux`_
| **Possible Mentors:** `Deepak Khatri <https://forum.beagleboard.org/u/lorforlinux>`_, `Jason Kridner <https://forum.beagleboard.org/u/jkridner>`_
......@@ -122,7 +123,7 @@ Ideas
.. card:: Upstream Zephyr Support on BBAI-64 R5
:fas:`timeline;pst-color-secondary` RTOS/microkernel imporvements :bdg-success:`Medium complexity` :bdg-success-line:`350 hours`
:fas:`timeline;pst-color-secondary` RTOS/microkernel imporvements :bdg-success:`Medium complexity` :bdg-danger-line:`350 hours`
^^^^
......@@ -143,26 +144,9 @@ Ideas
:fab:`discourse;pst-color-light` Discuss on forum
.. card:: Enhanced Media Experience with AI-Powered Commercial Detection and Replacement
:fas:`brain;pst-color-secondary` Deep Learning :bdg-success:`Medium complexity` :bdg-success-line:`350 hours`
^^^^
Leveraging the capabilities of BeagleBoard’s powerful processing units, the project will focus on creating a real-time, efficient solution that enhances media consumption experiences by seamlessly integrating custom audio streams during commercial breaks.
| **Goal:** Build a deep learning model, training data set, training scripts, and a runtime for detection and modification of the video stream.
| **Hardware Skills:** Ability to capture and display video streams using `Beagleboard ai-64 <https://www.beagleboard.org/boards/beaglebone-ai-64>`_
| **Software Skills:** `Python <https://www.python.org/>`_, `TensorFlow <https://www.tensorflow.org/>`_, `TFlite <https://www.tensorflow.org/lite>`_, `Keras <https://www.tensorflow.org/guide/keras>`_, `GStreamer <https://gstreamer.freedesktop.org/>`_, `OpenCV <https://opencv.org/>`_
| **Possible Mentors:** `Jason Kridner <https://forum.beagleboard.org/u/jkridner>`_, `Deepak Khatri <https://forum.beagleboard.org/u/lorforlinux>`_
++++
.. button-link:: https://forum.beagleboard.org/t/enhanced-media-experience-with-ai-powered-commercial-detection-and-replacement/37358
:color: danger
:expand:
:fab:`discourse;pst-color-light` Discuss on forum
.. button-link:: https://forum.beagleboard.org/tag/gsoc-ideas
:color: danger
......@@ -171,13 +155,7 @@ Ideas
:fab:`discourse;pst-color-light` Visit our forum to see newer ideas being discussed!
.. toctree::
:hidden:
.. tip::
You can also check our our :ref:`gsoc-old-ideas` and :ref:`Past_Projects` for inspiration.
.. _Linux:
https://docs.beagleboard.org/latest/intro/beagle101/linux.html
.. _C:
https://jkridner.beagleboard.io/docs/latest/intro/beagle101/learning-c.html
old/index
\ No newline at end of file
......@@ -17,13 +17,15 @@ into professional automation tasks, is strongly desired.
^^^^
- **Goal:** Complete implementation of librobotcontrol on BeagleBone AI/AI-64.
- **Hardware Skills:** Basic wiring
- **Software Skills:** C, Linux
- **Possible Mentors:** jkridner, lorforlinux
- **Expected Size of Project:** 350 hrs
- **Hardware Skills:** `basic wiring`_, `motors`_
- **Software Skills:** `C`_, `Linux`_
- **Possible Mentors:** `Jason Kridner <https://forum.beagleboard.org/u/jkridner>`_, `Deepak Khatri <https://forum.beagleboard.org/u/lorforlinux>`_
- **Expected Size of Project:** 175 hrs
- **Rating:** Medium
- **Upstream Repository:** https://github.com/jadonk/librobotcontrol/tree/bbai
- **References:**
- **Upstream Repository:** `BeagleBoard.org / librobotcontrol · GitLab <https://openbeagle.org/beagleboard/librobotcontrol>`_
- **References:**
- `Robotics Control Library — BeagleBoard Documentation <https://docs.beagle.cc/projects/librobotcontrol/docs/index.html>`_
- `Robot Control Library: Main Page <https://old.beagleboard.org/static/librobotcontrol/>`_
- http://www.strawsondesign.com/docs/librobotcontrol/index.html
++++
......
......@@ -14,6 +14,48 @@ For some background, be sure to check out `simplify embedded edge AI development
<https://e2e.ti.com/blogs_/b/process/posts/simplify-embedded-edge-ai-development>`_
post from TI.
.. card:: Enhanced Media Experience with AI-Powered Commercial Detection and Replacement
:fas:`brain;pst-color-secondary` Deep Learning :bdg-success:`Medium complexity` :bdg-danger-line:`350 hours`
^^^^
Leveraging the capabilities of BeagleBoard’s powerful processing units, the project will focus on creating a real-time, efficient solution that enhances media consumption experiences by seamlessly integrating custom audio streams during commercial breaks.
| **Goal:** Build a deep learning model, training data set, training scripts, and a runtime for detection and modification of the video stream.
| **Hardware Skills:** Ability to capture and display video streams using `BeagleBone AI-64 <https://www.beagleboard.org/boards/beaglebone-ai-64>`_
| **Software Skills:** `Python <https://www.python.org/>`_, `TensorFlow <https://www.tensorflow.org/>`_, `TFlite <https://www.tensorflow.org/lite>`_, `Keras <https://www.tensorflow.org/guide/keras>`_, `GStreamer <https://gstreamer.freedesktop.org/>`_, `OpenCV <https://opencv.org/>`_
| **Possible Mentors:** `Jason Kridner <https://forum.beagleboard.org/u/jkridner>`_, `Deepak Khatri <https://forum.beagleboard.org/u/lorforlinux>`_
++++
.. button-link:: https://forum.beagleboard.org/t/enhanced-media-experience-with-ai-powered-commercial-detection-and-replacement/37358
:color: danger
:expand:
:fab:`discourse;pst-color-light` Discuss on forum
.. card:: Embedded differentiable logic gate networks for real-time interactive and creative applications
:fas:`brain;pst-color-secondary` Creative AI :bdg-success:`Medium complexity` :bdg-danger-line:`350 hours`
^^^^
This project seeks to explore the potential of creative embedded AI, specifically using `Differentiable Logic (DiffLogic) <https://github.com/Felix-Petersen/difflogic>`_, by creating a system that can perform tasks like machine listening, sensor processing, sound and gesture classification, and generative AI.
| **Goal:** Develop an embedded machine learning system on BeagleBone that leverages `Differentiable Logic (DiffLogic) <https://github.com/Felix-Petersen/difflogic>`_ for real-time interactive music creation and environment sensing.
| **Hardware Skills:** Audio and sensor IO with `Bela.io <http://bela.io>`_
| **Software Skills:** Machine learning, deep learning, BeagleBone Programmable Real Time Unit (PRU) programming (see `PRU Cookbook <https://docs.beagleboard.org/latest/books/pru-cookbook/index.html>`_).
| **Possible Mentors:** `Jack Armitage <https://forum.beagleboard.org/u/jarm>`_, `Chris Kiefer <https://forum.beagleboard.org/u/luuma>`_
++++
.. button-link:: https://forum.beagleboard.org/t/embedded-differentiable-logic-gate-networks-for-real-time-interactive-and-creative-applications/37768
:color: danger
:expand:
:fab:`discourse;pst-color-light` Discuss on forum
.. card::
:fas:`brain;pst-color-secondary` **YOLO models on the X15/AI-64**
......
......@@ -3,6 +3,29 @@
FPGA based projects
####################
.. card:: Low-latency I/O RISC-V CPU core in FPGA fabric
:fas:`microchip;pst-color-primary` FPGA gateware improvements :bdg-success:`Medium complexity` :bdg-success-line:`175 hours`
^^^^
BeagleV-Fire features RISC-V 64-bit CPU cores and FPGA fabric. In that FPGA fabric, we'd like to
implement a RISC-V 32-bit CPU core with operations optimized for low-latency GPIO. This is similar
to the programmable real-time unit (PRU) RISC cores popularized on BeagleBone Black.
| **Goal:** RISC-V-based CPU on BeagleV-Fire FPGA fabric with GPIO
| **Hardware Skills:** `Verilog`_, `verification`_, `FPGA`_
| **Software Skills:** `RISC-V ISA`_, `assembly`_, `Linux`_
| **Possible Mentors:** `Cyril Jean <https://forum.beagleboard.org/u/vauban>`_, `Jason Kridner <https://forum.beagleboard.org/u/jkridner>`_
++++
.. button-link:: https://forum.beagleboard.org/t/low-latency-risc-v-i-o-cpu-core/37156
:color: danger
:expand:
:fab:`discourse;pst-color-light` Discuss on forum
.. card::
:fas:`microchip;pst-color-secondary` **RISC-V Based PRU on FPGA**
......
:orphan:
.. _gsoc-old-ideas:
Old GSoC Ideas
......
.. _gsoc-2024-projects:
:far:`calendar-days` 2024
##########################
.. note:: Only 3 out of 4 :ref:`accepted students <gsoc-2024-proposals>` were able to complete the program in 2024.
Enhanced Media Experience with AI-Powered Commercial Detection and Replacement
********************************************************************************
.. youtube:: Kagg8JycOfo
:width: 100%
| **Summary:** Leveraging the capabilities of BeagleBoard’s powerful processing units, the project will focus on creating a real-time, efficient solution that enhances media consumption experiences by seamlessly integrating custom audio streams during commercial breaks.
- Develop a neural network model: Combine Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to analyze video and audio data, accurately identifying commercial segments within video streams.
- Implement a real-time pipeline: Create a real-time pipeline for BeagleBoard that utilizes the trained model to detect commercials in real-time and replace them with alternative content or obfuscate them, alongside replacing the audio with predefined streams.
- Optimize for BeagleBoard: Ensure the entire system is optimized for real-time performance on BeagleBoard hardware, taking into account its unique computational capabilities and constraints.
**Contributor:** Aryan Nanda
**Mentors:** `Jason Kridner <https://forum.beagleboard.org/u/jkridner>`_, `Deepak Khatri <https://forum.beagleboard.org/u/lorforlinux>`_, Kumar Abhishek
.. grid:: 2 2 2 2
.. grid-item::
.. button-link:: https://summerofcode.withgoogle.com/archive/2024/projects/UOX7iDEU
:color: info
:shadow:
:expand:
:fab:`google;pst-color-light` - GSoC Registry
.. grid-item::
.. button-ref:: gsoc-2024-proposal-aryan-nanda
:color: primary
:shadow:
:expand:
Proposal
Low-latency I/O RISC-V CPU core in FPGA fabric
************************************************
.. youtube:: ic0RRK6d3hg
:width: 100%
| **Summary:** Implementation of PRU subsystem on BeagleV-Fire’s FPGA fabric, resulting in a real-time microcontroller system working alongside the main CPU, providing low-latency access to I/O.
**Contributor:** Atharva Kashalkar
**Mentors:** `Cyril Jean <https://forum.beagleboard.org/u/vauban>`_, `Jason Kridner <https://forum.beagleboard.org/u/jkridner>`_, Vedant Paranjape, Kumar Abhishek
.. grid:: 2 2 2 2
.. grid-item::
.. button-link:: https://summerofcode.withgoogle.com/archive/2024/projects/KjUoFlg2
:color: info
:shadow:
:expand:
:fab:`google;pst-color-light` - GSoC Registry
.. grid-item::
.. button-ref:: gsoc-2024-proposal-roger18
:color: primary
:shadow:
:expand:
Proposal
Differentiable Logic for Interactive Systems and Generative Music - Ian Clester
********************************************************************************
.. youtube:: NvHxMCF8sAQ
:width: 100%
| **Summary:** Developing an embedded machine learning system on BeagleBoard that leverages Differentiable Logic (DiffLogic) for real-time interactive music creation and environment sensing. The system will enable on-device learning, fine-tuning, and efficient processing for applications in new interfaces for musical expression.
**Contributor:** Ian Clester
**Mentors:** `Jack Armitage <https://forum.beagleboard.org/u/jarm/summary>`_, Chris Kiefer
.. grid:: 2 2 2 2
.. grid-item::
.. button-link:: https://summerofcode.withgoogle.com/archive/2024/projects/FBk0MM8g
:color: info
:shadow:
:expand:
:fab:`google;pst-color-light` - GSoC Registry
.. grid-item::
.. button-ref:: gsoc-2024-proposal-ijc
:color: primary
:shadow:
:expand:
Proposal
\ No newline at end of file
......@@ -14,6 +14,11 @@ GSoC over the previous years is given in the section that follows.
:margin: 4 4 0 0
:gutter: 4
.. grid-item-card:: :far:`calendar-days` 2024
:text-align: center
:link: gsoc-2024-projects
:link-type: ref
.. grid-item-card:: :far:`calendar-days` 2023
:text-align: center
:link: gsoc-2023-projects
......@@ -83,6 +88,7 @@ GSoC over the previous years is given in the section that follows.
:maxdepth: 1
:hidden:
2024
2023
2022
2021
......
proposals/2024/aryan_nanda/images/Figure6.png

108 KiB

.. _gsoc-2024-proposal-aryan-nanda:
.. _gsoc-proposal-template:
Enhanced Media Experience with AI-Powered Commercial Detection and Replacement
###############################################################################
Enhanced Media Experience with AI-Powered Commercial Detection and Replacement - Aryan Nanda
############################################################################################
Introduction
*************
......@@ -15,12 +14,13 @@ Summary links
- **Contributor:** `Aryan Nanda <https://forum.beagleboard.org/u/aryan_nanda>`_
- **Mentors:** `Jason Kridner <https://forum.beagleboard.org/u/jkridner>`_, `Deepak Khatri <https://forum.beagleboard.org/u/lorforlinux>`_
- **GSoC Repository:** TBD
- **Repository:** `Main Code Repostiory on Gitlab <https://openbeagle.org/aryan_nanda/gsoc_2024-enhanced_media_experience_with_ai-powered_commercial_detection_and_replacement>`_, `Mirror of Code Repository on Github <https://github.com/AryanNanda17/GSoC_2024-Enhanced_Media_Experience_with_AI-Powered_Commercial_Detection_and_Replacement>`_
- **Weekly Updates:** `Forum Thread <https://forum.beagleboard.org/t/weekly-progress-report-thread-enhanced-media-experience-with-ai-powered-commercial-detection-and-replacement/38487>`_
Status
=======
This project is currently just a proposal.
This project has been accepted for GSoC 2024.
Proposal
========
......@@ -31,7 +31,7 @@ Proposal
About
=====
- **Resume** - Find my resume `here <https://drive.google.com/file/d/1BblSPdncbjKf4qG7s9ldb7ssIhfGN5bA/view?usp=sharing>`_
- **Resume** - Find my resume `here <https://drive.google.com/file/d/1UPXxEo_Z-qPHpVlnPLcai9cBInQj_c5j/view?usp=sharing>`_
- **Forum:** :fab:`discourse` `u/aryan_nanda <https://forum.beagleboard.org/u/aryan_nanda>`_
- **OpenBeagle:** :fab:`gitlab` `aryan_nanda <https://openbeagle.org/aryan_nanda>`_
- **Github:** :fab:`github` `AryanNanda17 <https://github.com/AryanNanda17>`_
......@@ -130,13 +130,13 @@ This way we can use the features learned by MoViNets on the larger dataset with
This can help improve the model's performance even with limited data.
.. image:: Assets/Figure1.png
.. image:: images/Figure1.png
:alt: Stream buffer in MoViNets
.. centered::
Figure 1: Stream buffer in MoViNets [2]
.. image:: Assets/Figure2.png
.. image:: images/Figure2.png
:alt: Standard Convolution Vs Causal Convolution
.. centered::
......@@ -150,7 +150,7 @@ The depth of the queue will be determined through experimentation to find the op
The Conv+LSTMs model will perform well as it considers both the spatial and temporal features of videos just like a Conv3D model. The only reason it is not my first choice is because MoViNets are considered to be better for real-time performance.
.. image:: Assets/Figure3.png
.. image:: images/Figure3.png
:alt: Conv3D+LSTMs
.. centered::
......@@ -277,7 +277,7 @@ In order to infer a DNN, SDK expects the DNN and associated artifacts in the bel
Therefore, after choosing the model to be used in GStreamer pipeline, I will generate the artifacts directory by following the instructions mentioned in TexasInstruments:edgeai-tidl-tools examples [7].
.. image:: Assets/Figure4.png
.. image:: images/Figure4.png
:alt: TFLite Runtime
.. centered::
......@@ -303,7 +303,7 @@ NNStreamer provides efficient and flexible data streaming for machine learning
applications, making it suitable for tasks such as running inference on video frames.
So, I will use NNStreamer elements to do inferencing of videos.
.. image:: Assets/Figure5.png
.. image:: images/Figure5.png
:alt: GStreamer Pipeline
.. centered::
......@@ -320,7 +320,7 @@ The above GStreamer pipeline is a demo pipeline inspired from edge_ai_apps/data_
Project Workflow
===================
.. image:: Assets/Figure6.png
.. image:: images/Figure6.png
:alt: Project Workflow
.. centered::
......@@ -351,73 +351,75 @@ Timeline summary
.. table::
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| Date | Activity |
+========================+========================================================================================================================================================+
| February 26 - March 3 | Connect with possible mentors and request review on first draft |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| March 4 - March 10 | Complete prerequisites, verify value to community and request review on second draft |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| March 11 - March 20 | Finalized timeline and request review on final draft |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| March 21 - April 2 | Proposal review and Submit application |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| April 3 - May 1 | Understanding GStreamer pipeline and TFLite runtime of BeagleBone AI-64. |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| May 2 - May 10 | Start bonding and Discussing implementation ideas with mentors. |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| May 11 - May 31 | Focus on college exams. |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| June 1 - June 3 | Start coding and introductory video |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| June 3 - June 9 | :ref:`milestone #1<Milestone1>` -> Releasing introductory video and developing Commercial dataset |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| June 10 - June 16 | :ref:`milestone #2<Milestone2>` -> Developing Non-Commercial dataset and dataset Preprocessing |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| June 17 - June 23 | :ref:`milestone #3<Milestone3>` -> Transfer learning and fine-tuning MoViNets architecture |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| June 24 - June 30 | :ref:`milestone #4<Milestone4>` -> Transfer learning and fine-tuning ResNet architecture |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| July 1 - July 7 | :ref:`milestone #5<Milestone5>` -> Evaluate performance metrics to choose the best-performing model. |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| July 8 - July 14 | :ref:`Submit midterm evaluations <Submit midterm evaluation>` |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| July 15 - July 21 | :ref:`milestone #6<Milestone6>` -> Finalizing the best model by performing real-time inferencing |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| July 22 - July 28 | :ref:`milestone #7<Milestone7>` -> Compiling the model and generating artifacts and building pre-processing part of GStreamer pipeline |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| July 29 - August 4 | :ref:`milestone #8<Milestone8>` -> Building the compute pipeline using NNStreamer |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| August 5 - August 11 | :ref:`milestone #9<Milestone9>` -> Building the post-processing part of GStreamer pipeline |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| August 12 - August 18 | :ref:`milestone #10<Milestone10>` -> Enhancing real-time performance |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| August 19 | :ref:`Submit final project video, submit final work to GSoC site and complete final mentor evaluation<Final project video>` |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
+------------------------+--------------------------------------------------------------------------------------+
| Date | Activity |
+========================+======================================================================================+
| February 26 - March 3 | Connect with possible mentors and request review on first draft |
+------------------------+--------------------------------------------------------------------------------------+
| March 4 - March 10 | Complete prerequisites, verify value to community and request review on second draft |
+------------------------+--------------------------------------------------------------------------------------+
| March 11 - March 20 | Finalized timeline and request review on final draft |
+------------------------+--------------------------------------------------------------------------------------+
| March 21 - April 2 | Proposal review and Submit application |
+------------------------+--------------------------------------------------------------------------------------+
| April 3 - May 1 | Understanding GStreamer pipeline and TFLite runtime of BeagleBone AI-64. |
+------------------------+--------------------------------------------------------------------------------------+
| May 2 - May 10 | :ref:`ACRBonding` |
+------------------------+--------------------------------------------------------------------------------------+
| May 11 - May 31 | Focus on college exams. |
+------------------------+--------------------------------------------------------------------------------------+
| June 1 - June 3 | Start coding and introductory video |
+------------------------+--------------------------------------------------------------------------------------+
| June 3 - June 9 | :ref:`ACRMilestone1` |
+------------------------+--------------------------------------------------------------------------------------+
| June 10 - June 16 | :ref:`ACRMilestone2` |
+------------------------+--------------------------------------------------------------------------------------+
| June 17 - June 23 | :ref:`ACRMilestone3` |
+------------------------+--------------------------------------------------------------------------------------+
| June 24 - June 30 | :ref:`ACRMilestone4` |
+------------------------+--------------------------------------------------------------------------------------+
| July 1 - July 7 | :ref:`ACRMilestone5` |
+------------------------+--------------------------------------------------------------------------------------+
| July 8 - July 14 | :ref:`ACRSubmit-midterm-evaluations` |
+------------------------+--------------------------------------------------------------------------------------+
| July 15 - July 21 | :ref:`ACRMilestone6` |
+------------------------+--------------------------------------------------------------------------------------+
| July 22 - July 28 | :ref:`ACRMilestone7` |
+------------------------+--------------------------------------------------------------------------------------+
| July 29 - August 4 | :ref:`ACRMilestone8` |
+------------------------+--------------------------------------------------------------------------------------+
| August 5 - August 11 | :ref:`ACRMilestone9` |
+------------------------+--------------------------------------------------------------------------------------+
| August 12 - August 18 | :ref:`ACRMilestone10` |
+------------------------+--------------------------------------------------------------------------------------+
| August 19 | :ref:`ACRFinal-project-video` |
+------------------------+--------------------------------------------------------------------------------------+
Timeline detailed
==================
.. _ACRBonding:
Community Bonding Period (May 1st - May 10th)
==============================================
----------------------------------------------
- Discuss implementation ideas with mentors.
- Discuss the scope of the project.
.. _Milestone1:
.. _ACRMilestone1:
Milestone #1, Introductory YouTube video (June 3rd)
===================================================
Milestone #1, Releasing introductory video and developing commercial dataset (June 3)
-------------------------------------------------------------------------------------
- Making an Introductory Video.
- Commercial dataset acquisition:
- Web scrape videos marked as advertisements from YouTube 8-M dataset.
- Ensure proper labeling and categorization of commercial videos.
.. _Milestone2:
.. _ACRMilestone2:
Milestone #2 (June 10th)
==========================
Milestone #2, Developing non-commercial dataset and dataset preprocessing (June 10)
-------------------------------------------------------------------------------------
- Non-commercial dataset acquisition:
- Web scrape random videos from other categories of YouTube 8-M dataset.
......@@ -427,38 +429,39 @@ Milestone #2 (June 10th)
- Divide datasets into train, validation, and test sets.
- Perform random shuffling of data to maintain temporal dependencies.
.. _Milestone3:
.. _ACRMilestone3:
Milestone #3 (June 17th)
=========================
Milestone #3, Transfer learning and fine-tuning MoViNets architecture (June 17)
-------------------------------------------------------------------------------------
- Transfer learning and fine-tuning MoViNets architecture:
- Apply transfer learning on MoViNets and fine-tune its last few layers.
- Train MoViNets on the prepared dataset for video classification.
.. _Milestone4:
.. _ACRMilestone4:
Milestone #4 (June 24th)
==========================
Milestone #4, Transfer learning and fine-tuning ResNet architecture (June 24)
-------------------------------------------------------------------------------------
- Transfer learning and fine-tuning ResNet architecture:
- Adding additional layers of LSTMs for extracting temporal dependencies.
- Developing ResNet-LSTMs model architecture for video classification.
- Train the ResNet-LSTMs model on the prepared dataset.
.. _Milestone5:
.. _ACRMilestone5:
Milestone #5, Evaluate performance metrics to choose the best-performing model (July 1)
---------------------------------------------------------------------------------------
Milestone #5 (July 1st)
========================
- Finalize the best model:
- Save all trained models to local disk
- Evaluate performance metrics to choose the best-performing model.
.. _Submit midterm evaluation:
.. _ACRSubmit-midterm-evaluations:
Submit midterm evaluations (July 8th)
=====================================
-------------------------------------------------------------------------------------
- Document the progress made during the first phase of the project.
......@@ -466,60 +469,60 @@ Submit midterm evaluations (July 8th)
**July 12 - 18:00 UTC:** Midterm evaluation deadline (standard coding period)
.. _Milestone6:
.. _ACRMilestone6:
Milestone #6 (July 15th)
=========================
Milestone #6, Finalizing the best model by performing real-time inferencing (July 15)
--------------------------------------------------------------------------------------
- Finalize the best model:
- Perform real-time inference using OpenCV to determine the model that yields the best results with high-performance.
- Based on all the options tried in Phase 1, decide on the final model to be used in the GStreamer pipeline.
.. _Milestone7:
.. _ACRMilestone7:
Milestone #7 (July 22nd)
=========================
Milestone #7, Compiling the model and generating artifacts and building pre-processing part of GStreamer pipeline (July 22)
----------------------------------------------------------------------------------------------------------------------------
- Compile the chosen model and generate artifacts for TFLite runtime.
- Building the pre-processing part of GStreamer pipeline:
- Develop the pre-processing module to prepare video frames for inference.
.. _Milestone8:
.. _ACRMilestone8:
Milestone #8 (July 29th)
=========================
Milestone #8, Building the compute pipeline using NNStreamer (July 29)
----------------------------------------------------------------------------------------------------------------------------
- Building the compute pipeline using NNStreamer:
- Implement NNStreamer for inferencing videos using the compiled model.
.. _Milestone9:
.. _ACRMilestone9:
Milestone #9 (Aug 5th)
=======================
Milestone #9, Building the post-processing part of GStreamer pipeline (August 5)
----------------------------------------------------------------------------------------------------------------------------
- Building the post-processing part of GStreamer pipeline:
- Develop the post-processing module to perform actions based on classification results.
- Implement replacement or obscuring of commercial segments and audio substitution.
.. _Milestone10:
.. _ACRMilestone10:
Milestone #10 (Aug 12th)
========================
Milestone #10, Enhancing real-time performance (August 12)
----------------------------------------------------------------------------------------------------------------------------
- Enhancing real-time performance:
- Optimize the GStreamer pipeline for real-time performance using native hardware accelerators.
- Ensure smooth and efficient processing of video streams.
.. _Final project video:
.. _ACRFinal-project-video:
Final YouTube video (Aug 19th)
===============================
Submit final project video, submit final work to GSoC site and complete final mentor evaluation (August 19)
----------------------------------------------------------------------------------------------------------------------------
- Submit final project video, submit final work to GSoC site and complete final mentor evaluation.
Final Submission (Aug 24nd)
============================
----------------------------------------------------------------------------------------------------------------------------
.. important::
......@@ -530,7 +533,7 @@ Final Submission (Aug 24nd)
evaluations (standard coding period)
Initial results (September 3)
=============================
----------------------------------------------------------------------------------------------------------------------------
.. important::
**September 3 - November 4:** GSoC contributors with extended timelines continue coding
......@@ -565,7 +568,7 @@ Contingency
- If I get stuck on my project and my mentor isn’t around, I will use the following resources:-
- `MoViNets <https://www.tensorflow.org/hub/tutorials/movinet>`_
- `GStreamer Docs <https://gstreamer.freedesktop.org/>`_
- `BeagleBone AI-64 <https://docs.beagleboard.org/latest/boards/beaglebone/ai-64/01-introduction.html>`_
- `BeagleBone AI-64 docs <https://docs.beagleboard.org/latest/boards/beaglebone/ai-64/01-introduction.html>`_
- `NNStreamer <https://nnstreamer.github.io/>`_
- Moreover, the BeagleBoard community is extremely helpful and active in resolving doubts, which makes it a great going for the project resources and clarification.
- I intend to remain involved and provide ongoing support for this project beyond the duration of the GSOC timeline.
......@@ -586,7 +589,7 @@ Misc
- Relevant Coursework: `Neural Networks and Deep Learning <https://www.coursera.org/account/accomplishments/verify/LKHTEA9XRWML>`_, `Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization <https://www.coursera.org/account/accomplishments/verify/E52UFAHAY5UG>`_, `Convolutional Neural Networks <https://www.coursera.org/account/accomplishments/verify/9L4QL25AEL3L>`_
References
===========
***********
1. Youtube: `YouTube-8M: A Large and Diverse Labeled Video Dataset <https://research.google.com/youtube8m/>`_
2. Dan Kondratyuk*, Liangzhe Yuan, Yandong Li, Li Zhang, Mingxing Tan, Matthew Brown, Boqing Gong: `MoViNets: Mobile Video Networks for Efficient Video Recognition. <https://arxiv.org/pdf/2103.11511.pdf>`_
......
.. _gsoc-2024-proposal-ijc:
Differentiable Logic for Interactive Systems and Generative Music - Ian Clester
###############################################################################
Introduction
*************
Summary links
=============
- **Contributor:** `Ian Clester <https://forum.beagleboard.org/u/ijc>`_
- **Mentors:** `Jack Armitage <https://forum.beagleboard.org/u/jarm>`_, `Chris Kiefer <https://forum.beagleboard.org/u/luuma>`_
- **GSoC:** `Google Summer of Code <https://summerofcode.withgoogle.com/archive/2023/projects/iTfGBkDk>`_
- **Weekly Updates:** `Forum Thread <https://forum.beagleboard.org/t/weekly-progress-report-differentiable-logic-for-interactive-systems-and-generative-music/38486>`_
- **Repository**: `embedded-difflogic <https://openbeagle.org/ijc/embedded-difflogic>`_
Status
=======
This project has been accepted for GSoC 2024.
About
=====
- **Forum:** :fab:`discourse` `u/ijc (Ian Clester) <https://forum.beagleboard.org/u/ijc>`_
- **OpenBeagle:** :fab:`gitlab` `openbeagle.org/ijc <https://openbeagle.org/ijc>`_
- **Discord:** :fas:`comments` `bbb.io/gsocchat <https://bbb.io/gsocchat>`_
- **Github:** :fab:`github` `ijc8 (Ian Clester) <https://github.com/ijc8>`_
- **School:** :fas:`school` Georgia Institute of Technology
- **Country:** :fas:`flag` United States
- **Primary language:** :fas:`language` English
- **Typical work hours:** :fas:`clock` 9AM-6PM US Eastern
- **Previous GSoC participation:** :fab:`google` `Better Faust on the Web (2023) <https://summerofcode.withgoogle.com/archive/2023/projects/L6oI4LhW>`_
Project
********
**Project name:** Differentiable Logic for Interactive Systems and Generative Music
Description
============
The general aim of this project is to enable the development of models that are suitably efficient for use in real-time interactive applications on embedded systems (particularly the BeagleBone-based Bela).
At the project's core is difflogic [1]_, a recent technique that employs sparsely-connected network composed of basic logic gates (rather than densley-connected neurons with complex activation functions) to obtain small models and fast inference.
Thus, the first and foremost goal of the project is to enable a convenient workflow for developing difflogic models and running them on the Bela. The expected use case is developing and training models on a larger machine (e.g. a laptop, desktop, or server), followed by exporting the model to C and cross-compiling it for the BeagleBone - either the main CPU (ARM Cortex-A8) or the PRUs.
To support this workflow, I will develop wrappers for exporting compiled difflogic models for use in the various languages supported on Bela (C++, Pure Data, SuperCollider, Csound).
These wrappers will likely take inspiration from other projects that bring machine learning into computer music environments, such as `nn~ <https://github.com/acids-ircam/nn_tilde>`_ and `FluCoMa <https://www.flucoma.org/>`_.
This first goal, along with profiling and benchmarking the performance of difflogic models on both the main CPU and the PRUs, constitutes roughly the first half of the project.
The other, more exploratory half of the project consists of building out integrations and applications of difflogic for the rapid development of useful audio models.
To that end, I intend to explore the possibilities of combining difflogic networks with techniques such as DDSP (differentiable digital signal processing) [2]_, possibly also leveraging Faust auto-differentation.
I also intend to investigate the feasibility of "porting" well-known ML architectures such as VAEs to difflogic networks, and of training difflogic networks to approximate the behavior of existing neural networks (i.e. knowledge distillation).
Audio models such as RAVE [3]_, PESTO [4]_, and Whisper [5]_ may be of particular interest.
Furthermore, I will explore opportunities to combine difflogic networks with other cheap, effective techniques like the $Q recognizer [6]_ for gestural control, linear predictive coding for audio analysis & resynthesis, and toolkits such as `RapidLib <https://github.com/jarmitage/RapidLibBela>`_.
Such combinations may be particularly useful for interactive machine learning (as in Wekinator [7]_), should fine-tuning difflogic models on-device prove too costly.
In this phase of the project, I will develop example applications involving sound analysis, classification, and synthesis, and experiment with interactive machine learning.
Finally, I intend to dedicate some time to a specific creative application: generating networks of logic gates to approximate particular sounds and exploring the space of such sound-generating networks.
This application is inspired by bytebeat [8]_, a practice which involves writing short expressions that describe audio as a function of time, generating music sample-by-sample.
Typically, these expressions involve many bit-twiddling operations, consisting primarily of logic gates (bitwise AND, OR, XOR, NOT) and shifts --- a fact that suggests a remarkably good fit for difflogic, wherein models consist of networks of gates.
Other inspirations include work on sound matching: reproducing a given sound or family of sounds by estimating synthesizer parameters [9]_, generating patches [10]_, or training models [11]_.
In this vein, I will attempt to train difflogic gates to reproduce particular sounds, treating the entire network as a bytebeat-style function of time (sample index) that outputs samples.
Thanks to the tricks difflogic employs to train a network of discrete gates, this approach will enable sound matching via gradient descent and backpropagation (as in e.g. DDSP) rather than evolutionary methods, while still ultimately generating a discrete function.
Lastly, I will build an interactive application to explore the space of sound-generating networks (e.g. by mutating a network, or morphing between two networks) and visualize the execution of logic gate networks.
Software
=========
- C
- C++
- Python
- PyTorch
- difflogic
- dasp
- Faust
- Linux
Hardware
========
- Bela
- BeagleBone Black
- Bela Cape
- Microphone
- Speaker
- OLED screen
Timeline
********
.. note:: This timeline is based on the `official GSoC timeline <https://developers.google.com/open-source/gsoc/timeline>`_
Timeline summary
=================
.. table::
+------------------------+----------------------------------------------------------------------------------------------------+
| Date | Activity |
+========================+====================================================================================================+
| February 26 | Connect with possible mentors and request review on first draft |
+------------------------+----------------------------------------------------------------------------------------------------+
| March 4 | Complete prerequisites, verify value to community and request review on second draft |
+------------------------+----------------------------------------------------------------------------------------------------+
| March 11 | Finalized timeline and request review on final draft |
+------------------------+----------------------------------------------------------------------------------------------------+
| March 21 | Submit application |
+------------------------+----------------------------------------------------------------------------------------------------+
| May 1 | Start bonding |
+------------------------+----------------------------------------------------------------------------------------------------+
| May 27 | Start coding and introductory video |
+------------------------+----------------------------------------------------------------------------------------------------+
| June 3 | Release introductory video and complete milestone #1 |
+------------------------+----------------------------------------------------------------------------------------------------+
| June 10 | Complete milestone #2 |
+------------------------+----------------------------------------------------------------------------------------------------+
| June 17 | Complete milestone #3 |
+------------------------+----------------------------------------------------------------------------------------------------+
| June 24 | Complete milestone #4 |
+------------------------+----------------------------------------------------------------------------------------------------+
| July 1 | Complete milestone #5 |
+------------------------+----------------------------------------------------------------------------------------------------+
| July 8 | Submit midterm evaluations |
+------------------------+----------------------------------------------------------------------------------------------------+
| July 15 | Complete milestone #6 |
+------------------------+----------------------------------------------------------------------------------------------------+
| July 22 | Complete milestone #7 |
+------------------------+----------------------------------------------------------------------------------------------------+
| July 29 | Complete milestone #8 |
+------------------------+----------------------------------------------------------------------------------------------------+
| August 5 | Complete milestone #9 |
+------------------------+----------------------------------------------------------------------------------------------------+
| August 12 | Complete milestone #10 |
+------------------------+----------------------------------------------------------------------------------------------------+
| August 19 | Submit final project video, submit final work to GSoC site and complete final mentor evaluation |
+------------------------+----------------------------------------------------------------------------------------------------+
Timeline detailed
=================
Community Bonding Period (May 1st - May 26th)
----------------------------------------------------------------------------
GSoC contributors get to know mentors, read documentation, get up to speed to begin working on their projects
Coding begins (May 27th)
----------------------------------------------------------------------------
Milestone #1, Introductory YouTube video (June 3rd)
----------------------------------------------------------------------------
- Setup development environment
- Train trivial difflogic network on laptop & run generated C on Bela (main CPU)
Milestone #2 (June 10th)
----------------------------------------------------------------------------
- Run difflogic network on PRU
- Perform feature extraction (FFT, MFCCs) on PRU
Milestone #3 (June 17th)
----------------------------------------------------------------------------
- Build wrappers to simplify use of difflogic networks in Bela projects
- C++ (namespace & wrapper around difflogic-generated C)
- SuperCollider (UGen)
Milestone #4 (June 24th)
----------------------------------------------------------------------------
- Build wrappers to simplify use of difflogic networks in Bela projects
- Pure Data (external)
- Csound (UDO)
Milestone #5 (July 1st)
----------------------------------------------------------------------------
- Explore feasibility of combining difflogic with DDSP techniques (via dasp and possibly Faust auto-differentiation)
- Use difflogic network to control synthesizer parameters
Submit midterm evaluations (July 8th)
----------------------------------------------------------------------------
.. important::
**July 12 - 18:00 UTC:** Midterm evaluation deadline (standard coding period)
Milestone #6 (July 15th)
----------------------------------------------------------------------------
- Investigate feasibility of interactive machine learning (e.g. fine-tuning) with difflogic networks
- Combine difflogic network with complementary cheaply techniques (e.g. LPC, template matching via $Q, RapidLib)
Milestone #7 (July 22nd)
----------------------------------------------------------------------------
- Work on example applications
- Classify short mouth sounds for interactive system control (à la `parrot.py <https://github.com/chaosparrot/parrot.py>`_)
- Perform real-time pitch estimation (à la PESTO)
Milestone #8 (July 29th)
----------------------------------------------------------------------------
- Experiment with implementing popular architectures (e.g. VAEs, as in RAVE) as difflogic networks
- Experiment with difflogic knowledge distillation: training a difflogic network to approximate the behavior of a pre-trained, conventional neural network (student/teacher)
Milestone #9 (Aug 5th)
----------------------------------------------------------------------------
- Experiment with training difflogic networks for sound reconstruction
- Bytebeat-inspired: feed increasing timestamps to network, get subsequent audio samples out
Milestone #10 (Aug 12th)
----------------------------------------------------------------------------
- Creative application: Interactive exploration of space of difflogic sound reconstruction models
- "Glitch" - random perturbations of network (mutate gates & connections)
- "Morph" - interpolate (in terms of tree edit-distance) between different sound-generating networks
- Visualize difflogic networks & their execution
Final YouTube video (Aug 19th)
----------------------------------------------------------------------------
Submit final project video, submit final work to GSoC site
and complete final mentor evaluation
Final Submission (Aug 24nd)
----------------------------------------------------------------------------
.. important::
**August 19 - 26 - 18:00 UTC:** Final week: GSoC contributors submit their final work
product and their final mentor evaluation (standard coding period)
**August 26 - September 2 - 18:00 UTC:** Mentors submit final GSoC contributor
evaluations (standard coding period)
Initial results (September 3)
----------------------------------------------------------------------------
.. important::
**September 3 - November 4:** GSoC contributors with extended timelines continue coding
**November 4 - 18:00 UTC:** Final date for all GSoC contributors to submit their final work product and final evaluation
**November 11 - 18:00 UTC:** Final date for mentors to submit evaluations for GSoC contributor projects with extended deadline
Experience and approach
***********************
I have extensive experience with embedded systems and real-time audio.
As an undergraduate, I worked on embedded systems during internships at Astranis and Google.
For a final class project, I developed a multi-effects pedal with a configurable signal chain in C using fixed-point arithmetic on the `Cypress PSoC 5 <https://www.infineon.com/cms/en/product/microcontroller/32-bit-psoc-arm-cortex-microcontroller/32-bit-psoc-5-lp-arm-cortex-m3/>`_ (an ARM-based system-on-a-chip with configurable digital and analog blocks).
My `master's work <https://dspace.mit.edu/handle/1721.1/129201>`_ involved localizing RFID tags using software-defined radios with framerates sufficient for interactive systems.
Currently, I am a teaching assistant for a class on Audio Software Engineering (in Rust, with a focus on real-time audio software), in which I have been responsible for preparing much of the material and lectures.
I have worked with a variety of microcontrollers and single-board computers, from writing assembly on the Intel 8051, to C++ on Arduinos and ESP32s, to Python and JS on Raspberry Pis.
I have also employed machine learning techniques to build interactive systems.
In a graduate course on multimodal user interaction, I gained experience with classic machine learning techniques, and employed cheap techniques for gesture recognition in a `tablet-based musical sketchpad <https://github.com/ijc8/notepad>`_.
In the meantime, I have been following developments in machine learning for audio (particularly those that are feasible to run locally, especially sans GPU), and I have experimented with models such as RAVE and Whisper (using the latter for an recent interactive audiovisual `hackathon project <https://github.com/ijc8/hackathon-2024>`_).
Much of my graduate work has focused on generative music and computational representations of music.
My recent work on `ScoreCard <https://ijc8.me/s>`_ has put an extreme emphasis on fitting music-generating programs (typically written in C) into efficient, self-contained packages that are small enough to store in a QR code (\< 3kB).
Previous projects such as `Blocks <https://ijc8.me/blocks>`_ (an audiovisual installation) and `kilobeat <https://ijc8.me/kilobeat>`_ (a collaborative livecoding tool) have probed the musical potential of extremely short fragments of code (bytebeat & floatbeat expressions).
These projects also explore methods of visualizing musical programs, either in terms of their output or their execution.
More information about my work is available on `my website <https://ijc8.me>`_ and `GitHub <https://github.com/ijc8>`_.
I am particularly interested in difflogic because it occupies an intersection between lightweight machine learning techniques (cheaper is better!) and compact representations of musical models (less is more!), and I am strongly motivated to see what it can do.
Contingency
===========
If I get stuck on something related to BeagleBoard or Bela development, I plan to take advantage of resources within those communities (such as documentation, forums, and Discord servers).
If I get stuck on something related to ML or DSP, I plan to refer back to reference texts and the papers and code of related work (DDSP, RAVE, PESTO, etc.), and I may reach out to colleagues within the ML space (such as those in the Music Information Retrieval lab within my department) for advice.
If I get stuck on something related to music or design, I plan to take a break and go on a walk. :-)
Benefit
========
The first half of this project will provide a straightforward means to develop models with difflogic and run them on embedded systems such as BeagleBoards and particularly Bela. (The wrappers for Bela's supported languages may also prove generally useful outside of embedded contexts.)
Making it easier for practitioners to use difflogic models in creative applications will, in turn, aid in the development of NIMEs and DMIs that can benefit from the small size and fast inference (and corresponding portability and low latency) of difflogic networks.
The second half of this project, depending on the results of my explorations, may demonstrate useful ways to combine difflogic with other ML & DSP techniques, and provide some useful and interesting audio-focused applications to serve as effective demonstrations of the possibilities for ML on the BeagleBoard and possible starting points for others.
Misc
====
`Here <https://github.com/jadonk/gsoc-application/pull/194>`_ is my pull request demonstrating cross-compilation and version control.
References
==========
.. [1] Petersen, F. et al. 2022. Deep Differentiable Logic Gate Networks. Proceedings of the 36th Conference on Neural Information Processing Systems (Oct. 2022).
.. [2] Engel, J. et al. 2020. DDSP: Differentiable Digital Signal Processing. Proceedings of the International Conference on Learning Representations (2020).
.. [3] Caillon, A. and Esling, P. 2021. RAVE: A variational autoencoder for fast and high-quality neural audio synthesis. arXiv.
.. [4] Riou, A. et al. 2023. PESTO: Pitch Estimation with Self-supervised Transposition-equivariant Objective. Proceedings of the 24th International Society for Music Information Retrieval Conference (Sep. 2023).
.. [5] Radford, A. et al. 2023. Robust Speech Recognition via Large-Scale Weak Supervision. Proceedings of the 40th International Conference on Machine Learning (2023).
.. [6] Vatavu, R.-D. et al. 2018. $Q: a super-quick, articulation-invariant stroke-gesture recognizer for low-resource devices. Proceedings of the 20th International Conference on Human-Computer Interaction with Mobile Devices and Services (New York, NY, USA, Sep. 2018), 1–12.
.. [7] Fiebrink, R. et al. 2009. A Meta-Instrument for Interactive, On-the-fly Machine Learning. Proceedings of the International Conference on New Interfaces for Musical Expression (2009), 280–285.
.. [8] Heikkilä, V.-M. 2011. Discovering novel computer music techniques by exploring the space of short computer programs. arXiv.
.. [9] Yee-King, M. and Roth, M. 2008. Synthbot: An unsupervised software synthesizer programmer. ICMC (2008).
.. [10] Macret, M. and Pasquier, P. 2014. Automatic design of sound synthesizers as pure data patches using coevolutionary mixed-typed cartesian genetic programming. Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation (New York, NY, USA, Jul. 2014), 309–316.
.. [11] Caspe, F. et al. 2022. DDX7: Differentiable FM Synthesis of Musical Instrument Sounds. Proceedings of the 23rd International Society for Music Information Retrieval Conference. (2022).