From 6154da3db13e0dfa590b99a639999a0ab12d2692 Mon Sep 17 00:00:00 2001 From: Deepak Khatri <deepaklorkhatri7@gmail.com> Date: Thu, 25 Jan 2024 22:36:28 +0530 Subject: [PATCH] Add deep learning projects --- ideas/deep-learning.rst | 60 +++++++++++++++++++++++++++++++++++++++-- 1 file changed, 58 insertions(+), 2 deletions(-) diff --git a/ideas/deep-learning.rst b/ideas/deep-learning.rst index 8bd7af5..319d1d0 100644 --- a/ideas/deep-learning.rst +++ b/ideas/deep-learning.rst @@ -3,5 +3,61 @@ Deep Learning ############### -YOLO models on the X15/AI-64 -******************************* \ No newline at end of file +BeagleBoard-X15, BeagleBone-AI and BeagleBone-AI64 all have accelerators for running deep +learning tasks using TIDL (1, 2). We'd love projects that enable people to do more deep +learning application and end-nodes and leverage cloud-based training more easily. Goal +here is to create tools that make learning about and applying AI and deep learning easier. +Contributions to projects like ArduPilot and DonkeyCar (DIY Robocars and BlueDonkey) to +introduce autonomous navigation to mobile robots are good possible candidates. + +For some background, be sure to check out `simplify embedded edge AI development +<https://e2e.ti.com/blogs_/b/process/posts/simplify-embedded-edge-ai-development>`_ +post from TI. + +.. card:: + + **YOLO models on the X15/AI-64** + ^^^^ + + Port the YOLO model(s) to the X15/AI so the accelerator blocks can be leveraged. Currently, + running a frame through YOLOv2-tiny takes anywhere from 35 sec to 15 second depending on the + how the code is run on the ARM.35 second being a pure brute force compilation for ARM; 15 + second utilizing NEON and tweaked algorithms. The goal is to get things down to 1 second + or less using the onboard accelerators. Note, there are over 6 different variants of YOLO + (YOLOv1, YOLOv2, YOLOv2 and each one has a full size and a tiny version). The main interest + is in getting either the YOLOv2 or YOLOv3 versions running. Please discuss with potential + mentors on the desired approach as there are many approaches. Just to name a few: Porting + the YOLO model into TIDL; OpenCL directly; OpenCL integration with the acceleration library; + Integrating TIDL support with an acceleration library. + + - **Goal:** Run YOLOv2 or YOLOv3 with the onboard hardware acceleration. + - **Hardware Skills:** None + - **Software Skills:** C, C++, Linux kernel, Understanding of NNs and Convolution. + - **Possible Mentors:** Hunyue Yau (ds2) + - **Expected Size of Project:** 350 hrs + - **Rating:** Medium + - **Upstream Repository:** Numerous + - **References:** https://pjreddie.com/darknet/yolo/ + + ++++ + +.. card:: + + **OpenGLES acceleration for DL** + ^^^^ + + Current acceleration on the X15/AI focuses on using the EVE and DSP hardware blocks. + The SoC on those boards also feature an OpenGLES enabled GPU. The goal with this is + to utilize shaders to perform computations. A possible frame work to utilize this on + is the Darknet CNN framework. + + - **Goal:** Accelerate as many layers types as possible using OpenGLES. + - **Hardware Skills:** None + - **Software Skills:** C, C++, Linux kernel, OpenGLES, Understanding of NNs and Convolution. + - **Possible Mentors:** Hunyue Yau (ds2) + - **Expected Size of Project:** 350 hrs + - **Rating:** Medium + - **Upstream Repository:** Numerous + - **References:** https://pjreddie.com/darknet/ + + ++++ \ No newline at end of file -- GitLab