diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 6ac69b740a54a17fc241fee31374f761bc7f2f52..de8aa22d2ff408ec086383991cde55058b4eb2b5 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -1,5 +1,5 @@ +# The Docker image that will be used to build your app image: beagle/sphinx-build-env:latest - pages: tags: - docker-amd64 @@ -7,4 +7,4 @@ pages: - "./gitlab-build.sh" artifacts: paths: - - public \ No newline at end of file + - public diff --git a/proposals/Assets/Figure1.png b/proposals/Assets/Figure1.png new file mode 100644 index 0000000000000000000000000000000000000000..7efa530714c1b9350e3f7f6787498dc9bb64893b Binary files /dev/null and b/proposals/Assets/Figure1.png differ diff --git a/proposals/Assets/Figure2.png b/proposals/Assets/Figure2.png new file mode 100644 index 0000000000000000000000000000000000000000..ee57b2366fcdce40d4c4013ddc52a9c0cf14c389 Binary files /dev/null and b/proposals/Assets/Figure2.png differ diff --git a/proposals/Assets/Figure3.png b/proposals/Assets/Figure3.png new file mode 100644 index 0000000000000000000000000000000000000000..34e8986c725e990adc5cbfe8b0de1d9041fb3fec Binary files /dev/null and b/proposals/Assets/Figure3.png differ diff --git a/proposals/Assets/Figure4.png b/proposals/Assets/Figure4.png new file mode 100644 index 0000000000000000000000000000000000000000..7e44bf290e8a5fdacc033b378d5903c22cf85277 Binary files /dev/null and b/proposals/Assets/Figure4.png differ diff --git a/proposals/Assets/Figure5.png b/proposals/Assets/Figure5.png new file mode 100644 index 0000000000000000000000000000000000000000..9fdd3dc32baa8897de68c8d2239803e88061c8e7 Binary files /dev/null and b/proposals/Assets/Figure5.png differ diff --git a/proposals/Assets/Figure6.png b/proposals/Assets/Figure6.png new file mode 100644 index 0000000000000000000000000000000000000000..8ba089b7d561f8251156e20c0807ddadf3bc9c9c Binary files /dev/null and b/proposals/Assets/Figure6.png differ diff --git a/proposals/commercial_detection_and_replacement.rst b/proposals/commercial_detection_and_replacement.rst new file mode 100644 index 0000000000000000000000000000000000000000..2b0c4ae72774180848e0ee488ef940e0773ce524 --- /dev/null +++ b/proposals/commercial_detection_and_replacement.rst @@ -0,0 +1,598 @@ + +.. _gsoc-proposal-template: + +Enhanced Media Experience with AI-Powered Commercial Detection and Replacement +############################################################################### + +Introduction +************* + +The BeagleBone® AI-64 from the BeagleBoard.org Foundation is a complete system for developing artificial intelligence (AI) and machine-learning solutions with the convenience and expandability of the BeagleBone platform and onboard peripherals to start learning and building applications. +Leveraging the capabilities of BeagleBoard’s powerful processing units, the project will focus on creating a real-time, efficient solution that enhances media consumption experiences by seamlessly integrating custom audio streams during commercial breaks. + +Summary links +============= + +- **Contributor:** `Aryan Nanda <https://forum.beagleboard.org/u/aryan_nanda>`_ +- **Mentors:** `Jason Kridner <https://forum.beagleboard.org/u/jkridner>`_, `Deepak Khatri <https://forum.beagleboard.org/u/lorforlinux>`_ +- **GSoC Repository:** TBD + +Status +======= + +This project is currently just a proposal. + +Proposal +======== +- Created accounts across `OpenBeagle <https://openbeagle.org/aryan_nanda>`_, `Discord <https://discord.com/users/758929156401528892>`_ and `Beagle Forum <https://forum.beagleboard.org/u/aryan_nanda>`_ +- The PR Request for Cross Compilation: `#185 <https://github.com/jadonk/gsoc-application/pull/185>`_ +- Created a project proposal using the `proposed template <https://gsoc.beagleboard.io/proposals/template.html>`_ + +About +===== + +- **Resume** - Find my resume `here <https://drive.google.com/file/d/1UPXxEo_Z-qPHpVlnPLcai9cBInQj_c5j/view?usp=sharing>`_ +- **Forum:** :fab:`discourse` `u/aryan_nanda <https://forum.beagleboard.org/u/aryan_nanda>`_ +- **OpenBeagle:** :fab:`gitlab` `aryan_nanda <https://openbeagle.org/aryan_nanda>`_ +- **Github:** :fab:`github` `AryanNanda17 <https://github.com/AryanNanda17>`_ +- **School:** :fas:`school` `Veermata Jijabai Technological Institute (VJTI) <https://vjti.ac.in/>`_ +- **Country:** :fas:`flag` India +- **Primary language:** :fas:`language` English, Hindi +- **Typical work hours:** :fas:`clock` 9AM-5PM Indian Standard Time +- **Previous GSoC participation:** :fab:`google` This would be my first time participating in GSOC + +**About the Project** +********************** + +**Project name:** Enhanced Media Experience with AI-Powered Commercial Detection and Replacement + +Description +============ + +I propose developing **GStreamer Plugins** capable of processing video inputs based on their classification. +The plugins will identify commercials and either replace them with alternative content or obscure them, +while also substituting the audio with predefined streams. This enhancement aims to improve the media +consumption experience by eliminating unnecessary interruptions. I intend to **explore various video +classification models** to achieve accurate detection and utilize TensorFlow Lite to leverage the **native +accelerators of BeagleBone AI-64** for high-performance, real-time inferencing with minimal latency. +I believe real-time high-performance would be the most critical thing for this project and I intend on testing +a few different ways to see which one works best. + +Goals and Objectives +===================== + +The goal of this project is to detect and replace commercials in video streams on BeagleBoard hardware +using a GStreamer pipeline which includes a model that accurately detects commercials with minimal latency. +Comparison of different model accuracy can be done by doing some manual analysis and trying different video +classification models and to finally use the best performing option to be included in the GStreamer pipeline +for inferencing of real-time videos. This would be the result presented at the end of the project timeline. +For phase 1 evaluation, the goal is to build a training dataset, preprocess it and fine-tune and train a Video +Classification model. For phase 2 evaluation, the goal is to use the the best model identified in phase 1 for +commercial detection and build a GStreamer pipeline and use native accelerators present in BeagleBone AI-64 +for high-performance. + +In order to accomplish this project the following objectives need to be met. + +1. Phase 1:- + - Develop a dataset of videos and corresponding labels indicating the presence of commercials in specific segments. + - Preprocess the dataset to ensure it's suitable for input into deep learning models. Moreover divide the datset into train, validation and test set. + - Apply transfer learning and Fine-tune various deep learning models and train them on the prepared dataset to identify the most accurate one for commercial detection in videos. + - Save all trained models to local disk and perform real-time inference using OpenCV to determine the model that yields the best results with high-performance. +2. Phase 2:- + - Based on all the options tried in Phase 1, decide on the final model to be used in the GStreamer pipeline. + - Compiling the model and generating artifacts so that we can use it in TFLite Runtime. + - Building a GStreamer pipeline that would take real-time input of media and would identify the commercial segments in it. + - If the commercial segment is identified the GStreamer pipeline would either replace them with alternative content or obscure them, while also substituting the audio with predefined streams. + - I will also try to cut the commercial out completely and splice the ends. + - Enhancing the Real-time performance using native hardware Accelerators present in BeagleBone AI-64. + +**Methods** +*********************** +In this section, I will individually specify the training dataset, model, GStreamer Pipeline etc. methods that I plan on using in greater details. + +Building training Dataset and Preprocessing +============================================ +To train the model effectively, we need a dataset with accurate labels. Since a suitable commercial video +dataset isn't readily available, I'll create one. This dataset will consist of two classes: commercial and +non-commercial. By dividing the dataset into Commercial and Non-Commercial segments, I am focusing more on +"Content Categorization". Separating the dataset into commercials and non-commercials allows our model to +learn distinct features associated with each category. For commercials, this might include fast-paced editing, +product logos, specific jingles, or other visual/audio cues. Non-commercial segments may include slower-paced +scenes, dialogue, or narrative content. + + +To build this dataset, I'll refer to the **Youtube-8M dataset** [1], +which includes videos categorized as TV advertisements. However, since the Youtube-8M dataset provides encoded +feature vectors instead of the actual videos, direct usage would result in significant latency. Therefore, +I'll use it as a reference and download the videos labeled by it as advertisements to build our dataset. +I will use web scraper to automate this process by extracting URLs of the commercial videos. For the +non-commercial part, I will download random videos from other categories of Youtube-8M dataset. +After the dataset is ready I will preprocess it to ensure it's suitable for input into deep learning models. + + +Moreover I'll divide the datset into train, validation and test set. To address temporal dependencies during +training, I intend to employ random shuffling of the dataset using +```tf.keras.preprocessing.image_dataset_from_directory() with shuffle=True```. This approach ensures that +videos from different folders are presented to the model randomly, allowing it to learn scene change detection +effectively. + + +Video Classification models +============================ +**MoViNets** is a good model for our task as it can operate on streaming videos for online inference. The main reason behind trying out MoViNets first is becaue it does quick and continuous analysis of incoming video streams. MoViNet utilizes NAS(Neural Architecture Search) to balance accuracy and efficiency, incorporates stream buffers for constant memory usage [Fig. 1], and improves accuracy via temporal ensembles [2]. The MoViNet architecture uses 3D convolutions that are "causal". Causal convolution ensures that the output at time t is computed using only inputs up to time t [2][Fig. 2]. This allows for efficient streaming. +This make MoViNets a perfect choice for our case. + + +Since we don't have a big dataset, we will use the pre-trained MoViNets model as a feature extractor +and fine-tune it on our dataset. I will remove the classification layers of MoViNets and use its +pre-trained weights to extract features from our dataset. Then, train a smaller classifier (e.g., a few fully connected layers) on top of these features. +This way we can use the features learned by MoViNets on the larger dataset with minimal risk of overfitting. +This can help improve the model's performance even with limited data. + + +.. image:: Assets/Figure1.png + :alt: Stream buffer in MoViNets + +.. centered:: + Figure 1: Stream buffer in MoViNets [2] + +.. image:: Assets/Figure2.png + :alt: Standard Convolution Vs Causal Convolution + +.. centered:: + Figure 2: Standard Convolution Vs Causal Convolution [2] + + +If MoViNet does not perform well than we can use other models like **Conv+LSTMs** [Fig. 3][3]. Since a video is just a series of frames, a naive video classification method would be pass each frame from a video file through a CNN, classify each frame individually and independently of each other, choose the label with the largest corresponding probability, label the frame, and assign the most assigned image label to the video. + +To solve the problem of "prediction flickering", where the label for the video changes rapidly when scenes get labeled differently. I will use **rolling prediction averaging** to reduce “flickering†in results. And I will maintain a queue to store the last few frames and whenever a scene change is detected, all frames in the queue would be marked with the current result, allowing for retroactive scene modification. +The depth of the queue will be determined through experimentation to find the optimal setting. + +The Conv+LSTMs model will perform well as it considers both the spatial and temporal features of videos just like a Conv3D model. The only reason it is not my first choice is because MoViNets are considered to be better for real-time performance. + +.. image:: Assets/Figure3.png + :alt: Conv3D+LSTMs + +.. centered:: + Figure 3: Conv+LSTMs [3] + +Optional Methods +----------------- +- **ViViT**: A Video Vision Transformer, this is a pure Transformer based model which extracts spatio-temporal tokens from the input video, which are then encoded by a series of transformer layers [4]. I have kept this as an optional method because our problem is of binary classification(either Commercial or Non-Commercial), so using such a complex model for this small problem may not be as efficient as other models. +- **Audio fingerprinting**: This method involves extracting unique characteristics or features from audio signals to create a compact representation, often called a fingerprint. These fingerprints can then be compared against a database or used for various audio processing tasks [5]. I have kept it as an optional method because it may sometimes yield poorer results compared to deep learning models like MoViNets and Conv+LSTMs, particularly in tasks requiring complex audio understanding. +- **Scene Change Detection**: This approach involve detecting scene changes to segment the video into distinct segments or shots based on difference between pixel values of two consecutive frames etc [6]. And then applying video classification model on segmented frames. I have kept this as an optional appraoch because I think adding an additional step would cause unnecessary challenges. + +Choosing the Best Performing model +=================================== +I will choose the best performing model based on model's performance on:- + +1. **Evaluation metrics** + +- Accuracy: This metric provides a general overview of how well our model is performing overall. It's a good starting point for evaluating performance, but it might not be sufficient on its own, especially if the classes are imbalanced. +- Precision and Recall: These metrics provide insights into the model's ability to minimize false positives (precision) and false negatives (recall). Since our problem involves binary classification, precision and recall are essential for understanding the model's performance on each class (commercial and non-commercial). +- F1 Score: It provides a balanced measure of the model's performance. A high F1 score indicates that the model effectively balances precision and recall, ensuring both high accuracy and coverage in detecting commercials. + +2. **Real-time inferencing** + +- I will use OpenCV to evaluate real-time performance of the models. This ensures that our model's performance is evaluated under conditions similar to those it will encounter in deployment. + +This code snippet illustrates how the model will be assessed in real-time, focusing on both **detection accuracy** and **frames per second (FPS)** as the primary evaluation metrics. + +.. code-block:: python + + # Import necessary libraries + from keras.models import load_model + from collections import deque + import numpy as np + import pickle + import cv2 + import time + + model = load_model("./our_trained_commercial_detection_model") # Load the pre-trained Keras model + lb = pickle.load(open("./videoClassification.pickle", "rb")) # Load the label binarizer used during training + mean = np.array([123.68, 116.779, 103.939], dtype="float32") # Define the mean value for image preprocessing + Queue = deque(maxlen=128) # Define a deque (double-ended queue) to store predictions of past frames + capture_video = cv2.VideoCapture("./example_clips/DemoVideo.mp4") # Open the video file for reading + (Width, Height) = (None, None) # Initialize variables for storing frame dimensions and previous time + ptime = 0 + + # Loop through each frame of the video + while True: + (taken, frame) = capture_video.read() # Read a frame from the video + + if not taken: # Break the loop if no more frames are available + break + + + if Width is None or Height is None: # Get frame dimensions if not already obtained + (Width, Height) = frame.shape[:2] + + output = frame.copy() # Make a copy of the frame for processing and classification + + #---------------------------Pre-Processing--------------------------# + + frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # Preprocess the frame: convert color space and resize + frame = cv2.resize(frame, (sizeRequiredByModelHere)).astype("float32") + frame -= mean # Subtract the mean value for normalization + + #--------------------------model-inferencing-----------------------# + + preds = model.predict(np.expand_dims(frame, axis=0))[0] # Make predictions on the preprocessed frame + Queue.append(preds) # Append the predictions to the deque + results = np.array(Queue).mean(axis=0) # Calculate the average of predictions from past frames + i = np.argmax(results) # Get the index of the class with the highest average prediction + label = lb.classes_[i] # Get the label corresponding to the predicted class + + #---------------------------Post-Processing--------------------------# + + if label == "commercial": # Apply Gaussian blur to the output frame if the label is "commercial" + output = cv2.GaussianBlur(output, (99, 99), 0) + + ctime = time.time() # Calculate and display FPS + fps = int(1 / (ctime - ptime)) + cv2.putText(output, str(fps), (20, 200), cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 0, 255), 2) + ptime = ctime + + cv2.imshow("In progress", output) # Show the processed frame + + key = cv2.waitKey(1) & 0xFF # Handle user input (press 'q' to quit) + if key == ord("q"): + break + + # Release resources when finished + capture_video.release() + cv2.destroyAllWindows() + + +Model Execution on BeagleBone AI-64 +===================================== +BeagleBone AI-64 Linux for Edge AI supports importing pre-trained custom models to run inference on target. Moreover, Edge AI BeagleBone AI-64 images have TensorFlow Lite already installed with acceleration enabled. +The Debian-based SDK makes use of pre-compiled DNN (Deep Neural Network) models and performs inference using various OSRT (open source runtime) such as TFLite runtime, ONNX runtime etc. + +In order to infer a DNN, SDK expects the DNN and associated artifacts in the below directory structure [7]. + +.. code-block:: text + + project_root + │ + ├── param.yaml + │ + ├── artifacts + │ ├── 264_tidl_io_1.bin + │ ├── 264_tidl_net.bin + │ ├── 264_tidl_net.bin.layer_info.txt + │ ├── 264_tidl_net.bin_netLog.txt + │ ├── 264_tidl_net.bin.svg + │ ├── allowedNode.txt + │ └── runtimes_visualization.svg + │ + └── model + └── my_commercial_detection_model.tflite + +1. model: This directory contains the DNN being targeted to infer. + +2. artifacts: This directory contains the artifacts generated after the compilation of DNN for SDK. + +3. param.yaml: A configuration file in yaml format to provide basic information about DNN, and associated pre and post processing parameters. + +Therefore, after choosing the model to be used in GStreamer pipeline, I will generate the artifacts directory by following the instructions mentioned in TexasInstruments:edgeai-tidl-tools examples [7]. + +.. image:: Assets/Figure4.png + :alt: TFLite Runtime + +.. centered:: + Figure 4: TFLite Runtime [7] + +GStreamer Pipeline +=================== +The data flow in the GStreamer pipeline at a high level can be split into 3-parts [8]:- + +1. Input Pipeline - Grabs a frame from the input source. +2. Output Pipeline - Sends the output to the display. +3. Compute Pipeline - Performs pre-processing, inference and post-processing. + +I will create a GStreamer Pipeline that will receive input from an **hdmi source** and it will grab it frame by frame. The frame will be split into two paths. + +The “analytics†path normalizes the frame and resizes the input to match the resolution required to run the deep learning model. +The “visualization†path is provided to the post-processing module which does the required post process required by the model. +If a commercial video is detected, we apply blurring to the video frames and replace the audio. +If a non-commercial video is detected, proceed with the normal visualization process without blurring or replacing the audio. +Post-processed output is then sent to display [8]. + +NNStreamer provides efficient and flexible data streaming for machine learning +applications, making it suitable for tasks such as running inference on video frames. +So, I will use NNStreamer elements to do inferencing of videos. + +.. image:: Assets/Figure5.png + :alt: GStreamer Pipeline + +.. centered:: + Figure 5: GStreamer Pipeline [8] + +The above GStreamer pipeline is a demo pipeline inspired from edge_ai_apps/data_flows [8] and there could be a few more changes to it depending upon our specific need. + +- **"hdmisrc"** element is used for capturing audio and video data from an HDMI source. +- **"videoconvert"** ensure proper format conversion for display. +- **"tiovxcolorconvert"** is used to perform color space conversion. +- **"tiovxmultiscaler"** is used to perform multi-scaling operations on video frames. It allows us to efficiently scale the input frames to multiple desired resolutions in a single step. +- **"tiovxdlpreproc"** is used to perform pre-processing of input data in deep learning inference pipelines using TIOVX (TI OpenVX) framework. +- **"kmssink"** is used for displaying video on systems. + +Project Workflow +=================== +.. image:: Assets/Figure6.png + :alt: Project Workflow + +.. centered:: + Figure 6: Project Workflow + +Software +========= + +- `Python <https://www.python.org/>`_ +- `C++ <https://isocpp.org/>`_ +- `TensorFlow <https://www.tensorflow.org/>`_ +- `TFLite <https://www.tensorflow.org/lite>`_ +- `GStreamer <https://gstreamer.freedesktop.org/>`_ +- `OpenCV <https://opencv.org/>`_ +- `Build Systems <https://www.gnu.org/software/make/>`_ + +Hardware +======== + +- Ability to capture and display video streams using `BeagleBone AI-64 <https://www.beagleboard.org/boards/beaglebone-ai-64>`_ + +**Timeline** +************* + + +Timeline summary +================= + +.. table:: + + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | Date | Activity | + +========================+========================================================================================================================================================+ + | February 26 - March 3 | Connect with possible mentors and request review on first draft | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | March 4 - March 10 | Complete prerequisites, verify value to community and request review on second draft | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | March 11 - March 20 | Finalized timeline and request review on final draft | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | March 21 - April 2 | Proposal review and Submit application | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | April 3 - May 1 | Understanding GStreamer pipeline and TFLite runtime of BeagleBone AI-64. | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | May 2 - May 10 | Start bonding and Discussing implementation ideas with mentors. | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | May 11 - May 31 | Focus on college exams. | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | June 1 - June 3 | Start coding and introductory video | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | June 3 - June 9 | :ref:`milestone #1<Milestone1>` -> Releasing introductory video and developing Commercial dataset | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | June 10 - June 16 | :ref:`milestone #2<Milestone2>` -> Developing Non-Commercial dataset and dataset Preprocessing | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | June 17 - June 23 | :ref:`milestone #3<Milestone3>` -> Transfer learning and fine-tuning MoViNets architecture | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | June 24 - June 30 | :ref:`milestone #4<Milestone4>` -> Transfer learning and fine-tuning ResNet architecture | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | July 1 - July 7 | :ref:`milestone #5<Milestone5>` -> Evaluate performance metrics to choose the best-performing model. | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | July 8 - July 14 | :ref:`Submit midterm evaluations <Submit midterm evaluation>` | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | July 15 - July 21 | :ref:`milestone #6<Milestone6>` -> Finalizing the best model by performing real-time inferencing | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | July 22 - July 28 | :ref:`milestone #7<Milestone7>` -> Compiling the model and generating artifacts and building pre-processing part of GStreamer pipeline | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | July 29 - August 4 | :ref:`milestone #8<Milestone8>` -> Building the compute pipeline using NNStreamer | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | August 5 - August 11 | :ref:`milestone #9<Milestone9>` -> Building the post-processing part of GStreamer pipeline | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | August 12 - August 18 | :ref:`milestone #10<Milestone10>` -> Enhancing real-time performance | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + | August 19 | :ref:`Submit final project video, submit final work to GSoC site and complete final mentor evaluation<Final project video>` | + +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+ + +Timeline detailed +================== + +Community Bonding Period (May 1st - May 10th) +============================================== + +- Discuss implementation ideas with mentors. +- Discuss the scope of the project. + +.. _Milestone1: + +Milestone #1, Introductory YouTube video (June 3rd) +=================================================== + +- Making an Introductory Video. +- Commercial dataset acquisition: + - Web scrape videos marked as advertisements from YouTube 8-M dataset. + - Ensure proper labeling and categorization of commercial videos. + +.. _Milestone2: + +Milestone #2 (June 10th) +========================== + +- Non-commercial dataset acquisition: + - Web scrape random videos from other categories of YouTube 8-M dataset. + - Ensure diversity and relevance of non-commercial videos. +- Dataset preprocessing: + - Preprocess acquired datasets for suitability in deep learning models. + - Divide datasets into train, validation, and test sets. + - Perform random shuffling of data to maintain temporal dependencies. + +.. _Milestone3: + +Milestone #3 (June 17th) +========================= + +- Transfer learning and fine-tuning MoViNets architecture: + - Apply transfer learning on MoViNets and fine-tune its last few layers. + - Train MoViNets on the prepared dataset for video classification. + +.. _Milestone4: + +Milestone #4 (June 24th) +========================== + +- Transfer learning and fine-tuning ResNet architecture: + - Adding additional layers of LSTMs for extracting temporal dependencies. + - Developing ResNet-LSTMs model architecture for video classification. + - Train the ResNet-LSTMs model on the prepared dataset. + +.. _Milestone5: + +Milestone #5 (July 1st) +======================== +- Finalize the best model: + - Save all trained models to local disk + - Evaluate performance metrics to choose the best-performing model. + + +.. _Submit midterm evaluation: + +Submit midterm evaluations (July 8th) +===================================== + +- Document the progress made during the first phase of the project. + +.. important:: + + **July 12 - 18:00 UTC:** Midterm evaluation deadline (standard coding period) + +.. _Milestone6: + +Milestone #6 (July 15th) +========================= + +- Finalize the best model: + - Perform real-time inference using OpenCV to determine the model that yields the best results with high-performance. + - Based on all the options tried in Phase 1, decide on the final model to be used in the GStreamer pipeline. + +.. _Milestone7: + +Milestone #7 (July 22nd) +========================= + +- Compile the chosen model and generate artifacts for TFLite runtime. +- Building the pre-processing part of GStreamer pipeline: + - Develop the pre-processing module to prepare video frames for inference. + +.. _Milestone8: + +Milestone #8 (July 29th) +========================= + +- Building the compute pipeline using NNStreamer: + - Implement NNStreamer for inferencing videos using the compiled model. + +.. _Milestone9: + +Milestone #9 (Aug 5th) +======================= + +- Building the post-processing part of GStreamer pipeline: + - Develop the post-processing module to perform actions based on classification results. + - Implement replacement or obscuring of commercial segments and audio substitution. + +.. _Milestone10: + +Milestone #10 (Aug 12th) +======================== + +- Enhancing real-time performance: + - Optimize the GStreamer pipeline for real-time performance using native hardware accelerators. + - Ensure smooth and efficient processing of video streams. + + +.. _Final project video: + +Final YouTube video (Aug 19th) +=============================== + +- Submit final project video, submit final work to GSoC site and complete final mentor evaluation. + +Final Submission (Aug 24nd) +============================ + +.. important:: + + **August 19 - 26 - 18:00 UTC:** Final week: GSoC contributors submit their final work + product and their final mentor evaluation (standard coding period) + + **August 26 - September 2 - 18:00 UTC:** Mentors submit final GSoC contributor + evaluations (standard coding period) + +Initial results (September 3) +============================= + +.. important:: + **September 3 - November 4:** GSoC contributors with extended timelines continue coding + + **November 4 - 18:00 UTC:** Final date for all GSoC contributors to submit their final work product and final evaluation + + **November 11 - 18:00 UTC:** Final date for mentors to submit evaluations for GSoC contributor projects with extended deadline + +Experience and approach +*********************** + +This project requires prior experience with machine learning, multimedia processing and embedded systems. + +- As a good starting point for this project, I build a `Sports Video Classification model <https://github.com/AryanNanda17/VideoProcessing-Based-on-Video-classifcation>`_ and did **Video Processing on it based on Video classification** using OpenCV(Video Processing based on Video classification is an important part of the project). `Demo <https://youtu.be/hoKE2dr2nT4>`_ :fas:`external-link` +- We will be building Pure C++ GStreamer pipeline from input to output so experience with C++ Codebases and build systems is required. + - Relevant contribution - `#123 <https://github.com/SRA-VJTI/Pixels_Seminar/pull/123>`_ (OpenCV/C++) +- I have Previously Worked on the project `GestureSense <https://github.com/AryanNanda17/GestureSense/blob/master/GestureDetection/BgEliminationAndMotionDetection.py>`_ in which I did Image Processing based on Image classification using OpenCV/Python. +- I have past experience with esp-32 microcontroller and I have Previously Worked on a project `Multi-Code-Esp <https://github.com/AryanNanda17/multi_code_esp>`_ in which I build a multi-code esp component. +- Experience in Open-Source + Contributed at pymc repository. Added enhancements. + - `#7132 <https://github.com/pymc-devs/pymc/pull/7132>`_ (merged) + - `#7125 <https://github.com/pymc-devs/pymc/pull/7125>`_ (merged) + Resolved one issue in OpenCV repository (Improved Documentation). + - `#22177 <https://github.com/opencv/opencv/issues/22177>`_ (merged) +- Contributions in `openbeagle.org/gsoc <https://openbeagle.org/gsoc/gsoc.beagleboard.io>`_ + - Resolved pdf pageBreak issue - `#33 <https://openbeagle.org/gsoc/gsoc.beagleboard.io/-/merge_requests/33>`_ (merged) + - Added New idea - `#25 <https://openbeagle.org/gsoc/gsoc.beagleboard.io/-/merge_requests/25>`_ (merged) + - Improved Documentation - `#23 <https://openbeagle.org/gsoc/gsoc.beagleboard.io/-/merge_requests/23#note_18477>`_ (merged) + +Contingency +=========== +- If I get stuck on my project and my mentor isn’t around, I will use the following resources:- + - `MoViNets <https://www.tensorflow.org/hub/tutorials/movinet>`_ + - `GStreamer Docs <https://gstreamer.freedesktop.org/>`_ + - `BeagleBone AI-64 <https://docs.beagleboard.org/latest/boards/beaglebone/ai-64/01-introduction.html>`_ + - `NNStreamer <https://nnstreamer.github.io/>`_ +- Moreover, the BeagleBoard community is extremely helpful and active in resolving doubts, which makes it a great going for the project resources and clarification. +- I intend to remain involved and provide ongoing support for this project beyond the duration of the GSOC timeline. + +Benefit +======== + +This project will not only enhance the media consumption experience for users of BeagleBoard hardware but also serve as an educational resource on integrating AI and machine learning capabilities into embedded systems. It will provide valuable insights into: + +- The practical challenges of deploying neural network models in resource-constrained environments. +- The development of custom GStreamer plugins for multimedia processing. +- Real-world applications of machine learning in enhancing digital media experiences. + +Misc +==== + +- The PR Request for Cross Compilation: `#185 <https://github.com/jadonk/gsoc-application/pull/185>`_ +- Relevant Coursework: `Neural Networks and Deep Learning <https://www.coursera.org/account/accomplishments/verify/LKHTEA9XRWML>`_, `Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization <https://www.coursera.org/account/accomplishments/verify/E52UFAHAY5UG>`_, `Convolutional Neural Networks <https://www.coursera.org/account/accomplishments/verify/9L4QL25AEL3L>`_ + +References +*********** + +1. Youtube: `YouTube-8M: A Large and Diverse Labeled Video Dataset <https://research.google.com/youtube8m/>`_ +2. Dan Kondratyuk*, Liangzhe Yuan, Yandong Li, Li Zhang, Mingxing Tan, Matthew Brown, Boqing Gong: `MoViNets: Mobile Video Networks for Efficient Video Recognition. <https://arxiv.org/pdf/2103.11511.pdf>`_ +3. Jeff Donahue, Lisa Anne Hendricks, Marcus Rohrbach, Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, Trevor Darrell: `Long-term Recurrent Convolutional Networks for Visual Recognition and Description <https://arxiv.org/pdf/1411.4389.pdf>`_ +4. Anurag Arnab* Mostafa Dehghani* Georg Heigold Chen Sun Mario Luciˇ c´†Cordelia Schmid†: `ViViT: A Video Vision Transformer <https://arxiv.org/pdf/2103.15691.pdf>`_ +5. Nilesh M. Patil, Dr. Milind U. Nemade: `Content-Based Audio Classification and Retrieval: A Novel Approach <https://www.academia.edu/40346310/Content_Based_Audio_Classification_and_Retrieval_A_Novel_Approach>`_ +6. Igor Bieda, Anton Kisil, Taras Panchenko: `An Approach to Scene Change Detection <https://ieeexplore.ieee.org/document/9660887>`_ +7. TexasInstruments: `edgeai-tidl-tools <https://github.com/TexasInstruments/edgeai-tidl-tools/blob/master/examples/osrt_python/README.md>`_ +8. BeagleBone AI-64: `data-flows in edge_ai_apps <https://docs.beagleboard.org/latest/boards/beaglebone/ai-64/edge_ai_apps/data_flows.html>`_ diff --git a/proposals/himanshuk.rst b/proposals/himanshuk.rst new file mode 100644 index 0000000000000000000000000000000000000000..08eb00ebc70948bbd6e798d38d45a62bc355830c --- /dev/null +++ b/proposals/himanshuk.rst @@ -0,0 +1,399 @@ +.. _gsoc-proposal-Himanshu Kohale: + +Librobotcontrol support for newer boards +######################################## + +Introduction +************* + +Introducing librobotcontrol package support with newer boards. + +Summary links +============= + +- **Contributor:** `Himanshu Kohale <https://forum.beagleboard.org/u/ayush1325>`_ +- **Mentors:** `Jason Kridner <https://forum.beagleboard.org/u/jkridner>`_, `Deepak Khatri <https://forum.beagleboard.org/u/lorforlinux/summary>`_ +- **Code:** TBD +- **Documentation:** TBD +- **GSoC:** NA + +Status +======= + +This project is currently just a proposal. + +Proposal +======== + +Completed all the requirements listed on the `ideas page <https://gsoc.beagleboard.io/ideas/>`_ + +* Created accounts on `openbeagle <https://openbeagle.org/Himanshuk>`_ , `forum <https://forum.beagleboard.org/u/himanshuk/summary>`_ , `Discord <https://discord.com/users/869908108565168198>`_ +* Source dive and get to know with packages and examples. +* The code for the cross-compilation task can be found, Submitted through the pull request :- `pull request <https://github.com/jadonk/gsoc-application/pull/191>`_ +* Proposal :- `librobotcontrol support for newer boards <https://gsoc-beagleboard-io-himanshuk-afa51c0f037cce3ef5f7bf31158de2bf3.beagleboard.io/proposals/himanshuk.html>`_ + +About +===== + +- **Forum:** :fab:`discourse` `Himanshuk (Himanshu Kohale) <https://forum.beagleboard.org/u/himanshuk/summary>`_ +- **OpenBeagle:** :fab:`gitlab` `Himanshuk (Himanshu Kohale) <https://openbeagle.org/Himanshuk>`_ +- **Github:** :fab:`github` `Himanshukohale22 (Himanshu Kohale) <https://github.com/Himanshukohale22>`_ +- **School:** :fas:`school` `Veermata Jijabai Technological Institute <https://vjti.ac.in/>`_ +- **Country:** :fas:`flag` India +- **Primary language:** :fas:`language` English, Hindi, Marathi +- **Typical work hours:** :fas:`clock` 8AM-5PM Indian standard timeline +- **Previous GSoC participation:** :fab:`google` N/A + +Project +******** + +**Project name:** librobotcontrol support for newer boards . + +Description +============ + +**Overview** + +- Librobotcontrol is package of C library which contains examples and testing programms for Robotic control projects used by beaglebone capes like `robotic-cape <https://www.beagleboard.org/boards/beaglebone-robotics-cape>`_ which is sold by beagleboard.org. BeaglBboneBlack(BBB)(am33xx) supports the librobotcontrol package thanks to Deepak khatri, Who Previouly worked upon cape compatibility layer on BBB.BeaglBboneBlack support robotic-cape cape with librobotcontrol package due to device tree overlays which identify the robotics-cape as a specific hardware which can be useful while accesing various pheripherals and devices with roboitic-cape. + +- BeagleBone-AI support librobotcontrol package but its been draft and there is not a stable device tree overlays for robotic cape in AI image.So for using package, have to check the passed results for various drivers and use cape with AI and make changes accordingly. BeagleBone AI is based on the Texas Instruments AM5729 dual-core Cortex-A15 SoC with flexible BeagleBone Black header and mechanical compatibility. Which supports the AM572x device tree binary files. However, the current implementation has the following problems This proposal will address these issues. + +- BeagleBone AI-64 uses TI J721E-family TDA4VM system-on-chip (SoC) which is part of the K3 Multicore SoC architecture. Each TI evm has an unique device tree binary file required by the kernel. As BeaglBboneBlack need (ti,am33xx) similarly beaglebone-AI64 support (ti,j721e) device tree binary (.dtb).BeagleBone AI-64 also support librobotcontrol packages but there are less tutorial and not refine code for support librobotcontrol package with new boards. Need to refine the device trees overlays to use this librobotcontrol package with AI-64.As in librobotcontrol there is no robotic-cape dtb support for beaglebone-AI64 we need to write the device tree overlays. + +- BeagleV®-Fire is a revolutionary SBC powered by the Microchip’s PolarFire® MPFS025T RISC-V System on Chip (SoC) with FPGA fabric. It has the same P8 & P9 cape header pins as BeagleBone Black allowing to stack BeagleBone cape on top to expand it’s capability. Built around the powerful and energy-efficient RISC-V instruction set architecture (ISA) along with its versatile FPGA fabric. BeagleV-Fire also support the robotic cape with librobotcontrol package and cape gateware for robotic cape is pre-installed in V-fire image `here <https://openbeagle.org/himanshuk/gateware/-/tree/main/sources/FPGA-design/script_support/components/CAPE/ROBOTICS?ref_type=heads>`_ . but as BBB support the librobotcontrol with more functionality and flexible application, beagleV-fire is not capable for librobotcontrol package due to less number of child node (PWM, GPIO) present in robotic-cape (.dts) file which is pre-installed in V-fire image. With help of customization for cape gateware in V-fire which Provide more flexibility to board with cape, and librobotcontrol package to support V-fire with robotic cape. + +- Main goal of project is to update librobotcontrol package for beaglebone-AI(am5x), beaglebone-AI64(j721e) and beaeglV-fire(polarV-soc) boards. Primarily librobotcontrol support all the boards but not able to make roboitic-cape as flexible as BBB. + + +**Implementation** + +- Device tree overlay are the data structure for Describing hardware. Rather than hard coding every details of a device into an operaing system, many aspect of hardware can be +described in data structure that is passed to the OS at boot time. + +Implementation of device tree : + +.. image :: https://devicetree-specification.readthedocs.io/en/stable/_images/graphviz-96d4d843650908846790f318227ab351de33e252.png + :alt: Device tree Implementation + :align: center + +- As above example, Root node is starting or begin the overall process of accesing hardware information after that there are CPU's, memory and various pheripherals which used as Nodes and sub-node are device-nodes where the information about specific hardware which will use that specific pheripherals is written. +- Similar to above example robotics capes need UART, I2C, SPI and GPIO's node. for librobotcontrol package. +- Have to write device tree overlays with corresponding pheripherals according to cape interface with beagleboards. +Below is simple device tree example for accesing GPIO's for blink LED's with P8_7 and P8_8 headers pin which are inbuil-led's for roboitc-cape. + +.. code-block:: + + /{ + compatible = "ti,beaglebone-AI64"; + part-number = "LED_GPIO_TEST"; + version = "00A0"; + buildinLed@1{ + target = <ti,j721ex>; + pinctrl-singl.pins = { + 0x090 0x0F /*P8.7, MODE7*/ + 0x094 0x0F /*P8.8*/ + } + + } + + } + +- After writing the source file for robotics-cape, This Robotic-cape source file for each board can be compiled using device tree compiler: + +.. code-block:: linux + + $ dtc -0 .dtb -o robotic_cape.dtbo robotic-cape.dts + +- In case of BeagleV-Fire boards, beaglV-fire support customization cape gateware which can be very useful to implement. `Robotic_cape.dts <https://openbeagle.org/himanshuk/gateware/-/blob/main/sources/FPGA-design/script_support/components/CAPE/ROBOTICS/device-tree-overlay/robotics-cape.dtso?ref_type=heads>`_ source file is already present in V-fire image but with less specification and accesing PWM and GPIO's pheripherals which are bare minimum to use roboitc-cape with librobotcontrol. so for librobotcontrol use,have to update the device tree cape gateware. + +V-fire Gateware architecture: + +.. image :: https://docs.beagleboard.org/latest/_images/Gateware-Flow-simplified-overview.png + :alt: V-fire Gateware architecture + :align: center + +- Cape gateware in Gateware architecture of V-fire is responsible for handling the P8 and P9 connectors signals. The gateware is extended or customized by creating additional directories within the component directory of interest. + + +Previous work:- + +Previously Deepak khatri who worked upon the cape compatibility for beagleboards. use the robotic cape for various tasks. +using pre-work upon robotic cape, i can take a deep dive to robotic cape compatibility with BeaglBboneBlack (BBB) and how its works. +In previous gsoc-application 2022 participation kai yamada work upon same project which was about robotic-cape support with BeagleBone-AI (BB-AI). +In both projects implementation was about the device tree overlayes for BBB and AI for specific pheripherals to enabling functionality of PWM, I2C and SPI and UART for robotic-cape. + + +Software +========= + +- Device tree's overlays for beagleboards will be used.The project requires the use of the device tree compiler (dtc) for compiling the device tree source (ex. *.dts, *.dtsi) files. +- Primarily VScode and gitlab with web-IDE is use in this project for deep dive into code and firmware of librobotcontrol and rc (robot control library) examples. +- C language. + +Hardware +======== + +A list of hardware that you are going to use for this project. + +- `Beaglebone Black <https://www.digikey.in/en/products/detail/beagleboard-by-seeed-studio/102110420/12719590?cur=INR&lang=en&utm_adgroup=&utm_source=google&utm_medium=cpc&utm_campaign=PMax%20Shopping_Product_High%20ROAS&utm_term=&productid=12719590&utm_content=&utm_id=go_cmp-20122528480_adg-_ad-__dev-c_ext-_prd-12719590_sig-Cj0KCQjw8J6wBhDXARIsAPo7QA8aIQNqlJuRD5bNfrHXhCPfGk6LSU2nxmVaauLzHgc6BreuyUqskmEaAsJoEALw_wcB&gad_source=1&gclid=Cj0KCQjw8J6wBhDXARIsAPo7QA8aIQNqlJuRD5bNfrHXhCPfGk6LSU2nxmVaauLzHgc6BreuyUqskmEaAsJoEALw_wcB>`_ +- `BeagleBone-AI <https://www.digikey.in/en/products/detail/seeed-technology-co-ltd/102110362/10492208>`_ +- `Beaglebone AI 64 <https://www.digikey.in/en/products/detail/beagleboard-by-seeed-studio/102110646/15929655?cur=INR&lang=en&utm_adgroup=&utm_source=google&utm_medium=cpc&utm_campaign=PMax%20Shopping_Product_High%20ROAS&utm_term=&productid=15929655&utm_content=&utm_id=go_cmp-20122528480_adg-_ad-__dev-c_ext-_prd-15929655_sig-Cj0KCQjw8J6wBhDXARIsAPo7QA8OHJluOkNDsca6onRdfGL-SiAdurymvfiCgGq1_E1YqW2WvDsyjZYaAnUmEALw_wcB&gad_source=1&gclid=Cj0KCQjw8J6wBhDXARIsAPo7QA8OHJluOkNDsca6onRdfGL-SiAdurymvfiCgGq1_E1YqW2WvDsyjZYaAnUmEALw_wcB>`_ +- `BeagleV-fire <https://www.digikey.in/en/products/detail/beagleboard-by-seeed-studio/102110898/21706497>`_ +- Beaglebone-capes + - `Robotic cape <https://in.element14.com/beagleboard/bb-cape-robotics/robotics-cape-for-beaglebone-black/dp/2612581>`_ +- Additional hardware for project:- + - `Jumper cables <https://www.renaissancerobotics.com/JST_Jumper_Bundle.html>`_ :- + - 4-wire jst cables + - 6-wire jst cables + - `DC motors <https://www.sparkfun.com/products/13302>`_ + - `Servo motor <https://www.digikey.in/en/products/detail/900-00005/900-00005-ND/361277?WT.mc_id=IQ_7595_G_pla361277&wt.srch=1&wt.medium=cpc&WT.srch=1&gclid=CJz-qdC9n9ICFRO4wAodOjYLuQ>`_ + - `FTDI-TTL serial wire <https://www.adafruit.com/product/70>`_ + - `SD-card <https://www.amazon.in/SanDisk-Ultra-microSD-UHS-I-120MB/dp/B08L5FM4JC/ref=sr_1_3?dchild=1&keywords=64gb+sd+card&qid=1617689846&sr=8-3>`_ + - `power supply 12v <https://www.amazon.in/REES52-Adapter-Switch-Charger-Raspberry/dp/B07WJ34VJL>`_ +- Useful testing tools:- + - Oscilloscope + - Multimeter + - Soldering station + - Mechanical toolbox + + +Timeline +******** + +Timeline summary +================= + +.. table:: + + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | Date | Activity | + +========================+===============================================================================================================+ + | February 26 | Connect with possible mentors and request review on first draft | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | March 4 | Complete prerequisites, verify value to community and request review on second draft | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | March 11 | Finalized timeline and request review on final draft | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | March 21 | Submit application | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | May 1 | Start bonding | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | May 27 | Start coding and introductory video | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | June 3 | Release introductory video and complete milestone #1 | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | June 10 | Complete milestone #2 | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | June 17 | Complete milestone #3 | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | June 24 | Complete milestone #4 | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | July 1 | Complete milestone #5 | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | July 8 | Submit midterm evaluations | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | July 15 | Complete milestone #6 | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | July 22 | Complete milestone #7 | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | July 29 | Complete milestone #8 | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | August 5 | Complete milestone #9 | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | August 12 | Complete milestone #10 | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + | August 19 | Submit final project video, submit final work to GSoC site and complete final mentor evaluation | + +------------------------+---------------------------------------------------------------------------------------------------------------+ + + +Timeline detailed +================= + +Community Bonding Period (May 1st - May 26th) +============================================== + +- Get to know with community, Read resources for librobotcontrol and beagleboards, get up to speed to begin working on the projects. +- At current period of time, all the required hardware will be available. +- Setup all the beagleboard hardware (Flashing OS and test hello world). +- Check all hardware with beagleboard like DC motors, Servo motors and available sensors. +- Use robotic-cape with beagleboard BeaglBboneBlack (BBB) and librobotcontrol. +- Use robotic-cape with BeagleBone-AI. +- Because my end-semester exams are starting, my availability for contributing will be limited. + +Coding begins (May 27th) +========================= + +- Given the onset of my end-semester exams, my ability to contribute will be reduced. Thank you for your understanding. +- Understand device tree overlays for BeaglBboneBlack (BBB) and AI written for robotic cape. + +Milestone #1, Introductory YouTube video (June 3rd) +=================================================== + +- Due to my End-semester Exam i won't able to contribute more, thank you for understanding. +- Include introductory video. + + +Milestone #2 (June 10th) +========================== + +- End of exam and I can start to contribute more in project. +- Start to write Device tree for GPIO's and PWM support for AI-64. +- Test a device tree overlay to allow AI-64 to light the power LEDs with GPIO support. +- Test PWM Device tree overlay with robotics-cape with help of Hardware specification and check with oscilloscope. +- Get feeback from mentors. + + +Milestone #3 (June 17th) +========================= + +- Write I2C node device tree for AI-64. +- Test I2C with IMU. +- Get feedback from mentor. + +Milestone #4 (June 24th) +========================== + +- Create merge request for I2C Device tree node. +- Write SPI device tree overlay for AI-64. +- Test with robotic-cape. +- Get feedback from mentor. + +Milestone #5 (July 1st) +======================== + +- Create RoboticsCape.dts file for robotic-cape which will support AI-64 using pre-work. +- Test .dts file with robotic cape with AI-64. +- Test example of librobotcontrol with AI-64. +- get feedback from mentor. +- Create merge request for RoboticsCape.dts. + +Submit midterm evaluations (July 8th) +===================================== + +.. important:: + + **July 12 - 18:00 UTC:** Midterm evaluation deadline (standard coding period) + +Milestone #6 (July 15th) +========================= + +- Test RoboticsCape with cape gateware for beagleV-fire pre-installed in image. +- Understand the customization process for cape Gateware. + +Milestone #7 (July 22nd) +========================= + +- Customized LED example for robotic-cape gateware. +- Test GPIO's, Robotic cape with beaglV-fire. +- Create merge request for LED blink with beaglV-fire. + +Milestone #8 (July 29th) +========================= + +- Examine SPI support for beagleV-fire with robotic-cape. +- Create I2C device tree to test barometer on robotic-cape. +- Create merge request for I2C support. +- Discuss results and features with mentor. + +Milestone #9 (Aug 5th) +======================= + +- Test all pre-work for librobotcontrol and robotic-cape with beaeglV-fire. +- Upgrade robotic_cape.dts file gateware for beaeglV-fire using pre-work. +- Create Documentation and feeback from mentors. + +Milestone #10 (Aug 12th) +======================== + +- Finalize the work on robotic-cape.dts for beaeglV-fire and test examples of librobotcontrol. +- Create documentation for current process. +- Fixing other bugs, typos, etc. found during documentation. + +Final YouTube video (Aug 19th) +=============================== + +- Submit final project video, submit final work to GSoC site +and complete final mentor evaluation. + +Final Submission (Aug 24th) +============================ + +.. important:: + + **August 19 - 26 - 18:00 UTC:** Final week: GSoC contributors submit their final work + product and their final mentor evaluation (standard coding period) + + **August 26 - September 2 - 18:00 UTC:** Mentors submit final GSoC contributor + evaluations (standard coding period) + +Initial results (September 3) +============================= + +.. important:: + **September 3 - November 4:** GSoC contributors with extended timelines continue coding + + **November 4 - 18:00 UTC:** Final date for all GSoC contributors to submit their final work product and final evaluation + + **November 11 - 18:00 UTC:** Final date for mentors to submit evaluations for GSoC contributor projects with extended deadline + +Experience and approch +*********************** + +Experience: + • I’m well experienced with Embedded System and C. I’ve in-hand experienced with Embedded programming and Hardware design for various boards and projects. + • Here are my projects which demonstrate my proficiency in Embedded system and Robotics. + 1. `Martian rover used in IRC (International rover challenge ) <https://github.com/vishwaspace>`_ + Martian rover is a prototype of curosity the nasa mars rover which performed function like soil testing, sample collection and monitoring planet. + Project required Embedded hardware and firmware design for motor control, arm control and science sensor's configuration with ROS. + 2. `STM32 custom board <https://github.com/Himanshukohale22/stm32-custom-board-v1.2>`_ + STM32 was custom boad which is made in purpose to learn Embedded programming and hardware design. it's a open source development project. + 3. `Vaayu – AQI and various concentration calculation for gases present in air <https://github.com/Himanshukohale22/FYP_GreenSpace>`_ + VAAYU is air quality monitoring system device which calibrate the different gases concentration and display with a GUI and TFT-display. + 4. `TVC rocketry – Thrust vector control <https://github.com/Himanshukohale22/CYRUS>`_ + TVC rocketry is learning based model project about Thrust vector control rockets, which based on PID implementation and sensors configuration. + + More projects done by me can be found on my `github <https://github.com/Himanshukohale22>`_. + • I’ve designed various double and four layer board for clients and projects using Kicad , Eagle and Altium designer `(Designs) <github/Himanshu/my_designs>`_. And this shows that I’ve very good understanding for reading schematics and Circuit design for embedded development, which is required for This project. + +Approach: + +In my experience, projects often demand a comprehensive understanding of both software and hardware components Before changing the main packages, Hardware setup and debug will required more time than software. This involves meticulous reading of documentation and references, demanding patience and focus. I believe that this content can be completed without any problems. + +Contingency +=========== + +What will you do if you get stuck on your project and your mentor isn’t around? + +Unexpected software and hardware problems are most common in any projects. In such cases, + +1. In the event of encountering compatibility issues between BeagleBoard and librobotcontrol, I'll to use the BeagleBone Black (BBB) platform for testing purposes, as BBB offers native support for the librobotcontrol package. +2. If there is any hardware related issue to board,first ill review the datasheets and manule of hardware and if there is any issue related to circuitry I’ll use oscilloscope, multimeter and other testing devices for debugging. +3. If the problem is about SOC, I’ll check the datasheets of perticular SOC. +4. Here are a few references you can quickly glance at during debugging for guidance. + - `librobotcontrol package Documentation <http://strawsondesign.com/docs/librobotcontrol/>`_ + - `librobotcontrol github <https://github.com/beagleboard/librobotcontrol>`_ + - `Getting started with beaglebone AI-64 <https://docs.beagleboard.org/latest/boards/beaglebone/ai-64/index.html>`_ + - `Getting started with beagleV-fire <https://docs.beagleboard.org/latest/boards/beaglev/fire/index.html>`_ + - Device tree: `github <https://github.com/Himanshukohale22/BeagleBoard-DeviceTrees>`_ , `example blog <https://www.beagleboard.org/blog/2022-02-15-using-device-tree-overlays-example-on-beaglebone-cape-add-on-boards>`_ , `FDT <https://devicetree-specification.readthedocs.io/en/stable/flattened-format.html>`_ , `ref <https://elinux.org/Device_Tree_Reference>`_ `tutorial <https://octavosystems.com/app_notes/osd335x-design-tutorial/osd335x-lesson-2-minimal-linux-boot/linux-device-tree-overlay/>`_ + - `Cape interface docs <https://elinux.org/Beagleboard:BeagleBone_cape_interface_spec#cite_note-2>`_ + - `TDA4VM device tree <https://software-dl.ti.com/jacinto7/esd/processor-sdk-linux-sk-tda4vm/09_01_00/exports/docs/linux/Foundational_Components_Kernel_Users_Guide.html>`_ + - `Validatin scripts for understand device tree <https://github.com/jadonk/validation-scripts>`_ + +Benefit +======== + +If successfully completed, what will its impact be on the `BeagleBoard.org <https://www.beagleboard.org/>`_ community? Include quotes from `BeagleBoard.org <https://www.beagleboard.org/>`_. +community members who can be found on our `Discord <https://bbb.io/gsocchat>`_ and `BeagleBoard.org forum <https://bbb.io/gsocml/13>`_. + +* Librobotcontrol packages will support the beaglebone-AI, beaglebone-AI 64 and BeagleV-fire. +* Various tutorials, Documentation will be added to the Robotic Capes to help the user understand how to use it using llibrobotcotrol packages. + +Misc +==== + +Please complete the requirements listed in the `General Requirements <https://gsoc.beagleboard.io/guides/contributor#general-requirements>`_ . Provide link to merge request. + +- All prerequisite tasks have been completed. + * Source dive for Librobotcontrol packages and read all the documentation for packages + * Check hardware specification, setup and device trees for BBB. + * Here the 'Hello world' cross-compilation task Pull request : `merge request <https://github.com/jadonk/gsoc-application/pull/191>`_ diff --git a/proposals/images_support/BeaeglV-fire_gateware.png b/proposals/images_support/BeaeglV-fire_gateware.png new file mode 100644 index 0000000000000000000000000000000000000000..d4600c8534b62df2a2eaa7d2c94bb42c8e7a0031 Binary files /dev/null and b/proposals/images_support/BeaeglV-fire_gateware.png differ diff --git a/proposals/images_support/Device_tree.png b/proposals/images_support/Device_tree.png new file mode 100644 index 0000000000000000000000000000000000000000..448c27dd1570eb0ff14adf66d769deaf883bc76b Binary files /dev/null and b/proposals/images_support/Device_tree.png differ diff --git a/proposals/index.rst b/proposals/index.rst index 27f103da62b366abcbdef82d68ebd9881973082f..09b125d95b5fba87d065738488d5d8223d78ce34 100644 --- a/proposals/index.rst +++ b/proposals/index.rst @@ -1,6 +1,5 @@ -.. _proposals: +.. _gsoc-proposal-Himanshu kohale: -Proposals ######### .. tip:: @@ -10,6 +9,7 @@ Proposals .. toctree:: :hidden: - + suraj-sonawane + commercial_detection_and_replacement template