-->

My Blog

Latest blog

Introduction

With the escalation of Deep Learning and Computer Vision, comes forth the ability to develop better autonomous vehicles. One such potential vehicular means is Drones. It's application ranges from surveillance, delivery, precision agriculture, weather forecasting, etc. This project has one such application.
Embedded with Python-based Face Recognition and Tracking, and Convolutional Neural Network, the application gives autonomous flight abilities to the Drone. There are two modes: Manual and Autonomous mode. Additional Features included are Normal, Sports, and Berserk mode for Faster Flight Speeds, Flips(Forward, Backward, Left and Right), Patrolling and Live Video Streaming, and autonomous Snapshots. 


The Drones supported for my project are DJI Tello and Tello Edu. Both of these drones have several fascinating features which makes it the perfect candidate for the drone. Such as:
  • =>Affordability
  • =>Relatively Smaller Size
  • =>Programmable with Python and Swift
  • =>Embedded Camera
  • =>Intel Processor for stable flight and turbulence reduction

fig 2. DJI Tello (on the right) and DJI Tello Edu (on the left)

The Softwares Tools and Technologies used in this project are:
  • Programming and Markup Languages: Python, HTML, CSS, 
  • IDLE and Frameworks: Flask, Ajax, Anaconda Library, Jupyter Notebook, PyCharm
  • Python Libraries: OpenCV 4, numpy, Haar cascade-xml file for Face Recognition, FFMPEG, logging, socket, Threading, sys, time, contextlib

Methodology

I started by creating a program that streams videos from the Drone to my laptop. 


fig 3. Network Backend of the Application

There are two basic network streams: A double-ended Full Duplex connection for sending and receiving commands between the laptop and the drone, and a half-duplex one-way connection from the drone to the laptop for video streaming. After establishing a stable enough connection with live streaming functionality, I used OpenCV on the video streamed from the drone for Facial Recognition. The next step was to figure out how to make the drone flight synchronous with my face's movement. I had to use Co-ordinate Geometry for achieving this. 

fig 4. Coordinate Geometry behind the Follow Algorithm
The Follow Algorithm
  1. The pixel layout of our screen starts with (0,0) and goes to the screen resolution of the display((1080,720) for HD or (3840,2160) for 4k). We make use of this to form an x-y pixel graph of the video.  
  2. In this graph, I started by drawing out a Rectangle fixed to the frame of the screen. Then, I placed a fixed point in the centroid of this rectangle. This point will be a static point of reference for other moving points on the screen.
  3. The second point, which would be dynamic in nature, would be the centroid of the rectangle containing all the faces detected in the video. 
  4. The locations of these points, relative to each other, would determine the command to be sent to the Drone. There are two basic commands that we need to send to the drone: Direction and speed of the movement.
  5. The Direction in which the drone will be flown is determined by the orientation of the dynamic point (centroid of the Rectangle enclosing the detected faces), relative to the orientation of the static point(centroid of the static Rectangle enclosing the frame).
  6. The Speed of the drone will be determined by and will be directly proportional to the distance between these two points.
    The result looked like this:
    fig 5. Implementation of the Algorithm using Python


    I then set up a local web server using Flask to provide a UI for the project. At this point, the backend was pretty much ready. Next comes creating a cool UI/UX for my application. For which, I used HTML and CSS. 

    Results

    This video shows me moving around a small room and the drone following me.

    Test 1:



     Test 2:
     
     
    The drone was able to track my face and follow me well. You can clearly see the latency but it could be overcome with better hardware. 

    Future Work

    Although the Drone-Follow algorithm works well, there are somethings that could really improve the working of the application:
    • Integrating C++ to improve latency and response time, bypassing the comparatively slow Python I/O. I've been seeing significantly improved response time using Intel's distribution of Python. I will consider using that too...
    • The DJI camera is still good enough to track the faces but there's always a scope of introducing more functionalities such as more range, better resolution, zooming capabilities, and depth sensing. 
    • Upgrading the WiFi card in the drone would certainly improve the transmission latency, especially in local networks. 
    • I'm curious to see the results when we make use of other Computer Vision libraries such as DLib or Yolo. 
    • Improved battery life for the Drone.
    Finally, you can find my project in my Github repo here.  
    I would also like to thank the awesome developers of the online video editors: EZGif and Kapwing


    Data Structures and Algorithms are two of the core Computer Science Subjects. Data Structures, as its name suggests deals with storing the data in the most efficient manner- both with respect to Space and Time Complexities. Algorithms, on the other hand, provide a step-by-step process to execute commands/instructions to achieve the desired output. From a Computer Science's perspective, any program has the following timeline:  
    'INPUT' -> 'PROCESS' -> 'OUTPUT'

    It is relatively easier for us to be aware of the 'INPUT' we provide to the program, and the 'OUTPUT' we receive from it, compared to the PROCESS part of the program. It can be a challenging experience for anyone who is trying to implement the 'PROCESS' through complex data structures and algorithms. Visualization of such concepts proves to really enhance our understanding and learning experience, by aiding us to see our code in action. This is the basis of my application: 

    AlgoWiz - 'Enhancing the understanding of complex Data Structures and Algorithms through Visualization'.

    AlgoWiz is a Multi-Platform Application that allows users to Visualize various Data Structures and  Algorithms. It has a user-friendly GUI, optimized to visualize algorithms execution on Random Trees/Graphs, for users with little or no experience, and Custom, more Complex Data Structures for intermediate and more experienced users.
    For building auto-generated Trees and Graphs, users can input the Number of Nodes and Create the Data Structure as follows:

    fig.1 Auto-Generated Tree based on the Number of Nodes

    fig.2 Auto-Generated Graph based on the Number of Nodes

    This includes implementation of the Data Structures including:

    • Singly Linked Lists: Fundamental uni-directional, linear way of storing data.
    fig.3 Singly Linked List

    • Doubly Linked Lists: Bi-directional, Linear Way of Storing data.
    fig.4 Doubly Linked List

    • Stacks: First In, First Out Data Structure.
    fig.5 Stack

    • Queues: First In, Last out Data Structure.
    fig.6 Queue

    • Trees: AlgoWiz includes Binary Tree(Binary Tree), Complete BT, Full BT, etc.
    fig.7  Binary Tree

    • Graphs: The application allows us to generate a Simple Graph, Multi-Graph, Pseudo-Graph, and Weighted Graph.
    fig.8 Weighted Graph

    The Algorithms included in the Application are:

    • In-order, Pre-order, Post-order, and Level-order Traversals: These algorithms help traverse through a tree, more specifically a Binary Tree. The basic structure of a Binary Tree node comprises of a Value, Pointer to the Right Child and, a Pointer to the Left Child.
    class Node():
        Node __init__(self, Val):
            self.right=None
            self.value=Val
            self.left=None


      • The in-order sequence of traversal follows : Left-> Root-> Right
    fig.9 In-Order Traversal

      • The pre-order sequence of traversal follows : Left-> Right-> Root
    fig.10 Pre-order Traversal


      • The post-order sequence of traversal follows: Root-> Left-> Right
    fig.11 Post-order Traversal

      • The Level order sequence of traversal follows: All the nodes in Level 0, 1, 2...,n
    fig.12 Level-Order Traversal


                           Following is the example from the application that visualizes Traversal:
    fig.13 Level-order Traversal Visualization in AlgoWiz



    • Breadth-First Search and Depth-First Search:  
              The Breadth-First Search Algorithm grows wide, layer-wise. BFS recursively looks for the element starting from the source to the neighbors in the first layer, then all the neighbors of the second layer, and so on. Below is the representation of this algorithm in a Weighted Graph Data Structure.

    fig.14 Breadth-First Search of a Weighted Graph that searches 5

    The Depth-First Search Algorithm, on the other hand, grows deep. DFS recursively looks for the element starting from the source to its neighbors, then the neighbor's neighbor, and so on.

    fig.15 Depth-First Search of a Weighted Graph that searches 5

    • Djikstra's Shortest Path:
    Djikstra's Shortest Path Algorithm finds the minimum weighted path between 2 nodes in a Weighted Graph.
    fig.16 Shortest Path Algorithm


    • Prim's and Kruskal's Minimum Spanning Tree:
    A Spanning Tree is a data structure formed from a weighted Graph by including all of its vertices, that are connected without forming a Cycle. There could be many Spanning Trees of a Graph. A Minimum Spanning Tree is one where the sum of the weights of the tree's edges is the least amongst all the spanning trees of the Graph. Given a weighted Graph, there are two main algorithms to find its Minimum Spanning Tree: Prim's and Kruskal's MST Algorithm. AlgoWiz comprises of visualization for both of these algorithms.

    Prim's Algorithm:
    fig.17 Prim's Algorithms to find Minimum Spanning Tree of the Graph
    Kruskal's Algorithm:
    • fig.18 Kruskal's Algorithms to find Minimum Spanning Tree of the Graph

    Technologies used in the Application

    The Application makes use of Python(specifically Python 3.7) programming language with the following libraries:
    • Tkinter: For the GUI Framework
    • Matplotlib, Pylab, and Networkx for Graphs and Plotting
    • IDLE: Jupyter Notebook, Visual Studio, and Visual Studio Code
    • PyInstaller for Packaging the python files together
    • Agile Methodology- Scrum and Task Boards for Project Management
    • Additional Project Management Tools: Todoist and Trello
    The Animation and Graph Integration was done through a custom-developed Algorithm with basic in-built python syntax. 

    Future Works

    All the above data structures and algorithms in the current version of the application work well. The upcoming iterations of the applications would include more advanced data structures such as Segment Trees, Red-Black Trees, Heap and algorithms for Searching, Multipath Finding, Scheduling Problems, Finding the Perfect Match, etc.
    Latency in Python programs has been a well-known limitation. This could be seen while unpacking the application, switching between windows. This is a limitation of PyInstaller. I tried using Py2exe for packaging. It showed somewhat similar results.

    Finally, you can find the project in my Github Repository linked here. Thanks for checking out my project. I would love to hear your thoughts or any feedback on the application. You can also reach me through my email-id: akshatbajpai.biz@gmail.com

    Contact Me

    Contact With Me

    Search This Blog

    Want to join the squad, Feel free to reach me out anytime! Here's my contact info: