support self learn tracker

sipeed · Sep 5, 2024 · 0416da8 · 0416da8
1 parent 34f3fb9
commit 0416da8
Show file tree

Hide file tree

Showing 10 changed files with 133 additions and 17 deletions.
diff --git a/docs/doc/en/vision/classify.md b/docs/doc/en/vision/classify.md
@@ -28,7 +28,7 @@ while 1:
 
 Result video:
 
-<video playsinline controls autoplay loop muted preload src="https://wiki.sipeed.com/maixpy/static/video/classifier.mp4" type="video/mp4">
+<video playsinline controls autoplay loop muted preload src="/static/video/classifier.mp4" type="video/mp4">
 Classifier Result video
 </video>
 

diff --git a/docs/doc/en/vision/self_learn_detector.md b/docs/doc/en/vision/self_learn_detector.md
@@ -1,13 +1,36 @@
 ---
-title: MaixCAM MaixPy Self-Learning Detector
+title: MaixCAM MaixPy Self-Learning Detection Tracker
 ---
 
-## MaixPy Self-Learning Detector
+## MaixPy Self-Learning Detection Tracker
 
-Similar to the self-learning classifier, the self-learning detector does not require training. Simply taking a few photos of the object to be detected can enable detection, which is very useful in simple detection scenarios.
-Unlike the self-learning classifier, since it is a detector, it will provide the coordinates and size of the object.
+Similar to the self-learning classifier, this tracker doesn't require training. You can simply select the target object by drawing a box around it, and the system will detect and track the object, making it quite useful in simple detection scenarios. Unlike the self-learning classifier, the detection tracker provides the coordinates and size of the object.
 
-## Using the Self-Learning Detector in MaixPy
+<video playsinline controls autoplay loop muted preload src="/static/video/self_learn_tracker.mp4" style="width: 100%; min-height: 20em;"></video>
+
+## Using the Self-Learning Detection Tracker in MaixPy
+
+MaixPy currently offers a single-target learning detection tracking algorithm. Once you select the target object, the tracker will continuously follow it. The algorithm used here is [NanoTrack](https://github.com/HonglinChu/SiamTrackers/tree/master/NanoTrack), which you can explore if you're interested in learning more about the underlying principles.
+
+You can directly use the built-in self-learning tracking application after flashing the latest system image (>=2024.9.5_v4.5.0) to see the results.
+
+To use it, call the `maix.nn.NanoTrack` class. After initializing the object, call the `init` method to specify the target to be detected, then call the `track` method to continuously track the target. Below is a simplified code example:
+```python
+from maix import nn
+
+model_path = "/root/models/nanotrack.mud"
+tracker = nn.NanoTrack(model_path)
+tracker.init(img, x, y, w, h)
+pos = tracker.track(img)
+```
+Note that this uses a built-in model located in the system at `/root/models`. You can also download the model from the [MaixHub model library](https://maixhub.com/model/zoo/437).
+
+For more detailed code, refer to [MaixPy/examples/vision/self_learn_tracker.py](https://github.com/sipeed/MaixPy/tree/main/examples/vision/self_learn_tracker.py).
+
+## Other Self-Learning Tracking Algorithms
+
+Currently, the NanoTrack algorithm is implemented, which is highly stable and reliable in simple scenarios and provides a sufficient frame rate. However, its limitations include the need for the object to return near the last disappearance point to be detected again if it goes out of view, and the fact that it can only detect one target at a time.
+
+If you have better algorithms, you can refer to the existing NanoTrack implementation for guidance. Feel free to discuss or submit code PRs.
 
-TODO:
 
diff --git a/docs/doc/en/vision/yolov5.md b/docs/doc/en/vision/yolov5.md
@@ -33,7 +33,7 @@ while not app.need_exit():
 Video demonstration:
 
 <div>
-<video playsinline controls autoplay loop muted preload src="https://wiki.sipeed.com/maixpy/static/video/detector.mp4" type="video/mp4">
+<video playsinline controls autoplay loop muted preload src="/static/video/detector.mp4" type="video/mp4">
 </div>
 
 Here, the camera captures an image, which is then passed to the `detector` for detection. The results (classification names and positions) are displayed on the screen.

diff --git a/docs/doc/zh/vision/classify.md b/docs/doc/zh/vision/classify.md
@@ -27,7 +27,7 @@ while 1:
 
 效果视频:
 
-<video playsinline controls autoplay loop muted preload src="https://wiki.sipeed.com/maixpy/static/video/classifier.mp4" type="video/mp4">
+<video playsinline controls autoplay loop muted preload src="/static/video/classifier.mp4" type="video/mp4">
 Classifier Result video
 </video>
 

diff --git a/docs/doc/zh/vision/self_learn_detector.md b/docs/doc/zh/vision/self_learn_detector.md
@@ -1,14 +1,39 @@
 ---
-title: MaixCAM MaixPy 自学习检测器
+title: MaixCAM MaixPy 自学习检测跟踪器
 ---
 
 
-## MaixPy 自学习检测器
+## MaixPy 自学习检测跟踪器
 
-和自学习分类器类似，不需要训练，直接拍摄几张要检测的物体照片即可实现检测，在简单检测场景下十分好用。
+和自学习分类器类似，不需要训练，直接框选目标物体即可实现检测并且跟踪物体，在简单检测场景下十分好用。
 和自学习分类器不同的是因为是检测器，会有物体的坐标和大小。
 
-## MaixPy 中使用自学习检测器
+<video playsinline controls autoplay loop muted preload src="/static/video/self_learn_tracker.mp4" style="width: 100%; min-height: 20em;"></video>
 
-TODO：
+## MaixPy 中使用自学习检测跟踪器
+
+在 MaixPy 目前提供了一种单目标学习检测跟踪算法，即开始框选目标物体，后面会一直跟踪这个物体。
+这里使用的算法是[NanoTrack](https://github.com/HonglinChu/SiamTrackers/tree/master/NanoTrack)，有兴趣了解原理的可以自行学习。
+
+可以烧录最新的系统镜像（>=2024.9.5_v4.5.0）后直接使用内置的自学习跟踪应用看效果。
+
+使用`maix.nn.NanoTrack`类即可，初始化对象后，先调用`init`方法指定要检测的目标，然后调用`track`方法连续跟踪目标，以下为简化的代码：
+```python
+from maix import nn
+
+model_path = "/root/models/nanotrack.mud"
+tracker = nn.NanoTrack(model_path)
+tracker.init(img, x, y, w, h)
+pos = tracker.track(img)
+```
+注意这里使用了内置的模型，在系统`/root/models`下已经内置了，你也可以在[MaixHub 模型库](https://maixhub.com/model/zoo/437)下载到模型。
+
+具体详细代码请看[MaixPy/examples/vision/self_learn_tracker.py](https://github.com/sipeed/MaixPy/tree/main/examples/vision/self_learn_tracker.py)
+
+
+## 其它自学习跟踪算法
+
+目前实现了 NanoTrack 算法，在简单场景非常稳定可靠，而且帧率足够高，缺点就是物体出视野再回来需要回到上次消失的附近才能检测到，以及只能检测一个目标。
+
+如果有更好的算法，可以自行参考已有的 NanoTrack 实现方式进行实现，也欢迎讨论或者提交代码PR。
 
diff --git a/docs/doc/zh/vision/yolov5.md b/docs/doc/zh/vision/yolov5.md
@@ -36,7 +36,7 @@ while not app.need_exit():
 效果视频:
 
 <div>
-<video playsinline controls autoplay loop muted preload src="https://wiki.sipeed.com/maixpy/static/video/detector.mp4" type="video/mp4">
+<video playsinline controls autoplay loop muted preload src="/static/video/detector.mp4" type="video/mp4">
 </div>
 
 这里使用了摄像头拍摄图像，然后传给 `detector`进行检测，得出结果后，将结果(分类名称和位置)显示在屏幕上。

diff --git a/docs/pages/index/README.md b/docs/pages/index/README.md
@@ -342,7 +342,7 @@ MaixVision
     </div>
     <div class="feature_item">
         <div class="img_video">
-            <img src="/static/image/self_learn_detector.jpg">
+            <video playsinline controls autoplay loop muted preload src="/static/video/self_learn_tracker.mp4"></video>
             <p class="feature">AI 自学习检测器</p>
             <p class="description">无需在PC上训练，在设备上瞬间学习任意物体</p>
         </div>

diff --git a/docs/pages/index_en/README.md b/docs/pages/index_en/README.md
@@ -340,7 +340,7 @@ Below are some of the features, for more please see the [documentation](/doc/en/
     </div>
     <div class="feature_item">
         <div class="img_video">
-            <img src="/static/image/self_learn_detector.jpg">
+            <video playsinline controls autoplay loop muted preload src="/static/video/self_learn_tracker.mp4"></video>
             <p class="feature">AI Self Learning Detector</p>
             <p class="description">No need to train on PC, learning any object on device in a flash</p>
         </div>

diff --git a/docs/static/video/self_learn_tracker.mp4 b/docs/static/video/self_learn_tracker.mp4
diff --git a/examples/vision/ai_vision/nn_self_learn_tracker.py b/examples/vision/ai_vision/nn_self_learn_tracker.py
@@ -0,0 +1,68 @@
+from maix import camera, display, app, time, nn, touchscreen, image
+
+# Initialize variables
+model_path = "/root/models/nanotrack.mud"
+tracker = nn.NanoTrack(model_path)
+print(f"Load NanoTrack model {model_path} success")
+
+disp = display.Display()
+touch = touchscreen.TouchScreen()
+cam = camera.Camera(disp.width(), disp.height(), tracker.input_format())
+print("Open camera success")
+
+status = 0  # 0: select target box, 1: tracking
+pressing = False
+target = nn.Object()
+btn_str = "Select"
+font_size = image.string_size(btn_str)
+img_back = image.load("/maixapp/share/icon/ret.png")
+back_rect = [0, 0, img_back.width(), img_back.height()]
+
+def is_in_button(x, y, btn_pos):
+    return x > btn_pos[0] and x < btn_pos[0] + btn_pos[2] and y > btn_pos[1] and y < btn_pos[3]
+
+while not app.need_exit():
+    img = cam.read()
+    touch_status = touch.read()
+    if status == 0:  # Selecting target
+        if touch_status[2]:  # Finger press detected
+            if not pressing:
+                target.x = touch_status[0]
+                target.y = touch_status[1]
+                print("Start select")
+            pressing = True
+        else:
+            if pressing:  # Finger released, finalize selection
+                target.w = touch_status[0] - target.x
+                target.h = touch_status[1] - target.y
+                if target.w > 0 and target.h > 0:
+                    print(f"Init tracker with rectangle x: {target.x}, y: {target.y}, w: {target.w}, h: {target.h}")
+                    tracker.init(img, target.x, target.y, target.w, target.h)
+                    print("Init tracker ok")
+                    status = 1
+                else:
+                    print(f"Rectangle invalid, x: {target.x}, y: {target.y}, w: {target.w}, h: {target.h}")
+            pressing = False
+        if pressing:
+            img.draw_string(2, img.height() - font_size[1] * 2, "Select and release to complete", image.Color.from_rgb(255, 0, 0), 1.5)
+            img.draw_rect(target.x, target.y, touch_status[0] - target.x, touch_status[1] - target.y, image.Color.from_rgb(255, 0, 0), 3)
+        else:
+            img.draw_string(2, img.height() - font_size[1] * 2, "Select target on screen", image.Color.from_rgb(255, 0, 0), 1.5)
+    else:  # Tracking
+        if touch_status[2]:  # Button pressed, return to selection mode
+            pressing = True
+        else:
+            if pressing and is_in_button(touch_status[0], touch_status[1], [disp.width() - 100, disp.height() - 60, 100, 60]):
+                status = 0
+            pressing = False
+        r = tracker.track(img)
+        img.draw_rect(r.x, r.y, r.w, r.h, image.Color.from_rgb(255, 0, 0), 4)
+        img.draw_rect(r.points[0], r.points[1], r.points[2], r.points[3], image.Color.from_rgb(158, 158, 158), 1)
+        img.draw_rect(r.points[4] - r.points[7] // 2, r.points[5] - r.points[7] // 2, r.points[7], r.points[7], image.Color.from_rgb(158, 158, 158), 1)
+        img.draw_string(r.x, r.y - font_size[1] - 2, f"{r.score:.2f}", image.Color.from_rgb(255, 0, 0), 1.5)
+        img.draw_rect(disp.width() - 100, disp.height() - 60, 100, 60, image.Color.from_rgb(255, 255, 255), 4)
+        img.draw_string(disp.width() - 100 + (100 - font_size[0]) // 2, disp.height() - 60 + (60 - font_size[1]) // 2, btn_str, image.Color.from_rgb(255, 255, 255), 1)
+    if touch_status[2] and is_in_button(touch_status[0], touch_status[1], back_rect):
+        app.set_exit_flag(True)
+    img.draw_image(0, 0, img_back)
+    disp.show(img)