linyilyi / pose-monitor Goto Github PK

“让爷康康”是一款手机 AI 应用程序，可以监测不良坐姿并进行语音提示

License: Apache License 2.0

Kotlin 3.30% Jupyter Notebook 96.69% Python 0.01%

pose-monitor's Introduction

PoseMon 让爷康康

“让爷康康”是一款应用于安卓平台的手机应用，可以实时监测不良坐姿并给出语音提示。本项目主要基于 Tensorflow Lite 官方示例 - 姿态估计实现，其中 AI 部分包含用于姿态估计的 MoveNet，以及用于对姿态进行分类的全连接网络。本应用不需要联网使用，所有 AI 特性均在手机本地运行，不需要将视频画面传输至外部服务器，仅需要摄像头权限用于获取姿态画面。视频介绍可以点击 bilibili 或 YouTube。

文件结构

├───android
│   ├───app
│   │   └───src
│   └───gradle
├───doc_images
├───main
│   └───pose_data
│       └───train
│           ├───forwardhead
│           └───standard

项目的两个主要文件夹为 android/ 与 main/。android/ 下包含了所有与移动 App 相关的代码，main/ 文件夹下则是分类网络的训练数据与记录了训练过程的 pose_classification.ipynb 文件，训练数据存放在 main/pose_data/train/ 目录下，为精简项目体积，只上传了 pose_classification.ipynb 用到的两张示例图片。如果需要训练分类模型，可以按 pose_classification.ipynb 上面的指示填充 main/pose_data/train/ 与 main/pose_data/test/ 两个文件夹。doc_images/ 文件夹下是本文档所用到的示例图片，并不包括项目代码。

在 Android Studio 中编译程序并运行

本项目 Android 工程部分已编译为 apk 安装包，可直接在项目发布页面下载安装进行测试。如需进一步开发测试，可以在 Android Studio 中对安卓工程文件进行编译。

准备工作

安卓项目的编译需要 Android Studio，可以进入官方网站按照说明进行下载安装。
需要准备一部安卓手机。

编译程序

通过 git clone 克隆本项目，或者以压缩包形式下载项目文件并解压。
打开 Android Studio，在初始的 Welcome 界面选择 Open an existing Android Studio project，打开项目中的安卓工程文件夹。
安卓工程文件位于本项目的 android/ 文件夹下。在 Android Studio 的提示窗口中选择该文件夹。项目打开后软件可能会提示需要进行 Gradle 同步，同意并等待同步完成即可。
将处于开发者模式的手机通过 USB 线连接到电脑，具体连接方法可以参考官方教程。如果程序顶部工具栏右侧正确显示了你的手机型号，说明设备连接成功。
如果是首次安装 Android Studio，可能还需要安装一系列开发工具。点击软件界面右上角的绿色三角按钮Run 'app'直接运行程序。如果有需要安装的工具，系统会进行提示，按照提示依次安装即可。

模型介绍

本项目需要用到两个神经网络模型文件，均已包含在本项目中，不需要额外下载。第一个是 int8 格式的 MoveNet Thunder 神经网络模型，可以点击官方模型文件链接进一步了解。MoveNet 是谷歌推出的轻量级人体姿态估计模型，有 Thunder 和 Lightning 两个版本。其中 Thunder 版本运行速度较慢，但准确率更高，本项目使用的是 Thunder 版本。该版本又分为 float16、int8 两种数据格式。其中 float16 模型只能在通用 GPU 上运行，而 int8 模型既可以运行于通用 GPU 之上，也可以在高通骁龙处理器的 Hexagon DSP 数字信号处理器上运行。运行在 Hexagon 处理器上时，AI 程序运行速度更快、也更省电，建议对 AI 模型进行移动部署时优先选择 Hexagon 处理器。目前谷歌也推出了自研的 Google Tensor 处理器，最新型号为 Tensor G2，如何调用 Tensor 处理器的 AI 加速单元尚不清楚，未来拿到设备实测确认后会更新文档。

训练自己的分类网络

除了 MoveNet Thunder，本项目还使用了一个简单的全连接网络对 MoveNet 输出的姿态信息（人体 17 个关键点的坐标）进行分类，用来判断画面中的人处于“标准坐姿”、“翘二郎腿”、“脖子前倾驼背”中的哪一种状态。关于该分类网络的介绍以及训练过程实际演示，可以参考 Tensorflow Lite 的 Jupyter Notebook 教程，或是本项目中修改并注释过的版本。本项目为了对“标准坐姿”、“翘二郎腿”、“脖子前倾驼背”三种姿态进行分类，为每种姿态采集了约 300 张照片作为训练集（共 876 张照片），为每种姿态采集了约 30 张作为测试集（共 74 张照片）。其中训练集与测试集为不同人物主体，以此来在训练过程中及时发现模型的过拟合问题。训练数据应存放于 main/pose_data/train/ 路径下的 standard、crossleg、forwardhead 三个文件夹中，测试数据则位于 main/pose_data/test/ 路径下。本项目中用于训练分类网络的 Jupyter Notebook 会将原始数据自动转化为训练数据包，在此过程中生成每张照片的 MoveNet 检测结果，并将每张照片标记为三种姿态中的一种，最后将所有信息存储在 main/pose_data/train_data.csv、main/pose_data/test_data.csv，并生成记录标签信息的文本文件 main/pose_data/pose_labels.txt。在 Notebook 中训练完毕后，在 main/pose_data/ 路径下会自动生成 .tflite 权重文件，导入至 Android Studio 项目中，替换掉本项目中的 android\app\src\main\assets\classifier.tflite 即可使用。

运行效果

将手机连接至电脑，Android Studio 可以对本项目进行编译并将 App 安装至手机。打开应用，授权使用相机后，App 便可以监测人体坐姿并根据实时检测结果给出语音提示。程序的显示界面主要分为上、中、下三部分，顶部显示 AI 对当前姿态的判断结果，中部为摄像头实时画面，底部为信息显示界面，其中“运算设备”一栏可以选择不同选项，使用 CPU、GPU 或 NNAPI（Hexagon AI 加速器）进行计算，其中 NNAPI 速度最快，也最省电。为了避免程序误报，App 加入了一系列判断逻辑以提高 Precision（精确率）。连续 30 帧出现不健康坐姿时，程序会进入警戒状态，此时如果接下来 30 帧画面同样均判定为不健康坐姿，程序才会发出语音提示。效果如图所示：

鸣谢

本项目主要基于 Tensorflow Lite Pose Estimation 示例项目，离不开 Tensorflow、Jupyter Notebook 等开源框架、开源开发工具。感谢各位程序工作者对开源社区的贡献！

[2022.11.06] 感谢 @zhengbangbo 加入对前置摄像头的支持。（注：前置摄像头只录上半身可能导致“跷二郎腿”姿势识别不稳定，但可以支持未来开发更多上半身姿势判断。）

pose-monitor's People

Contributors

Stargazers

Watchers

Forkers

x1ngzai hyyzt su2353 rogerszzz sdpigpig jstony aisi746467512 yuansun liwagu sereneguest zhebuduiba userwangzz ll413608 white-fea zhuyilun ffr41mre joeupwu sokasa hincky wenbobobo gzthss blackgiulia t-l-t ztsinsun foundadd helloangus augustpluscn hei1225 asterismp wenchao404 ljy2016 louisloufy freesoaring endcloud lxiol-star diliess appcellphone excelyeh nonomal fanzhouyou vainants crazy6no olexrk jtll1914 jiezili departureszy mrcrow iasffcel runze166 jasperjayc1 blackthompson wenxiao2012 summitn imycod atlas-wong waraaaho tuapuikia emaginations hnz989 xiangweizheng nvono aningstar gunnicus kinmingtsang supperbo wd0407 thomasfung0119 2045ga hstarstudio herb95 iron13 phil153 isingerw alkaid-ly itworkonline asionbo awesome-helpers tinydream96 herotiga dinneo neil-huang wenzhaojie littlenewton altubers o2e hungyuantsai hotpoor sawaldo narutoforever ttanlangzi skisqibao libaolong186 asdlei99 landlordlycat frings-liu foxllhl drink36 mattd1947 heycms yang-jiaxiang

pose-monitor's Issues

使用后遇到的两个想讨论的点。

画面有缺失的躯干，预测骨架是脑补的，一直抖动。
当出现多人的，预测只有一个人能显示，但是预测骨架会在几个人身上来回变。

在生成姿态分类模型时报错

Define a Keras model for pose classification
print(landmarks.shape)结果(None, 17, 2)

ValueError: Exception encountered when calling layer 'flatten' (type Flatten).

Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor.

Call arguments received by layer 'flatten' (type Flatten):
• inputs=<KerasTensor: shape=(None, 17, 2) dtype=float32 (created by layer 'tf.math.truediv')>
为什么呀？

可以实现同时识别多个人吗，我测试了下只能识别一个

可以实现同时识别多个人吗，我测试了下只能识别一个。
要实现同时识别多人，可行吗？求大神指个路
（可能性能要求会比较高？同时识别2个人也成）

医疗场景中康复治疗的应用讨论

林哥你的这个开源项目突然让我想到一个医疗中使用的场景：
医院中有个科室叫做康复科，主要以促进残疾人及患者康复为主要目的。其中不乏很多需要判定患者在治疗中康复运动是否达标或到位，比如一个简单的抬脚、举手、弯腰等动作。
如果有了这个应用，患者或者自己就可以独立完成一些简单的动作训练，从而解放治疗师的时间，服务更多的患者。

看到林哥的样例使用的是摄像头，对于这个输入源，视频按道理也是可以的，毕竟同样使用的都是 SurfaceView ，如果能行，我是想通过选择本地视频，视频中播放患者做康复训练，在绘制出人体部位的同时，同时绘制出某个部位（比如膝盖）在训练中的移动轨迹。

林哥的这个项目确实很棒，从中其实可以挖掘出很多潜在的应用场景，欢迎大家前来讨论。

请教想要自己训练动作，但最后生成tflite时报错

想要自己训练一些动作，编译大佬的文件报错

报错位置：
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()

报错信息：Some of the operators in the model are not supported by the standard TensorFlow Lite runtime. If those are native TensorFlow operators, you might be able to use the extended runtime by passing --enable_select_tf_ops, or by setting target_ops=TFLITE_BUILTINS,SELECT_TF_OPS when calling tf.lite.TFLiteConverter(). Otherwise, if you have a custom implementation for them you can disable this error with --allow_custom_ops, or by setting allow_custom_ops=True when calling tf.lite.TFLiteConverter(). Here is a list of builtin operators you are using: ADD, DIV, EXPAND_DIMS, FLOOR_DIV, FULLY_CONNECTED, GATHER, MAXIMUM, MUL, PACK, REDUCE_MAX, RESHAPE, SOFTMAX, SQRT, SQUEEZE, STRIDED_SLICE, SUB, SUM. Here is a list of operators for which you will need custom implementations: BroadcastTo, Size.

需要增加这句： converter.allow_custom_ops=True

得到最终：16:23converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.allow_custom_ops=True
tflite_model = converter.convert()

这样编译通过了但是得到的tflite的文件运行到AS上面跑不通，想知道up主为什么不需要加就能编译通过

请教

import org.tensorflow.posemon.data.BodyPart这个依赖中的posemon报红该怎么解决安卓开发不太懂

林哥，可以写一个调用远程摄像头的吗

如果只是手机摄像头的话不够灵活，我一开始想尝试去调用esp32-cam摄像头，安卓技术太菜了，搞了半天没搞出来

好好玩哈哈

希望能增加切换到前置摄像头的支持

设备小米6 类原生Android11

林哥！！！适配个麒麟

求适配麒麟

发现MI 8打开就闪退了（悲

难道是不能设置始终允许相机的原因？

切换前置摄像头后卡顿

在win11安卓子系统虚拟机上监控画面是横着的，希望能增加一个选项调节

上半身动作，增加一个里屏幕太近就提示的警告

目前有一个Pixel6Pro，需要可以提供📣

关于该项目在线下课堂的应用设想

如果运用在线下课堂，就可以检测学生上课的时候是不是在看黑板，或者在大学里面通过学生的专注姿态反向评价老师的上课质量。然后每节课下课就生成报告上传到师生的APP上，就是不知道这app做出来会不会喷死，并且刚刚试了一下一次只能跟踪一个目标。不知道，能否实现同时跟踪多个目标。想做这个项目，大佬能带带吗？

老爸的语音是录制好的么？还是用哪个AI语音开源库训练生成的？

来求语音开源框架

一键三连～下次一定哦～

Have a error when I loading the movenat_thunder.tflite model

Hi,
I tried loading the movenat_thunder.tflite model with tf.lite.interpreter, using the following code:

import tensorflow as tf
interpreter = tf.lite.Interpreter(model_path='movenat_thunder.tflite')

But was hit with the following error:

ValueError Traceback (most recent call last)
/home/generate_cc_array.ipynb Cell 5 in <cell line: 1>()
----> 1 interpreter = tf.lite.Interpreter(model_path='movenat_thunder.tflite')
2 interpreter.allocate_tensors()
4 input_details = interpreter.get_input_details()[0]

File ~/virtual_environments/utkface/lib/python3.10/site-packages/tensorflow/lite/python/interpreter.py:455, in Interpreter.init(self, model_path, model_content, experimental_delegates, num_threads, experimental_op_resolver_type, experimental_preserve_all_tensors)
448 custom_op_registerers_by_name = [
449 x for x in self._custom_op_registerers if isinstance(x, str)
450 ]
451 custom_op_registerers_by_func = [
452 x for x in self._custom_op_registerers if not isinstance(x, str)
453 ]
454 self._interpreter = (
--> 455 _interpreter_wrapper.CreateWrapperFromFile(
456 model_path, op_resolver_id, custom_op_registerers_by_name,
457 custom_op_registerers_by_func, experimental_preserve_all_tensors))
458 if not self._interpreter:
459 raise ValueError('Failed to open {}'.format(model_path))

ValueError: quantized_dimension must be in range [0, 1). Was 3.Tensor 33 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 36 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 40 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 44 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 48 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 52 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 56 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 60 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 64 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 68 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 72 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 76 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 80 has invalid quantization parameters.quantized_dimension must be in range [0, 1). Was 3.Tensor 84 has invalid quantization parameters.

Tensorflow version : 2.9.1
Python version: 3.10.4

How to solve this issue? Please guide

Thanks

怎么调用前置摄像头？

应该是因为麒麟不支持NNAPI，进去直接闪退

不把NNAPI设为默认项就可以了。

崩溃报错信息

机型：三星Galaxy S9，打开会崩溃闪退，AS报错信息：Process: lyi.linyi.posemon, PID: 31265
java.lang.IllegalArgumentException: Internal error: Failed to apply delegate: NN API returned error ANEURALNETWORKS_OP_FAILED at line 4274 while completing NNAPI compilation.

Node number 157 (TfLiteNnapiDelegate) failed to prepare.

小米11u 切换NNAPI卡顿

切换后

切换前

888应该是支持NNAPI的吧

Sony多款機型閃退。

如題～
測試機型：1 III、10 II、XA2、XZ2。以上四機型皆在取得相機權限後閃退。

data and ml module mising ?

from data import BodyPart
from ml import Movenet
where are those two modules ?

崩溃闪退

三星note 10+ 手机不能运行

报错求解决

AttributeError Traceback (most recent call last)
in
12 image_bad = tf.io.decode_jpeg(image_bad)
13 person = detect(image_bad)
---> 14 _ = draw_prediction_on_image(image_bad.numpy(), person, crop_region=None,
15 close_figure=False, keep_input_size=True)

in draw_prediction_on_image(image, person, crop_region, close_figure, keep_input_size)
18 """
19 # Draw the detection result on top of the image.
---> 20 image_np = utils.visualize(image, [person])
21
22 # Plot the image with detection results.

AttributeError: module 'utils' has no attribute 'visualize'
类似这样的好多地方都找不到我要如何进行操作呢？