Re: nnstreamer & openVINO

Olda Šmíd

Hi, I wonder if it is possible to run openvinO support (USB stick NSC2) on nnstreamer?
On my laptop the program runs fast, but on raspberry (ubuntu server) everything is slow. I get a speed of about one frame per second. It occurs to me that it is probably not reasonable to use tensorflow lite models. On the other hand, the question is whether I will help other models :)

pá 5. 3. 2021 v 3:10 odesílatel MyungJoo Ham <myungjoo.ham@...> napsal:

- The reason why you need two videoconverts in your pipeline is because
    the format required by cairooverlay+ximagesink (BGRx, BGRA, RGB16) is different from the format required by the neural network (RGB).


- Each imagesink element has different characteristics. You should choose image sink element according to your graphics/UI frameworks. (or try autovideosink). For more information for these GStreamer base/good/bad plugins, you will need to talk with GStreamer community; I don't think NNStreamer committers have good understandings of GStreamer's original plugins.


- If you are switching X with Wayland, yes, I suppose so if your main overheads are coming from UI stack, which is why Tizen has switched to Wayland from X. But, you should be careful: you need to first analyze if that's the main overhead (the bottleneck) and if your applications' *(functional) requirements may be met by Wayland.




--------- Original Message ---------

Sender : Olda Šmíd <Olda476@...>

Date : 2021-03-05 03:18 (GMT+9)

Title : Re: [NNStreamer Technical Discuss] nnstreamer & openVINO


Now I wonder where I made a mistake in the code of the rtsp player that there was an increase in performance :) What you are writing really makes sense. I just didn't understand that everyone uses parse_launch and doesn't get angry with the dynamic pipeline.
I use this pipeline:
"rtspsrc location = rtsp: // 8554 / test latency = 0! decodebin! videoconvert! videoscale!"
      "video / x-raw, width =% d, height =% d, format = RGB! tee name = t_raw"
      "t_raw.! queue! videoconvert! cairooverlay name = tensor_res! ximagesink name = img_tensor"
      "t_raw.! queue leaky = 2 max-size-buffers = 2! videoscale! video / x-raw, width =% d, height =% d! tensor_converter!"
      "tensor_transform mode = arithmetic option = typecast: float32, add: -127.5, div: 127.5!"
      "tensor_filter framework = tensorflow-lite model =% s!"
      "tensor_sink name = tensor_sink"

- I think I'm stupidly using videoconvert twice. I get h264 video from the camera, I understand that I have to convert to raw format, but I don't understand why I convert twice via videocale format.
- I find that glimagesink works very well with raspberry, but ximagesink is used in the example, so I left it.
- Do you think speed on rpi would help wayland? It's very imperfect software, but it came to me very fast.

čt 4. 3. 2021 v 14:17 odesílatel MyungJoo Ham via <> napsal:

I would like to join github, I just don't know if I should publish on my account or join your :)

In Github, you are usually supposed to "fork" upstream ( to your personal repo ( and do the developments in your own personal repo. (I'm also following this developmental model).

You are allowed to discuss in the upstream GitHub issue ( and "send" your code commits from your personal repo to the upstream when (so called "pull-request").

So.... you are supposed to publish in your account (personal repo) and participate at the upstream ( You are not required to join any group for 1) forking nnstreamer, 2) doing development in your fork, 3) sending "pull requests" to upstream, 4) discussing in nnstreamer githubissues, 5) reviewing others' code commits, and so on.

It's very interesting - so it's relatively useless to convert an example to c - I just need to convert parse_launch to a dynamic pipeline -> use the syntax I know from c and connect elements via gst_bin_add_many and gst_link_many. As I wrote, I would like to use nnstreamer on raspberry, where the performance is insufficient. So I need every possible code acceleration. My friend and I have a video player on my raspberry, which displays any number of windows with an rtsp stream. RPI handles a maximum of 4 with a dynamic pipeline, with a maximum of two for parse_launch.

I don't think the runtime performance of pipelines would differ between GST-API implementation (what you are doing) and parse-launched implementation. If you look at gat-parse code, they actually call GST-API to generate pipelines. Once the pipeline is parsed and started, there should be no differences. The only overheads of parse-launch are parsing the string, which affects the start-up time, and the parser implementation should be efficient enough for RPI3 to execute within a few milliseconds. If there is performance problems with parse-launch, I'd investigate something else (the pipeline topology itself).

Cheers, MyungJoo



Join to automatically receive all group messages.