Re: Aggregating edges based on the source & target vertex attributes
toggle quoted messageShow quoted text
The processing time does not really surprise me, JanusGraph has to do everything in java. For the typical JanusGraph use case, the storage backend is the limiting factor and the java processing does not really matter. If you want to do this query fast in memory with multiple cores, you are better off with python dask or the like (and do the aggregation on a single dataframe with the edge id, inV label and outV label). I would not be surprised if pandas, using a single core, already does this within a second.
For the queries given above I believe only a single core is used when run as OLTP query. Because this N x N query is not easy to parallelize for TinkerPop, you have to take care how to run it as OLAP query. I would guess that with(SparkGraphComputer) with a single spark executor with 8 cores will work best because then the spark cores share the memory. This is automatically true for spark.master=local[*] .
Best wishes, Marc
PS Thanks for introducing me into the Indian numbering system. Happily, you do not have 1.5 crore vertices!
Op maandag 21 december 2020 om 09:16:08 UTC+1 schreef vishnu gajendran: