ES with JG


Suny <sahithiy...@...>
 

I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ? 


Jason Plurad <plu...@...>
 

1500 vertices is a fair amount, but how much data is getting returned in the valueMap? Is there a large number of properties in the map or perhaps are the values very large? Very easily could be the cost of serializing that map. In another post you were asking about string size restrictions...


On Thursday, September 21, 2017 at 2:40:06 PM UTC-4, Suny wrote:
I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ? 


Suny <sahithiy...@...>
 

Each vertex has 3-5 attributes on it. 


On Friday, September 22, 2017 at 9:02:15 AM UTC-4, Jason Plurad wrote:
1500 vertices is a fair amount, but how much data is getting returned in the valueMap? Is there a large number of properties in the map or perhaps are the values very large? Very easily could be the cost of serializing that map. In another post you were asking about string size restrictions...

On Thursday, September 21, 2017 at 2:40:06 PM UTC-4, Suny wrote:
I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ? 


Suny <sahithiy...@...>
 

values are of small size. like strings of length 20.

On Friday, September 22, 2017 at 10:03:34 AM UTC-4, Suny wrote:
Each vertex has 3-5 attributes on it. 

On Friday, September 22, 2017 at 9:02:15 AM UTC-4, Jason Plurad wrote:
1500 vertices is a fair amount, but how much data is getting returned in the valueMap? Is there a large number of properties in the map or perhaps are the values very large? Very easily could be the cost of serializing that map. In another post you were asking about string size restrictions...

On Thursday, September 21, 2017 at 2:40:06 PM UTC-4, Suny wrote:
I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ? 


Robert Dale <rob...@...>
 

How much time is 'slow'?.  Are your indexes actually enabled?  What is the byte size of the results? Have you ruled out resource issues - cpu, ram, disk, network?

What does `g.V().has('type',textContains('site')).valueMap().profile()` show?

Robert Dale

On Fri, Sep 22, 2017 at 10:10 AM, Suny <sahithiy...@...> wrote:
values are of small size. like strings of length 20.

On Friday, September 22, 2017 at 10:03:34 AM UTC-4, Suny wrote:
Each vertex has 3-5 attributes on it. 

On Friday, September 22, 2017 at 9:02:15 AM UTC-4, Jason Plurad wrote:
1500 vertices is a fair amount, but how much data is getting returned in the valueMap? Is there a large number of properties in the map or perhaps are the values very large? Very easily could be the cost of serializing that map. In another post you were asking about string size restrictions...

On Thursday, September 21, 2017 at 2:40:06 PM UTC-4, Suny wrote:
I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ? 

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8a3c629b-10c4-4d55-8f8f-8e5397f6ff87%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


Suny <sahithiy...@...>
 

The index is enabled in ES. I verified the same query g.V().has('type',textContains('site'))  by turning on force-index property to true and it works.

The initial hit tool 46000ms and later ones tool 600-750ms.

For the second query - g.V().has('type',textContains('site')).valueMap()

The initial hit tool 129826ms and later ones took 550 - 700ms 

gremlin> :> g.V().has('type',textContains('site')).valueMap().profile()

==>Traversal Metrics

Step                                                               Count  Traversers       Time (ms)    % Dur

=============================================================================================================

JanusGraphStep([],[type.textContains(site)])                        1186        1186          17.913     0.03

  optimization                                                                                 0.203

  backend-query                                                     1186                       5.714

PropertyMapStep(value)                                              1186        1186       64199.498    99.97

                                            >TOTAL                     -           -       64217.412        -




On Friday, September 22, 2017 at 10:17:23 AM UTC-4, Robert Dale wrote:
How much time is 'slow'?.  Are your indexes actually enabled?  What is the byte size of the results? Have you ruled out resource issues - cpu, ram, disk, network?

What does `g.V().has('type',textContains('site')).valueMap().profile()` show?

Robert Dale

On Fri, Sep 22, 2017 at 10:10 AM, Suny <sahi...@...> wrote:
values are of small size. like strings of length 20.

On Friday, September 22, 2017 at 10:03:34 AM UTC-4, Suny wrote:
Each vertex has 3-5 attributes on it. 

On Friday, September 22, 2017 at 9:02:15 AM UTC-4, Jason Plurad wrote:
1500 vertices is a fair amount, but how much data is getting returned in the valueMap? Is there a large number of properties in the map or perhaps are the values very large? Very easily could be the cost of serializing that map. In another post you were asking about string size restrictions...

On Thursday, September 21, 2017 at 2:40:06 PM UTC-4, Suny wrote:
I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ? 

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8a3c629b-10c4-4d55-8f8f-8e5397f6ff87%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


Robert Dale <rob...@...>
 

First, I'm not sure if an index is being used here. Second, my full table scans with 10k vertices is faster. (I'm also using 0.2.0-SNAPSHOT)  So you probably have some other issues going on.  Is your Cassandra and ES local or over internet?

No index, no cache:

gremlin> g.V().has('type',textContains('car')).valueMap().profile()
11:15:01 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [(type textContains car)]. For better performance, use indexes
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[type.textContains(car)])                        10000       10000        1640.758    95.02
    \_condition=(type textContains car)
    \_isFitted=false
    \_query=[]
    \_orders=[]
    \_isOrdered=true
  optimization                                                                                 3.493
  scan                                                                                         0.000
    \_condition=VERTEX
    \_query=[]
    \_fullscan=true
PropertyMapStep(value)                                             10000       10000          85.950     4.98
                                            >TOTAL                     -           -        1726.709        -

With Index, No Cache:

gremlin> g.V().has('type',textContains('car')).valueMap().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[type.textContains(car)])                        10000       10000         910.588    35.06
    \_condition=(type textContains car)
    \_isFitted=true
    \_query=[(type textContains car)]:typeIndex
    \_index=typeIndex
    \_orders=[]
    \_isOrdered=true
    \_index_impl=search
  optimization                                                                                 6.669
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                    10000                     700.633
    \_query=typeIndex:[(type textContains car)]:typeIndex
PropertyMapStep(value)                                             10000       10000        1686.415    64.94
                                            >TOTAL                     -           -        2597.003        -



On Friday, September 22, 2017 at 10:26:46 AM UTC-4, Suny wrote:
The index is enabled in ES. I verified the same query g.V().has('type',textContains('site'))  by turning on force-index property to true and it works.

The initial hit tool 46000ms and later ones tool 600-750ms.

For the second query - g.V().has('type',textContains('site')).valueMap()

The initial hit tool 129826ms and later ones took 550 - 700ms 

gremlin> :> g.V().has('type',textContains('site')).valueMap().profile()

==>Traversal Metrics

Step                                                               Count  Traversers       Time (ms)    % Dur

=============================================================================================================

JanusGraphStep([],[type.textContains(site)])                        1186        1186          17.913     0.03

  optimization                                                                                 0.203

  backend-query                                                     1186                       5.714

PropertyMapStep(value)                                              1186        1186       64199.498    99.97

                                            >TOTAL                     -           -       64217.412        -




On Friday, September 22, 2017 at 10:17:23 AM UTC-4, Robert Dale wrote:
How much time is 'slow'?.  Are your indexes actually enabled?  What is the byte size of the results? Have you ruled out resource issues - cpu, ram, disk, network?

What does `g.V().has('type',textContains('site')).valueMap().profile()` show?

Robert Dale

On Fri, Sep 22, 2017 at 10:10 AM, Suny <sahi...@...> wrote:
values are of small size. like strings of length 20.

On Friday, September 22, 2017 at 10:03:34 AM UTC-4, Suny wrote:
Each vertex has 3-5 attributes on it. 

On Friday, September 22, 2017 at 9:02:15 AM UTC-4, Jason Plurad wrote:
1500 vertices is a fair amount, but how much data is getting returned in the valueMap? Is there a large number of properties in the map or perhaps are the values very large? Very easily could be the cost of serializing that map. In another post you were asking about string size restrictions...

On Thursday, September 21, 2017 at 2:40:06 PM UTC-4, Suny wrote:
I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ? 

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8a3c629b-10c4-4d55-8f8f-8e5397f6ff87%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


Suny <sahithiy...@...>
 

I have Cassandra and ES hosted on AWS.

JG and ES on one node and Cassandra running on different node.

Do you think that is causing issues ? 

I am trying to setup JS,ES and Cassandra on same node and try it out. Any other suggestions ?


On Friday, September 22, 2017 at 11:29:04 AM UTC-4, Robert Dale wrote:
First, I'm not sure if an index is being used here. Second, my full table scans with 10k vertices is faster. (I'm also using 0.2.0-SNAPSHOT)  So you probably have some other issues going on.  Is your Cassandra and ES local or over internet?

No index, no cache:

gremlin> g.V().has('type',textContains('car')).valueMap().profile()
11:15:01 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [(type textContains car)]. For better performance, use indexes
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[type.textContains(car)])                        10000       10000        1640.758    95.02
    \_condition=(type textContains car)
    \_isFitted=false
    \_query=[]
    \_orders=[]
    \_isOrdered=true
  optimization                                                                                 3.493
  scan                                                                                         0.000
    \_condition=VERTEX
    \_query=[]
    \_fullscan=true
PropertyMapStep(value)                                             10000       10000          85.950     4.98
                                            >TOTAL                     -           -        1726.709        -

With Index, No Cache:

gremlin> g.V().has('type',textContains('car')).valueMap().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[type.textContains(car)])                        10000       10000         910.588    35.06
    \_condition=(type textContains car)
    \_isFitted=true
    \_query=[(type textContains car)]:typeIndex
    \_index=typeIndex
    \_orders=[]
    \_isOrdered=true
    \_index_impl=search
  optimization                                                                                 6.669
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                    10000                     700.633
    \_query=typeIndex:[(type textContains car)]:typeIndex
PropertyMapStep(value)                                             10000       10000        1686.415    64.94
                                            >TOTAL                     -           -        2597.003        -



On Friday, September 22, 2017 at 10:26:46 AM UTC-4, Suny wrote:
The index is enabled in ES. I verified the same query g.V().has('type',textContains('site'))  by turning on force-index property to true and it works.

The initial hit tool 46000ms and later ones tool 600-750ms.

For the second query - g.V().has('type',textContains('site')).valueMap()

The initial hit tool 129826ms and later ones took 550 - 700ms 

gremlin> :> g.V().has('type',textContains('site')).valueMap().profile()

==>Traversal Metrics

Step                                                               Count  Traversers       Time (ms)    % Dur

=============================================================================================================

JanusGraphStep([],[type.textContains(site)])                        1186        1186          17.913     0.03

  optimization                                                                                 0.203

  backend-query                                                     1186                       5.714

PropertyMapStep(value)                                              1186        1186       64199.498    99.97

                                            >TOTAL                     -           -       64217.412        -




On Friday, September 22, 2017 at 10:17:23 AM UTC-4, Robert Dale wrote:
How much time is 'slow'?.  Are your indexes actually enabled?  What is the byte size of the results? Have you ruled out resource issues - cpu, ram, disk, network?

What does `g.V().has('type',textContains('site')).valueMap().profile()` show?

Robert Dale

On Fri, Sep 22, 2017 at 10:10 AM, Suny <sahi...@...> wrote:
values are of small size. like strings of length 20.

On Friday, September 22, 2017 at 10:03:34 AM UTC-4, Suny wrote:
Each vertex has 3-5 attributes on it. 

On Friday, September 22, 2017 at 9:02:15 AM UTC-4, Jason Plurad wrote:
1500 vertices is a fair amount, but how much data is getting returned in the valueMap? Is there a large number of properties in the map or perhaps are the values very large? Very easily could be the cost of serializing that map. In another post you were asking about string size restrictions...

On Thursday, September 21, 2017 at 2:40:06 PM UTC-4, Suny wrote:
I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ? 

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8a3c629b-10c4-4d55-8f8f-8e5397f6ff87%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


Suny <sahithiy...@...>
 

Any other suggestions on this ?


On Friday, September 22, 2017 at 11:40:31 AM UTC-4, Suny wrote:
I have Cassandra and ES hosted on AWS.

JG and ES on one node and Cassandra running on different node.

Do you think that is causing issues ? 

I am trying to setup JS,ES and Cassandra on same node and try it out. Any other suggestions ?

On Friday, September 22, 2017 at 11:29:04 AM UTC-4, Robert Dale wrote:
First, I'm not sure if an index is being used here. Second, my full table scans with 10k vertices is faster. (I'm also using 0.2.0-SNAPSHOT)  So you probably have some other issues going on.  Is your Cassandra and ES local or over internet?

No index, no cache:

gremlin> g.V().has('type',textContains('car')).valueMap().profile()
11:15:01 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [(type textContains car)]. For better performance, use indexes
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[type.textContains(car)])                        10000       10000        1640.758    95.02
    \_condition=(type textContains car)
    \_isFitted=false
    \_query=[]
    \_orders=[]
    \_isOrdered=true
  optimization                                                                                 3.493
  scan                                                                                         0.000
    \_condition=VERTEX
    \_query=[]
    \_fullscan=true
PropertyMapStep(value)                                             10000       10000          85.950     4.98
                                            >TOTAL                     -           -        1726.709        -

With Index, No Cache:

gremlin> g.V().has('type',textContains('car')).valueMap().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[type.textContains(car)])                        10000       10000         910.588    35.06
    \_condition=(type textContains car)
    \_isFitted=true
    \_query=[(type textContains car)]:typeIndex
    \_index=typeIndex
    \_orders=[]
    \_isOrdered=true
    \_index_impl=search
  optimization                                                                                 6.669
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                    10000                     700.633
    \_query=typeIndex:[(type textContains car)]:typeIndex
PropertyMapStep(value)                                             10000       10000        1686.415    64.94
                                            >TOTAL                     -           -        2597.003        -



On Friday, September 22, 2017 at 10:26:46 AM UTC-4, Suny wrote:
The index is enabled in ES. I verified the same query g.V().has('type',textContains('site'))  by turning on force-index property to true and it works.

The initial hit tool 46000ms and later ones tool 600-750ms.

For the second query - g.V().has('type',textContains('site')).valueMap()

The initial hit tool 129826ms and later ones took 550 - 700ms 

gremlin> :> g.V().has('type',textContains('site')).valueMap().profile()

==>Traversal Metrics

Step                                                               Count  Traversers       Time (ms)    % Dur

=============================================================================================================

JanusGraphStep([],[type.textContains(site)])                        1186        1186          17.913     0.03

  optimization                                                                                 0.203

  backend-query                                                     1186                       5.714

PropertyMapStep(value)                                              1186        1186       64199.498    99.97

                                            >TOTAL                     -           -       64217.412        -




On Friday, September 22, 2017 at 10:17:23 AM UTC-4, Robert Dale wrote:
How much time is 'slow'?.  Are your indexes actually enabled?  What is the byte size of the results? Have you ruled out resource issues - cpu, ram, disk, network?

What does `g.V().has('type',textContains('site')).valueMap().profile()` show?

Robert Dale

On Fri, Sep 22, 2017 at 10:10 AM, Suny <sahi...@...> wrote:
values are of small size. like strings of length 20.

On Friday, September 22, 2017 at 10:03:34 AM UTC-4, Suny wrote:
Each vertex has 3-5 attributes on it. 

On Friday, September 22, 2017 at 9:02:15 AM UTC-4, Jason Plurad wrote:
1500 vertices is a fair amount, but how much data is getting returned in the valueMap? Is there a large number of properties in the map or perhaps are the values very large? Very easily could be the cost of serializing that map. In another post you were asking about string size restrictions...

On Thursday, September 21, 2017 at 2:40:06 PM UTC-4, Suny wrote:
I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ? 

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8a3c629b-10c4-4d55-8f8f-8e5397f6ff87%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


Suny <sahithiy...@...>
 

Testing with JG,Cassandra in local is very fast. Connecting to remote JG,Cassandra hosted in AWS is making it really slow.


On Wednesday, November 8, 2017 at 12:14:53 PM UTC-5, Suny wrote:
Any other suggestions on this ?

On Friday, September 22, 2017 at 11:40:31 AM UTC-4, Suny wrote:
I have Cassandra and ES hosted on AWS.

JG and ES on one node and Cassandra running on different node.

Do you think that is causing issues ? 

I am trying to setup JS,ES and Cassandra on same node and try it out. Any other suggestions ?

On Friday, September 22, 2017 at 11:29:04 AM UTC-4, Robert Dale wrote:
First, I'm not sure if an index is being used here. Second, my full table scans with 10k vertices is faster. (I'm also using 0.2.0-SNAPSHOT)  So you probably have some other issues going on.  Is your Cassandra and ES local or over internet?

No index, no cache:

gremlin> g.V().has('type',textContains('car')).valueMap().profile()
11:15:01 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [(type textContains car)]. For better performance, use indexes
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[type.textContains(car)])                        10000       10000        1640.758    95.02
    \_condition=(type textContains car)
    \_isFitted=false
    \_query=[]
    \_orders=[]
    \_isOrdered=true
  optimization                                                                                 3.493
  scan                                                                                         0.000
    \_condition=VERTEX
    \_query=[]
    \_fullscan=true
PropertyMapStep(value)                                             10000       10000          85.950     4.98
                                            >TOTAL                     -           -        1726.709        -

With Index, No Cache:

gremlin> g.V().has('type',textContains('car')).valueMap().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[type.textContains(car)])                        10000       10000         910.588    35.06
    \_condition=(type textContains car)
    \_isFitted=true
    \_query=[(type textContains car)]:typeIndex
    \_index=typeIndex
    \_orders=[]
    \_isOrdered=true
    \_index_impl=search
  optimization                                                                                 6.669
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                    10000                     700.633
    \_query=typeIndex:[(type textContains car)]:typeIndex
PropertyMapStep(value)                                             10000       10000        1686.415    64.94
                                            >TOTAL                     -           -        2597.003        -



On Friday, September 22, 2017 at 10:26:46 AM UTC-4, Suny wrote:
The index is enabled in ES. I verified the same query g.V().has('type',textContains('site'))  by turning on force-index property to true and it works.

The initial hit tool 46000ms and later ones tool 600-750ms.

For the second query - g.V().has('type',textContains('site')).valueMap()

The initial hit tool 129826ms and later ones took 550 - 700ms 

gremlin> :> g.V().has('type',textContains('site')).valueMap().profile()

==>Traversal Metrics

Step                                                               Count  Traversers       Time (ms)    % Dur

=============================================================================================================

JanusGraphStep([],[type.textContains(site)])                        1186        1186          17.913     0.03

  optimization                                                                                 0.203

  backend-query                                                     1186                       5.714

PropertyMapStep(value)                                              1186        1186       64199.498    99.97

                                            >TOTAL                     -           -       64217.412        -




On Friday, September 22, 2017 at 10:17:23 AM UTC-4, Robert Dale wrote:
How much time is 'slow'?.  Are your indexes actually enabled?  What is the byte size of the results? Have you ruled out resource issues - cpu, ram, disk, network?

What does `g.V().has('type',textContains('site')).valueMap().profile()` show?

Robert Dale

On Fri, Sep 22, 2017 at 10:10 AM, Suny <sahi...@...> wrote:
values are of small size. like strings of length 20.

On Friday, September 22, 2017 at 10:03:34 AM UTC-4, Suny wrote:
Each vertex has 3-5 attributes on it. 

On Friday, September 22, 2017 at 9:02:15 AM UTC-4, Jason Plurad wrote:
1500 vertices is a fair amount, but how much data is getting returned in the valueMap? Is there a large number of properties in the map or perhaps are the values very large? Very easily could be the cost of serializing that map. In another post you were asking about string size restrictions...

On Thursday, September 21, 2017 at 2:40:06 PM UTC-4, Suny wrote:
I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ? 

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8a3c629b-10c4-4d55-8f8f-8e5397f6ff87%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


Calvin Lei <ckp...@...>
 

have you tried deploying your code to an AWS EC2 instance? you should see at least some improvement. And then i would use JanusGraph's profiling along to debug further
And also make sure ES, Cassandra, and your code are all in the same region. 



On Wednesday, November 8, 2017 at 9:16:05 AM UTC-8, Suny wrote:
Testing with JG,Cassandra in local is very fast. Connecting to remote JG,Cassandra hosted in AWS is making it really slow.

On Wednesday, November 8, 2017 at 12:14:53 PM UTC-5, Suny wrote:
Any other suggestions on this ?

On Friday, September 22, 2017 at 11:40:31 AM UTC-4, Suny wrote:
I have Cassandra and ES hosted on AWS.

JG and ES on one node and Cassandra running on different node.

Do you think that is causing issues ? 

I am trying to setup JS,ES and Cassandra on same node and try it out. Any other suggestions ?

On Friday, September 22, 2017 at 11:29:04 AM UTC-4, Robert Dale wrote:
First, I'm not sure if an index is being used here. Second, my full table scans with 10k vertices is faster. (I'm also using 0.2.0-SNAPSHOT)  So you probably have some other issues going on.  Is your Cassandra and ES local or over internet?

No index, no cache:

gremlin> g.V().has('type',textContains('car')).valueMap().profile()
11:15:01 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [(type textContains car)]. For better performance, use indexes
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[type.textContains(car)])                        10000       10000        1640.758    95.02
    \_condition=(type textContains car)
    \_isFitted=false
    \_query=[]
    \_orders=[]
    \_isOrdered=true
  optimization                                                                                 3.493
  scan                                                                                         0.000
    \_condition=VERTEX
    \_query=[]
    \_fullscan=true
PropertyMapStep(value)                                             10000       10000          85.950     4.98
                                            >TOTAL                     -           -        1726.709        -

With Index, No Cache:

gremlin> g.V().has('type',textContains('car')).valueMap().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[type.textContains(car)])                        10000       10000         910.588    35.06
    \_condition=(type textContains car)
    \_isFitted=true
    \_query=[(type textContains car)]:typeIndex
    \_index=typeIndex
    \_orders=[]
    \_isOrdered=true
    \_index_impl=search
  optimization                                                                                 6.669
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                                                0.000
    \_query=typeIndex:[(type textContains car)]:typeIndex
  backend-query                                                    10000                     700.633
    \_query=typeIndex:[(type textContains car)]:typeIndex
PropertyMapStep(value)                                             10000       10000        1686.415    64.94
                                            >TOTAL                     -           -        2597.003        -



On Friday, September 22, 2017 at 10:26:46 AM UTC-4, Suny wrote:
The index is enabled in ES. I verified the same query g.V().has('type',textContains('site'))  by turning on force-index property to true and it works.

The initial hit tool 46000ms and later ones tool 600-750ms.

For the second query - g.V().has('type',textContains('site')).valueMap()

The initial hit tool 129826ms and later ones took 550 - 700ms 

gremlin> :> g.V().has('type',textContains('site')).valueMap().profile()

==>Traversal Metrics

Step                                                               Count  Traversers       Time (ms)    % Dur

=============================================================================================================

JanusGraphStep([],[type.textContains(site)])                        1186        1186          17.913     0.03

  optimization                                                                                 0.203

  backend-query                                                     1186                       5.714

PropertyMapStep(value)                                              1186        1186       64199.498    99.97

                                            >TOTAL                     -           -       64217.412        -




On Friday, September 22, 2017 at 10:17:23 AM UTC-4, Robert Dale wrote:
How much time is 'slow'?.  Are your indexes actually enabled?  What is the byte size of the results? Have you ruled out resource issues - cpu, ram, disk, network?

What does `g.V().has('type',textContains('site')).valueMap().profile()` show?

Robert Dale

On Fri, Sep 22, 2017 at 10:10 AM, Suny <sahi...@...> wrote:
values are of small size. like strings of length 20.

On Friday, September 22, 2017 at 10:03:34 AM UTC-4, Suny wrote:
Each vertex has 3-5 attributes on it. 

On Friday, September 22, 2017 at 9:02:15 AM UTC-4, Jason Plurad wrote:
1500 vertices is a fair amount, but how much data is getting returned in the valueMap? Is there a large number of properties in the map or perhaps are the values very large? Very easily could be the cost of serializing that map. In another post you were asking about string size restrictions...

On Thursday, September 21, 2017 at 2:40:06 PM UTC-4, Suny wrote:
I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ? 

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8a3c629b-10c4-4d55-8f8f-8e5397f6ff87%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


Ankur Goel <ankur...@...>
 

Application/ES/Cassandra/JanusGraph-Server should be in same network.

~


On Friday, September 22, 2017 at 12:10:06 AM UTC+5:30, Suny wrote:
I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ? 


Don Omondi <don.e...@...>
 

At only 1500 vertices, JanusGraph, Cassandra and Elasticsearch should all be on the same VPS if you ask me.

On Tuesday, November 14, 2017 at 5:30:15 PM UTC+3, Ankur Goel wrote:

Application/ES/Cassandra/JanusGraph-Server should be in same network.

~


On Friday, September 22, 2017 at 12:10:06 AM UTC+5:30, Suny wrote:
I implemented a mixed index on 'type' attribute on vertex.

Whenever i query for :> g.V().has('type',textContains('car'))

the result comes out really really fast. 

When I do :> g.V().has('type',textContains('site')).valueMap() it is taking so much time to retrieve data. I have 1500 vertices of type car.

Is it because for 1st query it gets the result using ES for the second query it gets vertex id's from ES but for getting attributes it needs to hit JG again ?

Is there a way to fasten up the second query ?