Phantom vertices


Rohit Jain <rohit.j...@...>
 

Hi folks,

I created 4916 vertices with the label 'movie' and a property 'movieid' where the movieid goes from 1 to 4916.
I created 8491 vertices with the label 'person' and a property 'personid' where the personid goes from 1 to 8491.
I create 'role' edges from these person vertices to the movie vertices with a 'roletype' indicating actor or director

When I do g.V().count() I get 13420 when I should get 13407.

Looks like I have some phantom vertices.  How do I find them?

Also, on a g.V().hasLabel('person').count(), I get 8492 instead of 8491.  So I even have a phantom personid that I don't know how to locate.

Rohit


Robert Dale <rob...@...>
 


Let's see what you've got:

g.V().groupCount().by(label)

g.V().values('movieid').min()
g.V().values('movieid').max()

g.V().values('personid').min()
g.V().values('personid').max()


Robert Dale

On Tue, Aug 15, 2017 at 7:44 PM, Rohit Jain <rohit.j...@...> wrote:
Hi folks,

I created 4916 vertices with the label 'movie' and a property 'movieid' where the movieid goes from 1 to 4916.
I created 8491 vertices with the label 'person' and a property 'personid' where the personid goes from 1 to 8491.
I create 'role' edges from these person vertices to the movie vertices with a 'roletype' indicating actor or director

When I do g.V().count() I get 13420 when I should get 13407.

Looks like I have some phantom vertices.  How do I find them?

Also, on a g.V().hasLabel('person').count(), I get 8492 instead of 8491.  So I even have a phantom personid that I don't know how to locate.

Rohit

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Rohit Jain <rohit.j...@...>
 

Robert,

This is EXACTLY what I was looking for!  I could not find how to do this.  I must educate myself better but don't seem to find an easy way to do that :-(  Maybe SQL is too ingrained in my blood and this learning requires different skills -- teaching an old dog new tricks ain't easy.

Okay, so  it seems like I had some demigod, titan, location, god, human, and monster vertices that I don't remember adding.
Now on the person vertices I get a min of 1 and a max of 8491, and I know I have a unique index on it.  But it still shows 8492.  This phantom is really strange.  I thought I would try to put it into a log file, pull it into Excel and figure out there what is going on.  I am wondering whether it is using an index which somehow got corrupted.  As you may have seen, I have an index that is in a strange state of INSTALLED that I cannot REINDEX or REMOVE.

Anyway, thanks for this.  This is good progress.

Rohit


Rohit Jain <rohit.j...@...>
 

I found my phantom person Vertex.

So, I was doing this query to find the top 5 actors who had acted in the most movies.  I don't think the query is correct for what I am trying to do since the result does not seem to be right.  However, when I first ran this query:

g.V().hasLabel("person").groupCount().by("personid").order(local).by(values,decr).select(keys).limit(local, 5)

I got an error saying that it could not find "personid" for this specific vertex id.  I did the following for the vertex id:

gremlin> g.V(5578792).valueMap()
==>[]

This told me that I had a vertex that did not have a personid for some reason.  That was my phantom vertex.  I dropped it and then the above query ran.  The result of the above query is ==>[1,2,3,4,5], which I know is not right.  From my SQL query using our product EsgynDB running on Apache Trafodion, I get:

PERSON_ID  NAME                                 NUM                 
---------  -----------------------------------  --------------------

     6830  Robert De Niro                                         53
     5828  Morgan Freeman                                         43
     1053  Bruce Willis                                           38
     5363  Matt Damon                                             37
     4057  Johnny Depp                                            36


Rohit


Robert Dale <rob...@...>
 

You probably want to count the edges:

g.V().hasLabel('person').group().by('name').by(outE().count()).order(local).by(values,decr).limit(local,5)

Robert Dale

On Tue, Aug 15, 2017 at 9:53 PM, Rohit Jain <rohit.j...@...> wrote:
I found my phantom person Vertex.

So, I was doing this query to find the top 5 actors who had acted in the most movies.  I don't think the query is correct for what I am trying to do since the result does not seem to be right.  However, when I first ran this query:

g.V().hasLabel("person").groupCount().by("personid").order(local).by(values,decr).select(keys).limit(local, 5)

I got an error saying that it could not find "personid" for this specific vertex id.  I did the following for the vertex id:

gremlin> g.V(5578792).valueMap()
==>[]

This told me that I had a vertex that did not have a personid for some reason.  That was my phantom vertex.  I dropped it and then the above query ran.  The result of the above query is ==>[1,2,3,4,5], which I know is not right.  From my SQL query using our product EsgynDB running on Apache Trafodion, I get:

PERSON_ID  NAME                                 NUM                 
---------  -----------------------------------  --------------------

     6830  Robert De Niro                                         53
     5828  Morgan Freeman                                         43
     1053  Bruce Willis                                           38
     5363  Matt Damon                                             37
     4057  Johnny Depp                                            36


Rohit

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.