Hi JanusGraph team, I have created a vertex-centric indexes for vertices. As follows, now I want to use the index to get the information of the top 500 edges in descending sort. However, I find that the execution time is the same as that without vertex index. How can I use the index to sort faster and extract the information of the first 500 edges more quickly? Here's the graph I've built: graph=JanusGraphFactory.open(‘janusgraph-cql-es-server-test2.properties’)
mgmt = graph.openManagement()
mgmt.makeVertexLabel('VirtualAddress').make()
addr = mgmt.makePropertyKey('address').dataType(String.class).cardinality(SINGLE).make()
token_addr = mgmt.makePropertyKey('token_addr').dataType(String.class).cardinality(SINGLE).make()
transfer_to=mgmt.makeEdgeLabel('TRANSFER_TO').multiplicity(MULTI).make()
amount = mgmt.makePropertyKey('amount').dataType(Double.class).cardinality(SINGLE).make()
tx_hash = mgmt.makePropertyKey('tx_hash').dataType(String.class).cardinality(SINGLE).make()
tx_index = mgmt.makePropertyKey('tx_index').dataType(Integer.class).cardinality(SINGLE).make()
created_time = mgmt.makePropertyKey('created_time').dataType(Date.class).cardinality(SINGLE).make()
updated_time = mgmt.makePropertyKey('updated_time').dataType(Date.class).cardinality(SINGLE).make()
mgmt.buildIndex('addressComposite', Vertex.class).addKey(addr).buildCompositeIndex()
mgmt.buildIndex('addressTokenUniqComposite', Vertex.class).addKey(addr).addKey(token_addr).unique().buildCompositeIndex()
mgmt.buildEdgeIndex(transfer_to,"transferOutAmountTs", Direction.OUT, Order.desc,amount,created_time)
mgmt.buildEdgeIndex(transfer_to,"transferOutTs", Direction.OUT, Order.desc,created_time)
mgmt.commit()
Here's the data I inserted, building a starting point, a million edges associated with it, and 100 endpoints, graph_conf = 'janusgraph-cql-es-server-test2.properties'
graph = JanusGraphFactory.open(graph_conf)
g = graph.traversal()
String line = "5244613,tx_hash_00,token_addr_00,from_addr_00,to_addr_00,,,6000,19305.57174337591,72,1520896044"
int start_value = 1
int end_value = 1000000
line = "tx_hash_00,token_addr_00,from_addr_00,to_addr_00,6000,72,1520896044"
cloumns = line.split(',', -1)
(tx_hash, token_addr, from_addr, to_addr, amount, log_index, timestamp) = cloumns
from_addr_node = g.addV('VirtualAddress').property('address', from_addr).property('token_addr', token_addr).next()
from_id = from_addr_node.id()
amount = amount.toBigDecimal()
tx_index = log_index.toInteger()
for (int i = start_value; i <= end_value; i++) {
to_addr_node = g.addV('VirtualAddress').property('address', to_addr + String.valueOf(i)).property('token_addr', token_addr).next()
to_id = to_addr_node.id()
Date ts = new Date((timestamp.toLong() - i) * 1000)
g.addE('TRANSFER_TO').from(g.V(from_id)).to(g.V(to_id))
.property('amount', amount + i)
.property('tx_hash', tx_hash)
.property('tx_index', tx_index + i)
.property('created_time', ts)
.next()
if (i % 20000 == 0) {
println("[total:${i}]")
System.sleep(500)
g.tx().commit()
graph.close()
System.sleep(5000)
graph = JanusGraphFactory.open(graph_conf)
g = graph.traversal()
System.sleep(5000)
}
g.tx().commit()
}
graph.close() Here are my query criteria: g.V().has('address', ‘from_addr_00').outE('TRANSFER_TO').order().by(‘amount’,desc).limit(500).valueMap().toList()
|
|
You already specified the vertex-centrex index on the amount key to be ordered while creating the index. By explicitly reordering the results in the traversal, the index cannot take effect because the reordering needs alls vertices to be retrieved instead of just the first 500.
HTH, Marc
Op zaterdag 25 juli 2020 om 20:52:37 UTC+2 schreef 18...@...:
toggle quoted message
Show quoted text
Hi JanusGraph team, I have created a vertex-centric indexes for vertices. As follows, now I want to use the index to get the information of the top 500 edges in descending sort. However, I find that the execution time is the same as that without vertex index. How can I use the index to sort faster and extract the information of the first 500 edges more quickly? Here's the graph I've built: graph=JanusGraphFactory.open(‘janusgraph-cql-es-server-test2.properties’)
mgmt = graph.openManagement()
mgmt.makeVertexLabel('VirtualAddress').make()
addr = mgmt.makePropertyKey('address').dataType(String.class).cardinality(SINGLE).make()
token_addr = mgmt.makePropertyKey('token_addr').dataType(String.class).cardinality(SINGLE).make()
transfer_to=mgmt.makeEdgeLabel('TRANSFER_TO').multiplicity(MULTI).make()
amount = mgmt.makePropertyKey('amount').dataType(Double.class).cardinality(SINGLE).make()
tx_hash = mgmt.makePropertyKey('tx_hash').dataType(String.class).cardinality(SINGLE).make()
tx_index = mgmt.makePropertyKey('tx_index').dataType(Integer.class).cardinality(SINGLE).make()
created_time = mgmt.makePropertyKey('created_time').dataType(Date.class).cardinality(SINGLE).make()
updated_time = mgmt.makePropertyKey('updated_time').dataType(Date.class).cardinality(SINGLE).make()
mgmt.buildIndex('addressComposite', Vertex.class).addKey(addr).buildCompositeIndex()
mgmt.buildIndex('addressTokenUniqComposite', Vertex.class).addKey(addr).addKey(token_addr).unique().buildCompositeIndex()
mgmt.buildEdgeIndex(transfer_to,"transferOutAmountTs", Direction.OUT, Order.desc,amount,created_time)
mgmt.buildEdgeIndex(transfer_to,"transferOutTs", Direction.OUT, Order.desc,created_time)
mgmt.commit()
Here's the data I inserted, building a starting point, a million edges associated with it, and 100 endpoints, graph_conf = 'janusgraph-cql-es-server-test2.properties'
graph = JanusGraphFactory.open(graph_conf)
g = graph.traversal()
String line = "5244613,tx_hash_00,token_addr_00,from_addr_00,to_addr_00,,,6000,19305.57174337591,72,1520896044"
int start_value = 1
int end_value = 1000000
line = "tx_hash_00,token_addr_00,from_addr_00,to_addr_00,6000,72,1520896044"
cloumns = line.split(',', -1)
(tx_hash, token_addr, from_addr, to_addr, amount, log_index, timestamp) = cloumns
from_addr_node = g.addV('VirtualAddress').property('address', from_addr).property('token_addr', token_addr).next()
amount = amount.toBigDecimal()
tx_index = log_index.toInteger()
for (int i = start_value; i <= end_value; i++) {
to_addr_node = g.addV('VirtualAddress').property('address', to_addr + String.valueOf(i)).property('token_addr', token_addr).next()
Date ts = new Date((timestamp.toLong() - i) * 1000)
g.addE('TRANSFER_TO').from(g.V(from_id)).to(g.V(to_id))
.property('amount', amount + i)
.property('tx_hash', tx_hash)
.property('tx_index', tx_index + i)
.property('created_time', ts)
.next()
if (i % 20000 == 0) {
println("[total:${i}]")
System.sleep(500)
g.tx().commit()
graph.close()
System.sleep(5000)
graph = JanusGraphFactory.open(graph_conf)
g = graph.traversal()
System.sleep(5000)
}
g.tx().commit()
}
graph.close() Here are my query criteria: g.V().has('address', ‘from_addr_00').outE('TRANSFER_TO').order().by(‘amount’,desc).limit(500).valueMap().toList()
|
|
How can I use the index to get the top 500 edges of the amount descending sort faster?
在2020年7月26日星期日 UTC+8 下午3:19:56<HadoopMarc> 写道:
toggle quoted message
Show quoted text
You already specified the vertex-centrex index on the amount key to be ordered while creating the index. By explicitly reordering the results in the traversal, the index cannot take effect because the reordering needs alls vertices to be retrieved instead of just the first 500.
HTH, Marc
Op zaterdag 25 juli 2020 om 20:52:37 UTC+2 schreef 18...@...: Hi JanusGraph team, I have created a vertex-centric indexes for vertices. As follows, now I want to use the index to get the information of the top 500 edges in descending sort. However, I find that the execution time is the same as that without vertex index. How can I use the index to sort faster and extract the information of the first 500 edges more quickly? Here's the graph I've built: graph=JanusGraphFactory.open(‘janusgraph-cql-es-server-test2.properties’)
mgmt = graph.openManagement()
mgmt.makeVertexLabel('VirtualAddress').make()
addr = mgmt.makePropertyKey('address').dataType(String.class).cardinality(SINGLE).make()
token_addr = mgmt.makePropertyKey('token_addr').dataType(String.class).cardinality(SINGLE).make()
transfer_to=mgmt.makeEdgeLabel('TRANSFER_TO').multiplicity(MULTI).make()
amount = mgmt.makePropertyKey('amount').dataType(Double.class).cardinality(SINGLE).make()
tx_hash = mgmt.makePropertyKey('tx_hash').dataType(String.class).cardinality(SINGLE).make()
tx_index = mgmt.makePropertyKey('tx_index').dataType(Integer.class).cardinality(SINGLE).make()
created_time = mgmt.makePropertyKey('created_time').dataType(Date.class).cardinality(SINGLE).make()
updated_time = mgmt.makePropertyKey('updated_time').dataType(Date.class).cardinality(SINGLE).make()
mgmt.buildIndex('addressComposite', Vertex.class).addKey(addr).buildCompositeIndex()
mgmt.buildIndex('addressTokenUniqComposite', Vertex.class).addKey(addr).addKey(token_addr).unique().buildCompositeIndex()
mgmt.buildEdgeIndex(transfer_to,"transferOutAmountTs", Direction.OUT, Order.desc,amount,created_time)
mgmt.buildEdgeIndex(transfer_to,"transferOutTs", Direction.OUT, Order.desc,created_time)
mgmt.commit()
Here's the data I inserted, building a starting point, a million edges associated with it, and 100 endpoints, graph_conf = 'janusgraph-cql-es-server-test2.properties'
graph = JanusGraphFactory.open(graph_conf)
g = graph.traversal()
String line = "5244613,tx_hash_00,token_addr_00,from_addr_00,to_addr_00,,,6000,19305.57174337591,72,1520896044"
int start_value = 1
int end_value = 1000000
line = "tx_hash_00,token_addr_00,from_addr_00,to_addr_00,6000,72,1520896044"
cloumns = line.split(',', -1)
(tx_hash, token_addr, from_addr, to_addr, amount, log_index, timestamp) = cloumns
from_addr_node = g.addV('VirtualAddress').property('address', from_addr).property('token_addr', token_addr).next()
amount = amount.toBigDecimal()
tx_index = log_index.toInteger()
for (int i = start_value; i <= end_value; i++) {
to_addr_node = g.addV('VirtualAddress').property('address', to_addr + String.valueOf(i)).property('token_addr', token_addr).next()
Date ts = new Date((timestamp.toLong() - i) * 1000)
g.addE('TRANSFER_TO').from(g.V(from_id)).to(g.V(to_id))
.property('amount', amount + i)
.property('tx_hash', tx_hash)
.property('tx_index', tx_index + i)
.property('created_time', ts)
.next()
if (i % 20000 == 0) {
println("[total:${i}]")
System.sleep(500)
g.tx().commit()
graph.close()
System.sleep(5000)
graph = JanusGraphFactory.open(graph_conf)
g = graph.traversal()
System.sleep(5000)
}
g.tx().commit()
}
graph.close() Here are my query criteria: g.V().has('address', ‘from_addr_00').outE('TRANSFER_TO').order().by(‘amount’,desc).limit(500).valueMap().toList()
|
|
g.V().has('address', ‘from_addr_00').outE('TRANSFER_TO').has('amount', gte(6000)).limit(500).valueMap().toList()
I am not sure the has() step is even necessary, or maybe just has('amount') is sufficient to trigger the - already sorted - index.
Best wishes,
Marc
Op zondag 26 juli 2020 om 11:15:16 UTC+2 schreef 18...@...:
toggle quoted message
Show quoted text
How can I use the index to get the top 500 edges of the amount descending sort faster?
在2020年7月26日星期日 UTC+8 下午3:19:56<HadoopMarc> 写道:
You already specified the vertex-centrex index on the amount key to be ordered while creating the index. By explicitly reordering the results in the traversal, the index cannot take effect because the reordering needs alls vertices to be retrieved instead of just the first 500.
HTH, Marc
Op zaterdag 25 juli 2020 om 20:52:37 UTC+2 schreef 18...@...: Hi JanusGraph team, I have created a vertex-centric indexes for vertices. As follows, now I want to use the index to get the information of the top 500 edges in descending sort. However, I find that the execution time is the same as that without vertex index. How can I use the index to sort faster and extract the information of the first 500 edges more quickly? Here's the graph I've built: graph=JanusGraphFactory.open(‘janusgraph-cql-es-server-test2.properties’)
mgmt = graph.openManagement()
mgmt.makeVertexLabel('VirtualAddress').make()
addr = mgmt.makePropertyKey('address').dataType(String.class).cardinality(SINGLE).make()
token_addr = mgmt.makePropertyKey('token_addr').dataType(String.class).cardinality(SINGLE).make()
transfer_to=mgmt.makeEdgeLabel('TRANSFER_TO').multiplicity(MULTI).make()
amount = mgmt.makePropertyKey('amount').dataType(Double.class).cardinality(SINGLE).make()
tx_hash = mgmt.makePropertyKey('tx_hash').dataType(String.class).cardinality(SINGLE).make()
tx_index = mgmt.makePropertyKey('tx_index').dataType(Integer.class).cardinality(SINGLE).make()
created_time = mgmt.makePropertyKey('created_time').dataType(Date.class).cardinality(SINGLE).make()
updated_time = mgmt.makePropertyKey('updated_time').dataType(Date.class).cardinality(SINGLE).make()
mgmt.buildIndex('addressComposite', Vertex.class).addKey(addr).buildCompositeIndex()
mgmt.buildIndex('addressTokenUniqComposite', Vertex.class).addKey(addr).addKey(token_addr).unique().buildCompositeIndex()
mgmt.buildEdgeIndex(transfer_to,"transferOutAmountTs", Direction.OUT, Order.desc,amount,created_time)
mgmt.buildEdgeIndex(transfer_to,"transferOutTs", Direction.OUT, Order.desc,created_time)
mgmt.commit()
Here's the data I inserted, building a starting point, a million edges associated with it, and 100 endpoints, graph_conf = 'janusgraph-cql-es-server-test2.properties'
graph = JanusGraphFactory.open(graph_conf)
g = graph.traversal()
String line = "5244613,tx_hash_00,token_addr_00,from_addr_00,to_addr_00,,,6000,19305.57174337591,72,1520896044"
int start_value = 1
int end_value = 1000000
line = "tx_hash_00,token_addr_00,from_addr_00,to_addr_00,6000,72,1520896044"
cloumns = line.split(',', -1)
(tx_hash, token_addr, from_addr, to_addr, amount, log_index, timestamp) = cloumns
from_addr_node = g.addV('VirtualAddress').property('address', from_addr).property('token_addr', token_addr).next()
amount = amount.toBigDecimal()
tx_index = log_index.toInteger()
for (int i = start_value; i <= end_value; i++) {
to_addr_node = g.addV('VirtualAddress').property('address', to_addr + String.valueOf(i)).property('token_addr', token_addr).next()
Date ts = new Date((timestamp.toLong() - i) * 1000)
g.addE('TRANSFER_TO').from(g.V(from_id)).to(g.V(to_id))
.property('amount', amount + i)
.property('tx_hash', tx_hash)
.property('tx_index', tx_index + i)
.property('created_time', ts)
.next()
if (i % 20000 == 0) {
println("[total:${i}]")
System.sleep(500)
g.tx().commit()
graph.close()
System.sleep(5000)
graph = JanusGraphFactory.open(graph_conf)
g = graph.traversal()
System.sleep(5000)
}
g.tx().commit()
}
graph.close() Here are my query criteria: g.V().has('address', ‘from_addr_00').outE('TRANSFER_TO').order().by(‘amount’,desc).limit(500).valueMap().toList()
|
|
Hi Marc,
After testing, has() step() condition is necessary, this solution is very effective.Thank you for all your assistance. Warm regards,
Leah 在2020年7月27日星期一 UTC+8 上午3:26:38<HadoopMarc> 写道:
toggle quoted message
Show quoted text
g.V().has('address', ‘from_addr_00').outE('TRANSFER_TO').has('amount', gte(6000)).limit(500).valueMap().toList()
I am not sure the has() step is even necessary, or maybe just has('amount') is sufficient to trigger the - already sorted - index.
Best wishes,
Marc
Op zondag 26 juli 2020 om 11:15:16 UTC+2 schreef 18...@...:
How can I use the index to get the top 500 edges of the amount descending sort faster?
在2020年7月26日星期日 UTC+8 下午3:19:56<HadoopMarc> 写道:
You already specified the vertex-centrex index on the amount key to be ordered while creating the index. By explicitly reordering the results in the traversal, the index cannot take effect because the reordering needs alls vertices to be retrieved instead of just the first 500.
HTH, Marc
Op zaterdag 25 juli 2020 om 20:52:37 UTC+2 schreef 18...@...: Hi JanusGraph team, I have created a vertex-centric indexes for vertices. As follows, now I want to use the index to get the information of the top 500 edges in descending sort. However, I find that the execution time is the same as that without vertex index. How can I use the index to sort faster and extract the information of the first 500 edges more quickly? Here's the graph I've built: graph=JanusGraphFactory.open(‘janusgraph-cql-es-server-test2.properties’)
mgmt = graph.openManagement()
mgmt.makeVertexLabel('VirtualAddress').make()
addr = mgmt.makePropertyKey('address').dataType(String.class).cardinality(SINGLE).make()
token_addr = mgmt.makePropertyKey('token_addr').dataType(String.class).cardinality(SINGLE).make()
transfer_to=mgmt.makeEdgeLabel('TRANSFER_TO').multiplicity(MULTI).make()
amount = mgmt.makePropertyKey('amount').dataType(Double.class).cardinality(SINGLE).make()
tx_hash = mgmt.makePropertyKey('tx_hash').dataType(String.class).cardinality(SINGLE).make()
tx_index = mgmt.makePropertyKey('tx_index').dataType(Integer.class).cardinality(SINGLE).make()
created_time = mgmt.makePropertyKey('created_time').dataType(Date.class).cardinality(SINGLE).make()
updated_time = mgmt.makePropertyKey('updated_time').dataType(Date.class).cardinality(SINGLE).make()
mgmt.buildIndex('addressComposite', Vertex.class).addKey(addr).buildCompositeIndex()
mgmt.buildIndex('addressTokenUniqComposite', Vertex.class).addKey(addr).addKey(token_addr).unique().buildCompositeIndex()
mgmt.buildEdgeIndex(transfer_to,"transferOutAmountTs", Direction.OUT, Order.desc,amount,created_time)
mgmt.buildEdgeIndex(transfer_to,"transferOutTs", Direction.OUT, Order.desc,created_time)
mgmt.commit()
Here's the data I inserted, building a starting point, a million edges associated with it, and 100 endpoints, graph_conf = 'janusgraph-cql-es-server-test2.properties'
graph = JanusGraphFactory.open(graph_conf)
g = graph.traversal()
String line = "5244613,tx_hash_00,token_addr_00,from_addr_00,to_addr_00,,,6000,19305.57174337591,72,1520896044"
int start_value = 1
int end_value = 1000000
line = "tx_hash_00,token_addr_00,from_addr_00,to_addr_00,6000,72,1520896044"
cloumns = line.split(',', -1)
(tx_hash, token_addr, from_addr, to_addr, amount, log_index, timestamp) = cloumns
from_addr_node = g.addV('VirtualAddress').property('address', from_addr).property('token_addr', token_addr).next()
amount = amount.toBigDecimal()
tx_index = log_index.toInteger()
for (int i = start_value; i <= end_value; i++) {
to_addr_node = g.addV('VirtualAddress').property('address', to_addr + String.valueOf(i)).property('token_addr', token_addr).next()
Date ts = new Date((timestamp.toLong() - i) * 1000)
g.addE('TRANSFER_TO').from(g.V(from_id)).to(g.V(to_id))
.property('amount', amount + i)
.property('tx_hash', tx_hash)
.property('tx_index', tx_index + i)
.property('created_time', ts)
.next()
if (i % 20000 == 0) {
println("[total:${i}]")
System.sleep(500)
g.tx().commit()
graph.close()
System.sleep(5000)
graph = JanusGraphFactory.open(graph_conf)
g = graph.traversal()
System.sleep(5000)
}
g.tx().commit()
}
graph.close() Here are my query criteria: g.V().has('address', ‘from_addr_00').outE('TRANSFER_TO').order().by(‘amount’,desc).limit(500).valueMap().toList()
|
|