Raghavendar T S <raghav...@...>
Hi. We are a charging station company based out of India. We are in the process of building an IoT platform to manage the charging stations. We are planning to use JanusGraph (Cassandra + ES Index Store) as primary data base or source of truth to persist real world entities and relationships. Basically the clients including the web and mobile applications interact with our APIs which will be first persisted into the graph and to a message queue. We have our stream processors running in the background to build denormalized views in Elasticsearch. We use graph to do multi-level traversals and the results will be used to persist in ES (denormalized). Is it recommended to use JanusGraph as primary database? We are concerned about the production issues in case If we face any and the support we get from the community? I am pretty sure that lot of companies are using JanusGraph in production and I just want to gain some confidence. Are there any of the companies that use JanusGraph for real-time client facing application other than analytics? Your valuable inputs would make us take better decisions.
Thanks & Regards Raghavendar T S
|
|
Oleksandr Porunov <alexand...@...>
Hi, In short, it depends on your data. Some data are really well suited for JanusGraph and some data isn't. JanusGraph is a graph database layer on top of other databases. Thus, you should understand your data to know which data store you should use (I am suggesting to use CAP theorem). Cassandra consistency is configurable but there are no transaction isolations. ElasticSearch consistency depends on index refreshing configuration and indexing time. If your entities are added / updated rarely, you may configure your Cassandra for write consistency ALL, use ElasticSearch refresh API to ensure all your data is consistent, thus you "may" use read consistency level - ONE. Write QUORUM and read QUORUM guarantees that your responses are consistent but you should check tradeoffs with your own project. JanusGraph is very well suited for data with many relations. I guess, if your project is IoT, then your data will be very connected, and thus, your data is suited well in JanusGraph. Also, I find ScyllaDB has a better overall performance then Cassandra thus may be a good decision for real time data but you should check it with your own scenarios as ScyllaDB uses more CPU time (to reduce latency) then Cassandra. That said, JanusGraph is well suited for real time as well as analytics if well configured (storage, index storage and JanusGraph itself). As a small suggestion if you are using JVM based language and your real-time traversals are not very complex, I would recommend to use JanusGraph in embedded mode (https://docs.janusgraph.org/basics/configuration/#janusgraph-embedded) as it will enable you to use datastax cassandra driver queries routing optimizations and eliminates an additional hop. That said, you your queries are complex enough, it may be better to use JanusGraph servers which are located closer to storage servers. Again, it should be checked for specific scenarios. The above information doesn't answer your questions precisely but maybe it may help somehow.
toggle quoted messageShow quoted text
On Friday, May 8, 2020 at 5:09:56 AM UTC-7, Raghavendar T S wrote: Hi. We are a charging station company based out of India. We are in the process of building an IoT platform to manage the charging stations. We are planning to use JanusGraph (Cassandra + ES Index Store) as primary data base or source of truth to persist real world entities and relationships. Basically the clients including the web and mobile applications interact with our APIs which will be first persisted into the graph and to a message queue. We have our stream processors running in the background to build denormalized views in Elasticsearch. We use graph to do multi-level traversals and the results will be used to persist in ES (denormalized). Is it recommended to use JanusGraph as primary database? We are concerned about the production issues in case If we face any and the support we get from the community? I am pretty sure that lot of companies are using JanusGraph in production and I just want to gain some confidence. Are there any of the companies that use JanusGraph for real-time client facing application other than analytics? Your valuable inputs would make us take better decisions.
Thanks & Regards Raghavendar T S
|
|
Raghavendar T S <raghav...@...>
Hi Oleksandr
It is very detailed and helpful explanation. I have pretty good experience with DataStax Graph in my earlier organisation and we are sure that our use cases will fit in JanusGraph. Since we are a startup we are not ready to use DataStax Graph because of licensing cost. Can you also give some information on how do we generally resolve production issues in case If we face any? Backup/Restore of the the Cassandra database is one of the option. We do not know the issues which we are going to face. We are only concerned about the production support.
Thanks & Regards Raghavendar T S
toggle quoted messageShow quoted text
On Saturday, May 9, 2020 at 5:14:40 AM UTC+5:30, Oleksandr Porunov wrote: Hi, In short, it depends on your data. Some data are really well suited for JanusGraph and some data isn't. JanusGraph is a graph database layer on top of other databases. Thus, you should understand your data to know which data store you should use (I am suggesting to use CAP theorem). Cassandra consistency is configurable but there are no transaction isolations. ElasticSearch consistency depends on index refreshing configuration and indexing time. If your entities are added / updated rarely, you may configure your Cassandra for write consistency ALL, use ElasticSearch refresh API to ensure all your data is consistent, thus you "may" use read consistency level - ONE. Write QUORUM and read QUORUM guarantees that your responses are consistent but you should check tradeoffs with your own project. JanusGraph is very well suited for data with many relations. I guess, if your project is IoT, then your data will be very connected, and thus, your data is suited well in JanusGraph. Also, I find ScyllaDB has a better overall performance then Cassandra thus may be a good decision for real time data but you should check it with your own scenarios as ScyllaDB uses more CPU time (to reduce latency) then Cassandra. That said, JanusGraph is well suited for real time as well as analytics if well configured (storage, index storage and JanusGraph itself). As a small suggestion if you are using JVM based language and your real-time traversals are not very complex, I would recommend to use JanusGraph in embedded mode ( https://docs.janusgraph.org/basics/configuration/#janusgraph-embedded) as it will enable you to use datastax cassandra driver queries routing optimizations and eliminates an additional hop. That said, you your queries are complex enough, it may be better to use JanusGraph servers which are located closer to storage servers. Again, it should be checked for specific scenarios. The above information doesn't answer your questions precisely but maybe it may help somehow.
On Friday, May 8, 2020 at 5:09:56 AM UTC-7, Raghavendar T S wrote: Hi. We are a charging station company based out of India. We are in the process of building an IoT platform to manage the charging stations. We are planning to use JanusGraph (Cassandra + ES Index Store) as primary data base or source of truth to persist real world entities and relationships. Basically the clients including the web and mobile applications interact with our APIs which will be first persisted into the graph and to a message queue. We have our stream processors running in the background to build denormalized views in Elasticsearch. We use graph to do multi-level traversals and the results will be used to persist in ES (denormalized). Is it recommended to use JanusGraph as primary database? We are concerned about the production issues in case If we face any and the support we get from the community? I am pretty sure that lot of companies are using JanusGraph in production and I just want to gain some confidence. Are there any of the companies that use JanusGraph for real-time client facing application other than analytics? Your valuable inputs would make us take better decisions.
Thanks & Regards Raghavendar T S
|
|
Oleksandr Porunov <alexand...@...>
If your startup doesn't want to spend money on production support (i.e. licensing) the only support you can get is either to hire an expert in a specific field (profitable for long term but doesn't for short term) or use community support (is free but the level of support is smaller than support with licensing). Both Cassandra and ScyllaDB has quite good community support. You can subscribe to Cassandra mailing lists here: https://cassandra.apache.org/community/ Or use ScyllaDB google group here: https://groups.google.com/forum/#!forum/scylladb-users
Moreover, many companies has special pricing plans for startups which might be quite good, so I would suggest to contacting them directly and ask if they has special offers / discounts for startups.
As a piece of advice is to not worry of production support on such an early stage startup. If your startup becomes profitable, most likely you will have money to buy production licenses of hire a specialist in the concrete field. If your startup doesn't take off, than it doesn't meter if there is any support available because you won't need it. Of course choosing a right data store is critical for the business but it is very rarely to see a startup doesn't take off because of data store or some technologies they choose.
Getting back to your original question. It depends on your size and the concrete use-case. Backup and restore is good of-course but it takes time to restore a large DB data set which is sometimes critical. I would suggest to configure a right replication, making compaction regularly, use monitoring and try to prevent such scenarios when you need to restore you data from backup.
I wish luck to your startup!
toggle quoted messageShow quoted text
On Saturday, May 9, 2020 at 12:35:47 AM UTC-7, Raghavendar T S wrote: Hi Oleksandr
It is very detailed and helpful explanation. I have pretty good experience with DataStax Graph in my earlier organisation and we are sure that our use cases will fit in JanusGraph. Since we are a startup we are not ready to use DataStax Graph because of licensing cost. Can you also give some information on how do we generally resolve production issues in case If we face any? Backup/Restore of the the Cassandra database is one of the option. We do not know the issues which we are going to face. We are only concerned about the production support.
Thanks & Regards Raghavendar T S On Saturday, May 9, 2020 at 5:14:40 AM UTC+5:30, Oleksandr Porunov wrote: Hi, In short, it depends on your data. Some data are really well suited for JanusGraph and some data isn't. JanusGraph is a graph database layer on top of other databases. Thus, you should understand your data to know which data store you should use (I am suggesting to use CAP theorem). Cassandra consistency is configurable but there are no transaction isolations. ElasticSearch consistency depends on index refreshing configuration and indexing time. If your entities are added / updated rarely, you may configure your Cassandra for write consistency ALL, use ElasticSearch refresh API to ensure all your data is consistent, thus you "may" use read consistency level - ONE. Write QUORUM and read QUORUM guarantees that your responses are consistent but you should check tradeoffs with your own project. JanusGraph is very well suited for data with many relations. I guess, if your project is IoT, then your data will be very connected, and thus, your data is suited well in JanusGraph. Also, I find ScyllaDB has a better overall performance then Cassandra thus may be a good decision for real time data but you should check it with your own scenarios as ScyllaDB uses more CPU time (to reduce latency) then Cassandra. That said, JanusGraph is well suited for real time as well as analytics if well configured (storage, index storage and JanusGraph itself). As a small suggestion if you are using JVM based language and your real-time traversals are not very complex, I would recommend to use JanusGraph in embedded mode ( https://docs.janusgraph.org/basics/configuration/#janusgraph-embedded) as it will enable you to use datastax cassandra driver queries routing optimizations and eliminates an additional hop. That said, you your queries are complex enough, it may be better to use JanusGraph servers which are located closer to storage servers. Again, it should be checked for specific scenarios. The above information doesn't answer your questions precisely but maybe it may help somehow.
On Friday, May 8, 2020 at 5:09:56 AM UTC-7, Raghavendar T S wrote: Hi. We are a charging station company based out of India. We are in the process of building an IoT platform to manage the charging stations. We are planning to use JanusGraph (Cassandra + ES Index Store) as primary data base or source of truth to persist real world entities and relationships. Basically the clients including the web and mobile applications interact with our APIs which will be first persisted into the graph and to a message queue. We have our stream processors running in the background to build denormalized views in Elasticsearch. We use graph to do multi-level traversals and the results will be used to persist in ES (denormalized). Is it recommended to use JanusGraph as primary database? We are concerned about the production issues in case If we face any and the support we get from the community? I am pretty sure that lot of companies are using JanusGraph in production and I just want to gain some confidence. Are there any of the companies that use JanusGraph for real-time client facing application other than analytics? Your valuable inputs would make us take better decisions.
Thanks & Regards Raghavendar T S
|
|
Raghavendar T S <raghav...@...>
Hi Oleksandr
Very much thanks for your valuable information. Just for your information, we contacted DataStax and there are no startup programs and we are supposed to purchase license If we are on production.
Thanks & Regards Raghavendar T S
toggle quoted messageShow quoted text
On Sat, May 9, 2020 at 11:12 PM Oleksandr Porunov < alexand...@...> wrote: If your startup doesn't want to spend money on production support (i.e. licensing) the only support you can get is either to hire an expert in a specific field (profitable for long term but doesn't for short term) or use community support (is free but the level of support is smaller than support with licensing). Both Cassandra and ScyllaDB has quite good community support. You can subscribe to Cassandra mailing lists here: https://cassandra.apache.org/community/Or use ScyllaDB google group here: https://groups.google.com/forum/#!forum/scylladb-usersMoreover, many companies has special pricing plans for startups which might be quite good, so I would suggest to contacting them directly and ask if they has special offers / discounts for startups. As a piece of advice is to not worry of production support on such an early stage startup. If your startup becomes profitable, most likely you will have money to buy production licenses of hire a specialist in the concrete field. If your startup doesn't take off, than it doesn't meter if there is any support available because you won't need it. Of course choosing a right data store is critical for the business but it is very rarely to see a startup doesn't take off because of data store or some technologies they choose. Getting back to your original question. It depends on your size and the concrete use-case. Backup and restore is good of-course but it takes time to restore a large DB data set which is sometimes critical. I would suggest to configure a right replication, making compaction regularly, use monitoring and try to prevent such scenarios when you need to restore you data from backup. I wish luck to your startup! On Saturday, May 9, 2020 at 12:35:47 AM UTC-7, Raghavendar T S wrote: Hi Oleksandr
It is very detailed and helpful explanation. I have pretty good experience with DataStax Graph in my earlier organisation and we are sure that our use cases will fit in JanusGraph. Since we are a startup we are not ready to use DataStax Graph because of licensing cost. Can you also give some information on how do we generally resolve production issues in case If we face any? Backup/Restore of the the Cassandra database is one of the option. We do not know the issues which we are going to face. We are only concerned about the production support.
Thanks & Regards Raghavendar T S On Saturday, May 9, 2020 at 5:14:40 AM UTC+5:30, Oleksandr Porunov wrote: Hi, In short, it depends on your data. Some data are really well suited for JanusGraph and some data isn't. JanusGraph is a graph database layer on top of other databases. Thus, you should understand your data to know which data store you should use (I am suggesting to use CAP theorem). Cassandra consistency is configurable but there are no transaction isolations. ElasticSearch consistency depends on index refreshing configuration and indexing time. If your entities are added / updated rarely, you may configure your Cassandra for write consistency ALL, use ElasticSearch refresh API to ensure all your data is consistent, thus you "may" use read consistency level - ONE. Write QUORUM and read QUORUM guarantees that your responses are consistent but you should check tradeoffs with your own project. JanusGraph is very well suited for data with many relations. I guess, if your project is IoT, then your data will be very connected, and thus, your data is suited well in JanusGraph. Also, I find ScyllaDB has a better overall performance then Cassandra thus may be a good decision for real time data but you should check it with your own scenarios as ScyllaDB uses more CPU time (to reduce latency) then Cassandra. That said, JanusGraph is well suited for real time as well as analytics if well configured (storage, index storage and JanusGraph itself). As a small suggestion if you are using JVM based language and your real-time traversals are not very complex, I would recommend to use JanusGraph in embedded mode ( https://docs.janusgraph.org/basics/configuration/#janusgraph-embedded) as it will enable you to use datastax cassandra driver queries routing optimizations and eliminates an additional hop. That said, you your queries are complex enough, it may be better to use JanusGraph servers which are located closer to storage servers. Again, it should be checked for specific scenarios. The above information doesn't answer your questions precisely but maybe it may help somehow.
On Friday, May 8, 2020 at 5:09:56 AM UTC-7, Raghavendar T S wrote: Hi. We are a charging station company based out of India. We are in the process of building an IoT platform to manage the charging stations. We are planning to use JanusGraph (Cassandra + ES Index Store) as primary data base or source of truth to persist real world entities and relationships. Basically the clients including the web and mobile applications interact with our APIs which will be first persisted into the graph and to a message queue. We have our stream processors running in the background to build denormalized views in Elasticsearch. We use graph to do multi-level traversals and the results will be used to persist in ES (denormalized). Is it recommended to use JanusGraph as primary database? We are concerned about the production issues in case If we face any and the support we get from the community? I am pretty sure that lot of companies are using JanusGraph in production and I just want to gain some confidence. Are there any of the companies that use JanusGraph for real-time client facing application other than analytics? Your valuable inputs would make us take better decisions.
Thanks & Regards Raghavendar T S
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/9a642f9b-ba51-4742-a982-d3e5283c9e3e%40googlegroups.com.
|
|
Samik Raychaudhuri <sam...@...>
Based on our experience, and the talks I have heard in various
meetups, I wouldn't recommend using JanusGraph + Cassandra + ES as
the back-end for OLTP type queries. Consider using a different
database (even Cassandra), or a caching layer in between for
real-time purposes. The best performance you can get is as Oleksandr
suggests: use Janusgraph in the embedded mode, but I suspect that
will not satisfy your requirements completely.
Best.
-Samik
On 09-05-2020 11:12 pm, Oleksandr
Porunov wrote:
If your startup doesn't want to spend money on
production support (i.e. licensing) the only support you can get
is either to hire an expert in a specific field (profitable for
long term but doesn't for short term) or use community support
(is free but the level of support is smaller than support with
licensing).
Both Cassandra and ScyllaDB has quite good community support.
You can subscribe to Cassandra mailing lists here:
https://cassandra.apache.org/community/
Or use ScyllaDB google group here:
https://groups.google.com/forum/#!forum/scylladb-users
Moreover, many companies has special pricing plans for startups
which might be quite good, so I would suggest to contacting them
directly and ask if they has special offers / discounts for
startups.
As a piece of advice is to not worry of production support on
such an early stage startup. If your startup becomes profitable,
most likely you will have money to buy production licenses of
hire a specialist in the concrete field. If your startup doesn't
take off, than it doesn't meter if there is any support
available because you won't need it.
Of course choosing a right data store is critical for the
business but it is very rarely to see a startup doesn't take off
because of data store or some technologies they choose.
Getting back to your original question. It depends on your size
and the concrete use-case. Backup and restore is good of-course
but it takes time to restore a large DB data set which is
sometimes critical. I would suggest to configure a right
replication, making compaction regularly, use monitoring and try
to prevent such scenarios when you need to restore you data from
backup.
I wish luck to your startup!
On Saturday, May 9, 2020 at 12:35:47 AM UTC-7, Raghavendar T S
wrote:
Hi Oleksandr
It is very detailed
and helpful explanation. I have pretty good experience
with DataStax Graph in my earlier organisation and we
are sure that our use cases will fit in JanusGraph.
Since we are a startup
we are not ready to use DataStax Graph because of
licensing cost.
Can you also give some information on how do we
generally resolve production issues
in case If we face
any? Backup/Restore of the the Cassandra database is one
of the option. We do not know the issues which we are
going to face. We are only concerned about
the production
support.
Thanks & Regards
Raghavendar T S
On Saturday, May 9, 2020 at 5:14:40 AM UTC+5:30, Oleksandr
Porunov wrote:
Hi,
In short, it depends on your data. Some data are
really well suited for JanusGraph and some data isn't.
JanusGraph is a graph database layer on top of other
databases. Thus, you should understand your data to
know which data store you should use (I am suggesting
to use CAP theorem).
Cassandra consistency is configurable but there are no
transaction isolations.
ElasticSearch consistency depends on index refreshing
configuration and indexing time.
If your entities are added / updated rarely, you may
configure your Cassandra for write consistency ALL,
use ElasticSearch refresh API to ensure all your data
is consistent, thus you "may" use read consistency
level - ONE. Write QUORUM and read QUORUM guarantees
that your responses are consistent but you should
check tradeoffs with your own project.
JanusGraph is very well suited for data with many
relations. I guess, if your project is IoT, then your
data will be very connected, and thus, your data is
suited well in JanusGraph.
Also, I find ScyllaDB has a better overall performance
then Cassandra thus may be a good decision for real
time data but you should check it with your own
scenarios as ScyllaDB uses more CPU time (to reduce
latency) then Cassandra.
That said, JanusGraph is well suited for real
time as well as analytics if well configured
(storage, index storage and JanusGraph itself).
As a small suggestion if you are using JVM based
language and your real-time traversals are not very
complex, I would recommend to use JanusGraph in
embedded mode ( https://docs.janusgraph.org/basics/configuration/#janusgraph-embedded)
as it will enable you to use datastax cassandra
driver queries routing optimizations and eliminates
an additional hop. That said, you your queries are
complex enough, it may be better to use JanusGraph
servers which are located closer to storage servers.
Again, it should be checked for specific scenarios.
The above information doesn't answer your
questions precisely but maybe it may help somehow.
On Friday, May 8, 2020 at 5:09:56 AM UTC-7,
Raghavendar T S wrote:
Hi. We are a charging station company based
out of India. We are in the process of building
an IoT platform to manage the charging stations.
We are planning to use JanusGraph (Cassandra +
ES Index Store) as primary data base or source
of truth to persist real world entities and
relationships. Basically the clients including
the web and mobile applications interact with
our APIs which will be first persisted into the
graph and to a message queue. We have our stream
processors running in the background to build
denormalized views in Elasticsearch. We use
graph to do multi-level traversals and the
results will be used to persist in ES
(denormalized). Is it recommended to use
JanusGraph as primary database? We are concerned
about the production issues in case If we face
any and the support we get from the community? I
am pretty sure that lot of companies are using
JanusGraph in production and I just want to gain
some confidence. Are there any of the companies
that use JanusGraph for real-time client facing
application other than analytics? Your valuable
inputs would make us take better decisions.
Thanks & Regards
Raghavendar T S
--
You received this message because you are subscribed to the Google
Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/9a642f9b-ba51-4742-a982-d3e5283c9e3e%40googlegroups.com.
|
|
Oleksandr Porunov <alexand...@...>
Hi Samik,
Would you mind sharing your experience and talks which you had? Why did you made such conclusions?
Here are some good experiences which use JanusGraph + ScyllaDB + ElasticSearch in production: https://www.scylladb.com/2020/05/14/zeotap-a-graph-of-twenty-billion-ids-built-on-scylla-and-janusgraph/ https://www.scylladb.com/2020/02/04/fireeye-providing-real-time-threat-analysis-using-a-graph-database/
Also, a tutorial which might be helpful: https://www.scylladb.com/2019/05/14/powering-a-graph-data-system-with-scylla-janusgraph/
It would be really helpful if you could share you experience / research as well even if that experience is negative.
Best regards, Oleksandr
toggle quoted messageShow quoted text
On Friday, May 15, 2020 at 5:08:30 AM UTC-7, Samik R wrote:
Based on our experience, and the talks I have heard in various
meetups, I wouldn't recommend using JanusGraph + Cassandra + ES as
the back-end for OLTP type queries. Consider using a different
database (even Cassandra), or a caching layer in between for
real-time purposes. The best performance you can get is as Oleksandr
suggests: use Janusgraph in the embedded mode, but I suspect that
will not satisfy your requirements completely.
Best.
-Samik
On 09-05-2020 11:12 pm, Oleksandr
Porunov wrote:
If your startup doesn't want to spend money on
production support (i.e. licensing) the only support you can get
is either to hire an expert in a specific field (profitable for
long term but doesn't for short term) or use community support
(is free but the level of support is smaller than support with
licensing).
Both Cassandra and ScyllaDB has quite good community support.
You can subscribe to Cassandra mailing lists here:
https://cassandra.apache.org/community/
Or use ScyllaDB google group here:
https://groups.google.com/forum/#!forum/scylladb-users
Moreover, many companies has special pricing plans for startups
which might be quite good, so I would suggest to contacting them
directly and ask if they has special offers / discounts for
startups.
As a piece of advice is to not worry of production support on
such an early stage startup. If your startup becomes profitable,
most likely you will have money to buy production licenses of
hire a specialist in the concrete field. If your startup doesn't
take off, than it doesn't meter if there is any support
available because you won't need it.
Of course choosing a right data store is critical for the
business but it is very rarely to see a startup doesn't take off
because of data store or some technologies they choose.
Getting back to your original question. It depends on your size
and the concrete use-case. Backup and restore is good of-course
but it takes time to restore a large DB data set which is
sometimes critical. I would suggest to configure a right
replication, making compaction regularly, use monitoring and try
to prevent such scenarios when you need to restore you data from
backup.
I wish luck to your startup!
On Saturday, May 9, 2020 at 12:35:47 AM UTC-7, Raghavendar T S
wrote:
Hi Oleksandr
It is very detailed
and helpful explanation. I have pretty good experience
with DataStax Graph in my earlier organisation and we
are sure that our use cases will fit in JanusGraph.
Since we are a startup
we are not ready to use DataStax Graph because of
licensing cost.
Can you also give some information on how do we
generally resolve production issues
in case If we face
any? Backup/Restore of the the Cassandra database is one
of the option. We do not know the issues which we are
going to face. We are only concerned about
the production
support.
Thanks & Regards
Raghavendar T S
On Saturday, May 9, 2020 at 5:14:40 AM UTC+5:30, Oleksandr
Porunov wrote:
Hi,
In short, it depends on your data. Some data are
really well suited for JanusGraph and some data isn't.
JanusGraph is a graph database layer on top of other
databases. Thus, you should understand your data to
know which data store you should use (I am suggesting
to use CAP theorem).
Cassandra consistency is configurable but there are no
transaction isolations.
ElasticSearch consistency depends on index refreshing
configuration and indexing time.
If your entities are added / updated rarely, you may
configure your Cassandra for write consistency ALL,
use ElasticSearch refresh API to ensure all your data
is consistent, thus you "may" use read consistency
level - ONE. Write QUORUM and read QUORUM guarantees
that your responses are consistent but you should
check tradeoffs with your own project.
JanusGraph is very well suited for data with many
relations. I guess, if your project is IoT, then your
data will be very connected, and thus, your data is
suited well in JanusGraph.
Also, I find ScyllaDB has a better overall performance
then Cassandra thus may be a good decision for real
time data but you should check it with your own
scenarios as ScyllaDB uses more CPU time (to reduce
latency) then Cassandra.
That said, JanusGraph is well suited for real
time as well as analytics if well configured
(storage, index storage and JanusGraph itself).
As a small suggestion if you are using JVM based
language and your real-time traversals are not very
complex, I would recommend to use JanusGraph in
embedded mode ( https://docs.janusgraph.org/basics/configuration/#janusgraph-embedded)
as it will enable you to use datastax cassandra
driver queries routing optimizations and eliminates
an additional hop. That said, you your queries are
complex enough, it may be better to use JanusGraph
servers which are located closer to storage servers.
Again, it should be checked for specific scenarios.
The above information doesn't answer your
questions precisely but maybe it may help somehow.
On Friday, May 8, 2020 at 5:09:56 AM UTC-7,
Raghavendar T S wrote:
Hi. We are a charging station company based
out of India. We are in the process of building
an IoT platform to manage the charging stations.
We are planning to use JanusGraph (Cassandra +
ES Index Store) as primary data base or source
of truth to persist real world entities and
relationships. Basically the clients including
the web and mobile applications interact with
our APIs which will be first persisted into the
graph and to a message queue. We have our stream
processors running in the background to build
denormalized views in Elasticsearch. We use
graph to do multi-level traversals and the
results will be used to persist in ES
(denormalized). Is it recommended to use
JanusGraph as primary database? We are concerned
about the production issues in case If we face
any and the support we get from the community? I
am pretty sure that lot of companies are using
JanusGraph in production and I just want to gain
some confidence. Are there any of the companies
that use JanusGraph for real-time client facing
application other than analytics? Your valuable
inputs would make us take better decisions.
Thanks & Regards
Raghavendar T S
--
You received this message because you are subscribed to the Google
Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to janusgra...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/9a642f9b-ba51-4742-a982-d3e5283c9e3e%40googlegroups.com.
|
|