[DISCUSS] Should the two GraphFactories have graph-deletion wrapper methods?
David Pitera <piter...@...>
The discussion so far has mostly taken place on my PR here: https://github.com/JanusGraph/janusgraph/pull/392#issuecomment-313802891 following this high level comment to the last comment.
To summarize, Sjudeng so far is of the attitude that the `JanusGraphFactory` and `ConfiguredGraphFactory` should not have wrapper methods to delete your graphs because it is dangerous and not advised in a production scenario.
So far I and Keith Lohnes are of the attitude that the factories should enable graph deletion for:
1. A complete UX
2. A complete DB management toolbox
3. If users do not delete their graphs correctly, i.e. 1. removing it from cache 2. closing it and 3. clearing the storage (and in this order), they can find their db in an unexpected state.
I would appreciate some thoughts/comments on the matter to get the discussion going. I want to get this PR in for 0.2.0 and would also like this change in 0.2.0.
Benjamin Anderson <b...@...>
Good tools usually have sharp edges; infantilizing users by protectingtoggle quoted messageShow quoted text
them from themselves is not traditionally well-received in the systems
programming world. Is there any technical benefit to the elision?
E.g., does including a graph deletion method preclude us from
implementing optimizations we'd prefer to take advantage of? Dropping
data is a standard DBMS function (relational or otherwise), so users
are almost certain to ask for it; "we don't want you to hurt yourself"
is a bit of a lame reason to give them.
From another perspective: if there is in fact risk of the users doing
the /wrong/ thing here, e.g., executing the steps in the wrong order,
and compromising system integrity, then it seems we'd be doing them a
service by simplifying the process.
+1 for inclusion of the deletion methods.
On Thu, Jul 13, 2017 at 9:52 AM, David Pitera <piter...@...> wrote:
The discussion so far has mostly taken place on my PR here:
toggle quoted messageShow quoted text
David, Keith and Benjamin - Thanks a lot for your work and insights on this. I like the sharp edges analogy. You've got me convinced, I'm good with adding the drop functionality to the core factory API. Maybe give it another few days for any lingering comments here and then move forward?
If/when the delete functionality is added to the core factory API we should make sure it's a complete feature and behaves in the expected manner, which as has been pointed out in the PR and here is the equivalent of DBMS drop functionality, across all storage and indexing backends. The test case for each storage/indexing backend would be "assert keyspace/table/index/etc. does not exist"? Agreed?
One thing to note on this is that existing delete implementations in the storage/indexing backends are not generally consistent with the above expected behavior as it appears they were written to support the testing rather than production use case. For example, Cassandra CQL, Thrift, Astyanax and Embedded store managers all truncate instead of drop keyspaces. Likewise for HBaseStoreManager. ElasticSearchIndex appears to do the right thing (though need to confirm this works as expected with multi-type indices in pending-merge #336) as does LuceneIndex. On the other hand SolrIndex does not appear to delete collections as would be expected.
The challenge here will be to support dropping keyspaces/etc. through the proposed core factory API delete functionality but still maintain the existing clearStorage implementations for testing. The latter is critical because it's been found that dropping keyspaces/tables during testing has a significant impact on performance (see #384 which showed truncation, among other updates, decreased runtime for the TinkerPop test suite in the CQL module from 15 hours to 3 hours).
I'm thinking this might be why he existing delete implementation is not already in the core factory API ... the current implementation does not consistently perform the expected operation and until it does the feature is hidden off in the JanusGraphCleanup utility class.
On Friday, July 14, 2017 at 1:02:49 PM UTC-5, Benjamin Anderson wrote:
Good tools usually have sharp edges; infantilizing users by protecting