I am using Apache Jena 5.1.
Consider the following turtle format RDF. The entities have all been collapsed
into 1 namespace for simplicity:
ex:thing1 ex:connectsTo ex:thing2 .
ex:thing2 ex:connectsTo ex:thing3, ex:thing4 .
ex:thing3 ex:connectsTo ex:thing5 .
ex:thing7 ex:whatever "not in this film" .
ex:thing8 ex:connectsTo ex:thing5 .
I seek a SPARQL statement that an initial pipeline to produce 1 or more matches
for a subject e.g. ex:thing1
or (ex:thing1,ex:thing8)
will produce
a proper recursive “from-to” result.
Assume we have a bound variable named components
that can be 0 or more
of the ex:thingN
entities.
If components
is just ex:thing1
then the result will be
something like this. I am not wed to this exact output format but the precise
walk down the dependency chain is critical. Any alternative result from the
engine, for example individual from-to pairs, is fine.
ex:thing1 -> ex:thing2 -> ex:thing3 -> ex:thing5
ex:thing1 -> ex:thing2 -> ex:thing4
If components
is (ex:thing1,ex:thing8)
then the result will be:
ex:thing1 -> ex:thing2 -> ex:thing3 -> ex:thing5
ex:thing1 -> ex:thing2 -> ex:thing4
ex:thing8 -> ex:thing5
The closest I got is this:
SELECT DISTINCT ?start ?end
WHERE {
# For simplicity just bind one value -- but ?components could be 2 or more!
BIND(<http://example.org/foo#thing1> AS ?components)
# Create the initial tree and get level 0 connectsTo:
{
?components ex:connectsTo ?end .
BIND(?components AS ?start)
}
UNION
# Now find all the rest:
{
?start ex:connectsTo ?end .
FILTER EXISTS {
?components ex:connectsTo+ ?start .
}
}
}
But the result coming back from Jena bizarrely includes ex:thing7
.
I have to believe there is a SPARQL recipe to walk a recursive list and produce
not just “a bag of all the dependencies” but a proper dependency graph, maybe
similar to what maven
would produce on transitive closure e.g. for
com.google.guava:guava:jar:33.2.1-jre
and output format tgf
:
[INFO] 1829313709 com.google.guava:guava:jar:33.2.1-jre:compile
[INFO] 2087075503 com.google.guava:failureaccess:jar:1.0.2:compile
[INFO] 1996385500 com.google.guava:listenablefuture:jar:9999.0-empty-to-avoid-conflict-with-guava:compile
[INFO] 804993772 com.google.code.findbugs:jsr305:jar:3.0.2:compile
[INFO] 2020602315 org.checkerframework:checker-qual:jar:3.42.0:compile
[INFO] 1174641185 com.google.errorprone:error_prone_annotations:jar:2.26.1:compile
[INFO] 1126074033 com.google.j2objc:j2objc-annotations:jar:3.0.0:compile
[INFO] #
[INFO] 1829313709 2087075503 compile
[INFO] 1829313709 1996385500 compile
[INFO] 1829313709 804993772 compile
[INFO] 1829313709 2020602315 compile
[INFO] 1829313709 1174641185 compile
[INFO] 1829313709 1126074033 compile
[INFO] 2067586671 1829313709 compile
Again, the less formatted the output the better — as long as the from-to
dependencies are captured.
Any suggestions?