I am trying to upload RDF data stored in .ttl files in my computer to Apache Jena Fuseki server. I ran Apache Jena Fuseki server as a standalone server based on the guidance given in Apache Jena Fuseki server page(https://jena.apache.org/documentation/fuseki2/fuseki-webapp.html#fuseki-web-application) and an online article(https://medium.com/@fadirra/setting-up-jena-fuseki-with-update-in-windows-10-2c8a2802ee8f).
The server seems to be running when I go to the localhost:3030. The code I developed for uploading the data seems to be working fine for smaller file sizes. However, for large file sizes, the data is not getting uploaded. On looking at the server logs, I identified the following error:
<code>Caused by: java.lang.IllegalStateException: form too large > 20000000
at org.eclipse.jetty.server.FormFields.checkMaxLength(FormFields.java:318) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.FormFields.parse(FormFields.java:307) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.FormFields.parse(FormFields.java:39) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.io.content.ContentSourceCompletableFuture.parse(ContentSourceCompletableFuture.java:104) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.handler.ContextHandler$ScopedContext.run(ContextHandler.java:1212) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.handler.ContextRequest$OnContextDemand.run(ContextRequest.java:74) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.util.thread.SerializedInvoker$Link.run(SerializedInvoker.java:191) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.internal.HttpConnection$DemandContentCallback.succeeded(HttpConnection.java:679) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:99) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) ~[fuseki-server.jar:5.0.0]
<code>Caused by: java.lang.IllegalStateException: form too large > 20000000
at org.eclipse.jetty.server.FormFields.checkMaxLength(FormFields.java:318) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.FormFields.parse(FormFields.java:307) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.FormFields.parse(FormFields.java:39) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.io.content.ContentSourceCompletableFuture.parse(ContentSourceCompletableFuture.java:104) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.handler.ContextHandler$ScopedContext.run(ContextHandler.java:1212) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.handler.ContextRequest$OnContextDemand.run(ContextRequest.java:74) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.util.thread.SerializedInvoker$Link.run(SerializedInvoker.java:191) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.internal.HttpConnection$DemandContentCallback.succeeded(HttpConnection.java:679) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:99) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) ~[fuseki-server.jar:5.0.0]
</code>
Caused by: java.lang.IllegalStateException: form too large > 20000000
at org.eclipse.jetty.server.FormFields.checkMaxLength(FormFields.java:318) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.FormFields.parse(FormFields.java:307) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.FormFields.parse(FormFields.java:39) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.io.content.ContentSourceCompletableFuture.parse(ContentSourceCompletableFuture.java:104) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.handler.ContextHandler$ScopedContext.run(ContextHandler.java:1212) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.handler.ContextRequest$OnContextDemand.run(ContextRequest.java:74) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.util.thread.SerializedInvoker$Link.run(SerializedInvoker.java:191) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.server.internal.HttpConnection$DemandContentCallback.succeeded(HttpConnection.java:679) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:99) ~[fuseki-server.jar:5.0.0]
at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) ~[fuseki-server.jar:5.0.0]
Here is the code I used for uploading the RDF data:
<code>input_location = "C:/......../Added_Triples.ttl"
with open(input_location, 'r') as f:
rdf_string_no_prefixes = "n".join(line for line in content.split("n") if not line.startswith("@prefix"))
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX CSRO: <http://www.semanticweb.org/aagr657/ontologies/2023/9/CraneSpaceRepresentationOntology#>
PREFIX LinkOnt: <http://purl.org/ConstructLinkOnt/LinkOnt#>
PREFIX bot: <https://w3id.org/bot#>
PREFIX expr: <https://w3id.org/express#>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geom: <http://rdf.bg/geometry.ttl#>
PREFIX ifc: <https://standards.buildingsmart.org/IFC/DEV/IFC2X3/TC1/OWL>
PREFIX inst: <https://www.ugent.be/myAwesomeFirstBIMProject#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX sf: <http://www.opengis.net/ont/sf#>
PREFIX omg: <https://w3id.org/omg#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX lbd: <https://linkedbuildingdata.org/LBD#>
PREFIX props: <http://lbd.arch.rwth-aachen.de/props#>
PREFIX unit: <http://qudt.org/vocab/unit/>
PREFIX IFC4-PSD: <https://www.linkedbuildingdata.net/IFC4-PSD#>
PREFIX smls: <https://w3id.org/def/smls-owl#>
PREFIX fog: <https://w3id.org/fog#>
PREFIX cc: <http://creativecommons.org/ns#>
PREFIX dce: <http://purl.org/dc/elements/1.1/>
PREFIX express: <https://w3id.org/express#>
PREFIX list: <https://w3id.org/list#>
PREFIX vann: <http://purl.org/vocab/vann/>
PREFIX expr: <https://w3id.org/express#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX : <https://standards.buildingsmart.org/IFC/DEV/IFC2x3/TC1/OWL#>
""" % (rdf_string_no_prefixes)
sparql = SPARQLWrapper("http://localhost:3030/your-dataset/update")
sparql.setQuery(update_query)
# Step 5: Execute the SPARQL Update query
<code>input_location = "C:/......../Added_Triples.ttl"
with open(input_location, 'r') as f:
content = f.read()
#print(type(content))
rdf_string_no_prefixes = "n".join(line for line in content.split("n") if not line.startswith("@prefix"))
update_query = """
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX CSRO: <http://www.semanticweb.org/aagr657/ontologies/2023/9/CraneSpaceRepresentationOntology#>
PREFIX LinkOnt: <http://purl.org/ConstructLinkOnt/LinkOnt#>
PREFIX bot: <https://w3id.org/bot#>
PREFIX expr: <https://w3id.org/express#>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geom: <http://rdf.bg/geometry.ttl#>
PREFIX ifc: <https://standards.buildingsmart.org/IFC/DEV/IFC2X3/TC1/OWL>
PREFIX inst: <https://www.ugent.be/myAwesomeFirstBIMProject#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX sf: <http://www.opengis.net/ont/sf#>
PREFIX omg: <https://w3id.org/omg#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX lbd: <https://linkedbuildingdata.org/LBD#>
PREFIX props: <http://lbd.arch.rwth-aachen.de/props#>
PREFIX unit: <http://qudt.org/vocab/unit/>
PREFIX IFC4-PSD: <https://www.linkedbuildingdata.net/IFC4-PSD#>
PREFIX smls: <https://w3id.org/def/smls-owl#>
PREFIX fog: <https://w3id.org/fog#>
PREFIX cc: <http://creativecommons.org/ns#>
PREFIX dce: <http://purl.org/dc/elements/1.1/>
PREFIX express: <https://w3id.org/express#>
PREFIX list: <https://w3id.org/list#>
PREFIX vann: <http://purl.org/vocab/vann/>
PREFIX expr: <https://w3id.org/express#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX : <https://standards.buildingsmart.org/IFC/DEV/IFC2x3/TC1/OWL#>
INSERT DATA {
%s
}
""" % (rdf_string_no_prefixes)
sparql = SPARQLWrapper("http://localhost:3030/your-dataset/update")
sparql.setMethod(POST)
sparql.setQuery(update_query)
# Step 5: Execute the SPARQL Update query
sparql.query()
</code>
input_location = "C:/......../Added_Triples.ttl"
with open(input_location, 'r') as f:
content = f.read()
#print(type(content))
rdf_string_no_prefixes = "n".join(line for line in content.split("n") if not line.startswith("@prefix"))
update_query = """
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX CSRO: <http://www.semanticweb.org/aagr657/ontologies/2023/9/CraneSpaceRepresentationOntology#>
PREFIX LinkOnt: <http://purl.org/ConstructLinkOnt/LinkOnt#>
PREFIX bot: <https://w3id.org/bot#>
PREFIX expr: <https://w3id.org/express#>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geom: <http://rdf.bg/geometry.ttl#>
PREFIX ifc: <https://standards.buildingsmart.org/IFC/DEV/IFC2X3/TC1/OWL>
PREFIX inst: <https://www.ugent.be/myAwesomeFirstBIMProject#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX sf: <http://www.opengis.net/ont/sf#>
PREFIX omg: <https://w3id.org/omg#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX lbd: <https://linkedbuildingdata.org/LBD#>
PREFIX props: <http://lbd.arch.rwth-aachen.de/props#>
PREFIX unit: <http://qudt.org/vocab/unit/>
PREFIX IFC4-PSD: <https://www.linkedbuildingdata.net/IFC4-PSD#>
PREFIX smls: <https://w3id.org/def/smls-owl#>
PREFIX fog: <https://w3id.org/fog#>
PREFIX cc: <http://creativecommons.org/ns#>
PREFIX dce: <http://purl.org/dc/elements/1.1/>
PREFIX express: <https://w3id.org/express#>
PREFIX list: <https://w3id.org/list#>
PREFIX vann: <http://purl.org/vocab/vann/>
PREFIX expr: <https://w3id.org/express#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX : <https://standards.buildingsmart.org/IFC/DEV/IFC2x3/TC1/OWL#>
INSERT DATA {
%s
}
""" % (rdf_string_no_prefixes)
sparql = SPARQLWrapper("http://localhost:3030/your-dataset/update")
sparql.setMethod(POST)
sparql.setQuery(update_query)
# Step 5: Execute the SPARQL Update query
sparql.query()
I read a few questions on stackoverflow about similar errors in some other servers, which suggested to edit the jetty.xml file. However, in my case, I can not find any such file in my computer. As I mentioned above, the code works perfectly fine for smaller file sizes, but issue comes with bigger file sizes.
For the time being, I divided the bigger RDF files into smaller chunks and uploaded them separately. However, that is taking a lot of time, as the time required for chunking is getting added. Therefore, I do not want to use this as a solution.
Any help about how to solve this issue without the need of chunking will be appreciated. In ideal case, I would want the whole graph file to be uploaded in one go in least time.