42 KiB
SPARQL 1.1 Specification for PostgreSQL Implementation
Version: SPARQL 1.1 (W3C Recommendation March 2013) Target: PostgreSQL Query Engine Implementation Research Date: December 2025 Project: RuVector-Postgres
Table of Contents
- Introduction
- Core SPARQL Components
- SPARQL Syntax
- SPARQL Operations
- SPARQL Update
- Built-in Functions
- SPARQL Algebra
- Query Result Formats
- Implementation Considerations
- References
Introduction
SPARQL (SPARQL Protocol and RDF Query Language) is a W3C standard query language for querying and manipulating RDF (Resource Description Framework) data. RDF is a directed, labeled graph data format representing information as triples (subject, predicate, object).
SPARQL 1.1 Enhancements
SPARQL 1.1 adds significant features over SPARQL 1.0:
- Subqueries: Nested SELECT queries
- Value assignment: BIND and VALUES clauses
- Property paths: Regular expression-like path matching
- Aggregates: COUNT, SUM, AVG, MIN, MAX, GROUP_CONCAT, SAMPLE
- Negation: NOT EXISTS and MINUS operators
- Service federation: Querying remote SPARQL endpoints
- Update operations: INSERT, DELETE, LOAD, CLEAR, etc.
Core SPARQL Components
1. RDF Triple Model
The foundation of SPARQL is the RDF triple:
<subject> <predicate> <object>
Components:
- Subject: IRI or blank node
- Predicate: IRI only
- Object: IRI, blank node, or literal
Example:
<http://example.org/person/Alice> <http://xmlns.com/foaf/0.1/knows> <http://example.org/person/Bob>
2. Graph Patterns
Graph patterns are the building blocks of SPARQL queries:
Basic Graph Pattern (BGP)
A set of triple patterns that must all match:
?person foaf:name ?name .
?person foaf:age ?age .
Group Graph Pattern
Multiple patterns enclosed in braces:
{
?person foaf:name ?name .
?person foaf:age ?age .
FILTER(?age >= 18)
}
Optional Graph Pattern
Extends solutions with additional patterns if they match:
?person foaf:name ?name .
OPTIONAL { ?person foaf:email ?email }
Semantics: LEFT JOIN - keeps all solutions from the first pattern whether or not the OPTIONAL pattern matches.
Union Graph Pattern
Alternatives - tries multiple patterns:
{
{ ?person foaf:name ?name }
UNION
{ ?person rdfs:label ?name }
}
Filter Pattern
Constrains solutions with boolean expressions:
?person foaf:age ?age .
FILTER(?age >= 21 && ?age < 65)
3. Query Forms
SPARQL defines four query forms:
SELECT Query
Returns variable bindings as a table:
SELECT ?name ?age
WHERE {
?person foaf:name ?name .
?person foaf:age ?age .
}
Returns: Solution sequence (table of variable bindings)
CONSTRUCT Query
Builds an RDF graph using a template:
CONSTRUCT {
?person ex:hasName ?name .
?person ex:hasAge ?age .
}
WHERE {
?person foaf:name ?name .
?person foaf:age ?age .
}
Returns: RDF graph
Shorthand:
CONSTRUCT WHERE {
?s ?p ?o .
}
ASK Query
Returns boolean indicating if pattern matches:
ASK {
?person foaf:name "Alice" .
}
Returns: true or false
DESCRIBE Query
Returns RDF description of resources:
DESCRIBE <http://example.org/person/Alice>
Returns: Implementation-specific RDF graph describing the resource
4. Solution Modifiers
Modifiers that transform query results:
ORDER BY
Sorts results by one or more expressions:
SELECT ?name ?age
WHERE { ?person foaf:name ?name ; foaf:age ?age }
ORDER BY DESC(?age) ?name
Options: ASC(?expr) (ascending, default), DESC(?expr) (descending)
DISTINCT
Removes duplicate solutions:
SELECT DISTINCT ?name
WHERE { ?person foaf:name ?name }
REDUCED
Allows (but doesn't require) duplicate elimination:
SELECT REDUCED ?name
WHERE { ?person foaf:name ?name }
LIMIT
Restricts result count:
SELECT ?name
WHERE { ?person foaf:name ?name }
LIMIT 10
OFFSET
Skips initial solutions:
SELECT ?name
WHERE { ?person foaf:name ?name }
OFFSET 20
LIMIT 10
GROUP BY
Groups solutions for aggregation:
SELECT ?company (COUNT(?employee) AS ?empCount)
WHERE {
?employee foaf:workplaceHomepage ?company .
}
GROUP BY ?company
HAVING
Filters grouped results:
SELECT ?company (COUNT(?employee) AS ?empCount)
WHERE {
?employee foaf:workplaceHomepage ?company .
}
GROUP BY ?company
HAVING (COUNT(?employee) >= 10)
SPARQL Syntax
PREFIX Declarations
Associate prefix labels with IRIs:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?name
WHERE {
?person foaf:name ?name .
}
BASE Declaration
Define base IRI for relative IRIs:
BASE <http://example.org/>
SELECT ?name
WHERE {
<person/Alice> foaf:name ?name .
}
Variable Syntax
Variables start with ? or $:
?name
$age
Note: ?var and $var refer to the same variable.
URI/IRI Syntax
Three ways to specify IRIs:
- Full IRI:
<http://example.org/resource> - Prefixed name:
prefix:localPart(e.g.,foaf:name) - Relative IRI:
<resource>(resolved against BASE)
Literal Syntax
String Literals
"simple string"
'another string'
"""multi-line
string"""
'''another
multi-line'''
Numeric Literals
42 # xsd:integer
3.14 # xsd:decimal
1.5e6 # xsd:double
Boolean Literals
true
false
Language-Tagged Literals
"chat"@en
"chat"@fr
Typed Literals
"42"^^xsd:integer
"2025-12-09"^^xsd:date
"P1Y2M"^^xsd:duration
Blank Node Syntax
Labeled Blank Nodes
_:label
_:alice
Anonymous Blank Nodes
[] # Empty blank node
[ foaf:name "Alice" ] # Blank node with properties
Blank Node Property Lists
[
foaf:name "Alice" ;
foaf:age 30 ;
foaf:knows [ foaf:name "Bob" ]
]
Triple Pattern Abbreviations
Semicolon (;) - Shared Subject
?person foaf:name "Alice" ;
foaf:age 30 ;
foaf:knows ?friend .
# Equivalent to:
?person foaf:name "Alice" .
?person foaf:age 30 .
?person foaf:knows ?friend .
Comma (,) - Shared Subject-Predicate
?person foaf:knows ?bob, ?charlie, ?diana .
# Equivalent to:
?person foaf:knows ?bob .
?person foaf:knows ?charlie .
?person foaf:knows ?diana .
rdf:type Shorthand (a)
?person a foaf:Person .
# Equivalent to:
?person rdf:type foaf:Person .
Collections (RDF Lists)
?list rdf:rest*/rdf:first ?item .
# Or using collection syntax:
?x foaf:knows ( :Alice :Bob :Charlie ) .
SPARQL Operations
1. Pattern Matching
Basic Triple Patterns
SELECT ?subject ?object
WHERE {
?subject foaf:knows ?object .
}
Multiple Patterns (Conjunction)
SELECT ?name ?email
WHERE {
?person foaf:name ?name .
?person foaf:email ?email .
}
2. FILTER Expressions
Apply constraints to solutions:
SELECT ?name ?age
WHERE {
?person foaf:name ?name .
?person foaf:age ?age .
FILTER(?age >= 18 && ?age < 65)
}
FILTER Operators
Logical:
&&(AND)||(OR)!(NOT)
Comparison:
=(equals)!=(not equals)<(less than)>(greater than)<=(less than or equal)>=(greater than or equal)
Arithmetic:
+(addition)-(subtraction)*(multiplication)/(division)
Other:
IN(set membership)NOT IN(set non-membership)
3. OPTIONAL Patterns
Left join - augments solutions if pattern matches:
SELECT ?name ?email
WHERE {
?person foaf:name ?name .
OPTIONAL { ?person foaf:email ?email }
}
Multiple OPTIONAL blocks:
SELECT ?name ?email ?phone
WHERE {
?person foaf:name ?name .
OPTIONAL { ?person foaf:email ?email }
OPTIONAL { ?person foaf:phone ?phone }
}
OPTIONAL with FILTER:
SELECT ?name ?email
WHERE {
?person foaf:name ?name .
OPTIONAL {
?person foaf:email ?email .
FILTER(CONTAINS(?email, "@example.com"))
}
}
4. UNION Patterns
Disjunction - tries alternative patterns:
SELECT ?name
WHERE {
{
?person foaf:name ?name .
}
UNION
{
?person rdfs:label ?name .
}
}
5. Property Paths
Regular expression-like patterns over properties:
Operators
| Operator | Syntax | Description |
|---|---|---|
| Sequence | elt1 / elt2 |
Follow elt1, then elt2 |
| Alternative | elt1 | elt2 |
Try elt1 or elt2 |
| Inverse | ^elt |
Reverse direction (object to subject) |
| Zero or more | elt* |
Zero or more occurrences |
| One or more | elt+ |
One or more occurrences |
| Zero or one | elt? |
Optional occurrence |
| Negation | !elt |
Not this property |
| Negated set | !(elt1|elt2) |
None of these properties |
Examples
Transitive closure:
SELECT ?ancestor
WHERE {
?person foaf:knows+ ?ancestor .
}
Path sequence:
SELECT ?grandchild
WHERE {
?person ex:hasChild / ex:hasChild ?grandchild .
}
Inverse path:
SELECT ?child
WHERE {
?parent ^ex:hasChild ?child .
}
Alternative paths:
SELECT ?name
WHERE {
?person (foaf:name | rdfs:label) ?name .
}
Negated property set:
SELECT ?x ?y
WHERE {
?x !rdf:type ?y .
}
6. Subqueries
Nested SELECT queries:
SELECT ?name ?avgAge
WHERE {
{
SELECT ?company (AVG(?age) AS ?avgAge)
WHERE {
?employee foaf:workplaceHomepage ?company .
?employee foaf:age ?age .
}
GROUP BY ?company
}
?company rdfs:label ?name .
}
7. Negation
NOT EXISTS
Pattern must not match:
SELECT ?person
WHERE {
?person a foaf:Person .
FILTER NOT EXISTS { ?person foaf:email ?email }
}
MINUS
Set difference operation:
SELECT ?person
WHERE {
?person a foaf:Person .
MINUS { ?person foaf:email ?email }
}
Difference: NOT EXISTS is a filter, MINUS is a set operation.
8. VALUES
Inject inline data:
SELECT ?name ?age
WHERE {
VALUES (?person ?age) {
(<http://example.org/alice> 30)
(<http://example.org/bob> 25)
}
?person foaf:name ?name .
}
Single variable:
VALUES ?x { :a :b :c }
Multiple variables:
VALUES (?x ?y) {
(:a 1)
(:b 2)
UNDEF
}
9. BIND
Assign values to variables:
SELECT ?name ?fullName
WHERE {
?person foaf:givenName ?first .
?person foaf:familyName ?last .
BIND(CONCAT(?first, " ", ?last) AS ?fullName)
}
10. Aggregates
Aggregate Functions
COUNT(?var)orCOUNT(*)- Count solutionsSUM(?expr)- Sum numeric valuesAVG(?expr)- Average numeric valuesMIN(?expr)- Minimum valueMAX(?expr)- Maximum valueGROUP_CONCAT(?expr)- Concatenate stringsSAMPLE(?expr)- Arbitrary value
GROUP BY Example
SELECT ?company (COUNT(?employee) AS ?count) (AVG(?salary) AS ?avgSalary)
WHERE {
?employee foaf:workplaceHomepage ?company .
?employee ex:salary ?salary .
}
GROUP BY ?company
HAVING Example
SELECT ?company (AVG(?salary) AS ?avgSalary)
WHERE {
?employee foaf:workplaceHomepage ?company .
?employee ex:salary ?salary .
}
GROUP BY ?company
HAVING (AVG(?salary) > 50000)
11. Named Graphs
Query specific graphs:
SELECT ?name
FROM <http://example.org/graph1>
WHERE {
?person foaf:name ?name .
}
GRAPH keyword:
SELECT ?name ?graph
WHERE {
GRAPH ?graph {
?person foaf:name ?name .
}
}
Multiple graphs:
SELECT ?name
FROM <http://example.org/graph1>
FROM <http://example.org/graph2>
WHERE {
?person foaf:name ?name .
}
SPARQL Update
SPARQL 1.1 Update provides operations for modifying RDF graphs.
1. INSERT DATA
Add ground triples (no variables):
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
INSERT DATA {
<http://example.org/alice> foaf:name "Alice" .
<http://example.org/alice> foaf:age 30 .
}
With named graph:
INSERT DATA {
GRAPH <http://example.org/graph1> {
<http://example.org/alice> foaf:name "Alice" .
}
}
Behavior:
- Creates graph if it doesn't exist (SHOULD)
- Blank nodes are "fresh" (distinct from existing nodes)
- No effect if triples already exist
2. DELETE DATA
Remove specific triples:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
DELETE DATA {
<http://example.org/alice> foaf:age 30 .
}
With named graph:
DELETE DATA {
GRAPH <http://example.org/graph1> {
<http://example.org/alice> foaf:age 30 .
}
}
Behavior:
- No error if triples don't exist
- No variables or blank nodes allowed
3. DELETE/INSERT
Pattern-based updates using WHERE clause:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
DELETE { ?person foaf:age ?age }
INSERT { ?person foaf:age 31 }
WHERE {
?person foaf:name "Alice" .
?person foaf:age ?age .
FILTER(?age = 30)
}
Order: DELETE executes before INSERT
Only DELETE:
DELETE { ?person foaf:email ?email }
WHERE {
?person foaf:email ?email .
FILTER(CONTAINS(?email, "@oldcompany.com"))
}
Only INSERT:
INSERT {
?person foaf:age 0 .
}
WHERE {
?person a foaf:Person .
FILTER NOT EXISTS { ?person foaf:age ?age }
}
4. DELETE WHERE
Shorthand for DELETE...WHERE without INSERT:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
DELETE WHERE {
?person foaf:email ?email .
FILTER(CONTAINS(?email, "@spam.com"))
}
5. LOAD
Load RDF document from IRI into graph:
LOAD <http://example.org/data.ttl>
Into named graph:
LOAD <http://example.org/data.ttl> INTO GRAPH <http://example.org/graph1>
Silent mode (no error):
LOAD SILENT <http://example.org/data.ttl>
6. CLEAR
Remove all triples from graph:
CLEAR GRAPH <http://example.org/graph1>
Options:
CLEAR DEFAULT- Clear default graphCLEAR NAMED- Clear all named graphsCLEAR ALL- Clear all graphsCLEAR SILENT GRAPH <uri>- No error if graph doesn't exist
7. CREATE
Create empty graph:
CREATE GRAPH <http://example.org/graph1>
Silent mode:
CREATE SILENT GRAPH <http://example.org/graph1>
8. DROP
Remove graph entirely:
DROP GRAPH <http://example.org/graph1>
Options:
DROP DEFAULT- Equivalent to CLEAR DEFAULTDROP NAMED- Drop all named graphsDROP ALL- Drop all graphsDROP SILENT GRAPH <uri>- No error if graph doesn't exist
9. COPY
Copy graph content to another graph:
COPY GRAPH <http://example.org/source> TO GRAPH <http://example.org/dest>
Behavior:
- Destination graph cleared first
- Source unchanged
10. MOVE
Move graph content to another graph:
MOVE GRAPH <http://example.org/source> TO GRAPH <http://example.org/dest>
Behavior:
- Destination graph cleared first
- Source graph cleared after copy
11. ADD
Add graph content to another graph:
ADD GRAPH <http://example.org/source> TO GRAPH <http://example.org/dest>
Behavior:
- Destination graph augmented
- Source unchanged
12. WITH Clause
Default graph for update operations:
WITH <http://example.org/graph1>
DELETE { ?person foaf:age ?age }
INSERT { ?person foaf:age 31 }
WHERE {
?person foaf:name "Alice" .
?person foaf:age ?age .
}
13. USING and USING NAMED
Specify graphs for WHERE clause:
DELETE { GRAPH <http://example.org/dest> { ?s ?p ?o } }
USING <http://example.org/source>
WHERE {
?s ?p ?o .
FILTER(?p = foaf:age)
}
Built-in Functions
1. Logical and Conditional
bound(?var)
Test if variable is bound:
SELECT ?name ?email
WHERE {
?person foaf:name ?name .
OPTIONAL { ?person foaf:email ?email }
FILTER(bound(?email))
}
IF(?cond, ?then, ?else)
Conditional expression:
SELECT ?name (IF(?age >= 18, "adult", "minor") AS ?status)
WHERE {
?person foaf:name ?name .
?person foaf:age ?age .
}
COALESCE(?expr1, ?expr2, ...)
Return first non-error value:
SELECT ?name (COALESCE(?email, ?phone, "no contact") AS ?contact)
WHERE {
?person foaf:name ?name .
OPTIONAL { ?person foaf:email ?email }
OPTIONAL { ?person foaf:phone ?phone }
}
EXISTS { pattern }
Test if pattern matches:
SELECT ?person
WHERE {
?person a foaf:Person .
FILTER EXISTS { ?person foaf:email ?email }
}
NOT EXISTS { pattern }
Test if pattern doesn't match:
SELECT ?person
WHERE {
?person a foaf:Person .
FILTER NOT EXISTS { ?person foaf:email ?email }
}
IN and NOT IN
Set membership:
SELECT ?name
WHERE {
?person foaf:name ?name .
?person foaf:age ?age .
FILTER(?age IN (20, 25, 30, 35))
}
2. RDF Term Functions
isIRI(?x) / isURI(?x)
Test if value is IRI:
FILTER(isIRI(?resource))
isBlank(?x)
Test if value is blank node:
FILTER(isBlank(?node))
isLiteral(?x)
Test if value is literal:
FILTER(isLiteral(?value))
isNumeric(?x)
Test if value is numeric:
FILTER(isNumeric(?value))
str(?x)
Convert to string:
SELECT (str(?uri) AS ?string)
WHERE { ?s ?p ?uri }
lang(?x)
Extract language tag:
SELECT ?name (lang(?name) AS ?language)
WHERE {
?person foaf:name ?name .
FILTER(lang(?name) = "en")
}
datatype(?x)
Get datatype IRI:
SELECT ?value (datatype(?value) AS ?type)
WHERE { ?s ?p ?value }
IRI(?x) / URI(?x)
Construct IRI from string:
BIND(IRI(CONCAT("http://example.org/", ?id)) AS ?resource)
BNODE() / BNODE(?label)
Create blank node:
BIND(BNODE() AS ?newNode)
STRDT(?string, ?datatype)
Create typed literal:
BIND(STRDT("42", xsd:integer) AS ?number)
STRLANG(?string, ?language)
Create language-tagged literal:
BIND(STRLANG("hello", "en") AS ?greeting)
UUID()
Generate random UUID:
BIND(UUID() AS ?id)
STRUUID()
Generate string UUID:
BIND(STRUUID() AS ?idString)
3. String Functions
STRLEN(?string)
String length:
SELECT ?name (STRLEN(?name) AS ?length)
WHERE { ?person foaf:name ?name }
SUBSTR(?string, ?start, ?length)
Extract substring:
SELECT (SUBSTR(?name, 1, 3) AS ?initials)
WHERE { ?person foaf:name ?name }
Note: Start position is 1-based
UCASE(?string)
Convert to uppercase:
SELECT (UCASE(?name) AS ?upper)
WHERE { ?person foaf:name ?name }
LCASE(?string)
Convert to lowercase:
SELECT (LCASE(?name) AS ?lower)
WHERE { ?person foaf:name ?name }
STRSTARTS(?string, ?prefix)
Test if string starts with prefix:
FILTER(STRSTARTS(?email, "admin@"))
STRENDS(?string, ?suffix)
Test if string ends with suffix:
FILTER(STRENDS(?email, "@example.com"))
CONTAINS(?string, ?substring)
Test if string contains substring:
FILTER(CONTAINS(?description, "important"))
STRBEFORE(?string, ?search)
Extract text before substring:
SELECT (STRBEFORE(?email, "@") AS ?username)
WHERE { ?person foaf:email ?email }
STRAFTER(?string, ?search)
Extract text after substring:
SELECT (STRAFTER(?email, "@") AS ?domain)
WHERE { ?person foaf:email ?email }
ENCODE_FOR_URI(?string)
Percent-encode for URI:
SELECT (ENCODE_FOR_URI(?name) AS ?encoded)
WHERE { ?person foaf:name ?name }
CONCAT(?string1, ?string2, ...)
Concatenate strings:
SELECT (CONCAT(?first, " ", ?last) AS ?fullName)
WHERE {
?person foaf:givenName ?first .
?person foaf:familyName ?last .
}
langMatches(?tag, ?range)
Match language tags:
FILTER(langMatches(lang(?label), "en"))
Language ranges:
"en"- Exact match"*"- Any language"en-US"- Specific locale
REGEX(?string, ?pattern) / REGEX(?string, ?pattern, ?flags)
Regular expression matching:
FILTER(REGEX(?email, "^[a-z]+@example\\.com$", "i"))
Flags:
"i"- Case insensitive"s"- Dot matches newline"m"- Multi-line mode"x"- Ignore whitespace
REPLACE(?string, ?pattern, ?replacement) / REPLACE(?string, ?pattern, ?replacement, ?flags)
String replacement:
SELECT (REPLACE(?phone, "[^0-9]", "") AS ?digitsOnly)
WHERE { ?person foaf:phone ?phone }
4. Numeric Functions
abs(?number)
Absolute value:
SELECT (abs(?diff) AS ?absDiff)
WHERE { BIND(?a - ?b AS ?diff) }
round(?number)
Round to nearest integer:
SELECT (round(?value) AS ?rounded)
WHERE { ?item ex:price ?value }
ceil(?number)
Ceiling function:
SELECT (ceil(?value) AS ?ceiling)
WHERE { ?item ex:price ?value }
floor(?number)
Floor function:
SELECT (floor(?value) AS ?floor)
WHERE { ?item ex:price ?value }
RAND()
Random number [0, 1):
SELECT ?item
WHERE {
?item a ex:Product .
FILTER(RAND() < 0.1)
}
5. Date/Time Functions
now()
Current timestamp:
BIND(now() AS ?currentTime)
year(?datetime)
Extract year:
SELECT (year(?date) AS ?year)
WHERE { ?event ex:date ?date }
month(?datetime)
Extract month (1-12):
SELECT (month(?date) AS ?month)
WHERE { ?event ex:date ?date }
day(?datetime)
Extract day (1-31):
SELECT (day(?date) AS ?day)
WHERE { ?event ex:date ?date }
hours(?datetime)
Extract hours (0-23):
SELECT (hours(?timestamp) AS ?hour)
WHERE { ?event ex:timestamp ?timestamp }
minutes(?datetime)
Extract minutes (0-59):
SELECT (minutes(?timestamp) AS ?minute)
WHERE { ?event ex:timestamp ?timestamp }
seconds(?datetime)
Extract seconds (0-59.999...):
SELECT (seconds(?timestamp) AS ?second)
WHERE { ?event ex:timestamp ?timestamp }
timezone(?datetime)
Extract timezone:
SELECT (timezone(?timestamp) AS ?tz)
WHERE { ?event ex:timestamp ?timestamp }
tz(?datetime)
Timezone abbreviation:
SELECT (tz(?timestamp) AS ?tzAbbr)
WHERE { ?event ex:timestamp ?timestamp }
6. Hash Functions
MD5(?string)
MD5 hash (lowercase hex):
SELECT (MD5(?email) AS ?hash)
WHERE { ?person foaf:email ?email }
SHA1(?string)
SHA-1 hash:
SELECT (SHA1(?password) AS ?hash)
WHERE { ?user ex:password ?password }
SHA256(?string)
SHA-256 hash:
SELECT (SHA256(?data) AS ?hash)
WHERE { ?item ex:data ?data }
SHA384(?string)
SHA-384 hash:
SELECT (SHA384(?data) AS ?hash)
WHERE { ?item ex:data ?data }
SHA512(?string)
SHA-512 hash:
SELECT (SHA512(?data) AS ?hash)
WHERE { ?item ex:data ?data }
7. Aggregate Functions
COUNT(?var) / COUNT(*)
Count solutions:
SELECT ?company (COUNT(?employee) AS ?count)
WHERE {
?employee foaf:workplaceHomepage ?company .
}
GROUP BY ?company
DISTINCT modifier:
SELECT (COUNT(DISTINCT ?type) AS ?typeCount)
WHERE { ?s rdf:type ?type }
SUM(?expr)
Sum numeric values:
SELECT ?department (SUM(?salary) AS ?totalSalary)
WHERE {
?employee ex:department ?department .
?employee ex:salary ?salary .
}
GROUP BY ?department
AVG(?expr)
Average numeric values:
SELECT ?department (AVG(?salary) AS ?avgSalary)
WHERE {
?employee ex:department ?department .
?employee ex:salary ?salary .
}
GROUP BY ?department
MIN(?expr)
Minimum value:
SELECT ?department (MIN(?salary) AS ?minSalary)
WHERE {
?employee ex:department ?department .
?employee ex:salary ?salary .
}
GROUP BY ?department
MAX(?expr)
Maximum value:
SELECT ?department (MAX(?salary) AS ?maxSalary)
WHERE {
?employee ex:department ?department .
?employee ex:salary ?salary .
}
GROUP BY ?department
GROUP_CONCAT(?expr) / GROUP_CONCAT(?expr; SEPARATOR = ?sep)
Concatenate values:
SELECT ?person (GROUP_CONCAT(?skill; SEPARATOR = ", ") AS ?skills)
WHERE {
?person ex:hasSkill ?skill .
}
GROUP BY ?person
SAMPLE(?expr)
Arbitrary value from group:
SELECT ?company (SAMPLE(?employee) AS ?anyEmployee)
WHERE {
?employee foaf:workplaceHomepage ?company .
}
GROUP BY ?company
SPARQL Algebra
The SPARQL algebra defines formal semantics for query evaluation.
Core Algebraic Operators
1. Basic Graph Pattern (BGP)
A set of triple patterns:
BGP(tp1, tp2, ..., tpn)
2. Join (⋈)
Combines two graph patterns:
P1 Join P2
Definition: Solutions that match both P1 and P2 with compatible bindings.
3. LeftJoin (⟕)
Left outer join (OPTIONAL):
LeftJoin(P1, P2, expr)
Definition: All solutions from P1, augmented with P2 solutions where compatible and expr is true.
Formal:
LeftJoin(Ω1, Ω2, expr) =
{ merge(μ1, μ2) | μ1 ∈ Ω1, μ2 ∈ Ω2, compatible(μ1, μ2), expr(merge(μ1, μ2)) = true }
∪ { μ1 | μ1 ∈ Ω1, ∀μ2 ∈ Ω2: ¬compatible(μ1, μ2) }
∪ { μ1 | μ1 ∈ Ω1, ∃μ2 ∈ Ω2: compatible(μ1, μ2), expr(merge(μ1, μ2)) = false }
4. Filter (σ)
Selection operation:
Filter(expr, P)
Definition: Solutions from P where expr evaluates to true.
5. Union (∪)
Disjunction:
Union(P1, P2)
Definition: Solutions from P1 or P2 (bag union).
6. Minus
Set difference:
Minus(P1, P2)
Definition: Solutions from P1 that don't join with any solution from P2.
7. Graph
Named graph pattern:
Graph(g, P)
Definition: Evaluate P against graph g.
8. Extend (Bind)
Add variable binding:
Extend(P, ?var, expr)
Definition: Add binding ?var = expr to each solution in P.
9. Project (π)
Project variables:
Project(P, vars)
Definition: Keep only specified variables from P.
10. Distinct
Remove duplicates:
Distinct(P)
11. Reduced
Allow duplicate removal:
Reduced(P)
12. OrderBy
Sort solutions:
OrderBy(P, conditions)
13. Slice
Limit and offset:
Slice(P, start, length)
14. ToList
Convert to solution sequence:
ToList(P)
Algebraic Properties
Important: Unlike relational algebra, SPARQL's LeftJoin is NOT distributive over Union:
LeftJoin(P1, Union(P2, P3), expr) ≠ Union(LeftJoin(P1, P2, expr), LeftJoin(P1, P3, expr))
This limits algebraic optimization opportunities.
Query Translation
A SPARQL query is translated to algebra in this order:
- Parse query text to syntax tree
- Translate patterns to algebra
- Apply solution modifiers (GROUP BY, ORDER BY, etc.)
- Apply projection (SELECT variables)
- Apply slice (LIMIT/OFFSET)
Example:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?email
WHERE {
?person foaf:name ?name .
OPTIONAL { ?person foaf:email ?email }
}
ORDER BY ?name
LIMIT 10
Translates to:
Slice(
OrderBy(
Project(
LeftJoin(
BGP(?person foaf:name ?name),
BGP(?person foaf:email ?email),
true
),
{?name, ?email}
),
[ASC(?name)]
),
0,
10
)
Query Result Formats
SPARQL supports multiple serialization formats for query results.
1. JSON Format
For SELECT and ASK queries.
SELECT Results
{
"head": {
"vars": ["name", "email"]
},
"results": {
"bindings": [
{
"name": {
"type": "literal",
"value": "Alice"
},
"email": {
"type": "literal",
"value": "alice@example.com"
}
},
{
"name": {
"type": "literal",
"value": "Bob"
}
}
]
}
}
RDF Term Types
IRI:
{
"type": "uri",
"value": "http://example.org/alice"
}
Literal:
{
"type": "literal",
"value": "Alice"
}
Language-tagged literal:
{
"type": "literal",
"value": "Alice",
"xml:lang": "en"
}
Typed literal:
{
"type": "literal",
"value": "42",
"datatype": "http://www.w3.org/2001/XMLSchema#integer"
}
Blank node:
{
"type": "bnode",
"value": "b0"
}
ASK Results
{
"head": {},
"boolean": true
}
2. XML Format
SELECT Results
<?xml version="1.0"?>
<sparql xmlns="http://www.w3.org/2005/sparql-results#">
<head>
<variable name="name"/>
<variable name="email"/>
</head>
<results>
<result>
<binding name="name">
<literal>Alice</literal>
</binding>
<binding name="email">
<literal>alice@example.com</literal>
</binding>
</result>
<result>
<binding name="name">
<literal>Bob</literal>
</binding>
</result>
</results>
</sparql>
RDF Term Elements
IRI:
<uri>http://example.org/alice</uri>
Literal:
<literal>Alice</literal>
Language-tagged:
<literal xml:lang="en">Alice</literal>
Typed literal:
<literal datatype="http://www.w3.org/2001/XMLSchema#integer">42</literal>
Blank node:
<bnode>b0</bnode>
ASK Results
<?xml version="1.0"?>
<sparql xmlns="http://www.w3.org/2005/sparql-results#">
<head/>
<boolean>true</boolean>
</sparql>
3. CSV Format
Simplified format without type information:
name,email
Alice,alice@example.com
Bob,
Characteristics:
- Lossy: No type information (IRI vs literal vs blank node)
- Simple: Easy to consume in applications
- Header row: Variable names
- Empty cells: Unbound variables
4. TSV Format
Tab-separated with type encoding:
?name ?email
"Alice" "alice@example.com"
"Bob"
RDF Term Encoding:
- IRI:
<http://example.org/resource> - Literal:
"value" - Language-tagged:
"value"@en - Typed literal:
"value"^^<datatype> - Blank node:
_:label
Characteristics:
- Lossless: Preserves all type information
- SPARQL/Turtle syntax: Uses standard RDF term syntax
- Simple parsing: Split on tabs
Implementation Considerations
For PostgreSQL Integration
1. Data Model Mapping
RDF Triples → PostgreSQL Tables:
-- Triple store table
CREATE TABLE rdf_triples (
id BIGSERIAL PRIMARY KEY,
subject TEXT NOT NULL,
subject_type VARCHAR(10) NOT NULL, -- 'iri', 'bnode'
predicate TEXT NOT NULL,
object TEXT NOT NULL,
object_type VARCHAR(10) NOT NULL, -- 'iri', 'literal', 'bnode'
object_datatype TEXT,
object_language VARCHAR(10),
graph TEXT
);
-- Indexes for query performance
CREATE INDEX idx_triples_spo ON rdf_triples(subject, predicate, object);
CREATE INDEX idx_triples_pos ON rdf_triples(predicate, object, subject);
CREATE INDEX idx_triples_osp ON rdf_triples(object, subject, predicate);
CREATE INDEX idx_triples_graph ON rdf_triples(graph);
2. Query Translation
SPARQL → SQL Translation:
SPARQL BGP:
?person foaf:name ?name .
?person foaf:age ?age .
SQL Translation:
SELECT t1.subject AS person, t1.object AS name, t2.object AS age
FROM rdf_triples t1
JOIN rdf_triples t2 ON t1.subject = t2.subject
WHERE t1.predicate = 'http://xmlns.com/foaf/0.1/name'
AND t2.predicate = 'http://xmlns.com/foaf/0.1/age';
3. OPTIONAL → LEFT JOIN
SPARQL:
?person foaf:name ?name .
OPTIONAL { ?person foaf:email ?email }
SQL:
SELECT t1.subject AS person, t1.object AS name, t2.object AS email
FROM rdf_triples t1
LEFT JOIN rdf_triples t2 ON t1.subject = t2.subject
AND t2.predicate = 'http://xmlns.com/foaf/0.1/email'
WHERE t1.predicate = 'http://xmlns.com/foaf/0.1/name';
4. UNION → UNION ALL
SPARQL:
{ ?person foaf:name ?name }
UNION
{ ?person rdfs:label ?name }
SQL:
SELECT subject AS person, object AS name
FROM rdf_triples
WHERE predicate = 'http://xmlns.com/foaf/0.1/name'
UNION ALL
SELECT subject AS person, object AS name
FROM rdf_triples
WHERE predicate = 'http://www.w3.org/2000/01/rdf-schema#label';
5. FILTER → WHERE
SPARQL:
?person foaf:age ?age .
FILTER(?age >= 18)
SQL:
SELECT subject AS person, object AS age
FROM rdf_triples
WHERE predicate = 'http://xmlns.com/foaf/0.1/age'
AND object_type = 'literal'
AND object_datatype = 'http://www.w3.org/2001/XMLSchema#integer'
AND CAST(object AS INTEGER) >= 18;
6. Property Paths
Property paths require recursive queries:
SPARQL:
?person foaf:knows+ ?ancestor .
SQL (PostgreSQL CTE):
WITH RECURSIVE transitive AS (
-- Base case
SELECT subject, object
FROM rdf_triples
WHERE predicate = 'http://xmlns.com/foaf/0.1/knows'
UNION
-- Recursive case
SELECT t.subject, r.object
FROM rdf_triples t
JOIN transitive r ON t.object = r.subject
WHERE t.predicate = 'http://xmlns.com/foaf/0.1/knows'
)
SELECT * FROM transitive;
7. Aggregates
SPARQL aggregates map to SQL aggregates:
SPARQL:
SELECT ?company (COUNT(?employee) AS ?count)
WHERE { ?employee foaf:workplaceHomepage ?company }
GROUP BY ?company
SQL:
SELECT object AS company, COUNT(*) AS count
FROM rdf_triples
WHERE predicate = 'http://xmlns.com/foaf/0.1/workplaceHomepage'
GROUP BY object;
8. Optimization Strategies
Statistics-based query planning:
- Collect statistics on predicate frequencies
- Estimate selectivity of triple patterns
- Order joins by selectivity
Materialized views:
- Pre-compute common property paths
- Cache frequently accessed subgraphs
Indexes:
- SPO, POS, OSP indexes (covering all access patterns)
- Partial indexes for specific predicates
- GiST/GIN indexes for full-text search
Caching:
- Query result cache
- Parsed query cache
- Compiled SQL cache
9. PostgreSQL Extensions
Leverage existing PostgreSQL features:
- JSONB: Store complex objects
- Full-text search: Text matching in literals
- GiST indexes: Spatial/hierarchical data
- CTEs: Recursive queries for property paths
- Window functions: Advanced analytics
- Parallel query: Scale to large datasets
10. Integration with RuVector
Combine SPARQL with vector operations:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rv: <http://ruvector.org/functions/>
SELECT ?name ?similarity
WHERE {
?person foaf:name ?name .
?person rv:embedding ?embedding .
# Use RuVector distance function
BIND(rv:cosine_similarity(?embedding, $query_vector) AS ?similarity)
FILTER(?similarity > 0.8)
}
ORDER BY DESC(?similarity)
LIMIT 10
Implementation:
SELECT t1.object AS name,
ruvector_cosine_similarity(
t2.object::ruvector,
$1::ruvector
) AS similarity
FROM rdf_triples t1
JOIN rdf_triples t2 ON t1.subject = t2.subject
WHERE t1.predicate = 'http://xmlns.com/foaf/0.1/name'
AND t2.predicate = 'http://ruvector.org/properties/embedding'
AND ruvector_cosine_similarity(t2.object::ruvector, $1::ruvector) > 0.8
ORDER BY similarity DESC
LIMIT 10;
Performance Considerations
- Index Strategy: SPO, POS, OSP covering all join orders
- Query Optimization: Statistics-based join reordering
- Caching: Parsed queries and compiled SQL
- Parallelization: Leverage PostgreSQL parallel query
- Partitioning: By graph, predicate, or subject
- Connection Pooling: Reuse database connections
- Prepared Statements: Reduce parsing overhead
Standards Compliance
Implement according to:
- SPARQL 1.1 Query Language (W3C Recommendation)
- SPARQL 1.1 Update (W3C Recommendation)
- SPARQL 1.1 Protocol (HTTP bindings)
- SPARQL 1.1 Results JSON/XML/CSV/TSV Formats
Consider SPARQL 1.2 draft features:
- Enhanced property paths
- New functions
- Improved federation
References
W3C Specifications
- SPARQL 1.1 Query Language - W3C Recommendation, March 2013
- SPARQL 1.1 Update - W3C Recommendation, March 2013
- SPARQL 1.1 Protocol - W3C Recommendation, March 2013
- SPARQL 1.1 Results JSON Format - W3C Recommendation, March 2013
- SPARQL 1.1 Results CSV/TSV Formats - W3C Recommendation, March 2013
- SPARQL 1.1 Property Paths - W3C Recommendation, March 2013
- SPARQL Query Language for RDF (1.0) - W3C Recommendation, January 2008
Draft Specifications
- SPARQL 1.2 Overview - W3C Working Draft
- SPARQL 1.2 Query Language - W3C Working Draft
Formal Semantics
- SPARQL Algebra - W3C Draft
- ARQ's SPARQL Algebra - W3C Community
Academic Resources
Implementation Guides
Appendix: Grammar Summary
Query Structure
Query ::= Prologue ( SelectQuery | ConstructQuery | DescribeQuery | AskQuery ) ValuesClause
Prologue ::= ( BaseDecl | PrefixDecl )*
BaseDecl ::= 'BASE' IRIREF
PrefixDecl ::= 'PREFIX' PNAME_NS IRIREF
SelectQuery ::= SelectClause DatasetClause* WhereClause SolutionModifier
ConstructQuery ::= 'CONSTRUCT' ( ConstructTemplate DatasetClause* WhereClause | DatasetClause* 'WHERE' '{' TriplesTemplate? '}' ) SolutionModifier
DescribeQuery ::= 'DESCRIBE' ( VarOrIri+ | '*' ) DatasetClause* WhereClause? SolutionModifier
AskQuery ::= 'ASK' DatasetClause* WhereClause SolutionModifier
DatasetClause ::= 'FROM' ( DefaultGraphClause | NamedGraphClause )
WhereClause ::= 'WHERE'? GroupGraphPattern
SolutionModifier ::= GroupClause? HavingClause? OrderClause? LimitOffsetClauses?
GroupClause ::= 'GROUP' 'BY' GroupCondition+
HavingClause ::= 'HAVING' HavingCondition+
OrderClause ::= 'ORDER' 'BY' OrderCondition+
LimitOffsetClauses ::= LimitClause OffsetClause? | OffsetClause LimitClause?
LimitClause ::= 'LIMIT' INTEGER
OffsetClause ::= 'OFFSET' INTEGER
Graph Patterns
GroupGraphPattern ::= '{' ( SubSelect | GroupGraphPatternSub ) '}'
GroupGraphPatternSub ::= TriplesBlock? ( GraphPatternNotTriples '.'? TriplesBlock? )*
GraphPatternNotTriples ::= GroupOrUnionGraphPattern | OptionalGraphPattern | MinusGraphPattern | GraphGraphPattern | ServiceGraphPattern | Filter | Bind | InlineData
OptionalGraphPattern ::= 'OPTIONAL' GroupGraphPattern
GraphGraphPattern ::= 'GRAPH' VarOrIri GroupGraphPattern
ServiceGraphPattern ::= 'SERVICE' 'SILENT'? VarOrIri GroupGraphPattern
Bind ::= 'BIND' '(' Expression 'AS' Var ')'
InlineData ::= 'VALUES' DataBlock
MinusGraphPattern ::= 'MINUS' GroupGraphPattern
GroupOrUnionGraphPattern ::= GroupGraphPattern ( 'UNION' GroupGraphPattern )*
Filter ::= 'FILTER' Constraint
Triple Patterns
TriplesBlock ::= TriplesSameSubjectPath ( '.' TriplesBlock? )?
TriplesSameSubjectPath ::= VarOrTerm PropertyListPathNotEmpty | TriplesNodePath PropertyListPath
PropertyListPath ::= PropertyListPathNotEmpty?
PropertyListPathNotEmpty ::= ( VerbPath | VerbSimple ) ObjectListPath ( ';' ( ( VerbPath | VerbSimple ) ObjectList )? )*
VerbPath ::= Path
VerbSimple ::= Var
ObjectListPath ::= ObjectPath ( ',' ObjectPath )*
ObjectPath ::= GraphNodePath
Path ::= PathAlternative
PathAlternative ::= PathSequence ( '|' PathSequence )*
PathSequence ::= PathEltOrInverse ( '/' PathEltOrInverse )*
PathElt ::= PathPrimary PathMod?
PathEltOrInverse ::= PathElt | '^' PathElt
PathMod ::= '?' | '*' | '+'
PathPrimary ::= iri | 'a' | '!' PathNegatedPropertySet | '(' Path ')'
PathNegatedPropertySet ::= PathOneInPropertySet | '(' ( PathOneInPropertySet ( '|' PathOneInPropertySet )* )? ')'
PathOneInPropertySet ::= iri | 'a' | '^' ( iri | 'a' )
Update Operations
Update ::= Prologue ( Update1 ( ';' Update )? )?
Update1 ::= Load | Clear | Drop | Add | Move | Copy | Create | InsertData | DeleteData | DeleteWhere | Modify
Load ::= 'LOAD' 'SILENT'? iri ( 'INTO' GraphRef )?
Clear ::= 'CLEAR' 'SILENT'? GraphRefAll
Drop ::= 'DROP' 'SILENT'? GraphRefAll
Create ::= 'CREATE' 'SILENT'? GraphRef
Add ::= 'ADD' 'SILENT'? GraphOrDefault 'TO' GraphOrDefault
Move ::= 'MOVE' 'SILENT'? GraphOrDefault 'TO' GraphOrDefault
Copy ::= 'COPY' 'SILENT'? GraphOrDefault 'TO' GraphOrDefault
InsertData ::= 'INSERT DATA' QuadData
DeleteData ::= 'DELETE DATA' QuadData
DeleteWhere ::= 'DELETE WHERE' QuadPattern
Modify ::= ( 'WITH' iri )? ( DeleteClause InsertClause? | InsertClause ) UsingClause* 'WHERE' GroupGraphPattern
DeleteClause ::= 'DELETE' QuadPattern
InsertClause ::= 'INSERT' QuadPattern
UsingClause ::= 'USING' ( iri | 'NAMED' iri )
GraphRefAll ::= GraphRef | 'DEFAULT' | 'NAMED' | 'ALL'
GraphRef ::= 'GRAPH' iri
End of Specification Document
This document provides comprehensive coverage of SPARQL 1.1 for implementing a query engine in PostgreSQL. For complete formal definitions and edge cases, refer to the official W3C specifications linked in the References section.