Getting started#

Ontology based atomic structure creation, manipulation, querying.

Imports

from atomrdf import KnowledgeGraph
import atomrdf.build as build
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 from atomrdf import KnowledgeGraph
      2 import atomrdf.build as build

File ~/checkouts/readthedocs.org/user_builds/pyscal-rdf/conda/latest/lib/python3.11/site-packages/atomrdf/__init__.py:21
      1 """atomRDF — ontology-based knowledge graphs for atomistic simulation data.
      2 
      3 atomRDF combines `pyscal3 <https://github.com/pyscal/pyscal3>`_,
   (...)     17 documentation at https://atomrdf.pyscal.org.
     18 """
     20 from atomrdf._version import __version__
---> 21 from atomrdf.graph import KnowledgeGraph
     22 from atomrdf.io.workflow_parser import WorkflowParser
     24 __all__ = [
     25     "__version__",
     26     "KnowledgeGraph",
     27     "WorkflowParser",
     28 ]

File ~/checkouts/readthedocs.org/user_builds/pyscal-rdf/conda/latest/lib/python3.11/site-packages/atomrdf/graph.py:46
     44 from atomrdf.stores import create_store, purge
     45 import atomrdf.json_io as json_io
---> 46 import atomrdf.mp as amp
     49 from atomrdf.namespace import (
     50     CMSO,
     51     PLDO,
   (...)     56     Literal,
     57 )
     59 # read element data file

File ~/checkouts/readthedocs.org/user_builds/pyscal-rdf/conda/latest/lib/python3.11/site-packages/atomrdf/mp.py:5
      1 """
      2 Wrapper around Materials Project to query structures and get it as a KG
      3 """
----> 5 from mp_api.client import MPRester
      6 import numpy as np
      8 def query_mp(api_key, chemical_system=None, material_ids=None, is_stable=True):

ModuleNotFoundError: No module named 'mp_api'

The initial step is to create a Knowledge Graph

kg = KnowledgeGraph(enable_log=True)

Creation of structures#

We will create three structures for the demonstration.

First a BCC Iron structure

struct_Fe = build.bulk("Fe", cubic=True, graph=kg)

Note that we passed an argument graph=kg which ensures that when the structure is created, it is also added to the Graph automatically. We can visualise the graph.

kg.visualise(hide_types=True)
../_images/de1115ad214b0f6027eb0994998336b45e5634f875b097fb2e5d0d77d26ae2f6.svg

Now a Si diamond structure

struct_Si = build.bulk("Si", cubic=True, graph=kg)
kg.visualise(hide_types=True, size=(60,30))
../_images/e146b8f1eb5ce660bc86abeeb9fe16f2d5f5d43984ccd48e9d243a67f69732ad.svg

We can save the graph and reload it as needed

kg.write('serial.ttl', format='ttl')
kg = KnowledgeGraph(graph_file='serial.ttl')
kg.n_samples
2

Querying the graph#

An example question would be, what are the space group of all structures with 4 atoms?

The corresponding SPARQL query looks like this:

query = """
PREFIX cmso: <http://purls.helmholtz-metadaten.de/cmso/>
SELECT DISTINCT ?symbol
WHERE {
    ?sample cmso:hasNumberOfAtoms ?number .
    ?sample cmso:hasMaterial ?material .
    ?material cmso:hasStructure ?structure .
    ?structure cmso:hasSpaceGroupSymbol ?symbol .
FILTER (?number="2"^^xsd:integer)
}"""
res = kg.query(query)

And print the results

res
symbol
0 Im-3m

The query system can also be used without experience in SPARQL, or deep knowledge about the ontology terms. The same query would be:

df = kg.query(kg.terms.cmso.AtomicScaleSample, [kg.terms.cmso.hasSpaceGroupSymbol, kg.terms.cmso.hasNumberOfAtoms==2])
df
AtomicScaleSample hasSpaceGroupSymbolvalue hasNumberOfAtomsvalue
0 sample:08fd0411-9078-410a-a671-95c971e0f9db Im-3m 2
sample = df.AtomicScaleSample.values[0]

We can write this sample to a file, for example, a LAMMPS data format, to use it for further simulations

kg.to_file(sample, 'lammps.data', format="lammps-data")
! more lammps.data
(written by ASE)

2 atoms
1 atom types

0.0      2.8700000000000001  xlo xhi
0.0      2.8700000000000001  ylo yhi
0.0      2.8700000000000001  zlo zhi

Atoms # atomic

     1   1                       0                       0                       0
     2   1      1.4350000000000001      1.4350000000000001      1.4350000000000001
lammps.data (END)