.. -*- coding: utf-8 -*- .. _TutosMuseumsDataManagement: Data management with CubicWeb ----------------------------- Import data ~~~~~~~~~~~ With our application customized, let's see how to import more data. There is several ways to import data in CubicWeb. In our tutorial, we want to import our museums from a `csv` file. This file is provided by the France's Ministry of Culture, and is available here_. .. _here: https://data.culture.gouv.fr/explore/dataset/liste-et-localisation-des-musees-de-france/export/ There are several ways to import data in CubicWeb; in this tutorial, we will use one of them, the others are described here: :ref:`dataimport`. First of all, we define in :file:`tuto/cubicweb_tuto/dataimport.py` a function which will read a file from a `filepath` and create the corresponding entities, using a `CubicWeb connection`: .. sourcecode:: python import csv def import_museums(cnx, filepath): existing_cities = dict(cnx.execute("Any Z, C Where C is City, C zip_code Z")) existing_cities_nb = len(existing_cities) created_museum_nb = 0 with open(filepath) as fileobj: reader = csv.DictReader(fileobj, delimiter=";") for record in reader: museum_name = record["NOM DU MUSEE"] street = record["ADR"] zip_code = record["CP"] city_name = record["VILLE"] try: lat, lng = record["coordonnees_finales"].split(",") lat_long = { "latitude": lat, "longitude": lng, } except (AttributeError, ValueError): lat_long = {} try: city = existing_cities[zip_code] except KeyError: city = cnx.create_entity("City", name=city_name, zip_code=zip_code) existing_cities[zip_code] = city.eid cnx.create_entity( "Museum", name=museum_name, postal_address=f"{street}, {zip_code} {city_name}", is_in=city, **lat_long, ) created_museum_nb += 1 print( "Import finished! {} existing cities, {} cities created, {} museums created.".format( existing_cities_nb, len(existing_cities) - existing_cities_nb, created_museum_nb, ) ) To be sure we don't have several time the same city, we first query CubicWeb to ask for all existing city. To do this, we use a specific language called **RQL**. As for SPARQL, it's a query language designed to query linked data. See :ref:`rql_intro` for more information about it. Then, we put existing cities in a dictionary, using zip code as key. In the real world, a zip code can concern several cities, but it's not really an issue in this tutorial. For each line of our `csv` file, we will check if we already have the city in our base. If not, we create it. Then, we create our Museum entity with all needed arguments. To create an entity, we use the `create_entity` method of the CubicWeb connection. This method takes as first argument the type of the entity (ie: the name of the corresponding class previously defined in :file:`tuto/cubicweb_tuto/schema.py`), and then all arguments of the entity type. In our example, a city needs a name and a zip code. A museum needs a name, a postal address, a latitude, a longitude and a city. As `is_in` is a relation, we give to the corresponding argument the eid of the city. .. Note:: As we have defined Museum in the schema, we have to link each instance of Museum to a City, that's why we create the city before the museum, and give this city as argument of the museum. If the city wasn't mandatory, we could add it later, using: .. sourcecode:: python museum_entity.cw_set(is_in=city) To use our function we need to create a CubicWeb command that will call it. First, we create a file :file:`tuto/cubicweb_tuto/ccplugin.py` (the name doesn't matter, but it is commonly used for all new CubicWeb commands). Then, we write the following code: .. sourcecode:: python from cubicweb.cwctl import CWCTL from cubicweb.toolsutils import Command from cubicweb.utils import admincnx from cubicweb_tuto.dataimport import import_museums @CWCTL.register class ImportMuseums(Command): """ Import Museums and Cities from a CSV from: https://data.culture.gouv.fr/explore/dataset/liste-et-localisation-des-musees-de-france/export/ """ arguments = " " name = "import-museums" min_args = max_args = 2 def run(self, args): appid, csv_file = args[:2] with admincnx(appid) as cnx: import_museums(cnx, csv_file) cnx.commit() * ``@CWCTL.register`` allows to register the command and then make it available with ``cubicweb-ctl`` command by its ``name``. * ``arguments`` defines which arguments take our command. * ``name`` defines the name of the command. * ``with admincnx(appid) as cnx`` allows to have an admin access to our instance, and then be able to create new entities. Thus, to execute our import command, we just have to enter in our shell (within our virtual env): .. code-block:: console cubicweb-ctl import-museums tuto_instance After this script, we should be able to see that we have much more cities and museums by visiting the homepage of our CubicWeb instance: .. image:: ../../images/tutos-museum_finished_import.png :alt: A CubicWeb instance with several cities and museums. RDF serialisation ~~~~~~~~~~~~~~~~~ .. TODO complete documentation Content negotiation ~~~~~~~~~~~~~~~~~~~ .. TODO complete documentation