cubicweb.dataimport
¶
Package containing various utilities to import data into cubicweb.
Utilities¶
-
cubicweb.dataimport.
ucsvreader_pb
(stream_or_path, encoding='utf-8', delimiter=', ', quotechar='"', skipfirst=False, withpb=True, skip_empty=True, separator=None, quote=None)[source]¶ same as
ucsvreader()
but a progress bar is displayed as we iter on rows
-
cubicweb.dataimport.
ucsvreader
(stream, encoding='utf-8', delimiter=', ', quotechar='"', skipfirst=False, ignore_errors=False, skip_empty=True, separator=None, quote=None)[source]¶ A csv reader that accepts files with any encoding and outputs unicode strings
if skip_empty (the default), lines without any values specified (only separators) will be skipped. This is useful for Excel exports which may be full of such lines.
-
cubicweb.dataimport.
callfunc_every
(func, number, iterable)[source]¶ yield items of iterable one by one and call function func every number iterations. Always call function func at the end.
-
cubicweb.dataimport.
lazytable
(*args, **kwargs)¶
-
cubicweb.dataimport.
lazydbtable
(*args, **kwargs)¶
-
cubicweb.dataimport.
mk_entity
(*args, **kwargs)¶
Sanitizing/coercing functions¶
-
cubicweb.dataimport.
optional
(*args, **kwargs)¶
-
cubicweb.dataimport.
required
(*args, **kwargs)¶
-
cubicweb.dataimport.
todatetime
(*args, **kwargs)¶
-
cubicweb.dataimport.
call_transform_method
(*args, **kwargs)¶
-
cubicweb.dataimport.
call_check_method
(*args, **kwargs)¶
Integrity functions¶
-
cubicweb.dataimport.
check_doubles
(*args, **kwargs)¶
-
cubicweb.dataimport.
check_doubles_not_none
(*args, **kwargs)¶
Object Stores¶
-
class
cubicweb.dataimport.
ObjectStore
[source]¶ Store objects in memory for faster validation (development mode)
But it will not enforce the constraints of the schema and hence will miss some problems
>>> store = ObjectStore() >>> user = store.prepare_insert_entity('CWUser', login=u'johndoe') >>> group = store.prepare_insert_entity('CWUser', name=u'unknown') >>> store.prepare_insert_relation(user, 'in_group', group)
-
prepare_insert_entity
(etype, **data)[source]¶ Given an entity type, attributes and inlined relations, return an eid for the entity that would be inserted with a real store.
-
-
class
cubicweb.dataimport.
RQLObjectStore
(cnx, commit=None)[source]¶ Bases:
cubicweb.dataimport.stores.NullStore
Store that works by making RQL queries, hence with all the cubicweb’s machinery activated.
-
class
cubicweb.dataimport.
NoHookRQLObjectStore
(cnx, metagen=None)[source]¶ Bases:
cubicweb.dataimport.stores.RQLObjectStore
Store that works by accessing low-level CubicWeb’s source API, with all hooks deactivated. It may be given a metadata generator object to handle metadata which are usually handled by hooks.
Arguments: - cnx, a connection to the repository - metagen, optional
MetadataGenerator
instance
-
class
cubicweb.dataimport.
SQLGenObjectStore
(cnx, dump_output_dir=None, nb_threads_statement=1)[source]¶ Bases:
cubicweb.dataimport.stores.NoHookRQLObjectStore
Controller of the data import process. This version is based on direct insertions throught SQL command (COPY FROM or execute many).
>>> store = SQLGenObjectStore(cnx) >>> store.create_entity('Person', ...) >>> store.flush()
Initialize a SQLGenObjectStore.
Parameters:
- cnx: connection on the cubicweb instance
- dump_output_dir: a directory to dump failed statements for easier recovery. Default is None (no dump).
Import Controller¶
-
class
cubicweb.dataimport.
CWImportController
(store, askerror=0, catcherrors=None, tell=<function wrapped>, commitevery=50)[source]¶ Bases:
object
Controller of the data import process.
>>> ctl = CWImportController(store) >>> ctl.generators = list_of_data_generators >>> ctl.data = dict_of_data_tables >>> ctl.run()