Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ricobl/django-importer

Create Data Importers for Django models.
https://github.com/ricobl/django-importer

Last synced: about 2 months ago
JSON representation

Create Data Importers for Django models.

Awesome Lists containing this project

README

        

# django-importer

Importers for Django models.

Provides basic functionality to import data into Django models, allowing easy
creation of custom importers. Highly extensible and customizable.

Data formats are commonly denormalized. The project doesn't aims to be the
"all-in-one" / "every-format" importer, but to provide a clean an flexible
interface to write custom importers.

# Features

* currently supported formats: XML
* easy to support new formats (CSV, Yaml, JSON, etc.)
* maps source values to model fields
* detects new / changed items
* many hooks to customize the importer behaviour

# Usage

Actions speaks louder than words, so let's go ahead with a practical example.

Let's say you have a news application in your project and want to import data from a XML file:

```xml


1
2009-04-20
django-importer released
Today, dango-importer has been released...

...

```

The model definition:

```python
class Entry(models.Model):
# External source ID, to keep track of already imported items
external_id = models.IntegerField()
# News entry properties
headline = models.CharField(max_length=100)
creation_date = models.DateTimeField()
pub_date = models.DateTimeField()
story = models.TextField()
```

Now the magic begins, let's write the importer. We must populate each field of our news
entry model, convert the creation date from string to a Python date and schedule the
publication date to the next hour.

```python
from django_importer.importers.xml import XmlImporter
from datetime import datetime, timedelta

from news.models import Entry

class MyXmlImporter(XmlImporter):
# Specify the model this Importer works against
model = Entry
# XmlImporter specific property: the nodename that identifies an XML item
item_tag_name = 'item'
# A list of model field names expected to be imported from the source
fields = ('external_id', 'headline', 'creation_date', 'story')
# A dictionary mapping model field names to data source identifiers
# In this case mappings points to XML nodes
field_map = {'external_id': 'id',
'creation_date': 'date',
'headline': 'title',
'story': 'content',
}
# List of fields that identifies an item as unique
unique_fields = ('external_id',)

def parse_creation_date(self, item, field_name, source_name):
# Get the value `source_name` from the XML `item` for the field `field_name`
# In other words: read the `date` node content to populate the field `creation_date` of our model.
val = self.get_value(item, source_name)
# Convert to a python date
return datetime(*val.split('-'))

def save_item(self, item, data, instance, commit=True):
# If the item is new, setup a publication date
if not instance.pk:
instance.pub_date = datetime.now() + timedelta(hours=1)
if commit:
instance.save()
return instance
```

And that's it. Now we can instantiate our importer and start parsing.

```python
from news.importers import MyXmlImporter

importer = MyXmlImporter('path/to/source.xml')
importer.parse()
```