Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/spithash/xml-extractor

An XML extractor for products matching specific elements using regular expressions written in Python.
https://github.com/spithash/xml-extractor

python wpallimport xml xml-extractor xml-parser

Last synced: 15 days ago
JSON representation

An XML extractor for products matching specific elements using regular expressions written in Python.

Awesome Lists containing this project

README

        

# XML-Extractor
An XML extractor for products matching specific elements using regular expressions written in Python. There's a progress bar too while fetching the XML.

## WHY?
I personally use this script to extract (match) products matching certain categories from an XML url containing thousands of products and get only the ones I want, the ones I select and output it in another file.

Use it to create custom (category specific) XMLs and import the products of 'output.xml' in WpAllImport.

## USAGE
Change selector values: Select an element (change it to match yours) like ____ for example and match products belonging to that category, editing the variable value of 'desired_category' in the lines below:
```
selectorprefix = ""
selectorsuffix = ""
```
and
```
desired_category = re.compile("My category name.*")
```
level3_category_description is an element of a **``````** entry. Selecting that and changing ```desired_category``` string value (which also supports regex) __selects__ the product category.

So if you want to select another category, you do so by changing ```desired_category = re.compile("Smartphones.*")``` to match your selection.

You should also change the ```output_file_name``` variable to the name of your output file because the old file will get overwritten.

### Example product entry
```

22301
22929
ΤΕΜ
Product Name
Product Description
https://example.com/photos/e7207152345c.jpg
Smartphones
147-2
10
out of stock
2.52
2.03
1.64

```

### OUTPUT
The script will output matching products (or rather entries) in an XML file called **output.xml**.

Enjoy :)