Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/spithash/xml-extractor
An XML extractor for products matching specific elements using regular expressions written in Python.
https://github.com/spithash/xml-extractor
python wpallimport xml xml-extractor xml-parser
Last synced: 15 days ago
JSON representation
An XML extractor for products matching specific elements using regular expressions written in Python.
- Host: GitHub
- URL: https://github.com/spithash/xml-extractor
- Owner: spithash
- License: gpl-3.0
- Created: 2023-02-01T16:29:44.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-02-13T10:16:40.000Z (almost 2 years ago)
- Last Synced: 2024-11-17T11:51:29.870Z (3 months ago)
- Topics: python, wpallimport, xml, xml-extractor, xml-parser
- Language: Python
- Homepage:
- Size: 30.3 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# XML-Extractor
An XML extractor for products matching specific elements using regular expressions written in Python. There's a progress bar too while fetching the XML.## WHY?
I personally use this script to extract (match) products matching certain categories from an XML url containing thousands of products and get only the ones I want, the ones I select and output it in another file.Use it to create custom (category specific) XMLs and import the products of 'output.xml' in WpAllImport.
## USAGE
Change selector values: Select an element (change it to match yours) like ____ for example and match products belonging to that category, editing the variable value of 'desired_category' in the lines below:
```
selectorprefix = ""
selectorsuffix = ""
```
and
```
desired_category = re.compile("My category name.*")
```
level3_category_description is an element of a **``````** entry. Selecting that and changing ```desired_category``` string value (which also supports regex) __selects__ the product category.So if you want to select another category, you do so by changing ```desired_category = re.compile("Smartphones.*")``` to match your selection.
You should also change the ```output_file_name``` variable to the name of your output file because the old file will get overwritten.
### Example product entry
```
22301
22929
ΤΕΜ
Product Name
Product Description
https://example.com/photos/e7207152345c.jpg
Smartphones
147-2
10
out of stock
2.52
2.03
1.64```
### OUTPUT
The script will output matching products (or rather entries) in an XML file called **output.xml**.Enjoy :)