Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/parvez3019/home24-page-analyser
https://github.com/parvez3019/home24-page-analyser
Last synced: 13 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/parvez3019/home24-page-analyser
- Owner: parvez3019
- Created: 2021-09-08T17:02:43.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2021-09-12T10:48:13.000Z (over 3 years ago)
- Last Synced: 2024-10-12T07:24:13.971Z (3 months ago)
- Language: Go
- Size: 7.09 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Home24 Assignment
### Task description
Create a web application which takes a website URL as an input and provides general information
about the contents of the page:
- HTML Version
- Page Title
- Headings count by level
- Amount of internal and external links
- Amount of inaccessible links
- If a page contains a login form## Instructions to RUN
- Install docker and start docker daemon
- Run the following command to start the server``` shell
make docker-run
```- Wait for the server to get started
- Run the following curl for making a request to the server``` shell
curl --location --request POST 'localhost:8000/analyse' \
--header 'Content-Type: application/json' \
--data-raw '{"page_url": "https://www.facebook.com"}'
```- Replace the "page_url" value for the testing.
### Sample Request
```json
{
"page_url": "https://www.facebook.com"
}
```### Sample Response
```json
{
"has_login_form": true,
"header_count": {
"h1": 1,
"h2": 2,
"h3": 3,
"h4": 4,
"h5": 5,
"h6": 6
},
"html_version": "HTML 3",
"links": {
"external": {
"count": 4,
"urls": [
"https://www.facebook.com",
"https://www.google.com",
"https://www.amazon.com",
"https://www.example.com/inaccessible"
]
},
"internal": {
"count": 2,
"urls": [
"/internal1",
"/internal2"
]
},
"inaccessible": {
"count": 1,
"urls": [
"https://www.example.com/inaccessible"
]
}
},
"title": "Parvez Hassan Test Page"
}
```### Notes for reviewer
- Have written integration and UT for each and every file but couldn't complete all test cases in UT for parser file due
to time crunch.
- Currently, I have made the following assumption in my implementation as those requirements were not clear in the
problem statement
- Since the login form implementation could be different for different pages, currently I'm checking it based on the
existence of password type field and submit type field.
- Considering links as accessible only if they are returning 200 as response, we can also consider 3xx series as a
valid response if req.